We know that Python is an object-oriented language. Object-oriented programming basically focuses on the data and its groups rather than functions and languages. It treats data and the functionality that is associated with it as objects.
Objects are entities that can store data as well as provide functions to operate on the data items. But how do we create objects? We can create objects by using classes. Today, we are going to learn about Python Dataclass in detail and its use cases.
But first, we need to revise classes in python.
What are classes in python?
Classes are a collection of data items and their associated methods. It defines a code template that can be used to create objects. Let us understand this with a basic example.
Let us say you want to build an office. The first thing you would require is a building plan from the architect. After that, you would start constructing the building. In this example, the building plan created by the architect is the class and the constructed building is called the object of that particular class.
Similarly in python in order to create objects we need to have classes. These classes are created using the Class keyword in python. Inside the classes, we have two types of items. One are called the attributes and the second are called methods.
What are attributes? Attributes describe a class. They are basically variables that store values used to describe a class. For example, if we have a class called cars, then it will have attributes like the number of doors, number of wheels, engine horsepower, etc.
Methods are functions inside a class that describe the functionalities of a class. Consider our previous example, if we have a class called cars, then the methods of a class would be the car moving forward, changing gears, opening the door, etc.
But class. Python 3.7 onwards introduced a new module that provided built-in support for easier usage of classes, the Dataclass module. This module provides built-in support, especially for the storage of data in classes and the representation of data in the class. Dataclasses can be implemented using decorators and the dataclasses module in python.
What are decorators in python?
Python Decorators are used to applying additional functionality to objects. They are used to provide more functionality without having to write additional code inside the object.
For example, if we have a function that splits a string and then produces an output but we want the output to be uppercase, we would apply a decorator to the function that would convert all output to upper case.
An example of this is given below:
def make_upper(function): def upper(): f = function() return f.upper() return upper @make_upper #decorator def helloworld(): return "hello world" print(helloworld())
Output:
HELLO WORLD
Decorators are applied in python using the @ symbol. The definition for decorators could be user-defined or it could be within a module or a class we import. Decorators can not only be applied to functions but also to classes as we shall see below.
In simple terms, decorators decorate the existing class with more functionalities. Now that we know what classes and decorators are, let us have a look at what are dataclasses.
What is Dataclass?
A dataclass in python is a specially structured class that is optimized for the storage and representation of data. Dataclasses have certain in-built functions to look after the representation of data as well as its storage.
Now that we know the basics, let us have a look at how dataclasses are created and used in python. Dataclasses in python require the dataclasses library that has the dataclass module inside it. A library is a collection of modules that provide pre-defined functionalities.
Libraries are made up of the code that is required to implement the module. Once we have imported the dataclasses library we will import the dataclass module from the library. After we have imported the dataclass module, we will use decorators to provide the properties of a dataclass to the new class that we create.
Here is the source code:
from dataclasses import dataclass @dataclass #dataclass decorator class authors: name: str #Type Hints aid: int books: str
In this code, we have created a new class named Authors, and we have applied the dataclass property to it using the dataclass decorator. If you observe the code, you will see that dataclasses have a new manner of writing variables inside them.
Every class that has the dataclass decorator has type hints inside it. Type hints are basically equivalent to declaring variables in other programming languages. Similar to how variables in other coding languages have their data types written in front of them, in dataclasses, the data types for the variables are written in front of them.
Use case of Dataclass in Python
As discussed above, dataclasses are specifically used for the representation of data and its storage. Therefore the two most popular use cases of dataclasses are for printing a class and for equality checks. Before we discuss why dataclasses are used instead of normal classes, let us have a look at the code examples for the same.
from dataclasses import dataclass @dataclass #dataclass decorator class authors: name: str #Type Hints aid: int books: str class author: def __init__(self,name,aid,books): self.name = name self.aid = aid self.books = books Obj1 = authors("Ivan",1254,"The Ghost Story") Obj2 = authors("Ivan",1254,"The Ghost Story") Obj3 = author("Ivan",1254,"The Ghost Story") Obj4 = author("Ivan",1254,"The Ghost Story") #Obj1 and Obj2 are dataclass objects #Obj3 and Obj4 are normal class objects print("Difference in Print Operation") print(Obj1) print(Obj3) print("\nDifference in Equality Check") print(Obj1==Obj2) print(Obj3==Obj4)
Output:
Difference in Print Operation
authors(name='Ivan', aid=1254, books='The Ghost Story')
<__main__.author object at 0x0000023EB6AA04F0>
Difference in Equality Check
True
False
In the above examples, we can observe that the print operation on dataclass and on a normal class behave very differently. When we print a normal class, the output we receive is the memory location of the normal class.
But when we use the print operation on a dataclass, the output we receive is a curated output that shows us all the attributes of the class and its output.
We also saw how the equality operator “==” works for normal classes and a dataclass. When we compare two normal classes, the interpreter checks to see if the memory locations of the two normal classes are the same location.
But when we use the equality operator on two dataclasses, it checks to see if the attributes inside the dataclasses are the same.
Conclusion
We learned about normal classes and their limitations when it comes to the representation of data and its storage. Thereafter we learnt about dataclasses in Python as a module that provides specific functionalities for the storage and representation of data. Now you must have understood the difference between normal class and dataclass.