In this Architecture Decision Record (ADR), we will explore Python options for type hints: Data Classes and attrs library for structuring and managing data in Python projects. Based on the comparison, we will justify our choice of using attrs over Data Classes.
- Ease of use
- Performance
- Extensibility and customization
- Compatibility with different Python versions
- Type safety
- Python Data Classes (available in Python 3.7+)
attrslibrary
Python Data Classes:
- Introduced in Python 3.7, they offer a simple and clean syntax for defining classes with minimal boilerplate.
- Automatically generates special methods like
__init__,__repr__, and__eq__. - Supports default values and mutable default values using
default_factory. - Provides
asdictandastupleutility functions for easy conversion.
Example:
from dataclasses import dataclass
from typing import Dict, Optional, Tuple
from datetime import datetime
@dataclass
class Site:
name: str
gps_location: Tuple[float, float]
metadata: Optional[Dict[str, str]] = None
date_of_creation: datetime = datetime.now()
site = Site("Example Site", (40.7128, -74.0060), {"description": "An example site"})
print(site)attrs:
- Provides a more concise syntax and additional features.
- Can be used with decorators or class decorators to generate special methods.
- Offers a wide range of customization options, such as specifying custom validation functions, customizing generated methods, and providing default values.
- Supports
slots, which can improve performance and memory usage.
import attr
from typing import Dict, Optional, Tuple
from datetime import datetime
@attr.s
class Site:
name: str = attr.ib()
gps_location: Tuple[float, float] = attr.ib()
metadata: Optional[Dict[str, str]] = attr.ib(default=None)
date_of_creation: datetime = attr.ib(factory=datetime.now)
site = Site("Example Site", (40.7128, -74.0060), {"description": "An example site"})
print(site)Python Data Classes:
- Native to Python, so they offer good performance.
- No built-in support for
slots. Must be manually implemented for better memory usage.
attrs:
- Claims to have better performance than Data Classes in certain scenarios.
- Performance gains could be significant when using the
slots=Trueoption, which reduces memory overhead.
NOTE
slots is a feature that allows you to optimize memory usage and potentially improve the performance of your classes.
In Python, every object has a dynamic dictionary called __dict__ that stores the object's attributes. The dynamic nature of __dict__ allows you to add or remove attributes at runtime. However, this flexibility comes at the cost of increased memory overhead.
slots is a mechanism that allows you to define a fixed set of attributes for a class. When using slots, a more memory-efficient data structure is created for storing the object's attributes, instead of the default __dict__. This can lead to significant memory savings, especially when dealing with a large number of instances of a class.
To use slots with attrs, you can set the slots parameter to True when defining the class decorator. Here's an example:
import attr
from typing import List, Optional
@attr.s(slots=True)
class Person:
name: str = attr.ib()
age: int = attr.ib()
email: Optional[str] = attr.ib(default=None)
hobbies: List[str] = attr.ib(default=None)Python Data Classes:
- Limited customization options for generated methods.
- Manual implementation is required for validation, conversion, and metadata support.
attrs:
- Offers a wide range of customization options, such as specifying custom validation functions, customizing generated methods, and providing default values.
- Supports
slots, which can improve performance and memory usage. - Built-in support for validators, converters, and metadata.
Python Data Classes:
- Available only in Python 3.7 and later versions.
attrs:
- Compatible with Python 2.7 and Python 3.4+.
Python Data Classes:
- Data Classes do not include built-in runtime validation for type safety.
- Manual implementation of runtime checks in the
__post_init__method is required to ensure type safety at runtime.
- Offers seamless integration with Python's type hinting system, providing static type safety and improved developer experience.
- Supports the use of
TypeVar,NewType, and other typing constructs. - Compatible with static type checkers like
mypy, which can catch type-related issues at development time.
- Python Data Classes are natively supported in Python 3.7+ and are well-integrated with most Python editors and IDEs.
- Type hints used in Data Classes can provide code completion, linting, and error checking, enhancing the development experience.
attrs:
attrsprovides additional safety through built-in support for validators and converters, ensuring that the data conforms to the expected types and constraints at runtime.- Custom validation functions can be specified using the
validatorparameter in the attribute definition.
attrsalso integrates well with Python's type hinting system, providing static type safety and improved developer experience.- Can work with
TypeVar,NewType, and other typing constructs. - Compatible with static type checkers like
mypy.
- As a third-party library,
attrsmay require additional configuration or plugins to enable full editor support. - However, once configured, most Python editors and IDEs can provide code completion, linting, and error checking for
attrsclasses, similar to Data Classes.
Python Data Classes:
from dataclasses import dataclass, field, InitVar
import sys
def validate_soil_id(value, name):
valid_soils = ['clay', 'dirt', 'in-between']
if value not in valid_soils:
raise ValueError(f"{name} must be one of {', '.join(valid_soils)}")
@dataclass
class Site:
name: str
soil_id: str
_valid_soil_id: InitVar = field(default=None, init=False)
def __post_init__(self, _valid_soil_id):
validate_soil_id(self.soil_id, "Soil ID")
object.__setattr__(self, '__dict__', None)
object.__setattr__(self, '__weakref__', None)
def __setattr__(self, name, value):
raise AttributeError("You cannot modify attributes in this class")
def __delattr__(self, name):
raise AttributeError("You cannot delete attributes in this class")
site = Site("Sample Site", "clay")
print(site)attrs:
import attr
def validate_soil_id(instance, attribute, value):
valid_soils = ['clay', 'dirt', 'in-between']
if value not in valid_soils:
raise ValueError(f"{attribute.name} must be one of {', '.join(valid_soils)}")
@attr.s(slots=True)
class Site:
name: str = attr.ib()
soil_id: str = attr.ib(validator=validate_soil_id)
site = Site("Sample Site", "clay")
print(site)| Feature | Python Data Classes | attrs |
|---|---|---|
| Built-in Python support | Yes | No |
| Runtime validation | No (manual) | Yes |
| Static type checking | Yes | Yes |
| Default values for attributes | Yes | Yes |
| Custom attribute converters | No (manual) | Yes |
Support for slots |
Yes (manual) | Yes |
| Flexible decorators/ordering | Limited | Yes |
| Readonly/Immutable attributes | No (manual) | Yes |
| Type hinting compatibility | Yes | Yes |
| Mypy and other type checkers | Yes | Yes |
| IDE/editor integration | Native | Plugin |
- Python Data Classes documentation: https://docs.python.org/3/library/dataclasses.html
attrslibrary documentation: https://www.attrs.org/en/stable/index.html- PEP 557 - Data Classes: https://www.python.org/dev/peps/pep-0557/
- Python
__slots__magic: https://stackoverflow.com/questions/472000/usage-of-slots - Type hinting in Python: https://docs.python.org/3/library/typing.html
- Mypy - Optional Static Typing for Python: https://mypy.readthedocs.io/en/stable/index.html