Domain model is an effective tool for software development. It can be used to express really complex business logic, and to verify and validate the understanding of the domain among stakeholders. Building rich domain models in Rails is hard. Primarily, because of Active Record, which doesn't play well with the domain model approach.
One way to deal with this problem is to use an ORM implementing the data mapper pattern. Unfortunately, there is no production ready ORM doing that for Ruby. DataMapper 2 is going to be the first one.
Another way is to use Active Record just as a persistence mechanism and build a rich domain model on top of it. That's what I'm going to talk about here.
First, let's take a look at some problem caused by using a class extending Active Record for expressing a domain concept:
-
The class is aware of Active Record. Therefore, you need to load Active Record to run your tests.
-
An instance of the class is responsible for saving and updating itself. It makes mocking and stubbing harder.
-
Every instance exposes such low-level methods as 'update_attribute!'. They give you too much power of changing the internal state of objects. Power corrupts. That's why we see 'update_attributes' used in so many places.
-
"Has many" associations allow bypassing an aggregate root. Too much power, and as we all know, it corrupts.
-
Every instance is responsible for validating itself. It's hard to test. On top of that, it makes validations much harder to compose.
Following Rich Hickey's motto of splitting things apart, the best solution I see is to split every Active Record class into three different classes:
- Entity
- Data Object
- Repository
The core idea here is that every entity when instantiated is given a data object. The entity delegates its fields' access to the data object. The data object doesn't have to be an Active Record object. You can always provide a stub or an OpenStruct instead. Since the entity is a plain old ruby object, it doesn't know how to save/validate/update itself. It also doesn't know how to fetch itself from the database.
A repository is responsible for fetching data objects from the database and constructing entities. It is also responsible for creating and updating entities. To cope with its responsibilities the repository has to know how to map data objects to entities. A registry of all data objects and their correspondent entities is created to do exactly that.
Let's take a look at a practical application of this approach. Order and Item are two entities that form an aggregate. This is the schema we can use to store them in the database:
create_table "orders", :force => true do |t|
t.decimal "amount", :null => false
t.date "deliver_at"
t.datetime "created_at", :null => false
t.datetime "updated_at", :null => false
end
create_table "items", :force => true do |t|
t.string "name", :null => false
t.decimal "amount", :null => false
t.integer "order_id", :null => false
t.datetime "created_at", :null => false
t.datetime "updated_at", :null => false
end
As you can see we didn't have to adapt the schema for this approach.
All entities are plain old ruby objects including the Model module:
class Order
include Model
# delegates id, id=, amount, amount=, deliver_at, deliver_at to the data object
fields :id, :amount, :deviver_at
# ...
end
class Item
include Model
fields :id, :amount, :name
end
where the Model module is defined as:
module Model
def self.included(base)
base.extend ClassMethods
end
attr_accessor :_data
def initialize _data = _new_instance
if _data.kind_of?(Hash)
@_data = _new_instance _data
else
@_data = _data
end
end
protected
#...
def _new_instance hash = {}
# using the registry to get the correspondent data class
Registry.data_class_for(self.class).new hash
end
module ClassMethods
def fields *field_names
field_names.each do |field_name|
self.delegate field_name, to: :_data
self.delegate "#{field_name}=", to: :_data
end
end
end
end
As the Order and Item classes form an aggregate, we can get a reference to an item only through its order. Therefore, we need to implement only one repository:
module OrderRepository
extend Repository
# All ActiveRecord classes are defined in the repository.
class OrderData < ActiveRecord::Base
self.table_name = "orders"
attr_accessible :amount, :deliver_at
validates :amount, numericality: true
has_many :items, class_name: 'OrderRepository::ItemData', foreign_key: 'order_id'
end
class ItemData < ActiveRecord::Base
self.table_name = "items"
attr_accessible :amount, :name
validates :amount, numericality: true
validates :name, presence: true
end
# mappings between models and data objects are defined here.
# "root:true" means that the OrderData class will be used
# when working with this repository.
set_model_class Order, for: OrderData, root: true
set_model_class Item, for: ItemData
def self.find_by_amount amount
where(amount: amount)
end
end
end
Where the Repository module is defined as:
module Repository
def persist model
data(model).save!
end
def find id
model_class.new(data_class.find id)
end
protected
def where attrs
# we search the database using the root data class and wrap
# the results into the instances of the model class
data_class.where(attrs).map do |data|
model_class.new data
end
end
def data model
model._data
end
def set_model_class model_class, options
raise "Data class is not provided" unless options[:for]
Registry.associate(model_class, options[:for])
if options[:root]
singleton_class.send :define_method, :data_class do
options[:for]
end
singleton_class.send :define_method, :model_class do
model_class
end
end
end
end
Now, let's see how we can use all these classes in an application.
test "using a data object directly (maybe used for reporting purposes)" do
order = OrderRepository::OrderData.create! amount: 10, deliver_at: Date.today
order.items.create! amount: 6, name: 'Item 1'
order.items.create! amount: 4, name: 'Item 2'
assert_equal 2, order.reload.items.size
assert_equal 6, order.items.first.amount
end
test "using a saved model" do
order_data = OrderRepository::OrderData.create! amount: 10, deliver_at: Date.today
order = Order.new(order_data)
order.amount = 15
assert_equal 15, order.amount
end
test "creating a new model" do
order = Order.new
order.amount = 15
assert_equal 15, order.amount
end
test "using hash to initialize a model" do
order = Order.new amount: 15
assert_equal 15, order.amount
end
test "using a repository to fetch models from the database" do
OrderRepository::OrderData.create! amount: 10, deliver_at: Date.today
orders = OrderRepository.find_by_amount 10
assert_equal 10, orders.first.amount
end
test "persisting models" do
order = Order.new amount: 10
OrderRepository.persist order
assert order.id.present?
assert_equal 10, order.amount
end
test "using data structure instead of a data object (can be used for testing)" do
order = Order.new OpenStruct.new
order.amount = 99
assert_equal 99, order.amount
end
One important aspect of building rich domain models hasn't been covered yet. How are the associations between an aggregate root and its childrens managed? How do we access items?
The simplest approach would be to build an array of Item ourselves using the active record association.
class Order
include Model
fields :id, :amount, :deliver_at
def items
_data.items.map{|i| Item.new i}
end
def add_item attrs
Item.new(_data.items.new attrs))
end
end
The problem here is that everyone is forced to use the _data variable, which is really undesirable. We can provide a controlled accessor to the data object by adding the collection and wrap methods the Model module.
module Model
# returns a rails has_many
def collection name
_data.send(name)
end
# wraps a collection of items into instances of the model class
def wrap collection
return [] if collection.empty?
model_class = Registry.model_class_for(collection.first.class)
collection.map{|c| model_class.new c}
end
end
Order using collection and wrap:
class Order
include Model
def items
wrap(collection :items)
end
def add_item attrs
wrap(collection(:items).new attrs)
end
end
Though the changes may not seem significant at first, they are crucial. There is no need to access the _data variable anymore. On top of that, we don't have to create instances of Item ourselves.
But the collection and wrap methods are just bare minimum. One can easily imagine the syntax sugar we can add on top of them.
module Model
module ClassMethods
def collections *collection_names
collection_names.each do |collection_name|
define_method collection_name do
wrap(collection collection_name)
end
end
end
end
end
class Order
include Model
fields :id, :amount, :deliver_at
collections :items
def add_item attrs
wrap(collection(:items).new attrs)
end
end
Now, let's see how we can use it in our application:
test "using a saved aggregate with children" do
order_data = OrderRepository::OrderData.create! amount: 10, deliver_at: Date.today
order_data.items.create! amount: 6, name: 'Item 1'
order = Order.new order_data
assert_equal 6, order.items.first.amount
end
test "persisting an aggregate with children" do
order = Order.new amount: 10
order.add_item name: 'item1', amount: 5
OrderRepository.persist order
from_db = OrderRepository.find(order.id)
assert_equal 5, from_db.items.first.amount
end
Since data objects are hidden, and aren't supposed to be accessed directly by the client code, we need to change the way we run validations. There are lots of available options, one of which is the following:
module DataValidator
def self.validate model
data = model._data
data.valid?
data.errors.full_messages
end
end
That's how you'd use it in the code:
test "using data validation for a saved model" do
order_data = OrderRepository::OrderData.create! amount: 10, deliver_at: Date.today
order = Order.new(order_data)
assert_equal [], DataValidator.validate(order)
end
test "using data validation for a new model" do
order = Order.new amount: 10
assert_equal [], DataValidator.validate(order)
end
You don't have to return an array of strings. It can be a hash or even a special object. The important idea here is to separate entities from their validations. Once again, by splitting things apart we end up with a better design. Why? For one thing, we can compose validations in run time based on, for instance, user settings. For another thing, we can validate a group of objects together, so there is no need to copy errors from one object to another.
Separating persistence from domain model has a tremendous impact on the architecture of our applications. The following is the traditional Rails app architecture.
That’s what we get if we separate persistence.
You don't have to be one of the Three Amigos to see the flaws of the traditional Rails app architecture: the domain classes depend on the database and Rails. Whereas, the architecture illustrated by the second diagram doesn’t have these flaws, which allows us to keep the domain logic abstract and framework agnostic.
-
The persistence logic has been extracted into OrderRepository. Having a separate object is beneficial in many ways. For instance, it simplifies testing, as it can be mocked up or faked.
-
Instances of Order and Item are no longer responsible for saving or updating themselves. The only way to do it is to use domain specific methods.
-
Low-level methods (such as update_attributes!) are no longer exposed.
-
There is no ItemRepository and no has_many associations. The result of it is an enforced aggregate boundary.
-
Having validations separated enables better composability and simplifies testing.
The suggested approach is fairly simple, but provides some real value when it comes to expressing complex domains.
Fields and associations can be defined in a declarative fashion, but at the same time, we still have access to the active record object when it’s required.
The approach plays really with legacy applications. Nothing has to be rewritten or redesigned from scratch. Just start using your existing Active Record models as data classes when building new functionality.