More on logical models differing from physical ones
On the topic of logical models and physical models different, I found the following short anecdote in Paul Oldfield’s Domain Modeling whitepaper:
Anecdote: One project for which I worked, and against my advice, chose to design an OO system based on a data model. They made no attempt to get the responsibilities in the right place, despite an explicit requirement from the customer that the system be flexible in response to changing requirements. After major redesigns in response to changing requirements, the plug was pulled on the project before it delivered any code at all to the customer. It had become clear that the design process being employed could not keep up with the rate of change of requirements, and never would.
I’m in the process of designing a new application. I can’t say what it is yet, but I can say that there is a difference between the physical and logical models. Roughly, the problem corresponds to the image below.

First, lots of external data is collected from some source. The data goes into the unshaded white tables.
Then, through some process, we translate data from those tables into another set of tables. We do this in the database both so that we can have before and after copies for sanity checking purposes, but also so that the rules and constraints in the database can act as checks on the transformation itself.
Only the shaded tables in the physical model end up being relevant for display to users on the front-end. However, those tables are highly normalized and may have a variety of peculiarities about them (from a domain modeler’s perspective); those peculiarities come from the necessary imposition of the the database constraints and heavy normalization we needed to guarantee the data’s integrity.
So the third step is to translate these physical tables into classes that more logically represent our problem. The logical model hides all of decisions of the data layer for the programmer who is creating the web front-end. The input tables disappear, and the other tables are recombined in ways that make sense for the application. Some normalization can be lost because it’s not meaningful here.
Certainly, I could create a front-end that functions perfectly well without this layer. But it will be much easier to create and modify the front-end in the future with this layer in place. Although it seems like “work” to create the new abstraction layer, it vastly simplifies the problem of creating the front-end once the translation layer exists. The act of “translation” happens in one place, not in every controller or view that needs to work directly with data from the database.
In fact, as I’ve been arguing already (though not on this blog), you are doing this step all the time already, because no matter how you think of it, you do have to somehow translate data from the data layer into the format it ultimately gets displayed as on the front-end. The difference is that this model recognizes this process and brings all of the code that accomplishes it together in one place. That makes adjusting to changing requirements, or keeping database changes hidden from consumers a much simpler task.
The rub is that one needs to learn how to design the “domain inspired logical model.” Right now that seems to be the biggest gap. ORMs like ActiveRecord trick you into thinking that the data model and the object-oriented logical model are the same, and that’s something we need to overcome.
Physical vs Logical Modeling in Rails
I gave a talk on Tuesday at the Boston Ruby User’s Group meeting. The talk was quite exciting, with about 60 people split rather evenly between skeptics and converts to my message. Although they are fairly sparse for consumption without having first heard the talk, the slides are attached. As I understand more and more of the initial objections folks have to splitting up domain and data modeling this way, I’ll be working toward a more comprehensive argument and set of examples. Yes, ideally for the next edition of my book. Here’s the
PDF.
Speeding up ActiveRecord with Hashes, Take 2
A few weeks ago, I posted about the release of my gem, hash_extension, which makes Ruby hashes act a little more like plain old objects. That’s a good thing, because ActiveRecord requests that return hashes instead of full-blown ActiveRecord objects are about 50% faster. 50% is not a performance tweak — it’s basically the best thing that’s ever happened to ActiveRecord.
In version 0.0.1, the only way to get back hashes from ActiveRecord for use with hash_extension was to use select_all:
Foo.connection.select_all("select * from foo")
In the docs for hash_extension I put out a call for someone to extend ActiveRecord itself to return hashes from more natural, ActiveRecord-esque methods. Elliot Laster answered that call, and now, in version 0.0.2, find_as_hashes and find_by_sql_as_hashes are now available:
Foo.find_as_hashes(:all)
Foo.find_as_hashes(:first)
Foo.find_as_hashes(:all, :conditions => "bar = 'baz'")
Foo.find_by_sql_as_hashes("complex sql goes here")
To learn more about the gem and to download, go here.
ActiveRecord is slow. Hashes are fast.
Introducing the hash_extension gem…
…the first gem associated with my upcoming book, Scaling to Enterprise with Ruby on Rails.
Are you tired of hearing that Ruby is slow? Well, Ruby is slow, in many ways. The trick to a fast site is to not use the parts of Ruby and Rails that are slow in places where performance counts. For example, loading ActiveRecord objects happens to be extremely slow. Something simple like the following statement may take very little database time, but then will spin through the slow process of ActiveRecord object creation.
MyObject.find(:all)
On my dual core macbook pro, on a table with 40k records, this takes 7 seconds of Ruby time. Conversely, the following query, which returns not an array of ActiveRecord (MyObject) objects, but an array of hashes with all the same properties, takes just over 3 seconds:
MyObject.connection.select_all("select * from my_objects")
So if you don’t need the associations or methods that come with the full ActiveRecord version of your data, you can save a lot of Ruby cycles by using hashes instead — over 50% of the overhead. The problem is that the two statements above are not drop-in replacements for each other. Objects follow dot notation (f.attr) whereas hashes follow, well, hash notation (f['attr']). So to switch to the hash result, you would have to update all your code to follow hash notation instead of dot notation, and that’s a pain (not to mention ugly).
hash_extension to the rescue! This gem allows you to access hashes just as you would regular objects. With this gem, the following is possible:
>> hash = Hash.new >> hash.foo = 'bar' >> hash.foo => 'bar'
Now the two statements above are interchangeable. If you have slow pages in need of tuning, and you’re loading lots of object for display purposes only (e.g., you don’t actually need the weight associated with the full objects), this is an easy way to eek out some more performance.You can download the gem here and read more about how to set it up and use it here.

2 comments