Dan Chak’s Blog

More on logical models differing from physical ones

Posted in rails by Dan Chak on November 17, 2008

On the topic of logical models and physical models different, I found the following short anecdote in Paul Oldfield’s Domain Modeling whitepaper:

Anecdote: One project for which I worked, and against my advice, chose to design an OO system based on a data model. They made no attempt to get the responsibilities in the right place, despite an explicit requirement from the customer that the system be flexible in response to changing requirements. After major redesigns in response to changing requirements, the plug was pulled on the project before it delivered any code at all to the customer. It had become clear that the design process being employed could not keep up with the rate of change of requirements, and never would.

I’m in the process of designing a new application. I can’t say what it is yet, but I can say that there is a difference between the physical and logical models. Roughly, the problem corresponds to the image below.

translation

First, lots of external data is collected from some source. The data goes into the unshaded white tables.

Then, through some process, we translate data from those tables into another set of tables. We do this in the database both so that we can have before and after copies for sanity checking purposes, but also so that the rules and constraints in the database can act as checks on the transformation itself.

Only the shaded tables in the physical model end up being relevant for display to users on the front-end. However, those tables are highly normalized and may have a variety of peculiarities about them (from a domain modeler’s perspective); those peculiarities come from the necessary imposition of the the database constraints and heavy normalization we needed to guarantee the data’s integrity.

So the third step is to translate these physical tables into classes that more logically represent our problem. The logical model hides all of decisions of the data layer for the programmer who is creating the web front-end. The input tables disappear, and the other tables are recombined in ways that make sense for the application. Some normalization can be lost because it’s not meaningful here.

Certainly, I could create a front-end that functions perfectly well without this layer. But it will be much easier to create and modify the front-end in the future with this layer in place. Although it seems like “work” to create the new abstraction layer, it vastly simplifies the problem of creating the front-end once the translation layer exists. The act of “translation” happens in one place, not in every controller or view that needs to work directly with data from the database.

In fact, as I’ve been arguing already (though not on this blog), you are doing this step all the time already, because no matter how you think of it, you do have to somehow translate data from the data layer into the format it ultimately gets displayed as on the front-end. The difference is that this model recognizes this process and brings all of the code that accomplishes it together in one place. That makes adjusting to changing requirements, or keeping database changes hidden from consumers a much simpler task.

The rub is that one needs to learn how to design the “domain inspired logical model.” Right now that seems to be the biggest gap. ORMs like ActiveRecord trick you into thinking that the data model and the object-oriented logical model are the same, and that’s something we need to overcome.

Tagged with: ,

2 Responses

Subscribe to comments with RSS.

  1. Todd Jonker said, on November 30, 2008 at 3:08 am

    Hey Dan, congrats on the new book.

    Interestingly, this logical/physical distinction is one of the “issues” I observed with Rails… way back many years ago when you were trying to sell me on it. I’ve since much more material on Rails, and none of it has addressed this issue. It’s a serious impediment to evangelizing to those of us that care about scale, abstraction, and evolutionary design.

    Oh well, these days we’ve learned that we don’t need databases anyway. ;-)

  2. Robert Young said, on February 21, 2009 at 8:41 pm

    Waiting for the book, so I’m reading through your blogs. Given your (proper) regard for normalization, a couple of thoughts.

    1) for those who complain that performance must suffer, I suggest, assuming you will have your hands on the whole thing, using solid state disks for your primary table data. Since you have a full normal form data structure, you also have a (metaphorically speaking) minimal cover of the logical data requirements. No other model will be as parsimonious. Certainly not any xml datastore. A SSD database machine with a xNF database will also be faster than any non-normalized datastore, even on the same SSD database machine.

    2) for those who complain that the (any) Relational Model must be less than the (any) Object Data Model, the response has to be that such complaints are ill informed. Each object of a class is distinguished by its instance data only, which is why OODBMS’s have failed. Not only is there no need for multiple copies of method text, there is good reason not to have such multiple copies, they have to be kept synchronized, etc. A waste of time and storage.

    When instantiated into any sensible operating system (or VM), the OS has only one copy of the method texts of a class. Each object has its data written separately, all referencing the single set of method texts. The most parsimonious data structure is the RM; always.


Leave a Reply