||1. Refining ORM further
John Barkley Schatz
|The limitations mentioned in this article can be overcome and additional benefits can be obtained by: 1) adopting the Data Mapper pattern over the Active Record pattern, and 2) interacting with DBMS table views instead of the real tables in order to avoid having to generate complicated SQL in the PHP code.|
Data Mapper is discussed by Martin Fowler:
Leaving complicated data queries to the DBMS (in the table view code) recalls a criticism of ORM, that it tries to solve the wrong side of the natural impedance mismatch between objects and database relations:
"The information principle underpinning relational databases implies that object orientation itself is inadequate for the full needs of data manipulation, and it is that 'paradigm' as a whole that should be addressed. If this were the case, ORM would be left redundant. In this view, the 'impedance mismatch' and the supposed need for object-relational mapping arises from the mistaken equation of object and relation (table or view in SQL speak). The correct mapping in the relational model is between object and type. Also, ORM systems tend to perform worse than writing SQL directly for more complex tasks. However, most ORM systems allow writing raw SQL to some degree."
Using simple SQL in the PHP code avoids the ORM performance limitations when using either Active Record or Data Mapper; instead, we can leverage the DBMS's strength of executing complicated SQL stored in views.
However, while Active Record is a wrapper around the database table, Data Mapper allows for moving data between objects and their data store without either knowing about each other. The data store can be anything: a database, an LDAP server, an XML document, whatever.
So if Active Record is convenient, Data Mapper is nirvana. As explained by Fowler: "With Data Mapper the in-memory objects needn't know even that there's a database present; they need no SQL interface code, and certainly no knowledge of the database schema. (The database schema is always ignorant of the objects that use it.) Since it's a form of Mapper (473), Data Mapper itself is even unknown to the domain layer."
Although I have not tried it, one ORM tool based on the Data Mapper pattern is xyster:
I would just add a final thought about using simple SQL in the PHP code. It is my understanding that most DBMS only allow Select queries on table views. I do not have experience with many DBMSs, however, some may be interested to know that PostgreSQL can also handle Insert/Update/Delete on views by means of an ingenious rules system.
||2. Re: Refining ORM further
|2008-10-27 04:31:57 - In reply to message 1 from John Barkley Schatz|
|I don't think that resorting to table views is a solution to the problem as they are never as efficient as simple table selects, inserts, updates and deletes.|
Anyway, as mentioned in the article the greatest problem is that most people that attempt to use ORMs depart from the table schema, instead of departing from from the object model. That should be called ROM, not ORM.
If you depart from the object model, you do not have data type ambiguity problems, you can easily generate database schema definitions or upgrades when the object model changes, and of course you do not need views, which are something not quite portable between databases.
The PHPClasses site uses Metastorage which is an ORM code generation tool that departs from the object model. The generated code is as efficient as hand written code. That code does not rely on an ORM run-time library.
The only dependency is on Metabase which is a database abstraction layer that provides database independence to the generated code, but even that could be replaced by a code that accesses databases directly with the native PHP database extensions.
The result is that development happens much faster and the application code is much more reliable as no database access code is written by humans, so there are no human mistakes adding bugs to the ORM code.
||3. Re: Refining ORM further
John Barkley Schatz
|2008-10-29 20:23:03 - In reply to message 2 from Manuel Lemos|
|"Making liberal use of views is a key aspect of good SQL database design. Views allow you to encapsulate the details of the structure of your tables, which might change as your application evolves, behind consistent interfaces." -- Chapter 3.2 Advanced Features: Views, PostgreSQL 8.3.3 Documentation.|
Inefficiency is not a known issue as far as I can tell: if SQL from application code targets a RDBMS view, this does not cause the RDBMS to run SQL on the database tables twice. In addition, views can complement the efficient code generation objective of Metastorage, as the following discussion of equivalence suggests:
"Equivalence is a characteristic of the view. A view is equivalent to its source query. When queries are run against views, the query is modified. For example, if there exists a view named Accounts_view and the content is:
(money_received - money_sent) AS balance,
FROM table_customers c
JOIN accounts_table a
ON a.customerid = c.customer_id
"The application [e.g., Metastorage?] would simply run a simple query such as:
"The RDBMS then takes the simple query, replaces the equivalent view, then sends the following to the optimiser:
FROM (SELECT name,
(money_received - money_sent) AS balance,
FROM table_customers c JOIN accounts_table a
ON a.customerid = c.customer_id )
"From this point on the optimizer takes the query, removes unnecessary complexity (i.e. it is not necessary to read the address, since the parent invocation does not make use of it) and then sends the query to the SQL engine for processing." -- http://en.wikipedia.org/wiki/View_(database)
Perhaps I misunderstand what was meant by views being never as efficient? Nevertheless, I did not suggest substituting views for simple table selects but rather to simplify complicated queries.
Also, views are portable between database management systems to the extent that views are specified in the SQL standard. No RDBMS adheres perfectly to the SQL standard, of course, so this is not a compelling reason to avoid using views.
My post was intended as a positive and complementary contribution to the topic under discussion. Many programmers (they freely admit this) have a shallow appreciation of the RDBMS and the relational data model. This is evidenced by official statistics of the USA's Department of Labor, for example, which indicate that 40% of all database application development projects fail, primarily due to deficiencies in the design of the database. By the way, this is the worst failure rate of any industry! I suspect there are positive benefits to be gained from harnessing what is a common and powerful RDBMS capability in projects such as Metabase or, indeed, in any PHP coding project that accesses a database.
One of the most important reasons to use views at all times, even when a view mirrors a real table, is that views enhance the security of the database by ensuring that uncontrolled, direct access to the tables is impossible from outside the RDBMS.
Views can be used in almost any place a real table can be used. Building views upon other views is not uncommon.
Concluding with a short summary extracted from the Wikipedia article cited above.
"Views can provide advantages over tables:
- Views can subset the data contained in a table
- Views can join and simplify multiple tables into a single virtual table
- Views can act as aggregated tables, where aggregated data (sum, average etc.) are calculated and presented as part of the data
- Views can hide the complexity of data, for example a view could appear as Sales2000 or Sales2001, transparently partitioning the actual underlying table
- Views take very little space to store; only the definition is stored, not a copy of all the data they present
- Depending on the SQL engine used, views can provide extra security.
- Views can limit the exposure to which a table or tables are exposed to the outer world
"Just like functions (in programming) provide abstraction, views can be used to create abstraction. Also, just like functions, views can be nested, thus one view can aggregate data from other views. Without the use of views it would be much harder to normalise databases above second normal form. Views can make it easier to create lossless join decomposition."
This just sounds to me like a resource that Metastorage (and PHP programmers in general) can leverage to its advantage.