eZ component: PersistentObject, Design, 1.4 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ :Author: Frederik Holljen Tobias Schlitt :Revision: $Rev: 3576 $ :Date: $Date: 2006-09-25 13:44:15 +0400 (Mon, 25 Sep 2006) $ :Status: Final .. contents:: Scope ===== The scope of this document is to describe the proposed enhancements for the PersistentObject component version 1.4. The general goal for this version is to implement various features described by these issues in our issue tracker: - #10151: Improved Database and PersistentObject datatype support (especially binary data). - #10373: Several relations to the same table for PersistentObject. - #10913: Complex data types (e.g. DateTime) for PersistentObject. - #12287: ezcPersistentSession refactoring. Each of the issues and their proposed solutions are explained in a seperate chapter in this document. Improved datatype support (especially binary data) [#10151] =========================================================== Background ---------- PDO uses several different ways of binding parameters to queries. Currently these are: - PDO::PARAM_BOOL - PDO::PARAM_NULL - PDO::PARAM_INT - PDO::PARAM_LOB - PDO::PARAM_STR (default) - PDO::PARAM_INPUT_OUTPUT Currently unsported by PDO. Used for in/out parameters of a stored procedure. Currently only the default paramter PARAM_STR is supported by persistent object. This does not mean that you can not store integers and booleans as the database drivers will figure this out themselves. However, for some field types especially you are required to use parameters. An example of this is binary data which can contain '\0'. If you use the default parameter PDO::PARAM_STR the remaining data will simply be ignored since the end of the string has already been encountered. The support for PDO PARAM_ types was recently added to the Database component as a fix for issue `#010943`_. .. _`#010943`: http://issues.ez.no/IssueView.php?Id=10943 Design ------ The proposed solution is to add a columnType property for all definitions where a column is defined. The columnType can be one of the PDO PARAM_ type parameters and directly defines the parameter type provided to PDO when building the queries. This currently affects: - ezcPersistentObjectIdProperty (definition) - ezcPersistentObjectProperty (definition) - ezcPersistentSession (usage) Open questions -------------- > It would be nice to make large binary data available as a resource both for > writing and reading. PDO::PARAM_LOB can handle this if used together with > PDO::FETCH_BOUND. Should we, and how do we make this available to the user? > Write support will most probably work out of the box. If the value is a > resource the whole file will be read automatically by PDO as long as PARAM_LOB > is used. This is not possible with the current design for PersistentObject, since we only bind values and not parameters. We could only support this in the load() method, where the SELECT query is build completly internally and cannot be modified by the user. Everywhere else we would need to keep track of the bound variables and can not rely on correctly bound parameters when using find(). In addition, this functionality seems to be broken in MySQL and SQLite (see: http://bugs.php.net/bug.php?id=40913). As discussed on the mailinglist: No support for FETCH_BOUND, because of missing architecture infrastructure. > Should we use the PDO::PARAM_BOOL/INT by default? That is do we actually > want to require to use this for anything else than PARAM_LOB since ints/bools > etc. actually work fine already. What does using these parameters gain us. It does not work for those parameter types in all cases (referring to DR) so we should support all of them. Since ezcDbQuery::bindValue() uses PDO::PARAM_STRING as default, we can silently rely on this as the default for PersistentObject definitions, too. Several relations to the same table for PersistentObject [#10373] ================================================================= Background ---------- Since version 1.2 PersistentObject supports relations. The definition of a relation will typically look something like: :: $def->relations["Address"] = new ezcPersistentManyToManyRelation( "persons", "addresses", "persons_addresses" ); $def->relations["Address"]->columnMap = array( new ezcPersistentDoubleTableMap( "id", "person_id", "address_id", "id" ) ); The method to fetch a relation with this definition will typically look something like: :: // signature getRelatedObjects( object $relationObject, string $className ) $addresses = $session->getRelatedObjects( $person, "Address" ); Basically we can see that both the definition and the fetching of related objects is bound to the related class type. This means that a class can only have one relation type to another class. Usually this is sufficient. However, sometimes it is necessary to have more than one relation to the same class e.g. a Person class with the relations "mother" and "father" to the very same Person class. PersistentSession design ------------------------ The proposed solution is to add an optional name for relations. This way it is possible to use the name to specify the exact relation you want to fetch if there are more than one. All methods used to fetch and store relations will have the optional parameter appended to them. The affected methods and their new signatures are: :: ezcPersistentSession::getRelatedObject( object $object, string $className, string $relationName = null ) ezcPersistentSession::getRelatedObjects( object $object, string $className, string $relationName = null ) ezcPersistentSession::addRelatedObject( object $object, object $relationObject, string $relationName = null ); ezcPersistentSession::removeRelatedObject( object $object, object $relationObject, string $relationName = null ); The new parameter makes it possible to reach one new exceptional state: - ezcPersistentUndeterministicRelationException is thrown when several relations to one class is defined but no relation name is used. Definition Design ----------------- Similar to the changes to PersistentSession we need to change the persistent object definition to store the relation name for classes that have more than one relation to another class. To do this we want to introduce a wrapper struct that can hold several named relations: :: class ezcPersistentRelationCollection { public $namedRelations = array(); } An implementation of the example with persons would then look something like: :: $personRelations = new ezcPersistentRelationCollection; $def->relations['Person'] = $personRelations; $personRelations['Mother'] = new ezcPersistentOneToOneRelation( "persons", "persons" ); $personRelations['Mother']->columnMap = array( new ezcPersistentSingleTableMap( "id", "mother_id" ) ); $personRelations['Father'] = new ezcPersistentOneToOneRelation( "persons", "persons" ); $personRelations['Father']->columnMap = array( new ezcPersistentSingleTableMap( "id", "father_id" ) ); The changes do not break BC and require changes in PersistentSession to work. An extra if will have to be introduced to check for the existence of named relations. This will not harm performance in a significant way. Complex data types (e.g. DateTime) for PersistentObject [#10913] ================================================================ Background ---------- Currently, PersistentObject simply loads column values from a database and assigns those to PHP object properties. When storing a persistent object, the property values are simply stored into their corresponding columns. A typical example for such a conversion is the DateTime object of PHP 5.2. To store date and time values database independantly an integer field is usually used, containing unix timestamp information. In the business logic, this is somewhat unhandy and an instance of DateTime is desired for comfortable handling. The conversion between both date/time representations must currently be performed manually. This feature should allow the conversion to be handled by PersistentObject transparently. Design ------ To support highest flexibility for conversions, a very basic interface is introduced, which must be implemented by a class to use its instances as "conversion objects". An instance of such a class may then be assigned to an instance of ezcPersistentObjectProperty to enable the conversion for it. This way, a single conversion object might also be used for multiple properties, to reduce the number of objects to be instanciated. The interface ezcPersistentObjectPropertyConversion defines the following methods: fromDatabase( $databaseValue ) This method will be called by ezcPersistentSession right after a value has been read from the database column and before this value gets assigned to the PHP objects property. The $databaseValue parameter will contain the value read from the database. The return value of this method will be assigned to the object property, instead of the original database value. toDatabase( $propertyValue ) Corresponding to fromDatabase(), this method will be invoked right before object values are stored into the database (on insert and update). The given parameter $propertyValue will contain the object property contents and the value returned by this method will be stored into the database. For the DateTime example, fromDatabase() will receive the integer unix timestamp value and will return a PHP DateTime object. toDatabase() will receive an instance of DateTime and return an integer unix timestamp value. As a reference implementation, the described DateTime conversion will be used. If more (generally sensible) conversions are found during the implementation or are requested by users, it might happen that additional conversion classes will be shipped with this version of PersistentObject. ezcPersistentSession refactoring [#12287] ========================================= Background ---------- The ezcPersistentSession class is the very heart of the PersistentObject component. It contains the complete application logic of the component, structured in many public (25) and a few protected and private (5) methods consisting of more than 1200 lines of code and documentation. The mass of code starts getting unmaintainable by now and some code-synergies could be realized between different methods. Therefore a refactoring of the class and its surrounding classes will be described in this section to solve the named issues. The public API of ezcPersistentSession will not be affected by this, but only delegation might happen internally, without being visible to the user of the API. Design ------ To reduce the pure number of method implementations in this class, the following new classes are suggested to group the functionality: - ezcPersistentLoadHandler - public function load( $class, $id ) - public function loadIfExists( $class, $id ) - public function loadIntoObject( $pObject, $id ) - public function refresh( $pObject ) - public function find( ezcQuerySelect $query, $class ) - public function findIterator( ezcQuerySelect $query, $class ) - public function getRelatedObjects( $object, $relatedClass ) - public function getRelatedObject( $object, $relatedClass ) - public function createFindQuery( $class ) - public function createRelationFindQuery( $object, $relatedClass ) - ezcPersistentSaveHandler - public function save( $pObject ) - public function update( $pObject ) - public function saveOrUpdate( $pObject ) - public function addRelatedObject( $object, $relatedObject ) - public function createUpdateQuery( $class ) - public function updateFromQuery( ezcQueryUpdate $query ) - private function saveInternal( $pObject, $doPersistenceCheck = true, - private function updateInternal( $pObject, $doPersistenceCheck = true ) - ezcPersistentDeleteHandler - public function delete( $pObject ) - public function removeRelatedObject( $object, $relatedObject ) - public function createDeleteQuery( $class ) - public function deleteFromQuery( ezcQueryDelete $query ) - private function cascadeDelete( $object, $relatedClass, ezcPersistentRelation $relation ) These classes will just act as method containers. An instance of each class will be held in a virtual-property of the ezcPersistentSession instance. All will receive the ezcPersistentSession instance as a ctor parameter, to be able to interact with it and the remaining classes. The methods stubs of ezcPersistentSession will remain as they are, dispatching to instances of the newly created classes, which will then perform the actual work. Some common methods will be left in ezcPersistentSession, defined as public but marked as private (lack of friendship relations for PHP classes). In addition, the code of the outsourced methods will be consolidated into new, common methods (like executing a query and checking the result, optional checks of getState() methods, ...) of ezcPersistentSession, where possible. Notes and ideas --------------- This refactoring approach will obviously have an impact on performance, since for each public method call on ezcPersistentSession there will be an additional method invocation added. In addition, the consolidiation of common code into helper methods (held on ezcPersistentSession) will impact performance. On the other hand, the extraction of methods from ezcPersistentSession into handler-classes will raise the maintainability of the whole component and the consolidation of common code will reduce the complexity of the implemented methods, which also raises maintainability, code-reuse and error-tollerance. In addition, it makes the implementation of new features easier. An idea would be to instanciate the newly created handler classes only when needed (via virtual-property checks), which would reduce the ammount of code to load on instantiating ezcPersistentSession and the memory usage occupied by objects. .. Local Variables: mode: rst fill-column: 79 End: vim: et syn=rst tw=79