eZ component: Cache, Design, 1.4 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ $Author$ $Revision$ $Date$ :Status: Draft .. contents:: ===== Scope ===== The scope of this document is to design the features to be implemented for the Cache component version 1.4. This version will incorporate the following features and fixes, which will be described in detail in this document: - #12587: Hierarchic caching for the Cache component ================================================== #12587: Hierarchic caching for the Cache component ================================================== The idea behind this feature is to provide hierarchical multi level caching for the Cache component. Currently the Cache component only supports 1 cache handler to be asked to restore a certain object. Either the handler returns cached data for the desired object (hit) or it returns false to indicate that it does not have valid data (miss). There is no possibility to instruct the Cache component to search other caches in case of a miss. For this reason, hierarchic caches will be introduced. Design ====== This section is meant to design the enhancements defined by the requirements section above. .. Warning:: The design described below adds another level of abstraction to the Cache component, The use of this abstraction can significantly reduce the performance, if done in an unintended way. ezcCacheStack ------------- An object of the ezcCacheStack class is the main instance to provide the hierarchical stack mechanism. The stack object takes care of managing several cache storages, the unified access for storing and restoring cache items and the associated objects needed to realize this. The class ezcCacheStack extends the class ezcCacheStorage and will therefore be able to be used with ezcCacheManager. However, this introduces the need to store all necessary settings for this class in an options object, to enable the internal delayed initialization mechanism of ezcCacheManager. The class will only implement the methods required by ezcCacheStorage to be implemented and will not support additional features of some cache storages, like explicit searching for items. :: class ezcCacheStack extends ezcCacheStorage { public function __construct( $location, $options ); public function store( $id, $data, $attributes = array() ); public function restore( $id, $attributes, $search ); public function delete( $id, $attributes, $search ); public function countDataItems( $id, $attributes ); public function getRemainingLifetime( $id, $attributes ); public function pushStorage( ezcCacheStackStorageConfiguration $storageConf ); public function popStorage(); public function getStorages(); public function countStorages(); public function reset(); } The $location parameter of the ctor will be ignored by the class, because a unique location is not required by the stack. The $search parameter for the restore() and delete() methods is forwarded to the child storages. Aside of this the parameter instructs the stack to search through all its contained cache storages, instead of stopping as soon as a fitting item was found in one of them. The requirements defined for cache stacking force cache storages to implement several additional functionalities. Therefore, a new interface ezcCacheStackableStorage will be introduced, to define the necessary methods. This also ensures, that caches are not stacked recursively, since ezcCacheStack won't implement this interface itself. Internally, an object of this class is composed from several other objects. The stack is created by using the pushStorage() method to add a new storage to the top of the stack. A storage can be removed from the top using the popStorage() method, which also returns the removed storage configuration. The getStackedCaches() method returns all stacked cache storage objects as an array. This method can be used by user applications to directly influence the cache storages. The cache stack will implement the "propergate on store" strategy. This means that a cache item will be stored on all levels of the stack at once, as it is initially stored. This consumes more space when an item is stored, but can make the storage process itself faster, in case items need to be replaced often. Replaced items don't need to be bubbled down to deeper levels if a cache storage runs full. The store() method will work as follows: :: $metaStorage->lock(); $metaData = $metaStorage->restoreMetaData(); foreach ( $storageStack as $storage ) { $replacementStrategy->store( $storageConfiguration, $metaData, $itemId, $itemData, $itemAttributes ); } $metaStorage->storeMetaData( $data ); $metaStorage->unlock(); The restore() method will work as follows: :: $metaStorage->lock(); $metaData = $metaStorage->restoreMetaData(); $res = false; foreach ( $storageStack as $level => $storage ) { $res = $replacementStrategy->restore( $storageConfiguration, $metaData, $itemId, $itemAttributes, $search ); if ( $res !== false ) { if ( $options->bubbleUpOnRestore === true ) { // Store retored item in higher level storages foreach ( $storageStack as $storeLevel => $storage ) { if ( $storeLevel === $level ) { break; } $replacementStrategy->store( $storageConfiguration, $metaData, $itemId, $res, $itemAttributes ); } } break; } } $metaStorage->storeMetaData( $metaData ); $metaStorage->unlock(); return $res; .. Warning:: If the $bubbleUpOnRestore option is set to true, the item will be stored on higher levels only with those attributes, that have been used for the restoring, since there is no possibility to request the stored attributes for an item from a storage. The delete() method is as simple, but somewhat inefficient, since it needs to delete the desired item from every storage on the stack. :: $metaStorage->lock(); $metaData = $metaStorage->restoreMetaData(); foreach ( $storageStack as $storage ) { $replacementStrategy->delete( $storageConfiguration, $metaData, $itemId, $itemAttributes ); } The call to delete() has to happen on each of the stacked storages, although a physical deletion might only happen on some or even one or none of them. Since none of the caches might have all items fulfilling the criteria given to the countDataItems() method in place, this one will summarize the occurances in all storages. The getRemainingLifetime() method will return the maximum lifetime found over all storages, since different storages in the cache might have different lifetimes. The latter issue is generally undesired, but might be configured by a user. For both methods, an iteration through all storages is required. The convenience method countStorages() returns the number of storages currently on the stack. The reset() method will reset the complete stack. It will remove the meta data from the meta data storage and will clean up the complete content of all storages on the stack. ezcCacheStackableStorage ------------------------ The interface ezcCacheStackableStorage is used to ensure, that storage classes that can be stacked implement the necessary functionality. The following methods are needed: :: interface ezcCacheStackableStorage { array(string) purge( $limit = null ); array(string) delete(...); void reset(); } The delete() method, while being defined already in the ezcCacheStorage base class, needs to be redefined here. There is no addition necessary in the parameter list, but the delete() method needs to return an array of item IDs that have been deleted, to ensure that those are properly synchronized with the meta data. The purge method is needed to make the storage purge all outdated items. In case a cache storage runs full (determined by the replacement strategy), first all outdated items will be purged, before items are deleted using the original replacement strategy. The purge() method needs to return the IDs of the purged items, to allow the replacement strategy to keep its meta data in sync. The reset() method will be called by the ezcCacheStack instance, if ezcCacheStack->reset() is called. The effect of this method must be, that all content of the storage is cleaned out. ezcCacheStackMetaDataStorage ---------------------------- This interface defines the methods to be implemented by a meta data storage, that is used to store the data utilized by an ezcCacheStackReplacementStrategy. The following methods need to be implemented: :: interface ezcCacheStackMetaDataStorage { ezcCacheStackMetaData restoreMetaData(); void storeMetaData( ezcCacheStackMetaData $metaData ); void lock(); void unlock(); } The store-/restoreMetaData() methods are used by the stack to obtain and save the meta data. The stack submits this data for analysis and manipulation to the instance of ezcCacheStackReplacementStrategy and takes care for storing and restoring it. To keep the meta data consistent between requests, the meta data storage needs to support locking via the lock() and unlock() methods. This lock must affect only the lock() method itself, since the stack won't operate on the meta data without locking the storage. .. Warning:: There is not way around locking to keep the meta data consistent between requests. This means, that access to the complete stack will be exclusive and requests might need to wait for other requests to free the cache lock again. ezcCacheStackOptions -------------------- An object of this class is used to configures the cache stack. It extends the ezcCacheStorageOptions class, to be compatible with all other mechanisms. The 'ttl' and 'extension' options are ignored, because each of the stacked caches must be able to implement its own set of options. The following options are part of this class: 'configurator' This option can define a configurator class, which implements the ezcCacheStackConfigurator interface. This class will then be used to perform the initial configuration of the stacked storages. Changes to this value after the stack has been instanciated do not have any effect. If the value of this option is null (default) no initialization will be performed. 'metaStorage' This option can either be set to null or can contain an implementation of ezcCacheMetaDataStorage, which will be used to store the meta information for the stack, that is defined by the ezcCacheStackReplacementStrategy. If null is given here, the top most storage of the stack will be used. 'replacementStrategy' The object defined in this option will be used as the replacement strategy for storages in the stack and must be an instance of ezcCacheStackReplacementStrategy. The default is an instance of ezcCacheLruReplacementStrategy. 'bubbleUpOnRestore' This option controlls, wether items restored from a lower cache level should be bubbled up to all higher ones. The default here is false, to save the overhead of writing to the storages while restoring. ezcCacheStackConfigurator ------------------------- The ezcCacheStackConfigurator class is used as the delayed initialization mechanism for ezcCacheStack objects. Delayed initilization in the Cache component is realized through the dedicated ezcCacheManager class, which creates cache instances on the fly, as soon as they are needed. Since this mechanism does only allow cache storages to be configured via options, an instance of ezcCacheStackOptions can contain an ezcCacheStackConfigurator class in its $configurator property. This class will be used after construction of the ezcCacheStack to initially configure it. The interface requires the following methods to be implemented: :: interface ezcCacheStackConfigurator { static void configure( ezcCacheStack $stack ); } After being instatiated, the $stack is submitted to this method, which can then perform any arbitrary operation on it. Most commonly a set ezcCacheStackStorageConfiguration instances will be created and added to the $stack via the pushStorage() method. ezcCacheStackStorageConfiguration --------------------------------- Instances of this class are used in the ezcCacheStack to configure the behavior of the storage inside the stack. Beside the storage itself additional properties are needed. The following properties are needed. 'id' Unique ID of the storage in the stack. This should be unique over all stacks and storages, if possible. If used properly, the $location parameter of the storage can be used here. 'storage' The instance of ezcCacheStackableStorage that is configured with this object. 'itemLimit' The maximum number of items to be stored in this cache. Since the size of cache items cannot be determined inside PHP, the only way to limit the items stored in a cache storage is to use the number of items, stored. 'freeRate' This option is an float between 0 and 1 indicating a fraction value. In case the cache storage runs full, this rate of item slots (measured from the $itemLimit) will be freed. This mechanism ensures that running full of a cache does not occur too often. Storage configurations are used with the ezcCacheStack::pushStorage() method. This methods receives a unique ID in addition, to identify a configured stack. This ID is needed by the replacement strategies in the meta data. ezcCacheStackReplacementStrategy -------------------------------- This interface defines methods that are used by ezcCacheStack to realize the actual storing of cache items. Inside theses methods, the replacement of cache items inside the actual storage needs to be handled, as well as the update of ezcCacheStackMetaData. Since all state information for a replacement strategy is stored in the ezcCacheStack instance, there is no need to create an instance of it. Therefore, all methods are declared static and retrieve the necessary state information in their signature. The $storageConfiguration parameter is the ezcCacheStackStorageConfiguration of the storage to utilize. Beside the storage itself, it also contains the configuration and especially the ID of the storage. The interface will define the following methods: :: interface ezcCacheStackReplacementStrategy { public static void store( $storageConfiguration, $metaData, $itemId, $itemData, $itemAttributes = array() ); public static void restore( $storageConfiguration, $metaData, $id, $attributes, $search ); public static void delete( $storageConfiguration, $metaData, $id, $attributes, $search ); } The store() method performs the most complex operation in this case. To illustrate its working, the following pseudo code is used: :: $item = $storage->restore( $id ); if ( $item === false && $meta['itemsStored'] >= $limit ) { // The item was not found and the limit is exceeded. First purge the // outdate items, then eventually free more items using the strategy // implemented. free( $storage, $metaData, $freeRate ); // Update the meta information according to the freeing update( $metaData ); } // There is enough space to store the item now $storage->store(...); update( $metaData ); The restore() method does not work as complex as the store() method does. Its algorithm is defined by the following pseudo code: :: $item = $storage->restore(...); if ( $item !== false ) { // Notice access to this item in meta information update( $metaData ); } return $item; The delete() method needs to take care, that deleted items are also removed from the meta data. A replacement strategy should always store its name in the meta data container, to ensure that a switch between strategies is possible. In case a replacement strategy receives an invalid meta data structure, it must throw an exception. To avoid this, the complete stack must be resetted by a call to ezcCacheStack->reset(). ezcCacheLruReplacementStrategy ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ This replacement strategy works after the Least-Recently-Used algorithm (LRU). It discards items, which have not been requested (reading or writing) for the longest time span. The meta data of this replacement strategy consists of an array, which is indexed by the IDs of the stored data items assigned to the last access time of the item. When a new item is added to the cache, its ID is added with the current time stamp. Each time the data item is read or updated, the time stamp is actualized. In case the item is removed, the ID is unset in this array. For the case that the cache runs full, the array of timestamps first needs to be sorted. Then the first X elements (indicated by $freeRate) will be removed from the array and the affected items removed from the cache storage (array_splice()). ezcCacheLfuReplacementStrategy ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The Least-Frequently-Used algorithm (LFU), in contrast to LRU, stores the number of accesses of a cache item. In case the cache needs to be freed, those items are discarded, that have been used least frequently. The meta data structure is pretty similar to the one of LRU strategy: The cache item IDs are used as the key, being assigned to the number of accesses to this item. In case the cache runs full, this list is sorted and the first X elements (determined by $freeRate) are removed. The cache items assigned to the determined IDs are deleted from the cache storage. ezcCacheStackMetaData --------------------- This struct is used to represent meta information, that is stored in an ezcCacheStorage. In addition to the array structure it provides, it contains an $identifier property that is to be set by the replacement strategy. If a replacement strategy receives an incorrect meta information object, it must thrown an exception. :: class ezcCacheStackMetaData extends ezcBaseStruct { public $identifier; public $data = array(); } A storage is responsible for arbitrary storage of the meta information struct itself. The APC storage might store it directly, the file system plain storage might serialize its contents and the file system array storage might convert it into a complete array. .. Local Variables: mode: rst fill-column: 79 End: vim: et syn=rst tw=79