ibexa

Path

ez publish / technical manual / 3.9 / features / clustering


Caution: This documentation is for eZ Publish legacy, from version 3.x to 5.x.

Clustering

The clustering feature makes it possible to run an eZ publish site on several web servers. A site that is running on a cluster of servers will have better performance and will be able to handle more traffic. Please note that this feature was significantly improved in 3.8.

Versions prior to eZ publish 3.8 could be run in a clustered environment, but these configurations were subject to occasional race conditions when files were updated or removed. Since all cache files and images were stored locally on separate filesystems (one for each web server), the files had to be synchronized using "rsync" or similar tools.

From 3.8, it is possible to store all content related caches, images and binary files in the database. A technique called database transaction is used to ensure that all the cluster nodes use the same cache files and have access to the same image and binary files. In other words, when content is updated, changes will automatically become available for all the web servers in the cluster. This solves the caching and synchronization issues related to earlier versions running in a clustered environment. In addition, it makes it easier to do backups and migrate the solution to other platforms.

From 3.9, an additional HTTP header called "Served-by" is supported. This feature has been added for the purpose of testing and debugging. It is typically useful when you need to check from the client side which server that has handled the request. The following example shows a part of a server response that contains this header:

...
Last-Modified: Fri, 29 Jun 2007 09:35:54 GMT
Served-by: 62.70.12.230
Content-Language: en-GB
...

Note that when clustering is used, it is recommended to run the site in a Virtual Host environment on the different servers.

How it works

Data that must be synchronized between the different servers is stored using the database:

  • Binary files
  • Image and image alias files
  • Caches related to content:
    • Content view cache
    • Template block cache
    • Expiry cache
    • URL alias cache
    • RSS cache
    • User info cache
    • Class identifier cache
    • Sort key cache

Other files are stored using the filesystem, including (but not limited to):

  • INI files
  • Template files
  • Compiled templates
  • PHP files
  • Log files
  • Caches that are not related to content:
    • Global INI cache
    • INI cache
    • Codepage cache
    • Character transformation cache
    • Template cache
    • Template override cache

Content view cache

When eZ publish is displaying a page (a content node), it will execute the "view" view of the "content" module and include the output in the pagelayout. If the output is cached, the cache file(s) will be read and served. If not, the system will fetch the content stored in the eZ publish object database, render the necessary templates, generate a web page and store the resulting XHTML on the filesystem before serving it. As previously mentioned, these files can now (from 3.8) be stored in the database and thus the files (along with changes) are easily and immediately available to all servers in the cluster.

Images and image aliases

The approach described above is also used when it comes to images and image aliases (image variations). However, the solution is a bit more complicated because until now (3.8), images have been served directly by Apache. Since the web server isn't able to communicate with the database, the images need to be served using a PHP script called "index_image.php". This is true for all content images, but not for images that are related to design. Please note that you'll need to add new rewrite rules in order to instruct Apache to use "index_image.php" when serving images.

Cluster file handlers

A new cluster file handler mechanism was added in 3.8. It makes it possible to store, retrieve, rename, delete, etc. files using the database. The cluster file handlers are located in the "kernel/classes/clusterfilehandlers" directory of the eZ publish installation. The following cluster file handlers are known to the system by default:

  • ezfs (eZFSFileHandler)
  • ezdb (eZDBFileHandler)

eZFSFileHandler

This handler makes it possible to use the filesystem when dealing with files.

eZDBFileHandler

This handler makes it possible to use the database when dealing with files (in a cluster environment, this would typically be images, uploaded binary files and content-related caches, etc.). It is split into different back-ends that are compatible with the supported database engines (MySQL, PostgreSQL, Oracle, etc).

Custom handlers

It is possible to extend the system by implementing your own handlers and/or back-ends. This should be done using the extension system (and not by modifying the original eZ publish kernel files).

The "ExtensionDirectories[]" array located under the "[ClusteringSettings]" block of the "file.ini" configuration file specifies the extension directories where eZ publish should search for additional cluster file handlers. By default, eZ publish will search in the "clusterfilehandlers" subdirectory inside your extension.

Example

If you have an extension "myExtension" that includes a cluster file handler "cfh", you should add the following lines under the "[ClusteringSettings]" block in your "file.ini.append.php" file:

FileHandler=cfh
ExtensionDirectories[]=myExtension

 

These settings will instruct eZ publish to use your custom cluster file handler located in "extension/myExtension/clusterfilehandlers/cfhfilehandler.php".

Svitlana Shatokhina (10/08/2006 9:27 am)

Julia Shymova (19/07/2007 8:17 am)

Svitlana Shatokhina, Balazs Halasy, Julia Shymova


Comments

There are no comments.