ibexa

Path

ez publish / technical manual / 3.10 / features / clustering


Caution: This documentation is for eZ Publish legacy, from version 3.x to 5.x.

Clustering

The clustering feature makes it possible to run an eZ Publish site on several web servers. A site that is running on a cluster of servers will have better performance and will be able to handle more traffic.

It is possible to store all content related caches, images and binary files in the database. A technique called database transaction is used to ensure that all the cluster nodes use the same cache files and have access to the same image and binary files. In other words, when content is updated, changes automatically become available for all the web servers in the cluster. This functionality was significantly improved in 3.10.

Note that when clustering is used, it is recommended to run the site in a Virtual Host environment on the different servers.

Changes introduced in 3.10

In eZ Publish versions prior to 3.10, clearing the caches lead to physical removal of the cache files. This operation can be quite time consuming.

From 3.10, the system will mark the cache files invalid instead of removing them physically from the database or filesystem. This can be done by either marking each particular cache file expired or setting the global expiry (the latter typically happens when a significant amount of changes is needed, e.g. when clearing all the caches of a specific type). The global expiry is a timestamp that is used as an expiry value for all the caches in the system. If the global expiry is set to a certain date, all cache files that are older than this date will not be used. Note that the system will re-use old/expired cache file entries when re-creating the caches.

In order to physically remove the cache files from the database, the "bin/php/ezcache.php" script needs to be run with the "--purge" option. The following example shows how to remove the content caches that are more than two days old:

php bin/php/ezcache.php --clear-id=content --purge --expiry='-2 days'


For more information about the available parameters, run the script with the "--help" option:

php bin/php/ezcache.php --help

 

Note that 3.10 does not support clustering for PostgreSQL databases. The code is optimized for best performance and focused on MySQL databases using the InnoDB engine. The number of database connections in MySQL must be increased by 30-50%. The reason for this is because the new cluster code performs an extra connection when writing content to the database (this connection checks if the file has been modified since the write lock was acquired to remove the need to write). If persistent connections are enabled, the cluster code will no longer share connections with normal database calls so the number of connections previously used will have to be doubled.

Changes introduced in 3.9

From 3.9, an additional HTTP header called "Served-by" is supported. This feature has been added for the purpose of testing and debugging. It is typically useful when you need to check from the client side which server that has handled the request. The following example shows a part of a server response that contains this header:

...
Last-Modified: Fri, 29 Jun 2007 09:35:54 GMT
Served-by: 62.70.12.230
Content-Language: en-GB
...

How it works

Data that must be synchronized between the different servers is stored using the database:

  • Binary files
  • Image and image alias files
  • Caches related to content:
    • Content view cache
    • Template block cache
    • Expiry cache
    • URL alias cache
    • RSS cache
    • User info cache
    • Class identifier cache
    • Sort key cache

Other files are stored using the filesystem, including (but not limited to):

  • INI files
  • Template files
  • Compiled templates
  • PHP files
  • Log files
  • Caches that are not related to content:
    • Global INI cache
    • INI cache
    • Codepage cache
    • Character transformation cache
    • Template cache
    • Template override cache

Content view cache

When eZ publish is displaying a page (a content node), it will execute the "view" view of the "content" module and include the output in the pagelayout. If the output is cached, the cache file(s) will be read and served. If not, the system will fetch the content stored in the eZ publish object database, render the necessary templates, generate a web page and store the resulting XHTML on the filesystem before serving it. As previously mentioned, these files can now (from 3.8) be stored in the database and thus the files (along with changes) are easily and immediately available to all servers in the cluster.

Images and image aliases

The approach described above is also used when it comes to images and image aliases (image variations). However, the solution is a bit more complicated because until now (3.8), images have been served directly by Apache. Since the web server isn't able to communicate with the database, the images need to be served using a PHP script called "index_image.php". This is true for all content images, but not for images that are related to design. Please note that you'll need to add new rewrite rules in order to instruct Apache to use "index_image.php" when serving images.

Cluster file handlers

A new cluster file handler mechanism was added in 3.8. It makes it possible to store, retrieve, rename, delete, etc. files using the database. The cluster file handlers are located in the "kernel/classes/clusterfilehandlers" directory of the eZ publish installation. The following cluster file handlers are known to the system by default:

  • ezfs (eZFSFileHandler)
  • ezdb (eZDBFileHandler)

eZFSFileHandler

This handler makes it possible to use the filesystem when dealing with files.

eZDBFileHandler

This handler makes it possible to use the database when dealing with files (in a cluster environment, this would typically be images, uploaded binary files and content-related caches, etc.). It is split into different back-ends that are compatible with the supported database engines (note: currently only MySQL is supported).

Custom handlers

It is possible to extend the system by implementing your own handlers and/or back-ends. This should be done using the extension system (and not by modifying the original eZ publish kernel files).

The "ExtensionDirectories[]" array located under the "[ClusteringSettings]" block of the "file.ini" configuration file specifies the extension directories where eZ publish should search for additional cluster file handlers. By default, eZ publish will search in the "clusterfilehandlers" subdirectory inside your extension.

Example

If you have an extension "myExtension" that includes a cluster file handler "cfh", you should add the following lines under the "[ClusteringSettings]" block in your "file.ini.append.php" file:

FileHandler=cfh
ExtensionDirectories[]=myExtension

 

These settings will instruct eZ publish to use your custom cluster file handler located in "extension/myExtension/clusterfilehandlers/cfhfilehandler.php".

Svitlana Shatokhina (10/08/2006 9:27 am)

Gaetano Giunta (19/05/2009 2:53 pm)

Svitlana Shatokhina, Balazs Halasy, Gaetano Giunta


Comments

There are no comments.