ibexa

Path

ez publish / technical manual / 3.8 / features / search engine


Caution: This documentation is for eZ Publish legacy, from version 3.x to 5.x.

Search engine

The system comes with a built-in search engine which integrates tightly with the content structure. It is capable of indexing everything that is inputted through the native content model.

In eZ Publish, a content class describes the actual data structures (for example news articles, products, etc.). The classes are built up of attributes which are represented by datatypes. An attribute can be the title of an article, the price of a product and so on. It is possible to control which attributes that should be indexed by the search engine. This can be done by making use of the "Searchable" checkboxes while editing a class. Some datatypes (for example float, price, etc.) do not support indexing. Please refer to the datatype overview page to see which datatypes that can be indexed.

When an object is published, the attributes that are marked searchable will be indexed by the search engine. It will then be possible to use the search interface to find words or phrases that are a part of the published content. For example, if the user searches for "backpack", the system will return a list of all kinds of objects where the word "backpack" occurs. This is the default behavior. The following screenshot shows the standard search interface.

Standard search interface

Advanced search

The advanced search interface makes it possible tweak and narrow the search. The following features are supported:

  • Search for several words at the same time (for example "car bike train").
  • Search for an exact phrase (for example "cheap cars in Scandinavia").
  • Class level filtering (limit the search to a specific class).
  • Attribute level filtering (search only a specific attribute).
  • Tree level filtering (limit the search to a part of the node tree).
  • Section filtering (limit the search to objects that belong to a certain section).
  • Time filtering (yesterday, last week/month/3-months/year).

The following screenshot shows the advanced search interface.

Advanced search interface

Advanced search interface

Wildcard searching

The default behavior of the search engine is that it only searches for complete words or phrases. If the user searches for "demo", the system will not return objects that contain words like "demolition", "demonstration" and so on. However, eZ Publish does in fact support wildcard searching, but it must be turned on by adding the following lines to a configuration override for "site.ini":

[SearchSettings]
EnableWildcard=true

When the wildcard search feature is turned on, it is possible to use the asterisk character as a wildcard, for example like this: "demo*". In this case, eZ Publish will return a list of objects that contain words starting with "demo". For example, it would return objects containing words like "demonstration", "demolition", etc. When this notation is used, the result will also return objects that contain the word which was specified before the asterisk. In other words, objects containing only the word "demo" will also be returned.

Please note that the asterisk can only be used after a word. This means that the following search queries are invalid: "*demo" and "some*thing".

Warning! There is a good reason for the wildcard search being turned off by default. It requires a lot more processing time than the standard search. This means that the server might have to be upgraded in order to produce faster results and to achieve less overall system load.

Logical operators

Inline logical operators like "AND" and "OR" are not supported. This means that it is not possible to specify search queries like "cars AND minivans" or "trucks OR vans". However, it is in fact possible to do an AND search. This can be done by making use of the "Search for all of the following words" input field in the advanced search interface. For example, if the user inputs "cars bikes" then the system will return a list of objects that contain both of these words. The order of the words is insignificant.

Search statistics

The setup part of the administration interface provides access to a page that reveals information about words/phrases that have been searched along with the average results that have been returned. The following screenshot shows the search statistics interface.

Search statistics

Search statistics

The "Reset statistics" button will simply clear the search log.

Balazs Halasy (16/11/2005 2:25 pm)

Balazs Halasy (16/11/2005 2:31 pm)


Comments

  • searching inside files pdf

    The search engine don't search inside files .pdf. Is this behavior a normal one ?
    Thanks you.
  • Search statistics - public site

    Our search statistics only records the search words from the admin module - - how do I change that to be the search statistics for the public website ?