Concept - Search Services

1 Introduction

In order to provide a powerful full-text search, guided search and other advanced search features, Intershop 7 also has search services for product and content searches in consumer storefronts. This interface enables the integration of external search engines, which then build and query search indexes and add search features such as full-text search, guided search (after search navigation), stop word and synonym configuration, etc. to Intershop 7 based e-commerce applications.

2 References

Have a look at Cookbook - Search for related common questions.

3 Feature Overview

Through integrating an external search engine, the search services provides powerful search options to Intershop 7. Depending on the actual search engine, these options may include:

  • Simple Search Index Interface: The search services interface supports the creation and configuration of search indexes to include specified standard and custom product attributes for product searches and content parameters for content searches.
  • Synonyms and Antonyms: The search services interface provides the ability to configure and store synonym and antonym configurations.
  • Guided Search (After Search Navigation): The search services interface provides a standard pipeline and template set to add a guided search function to an Intershop 7 storefront. When a user submits a query to the simple search, a set of filter attributes is presented along with the results. The filter attribute values can be used to refine the search, hence narrow down the result set. Behind each filter attribute value, a number indicates how many hits the respective filter attribute value has produced.
  • Dynamic Navigation: The search services interface can be used to implement a dynamic navigation. Dynamic navigation facilitates catalog browsing by indicating for each category how many products the individual categories and sub-categories contain.
  • Suggest Function: The search services interface can also provide a dynamic recommendation function that offers possible completions for search term fragments while typing the search term. Along with each recommended completion, the number of possible hits, an optional image and a suggestion type may be indicated

4 Attributes - What is Indexed

This section introduces general search index management concepts, which are common for all search service implementations.
The index creation (and search, consequently) is based on configurable attributes.Hence, indexing is generally based on defining index attributes and filter attributes, as well as on defining synonyms and stop words.

Note

Any additional search functionality (tuning, campaigns, etc.) depends on the actual search engine implemented as search service in Intershop 7.

4.1 Indexed Attributes

  • Standard Product Attributes:
    Certain standard product attributes are automatically included with the index, most of them for technical purposes, such as "productuuid". These attributes cannot be managed in Commerce Management. In addition, the index automatically includes information regarding catalog and category structure, starting with the top level category 0, down to category level 5. This information is used to enable the filter mechanism using catalog categories in the storefront. If the multiple category assignment attribute is added to the index, it is possible to include the category assignments for products that are assigned to more than one category. The category levels 1 through 5 are then replaced by the special CategoryUUIDLevelMulti field. The category display names for the product are still indexed and are taken from the default category.
  • Business Attributes:
    Business attributes defined on products are offered on the Indexed Attributes tab, where they can be selected for inclusion in the index.
  • Custom attributes:
    Custom attributes defined on a product can be included in the index creating a new filter at the Indexed Attributes tab.

The standard product attributes to be indexed automatically as well as additional business attributes to be offered on the Indexed Attributes tab in Intershop Commerce Management are defined in the search index type definition file for the corresponding search index.

4.2 Filter Attributes

Intershop 7 provides a convenient way to easily configure filter attributes for guided search in the consumer storefront. These filters appear at the storefront after a simple search helping users to refine their search results.

Filters are managed at the search index configuration and their according definitions are stored in the corresponding configuration file (ISH-Config.xml). Each filter is based on an indexed attribute and is applied to all objects of the index. You can choose between different behaviors for each filter, e.g., sorting, showing up based on count, filter ranges, etc.

Additionally, you can set the scope of the filter - i.e., global or constrained to specific categories. This mechanism is used for filter attributes that are relevant for specific categories only. For example, a filter attribute Display Size has been defined for the catalog category TFT Monitors. In the inTRONICS storefront, this filter attribute will only be displayed if users submit a simple search and then drill down to the category TFT Monitors.

Note

An indexed attribute can only be used for one filter.

Note

These are standard configuration views to manage filter attributes as deployed by default. The actual view depends on the implementation of the used search engine.

Note

Dependent on the used search engine, filter attributes could also be managed in the search engine's own back office so there is no need to implement this at Intershop 7-side.


4.3 Synonyms

Synonyms are important to inform a search engine which words have similar meanings, without any similarity in form (for example: "Jeans" > "Pants").
Antonyms are used to tell a search engine that two words have nothing to do with one another, despite the fact that a search engine would report a similarity because the forms are similar (for example: "TFT" > "DVD").

The search services provides a configuration view to define both synonyms and antonyms in the back office on the Synonym tab of the respective index. The difference is in the reduction used for the word relationship. Synonyms have a word relationship reduction of 0%. Antonyms have a reduction of 100%. Intermediate values may be used to define word relationships that are not exactly synonyms, but also not antonyms. For example, Jeans > Pants. This word relationship could be given a slight reduction (such as 5%), so that a search for "Pants" would also find "Blue Jeans", but still give "corduroy pants" earlier in the list. Synonym and antonym relations defined in the back office are mapped to entries within the dedicated synonyms section in the general configuration file. A search engine specific implementation may map these synonym configurations to the search engine-specific configurations or may replace the synonym configuration tab completely if the search engine provides its own configuration pages.
For each synonym and antonym relation, you can define the reduction separately depending on the direction in which the relation is traversed. For example, when a search engine searches semantic relations for "TFT", it may identify "Monitor" as synonym item, but apply a 5% reduction. On the other hand, when looking for items semantically related to "Monitor", a search engine may identify "TFT" as a synonym, without applying a reduction. An antonym relation is defined connecting "TFT" and "DVD", indicating that these words are not related semantically.

Note

Depending on the actually integrated search engine, the synonym evaluation may differ. Possible differences in the search approach may include, among others, recursive evaluation, synonym look up (exact or fuzzy search) or search for single words only or phrases.

4.4 Stop Words

Stop words can be defined in the back office to exclude words from the index that have no relevance. A typical example are function words such as "the", "a", etc., which have little meaning and can usually be ignored when searching.
Stop word definitions defined in the back office are mapped to a dedicated stop word section in the general configuration file 

5 What is a Search Service?

A search service integrates search functionality into the storefront. It consists of:

  • A specific SearchIndex which communicates with a 3rd party search engine/server (incl. ISH-Config.xml configuration file for each instance which stores attributes to index, filter attributes, etc.)
  • Search feature configuration file, e.g., SFProductSearch.solr.xml which defines DataProviders, standard attributes, ImportHandler, ... to instantiate the specific SearchIndex object
  • SearchIndexType
  • SearchIndexTypeService

The search services are using the managed service framework. Therefore, they implement the corresponding interfaces and inherit functionality like enabling/disabling per domain via back office services tab, configuring, etc.

Search_Index_Managed_Service_Simple

5.1 SearchIndex

A class which implements the SearchIndex interface and/or extends the SearchIndexBaseImpl provides the actual implementation/functionality for a specific search engine.
This includes primarily the communication with the external search server/engine by:

  • Executing the query (after converting the query into the search engine specific format)
  • Mapping the search result (search engine specific) to a Intershop 7 conform format

5.2 SearchIndexType

For each specific SearchIndex there exists a SearchIndexType which can hold several instances of this SearchIndex with the same and/or different configurations (ISH-Config.xml). Only one of these SearchIndexes handled by the same SearchIndexType instance can be online at the same time for the same domain.

A base implementation (BaseSearchIndexType) offers functionality for handling the SearchIndexes found at the old deprecated SearchIndexMgrImpl before.

5.3 SearchIndexTypeService

This class is primarily responsible for managing the SearchIndexTypes. A SearchIndexType - and thus the actual SearchIndex - can be enabled/disabled per domain. This is done by the SearchIndexTypeService which instantiates one SearchIndexType per domain (if set to enabled).
As the SearchIndexTypeService holds the configuration file (e.g., SFProductSearch.solr.xml (declaration of DataProviders, Standard Attributes, ImportHandler)), different search services can be implemented which use the same SearchIndex class but different configurations.

5.4 SearchIndexServiceMgr

This object manages all SearchIndexTypeServices. But more important it provides the functionality to deal with the actual SearchIndexes. You can get the online indices, create new ones or delete them - internally, it delegates the calls to the proper services.
Furthermore, it handles all events in a distributed environment to keep indices in sync.

6 Data Indexing from the Database

Search_Service_Indexing

The data transformation (db -> index) is handled by the SearchIndexImportHandler (a search engine specific implementation) held by the SearchIndex.
Each SearchIndexType has a configuration file from which it is build (e.g., SFProductSearch.solr.xml) and thus for the SearchIndexes it handles.
This configuration includes amongst others:

  • Declaration of DataProviders
  • Declaration of Attributes (standard & business) in conjunction with the proper DataProvider. In the back office at the "Indexed Attributes"- tab you can initially (new created index) see all available attributes defined at the mentioned configuration file. If an attribute is initially allready assigned to the "Indexed Attributes" -section and not the "Available Attributes" -section, it is defined as "standard" at the config file and thus not removable from the index. After a search index was created at the back office (has not been build yet), a configuration (ISH-Config.xml) is built for this specific index at the shared file system (see figure). All attributes which are marked as to be indexed are stored inside (including the proper DataProviders).

6.1 Indexing

If you finished the assignment of attributes which have to be indexed (further called AttributesToIndex( ATI)), you can finally build the index.
The workflow as follows:

  • The SearchIndexImportHandler has a list of all ATI (see ISH-Config.xml)
  • For each ATI the proper DataProvider is called (as mentioned defined at the config file)
    • A specific DataProvider knows/implements the functionality how to retrieve the data from database (e.g., via query, ORM-layer, etc.) and how to process it
  • Retrieved and processed data gets transformed to a specific search engine format
  • Transformed data is provided to the search server, e.g.,
    • Sent to the search engine via http, ftp
    • Written to the shared file system for direct file access by the search server

7 References

Advanced Search Module Implementation and Customization: dev_asm.pdf
Search Service Solr Documentation

Disclaimer

The information provided in the Knowledge Base may not be applicable to all systems and situations. Intershop Communications will not be liable to any party for any direct or indirect damages resulting from the use of the Customer Support section of the Intershop Corporate Web site, including, without limitation, any lost profits, business interruption, loss of programs or other data on your information handling system.

Customer Support
Knowledge Base
Product Resources
Support Tickets