Personal Library

Concept - Mass Data Replication (valid to 11)

Introduction

ICM data replication generally refers to the process of first updating data in a source system and then synchronizing the data with a target system. The replication mechanism allows content to be developed and maintained in the background (i.e., in a source system that is offline to the public) without interfering with the target system that is online.

ICM provides two basic ways to update live system data in a data replication environment: Mass data replication, which is intended for large volumes of data, and business object replication, which is intended for rapid updates of some selective data. Both methods use the same communication channels, but differ in the way they collect data in the source system and inject it into the target system.

Note

This concept describes mass data replication only.

Glossary

Term	Description
Staging	Refers to a framework that provides basic functionality for transferring data in the form of database or file system data from a source system to a target cluster. Often used as a synonym for data replication (which is actually incorrect).
Data Replication	Data replication is a process to transfer large amounts of data from a source cluster to a target cluster. As a typical scenario, one could first update any storefront data (like product data) and other settings in an editing system and then transfer this information to a live system. This mechanism allows to develop and maintain large content in the background without significant disturbances to the production system. The mechanism for transferring individual business objects in an ad-hoc manner is called object replication (developer and administrator perspective) or publishing (shop manager perspective).
Editing system	In a data replication environment, the editing system is a dedicated ICM installation used to prepare or update the storefront data in the background without disturbing the operation of the live system. The wording pronounces the purpose aspect of the system in the data replication environment as seen by a Data Replication Manager.
Source system	Describes an ICM system used to import and test new data which then are intended to be transferred to another ICM system by usage of Data Replication. Thus, it is often used as a synonym for editing system in a data replication environment. The wording pronounces the data flow aspect in a data replication environment.
Offline system	Often used as a synonym for a source system in a data replication environment. The wording pronounces the aspect of public accessibility of the system in the data replication environment.
Live system	In a data replication environment, the live system is a dedicated ICM installation that serves the live storefront and receives the data that has been prepared in the editing system. The wording pronounces the purpose aspect of the system in the data replication environment as seen by a Data Replication Manager.
Target system	Describes an ICM system which is intended to be the receiver of data transferred from another ICM system (the source system) by usage of Data Replication. Thus, it is often used as a synonym for live system in a Data Replication environment. The wording pronounces the data flow aspect in a data replication environment.
Online system	Often used as a synonym for a target system in a data replication environment. The wording pronounces the aspect of public accessibility of the system in the data replication environment.
Target system vs. Target cluster	A target system refers to an ICM cluster, which is the receiver (the target) of a data replication process. As seen from a Data Replication perspective, a target system owns one web server address and one database schema, though it may consist of multiple web and app server(s). A target cluster refers to a logical compound of multiple (even spatially divided) target systems sharing the same cluster ID, but each of them owning its separate shared file system, own web address and own database schema. All target systems of a target cluster are updated in parallel with the same replication process.

References

(Mass Data) Replication and Staging

ICM's data replication mechanism is based on three different frameworks: staging, JDBC, and locking framework, see figure below.

While the complete replication mechanism provides an all-encompassing business process centered handling of data synchronization, staging provides the fundamental data transport mechanism and thus a viewing from a technical perspective.

Figure: Mass Data Replication: Involved Frameworks

Staging Framework

The staging framework provides the fundamental main entities and processes to identify and access the content affected by data replication, to model the assignment of content to replication processes, and to initiate and manage process execution.

The data replication mechanism does not replace the staging framework. It extends the staging framework in order to facilitate the management and execution of staging processes.

Note

Although the term staging has often been used as a synonym for replication, it is in fact only one ICM component involved in a Mass Data Replication process.

JDBC Framework

The JDBC framework and SQL are used to initiate data transfer between database instances or schemata.

Locking Framework

The locking framework prevents different processes within an ICM cluster (such as import processes, jobs, or data replication processes) from accessing the same resources at the same time, e.g., database tables or file system content. Therefore, each process has to impose a virtual lock on any resource it is going to access in order to ensure no other process can concurrently modify the resource.

Basic Architecture and Infrastructure for Data Replication

Basically, the data replication mechanism of ICM relates two kind of systems: source systems and a target systems.

To provide a Multi Data Center support, target systems, though they can possibly be situated in different locations, are encapsulated in (logic) target clusters. Same applies to source system / (logic) source clusters.

Note

All target systems of one target cluster are allowed to be active at the same time, while at one time only one source system is allowed to be active (up and running).

For an easier understanding, the following figure shows a simplified view with only one editing and one target system; Multi Data Center functionality is described later in a separate paragraph in more detail.

Figure: Simplified Basic Architecture

One target system includes one or more application servers, the Web server, the Web adapter and a target database account. In fact, the number of application servers, Web servers and Web adapters is irrelevant to the data replication mechanism, it must just meet the required needs in order to process incoming requests properly.

One source system also includes one or more application servers, Web server, Web adapters and a source database account. Again, the number of application servers, Web server and Web adapters is irrelevant to the data replication mechanism. Typically, the sizing requirements for a source system are lower, as the source system does not have to process online requests.

All target systems of a target cluster have to use the identical clusterID. All editing systems of of the according editing cluster have to use an identical clusterID, but different from the target cluster.

A source cluster can be connected to multiple target clusters. However, each data replication process is directed at exactly one replication cluster. It is not possible to update multiple target clusters from a source system in one data replication process. Nevertheless, all target systems belonging to the target cluster selected for a replication process are updated with the same replication process.

Mass data replication is based on the following fundamental paradigms:

One replication process involves exactly one source system and one target cluster. A target cluster consists of at least one target system. Data replication is handled by one subordinated staging process per target system.

A replication process is defined to be atomic, meaning that it is only finished successfully if all subordinated staging processes finished successfully. If only one subordinated staging process, i.e., replication to one of the target systems forming the current target cluster fails, the whole replication process is considered as failed.

Basic mechanism:

A replication process is started off by the editing system. Via HTTP the editing system calls a Web service in each assigned target systems to inform the target system, that and which new data is available.

Each target system then pulls the advised data from the editing system, whereby file system data is downloaded via HTTP and database data is downloaded via a database by the target system.

Mass Data Replication From a Business Point of View

Data Replication Workflow

From a user's perspective, data replication is separated into two main stages: first defining data replication tasks, and afterwards executing these tasks as data replication processes. Both stages are managed in the editing system and are described in more detail below.

Role Concept

According to the two stages two basic user roles for data replication can be distinguished: Data Replication Manager and System Administrator.

Data Replication Managers operate within the back office of a particular business unit (i.e., enterprise or channel). They do not need any technical knowledge of data replication. They create replication tasks and assign them to the System Administrator for execution. For example, the Data Replication Manager could be an editor who maintains product and catalog information of a consumer channel of the source system. The editor then creates the task to replicate the data to the consumer channel of the target system.

The System Administrator acts as Data Replication Manager of the system unit (central e-selling administration, i.e., Operations back office). System Administrators overlook data replication across the whole system through technical eyes. Their duties encompass receiving of the replication tasks from the Data Replication Managers of the individual business units, combining them to data replication processes for execution, assigning the appropriate target cluster, and starting of the replication processes.
Additionally, the System Administrator can trigger the rollback of publication processes if necessary, and monitors the replication process progress.

Each business unit (channel, enterprise/sales partner) contains an access privilege Data Replication Manager, which is connected with the permission SLD_MANAGE_DATA_REPLICATION. The Data Replication Tasks module of ICM's back office becomes accessible if the user inherits the access privilege Data Replication Manager for the particular business unit.
The System Administrator owns the same permission, but in comparison to the context of a business unit the functionality of module Data Replication Tasks is limited to process published tasks in utilization of additional module Data Replication Processes.

Replication Tasks

Data replication tasks determine the content to be replicated. They are defined by the responsible Data Replication Managers individually for each channel in the sales organization or partner back office. For example, the Data Replication Manager of the channel “PrimeTechSpecials” can define data replication tasks for this particular channel, using the consumer channel management plug-in in the sales organization back office.

For each data replication task the Data Replication Manager has to define:

Start Date
The start date sets the earliest time at which a replication task should be executed.
Replication Groups
To each replication task, one or more data replication groups have to be assigned. Replication groups define the kind of data to be replicated from the view of business objects.

Once defined, data replication tasks are submitted to the System Administrator for execution.

Replication Groups

A data replication group identifies the content to be replicated from a business object's point of view. Thus, the replication group can encapsulate various content types (file content, database content), which is needed to replicate the selected business object. For example, the data replication group “Organization” includes the organization profile, the departments, the users and roles, and all preferences defined for an organization.
Each replication group refers to a certain content domain.

Replication Process

To execute data replication tasks, the System Administrator defines data replication processes in the central administration front end.

For each data replication process, the System Administrator defines:

Target Cluster
A source system can be connected to multiple target clusters (each consisting of one or more target systems). However, each replication process can address a single target cluster only.
Replication Tasks
Each replication process executes one or more replication tasks as submitted by the responsible Data Replication Managers. Only replication tasks whose start date has been reached can be included in a replication process.
Activation Rules
Data replication processes can be started either manually, or by a scheduled job at predefined times.
Data Replication Type
From a business point of view, the data replication type determines, how data is handled during replication. Possible replication types include Data Transfer, Data Publishing, Data Transfer and Publishing, and Undo.

Replication Types

For each replication process, a data replication type is set by the System Administrator. From a business point of view, the data replication type determines if new data is transferred and published in one single process or in separate processes. Subsequently to a replication process, which included a successful publication, additionally a one-step-back undo process can be run.
The following replication types are available:

Data Transfer
This process transfers the data to the target cluster. However, it does not trigger a table or directory switch (publication).
Data Publishing
This process publishes data that have already been transferred to the target cluster. The process triggers all necessary table and directory switches as well as concomitant database commits to persist the changes (publication and cache refresh).
Note
Data publishing can only be executed on the results of a process of type Data Transfer executed immediately before.
Data Transfer and Publishing
This process accomplishes a complete replication process.
Undo
An Undo process rolls back a data replication process of type Data Publishing or Data Transfer & Publishing which has been completed successfully. Undo restores the target cluster state prior to executing the data replication task that is rolled back.
Note
Undo does not support undoing processes of type Data Transfer. Also, Undo can only roll back the most recent data replication process.

Mass Data Replication From a Technical Perspective

Replication Process Phases

A complete data replication process consists of the following main phases, as described in the figure below:

Figure: Phases of a replication process

Preparation
During this phase, the content involved in the current replication process is prepared. For example, the database tables will be analyzed to guarantee optimal execution plans for SQL statements used during the replication process.
Moreover, index files are created and packed into the distribution directory <IS.INSTANCE.SHARE>/dist/staging, containing information on files to be replicated.
Synchronization
The replication process merges content to be replicated (new content of source system), with content that should not be changed (old content of target system belonging to other domains).
During the synchronization phase, the old content of the target system that should not be changed is replicated to physical shadow containers (tables or directories).
Replication
During the replication phase, the new content is copied to the shadow container of target system. Database content and file system content is handled separately.
Publication
The final step of the data replication process is to publish the replicated content, for example by performing a switch between live and shadow tables (full replication of database content) or between active and inactive directories (replication of file system content). As a result, any new or changed data is available for online users, and deleted data does no longer appear in the Web front.
Note
The publication phase has not run through if any of the preceding steps had ended with an error.
Cache Refresh
There are several caches in ICM to ensure high performance. These caches are refreshed whenever new content has been published.

The process details for the individual phases differ depending on the content type to be replicated and the staging processor used to execute the replication process.

Replication Types

When preparing a replication process, the System Administrator has to set a data replication type. From a technical point of view, the data replication type determines which replication phases are actually performed for the respective data replication tasks. The following replication types are available:

Data Transfer
This process transfers the data to the target cluster, involving the phases preparation, synchronization, and replication. However, it does not trigger a table or directory switch (phase publication).
Data Publishing
This process publishes data that have already been transferred to the target cluster. The process triggers all necessary table and directory switches as well as concomitant database commits to persist the changes (i.e., phases publication and cache refresh).
Note
Data publishing can only be executed on the results of a process of type Data Transfer that was executed immediately before.
Data Transfer and Publishing
This process accomplishes a complete replication process.
Undo
An Undo process rolls back a data replication process of type Data Publishing or Data Transfer & Publishing which has been completed successfully. Undo restores the target cluster state prior to the execution of the data replication task that is rolled back.
Note
Undo does not support undoing processes of type Data Transfer. Also, Undo can only roll back the most recent data replication process.

Mass Data Replication Process Model

Replication and Staging Processes

In the active source system, for each data replication process a ReplicationProcess object, a StagingProcess object and at least one additional StagingProcess object (one for each target system of the target cluster assigned to the replication process) is created, all being tightly integrated with the locking framework, as shown in the figure below.

Figure: Internal structure of a Mass Data Replication process

The ReplicationProcess is mainly used to organize and visualize individual data replication processes. The ReplicationProcess always starts one StagingProcess, representing the target cluster, and additionally one StagingProcess for each target system of this cluster.

The StagingProcess stores meta-information associated with a data replication process, such as the name of the target cluster and the data replication type set for the process by the System Administrator. The StagingProcess objects are created automatically by the staging framework.

The StagingProcess uses additional staging sub-processes (one StagingProcess instance per target system) to keep track of the process states of each of the target systems, while its own process state provides a summarized process state for the whole target cluster and in this way for the whole staging process.

Both ReplicationProcess and StagingProcess are wrapper classes which extend functionality provided by the Process class of the locking framework.
Note
The locking framework provides the necessary persistent objects. For example, the wrapper class ReplicationProcess contains the persistent object Process. Replication-specific information of the ReplicationProcess are mapped onto custom attributes of the Process object.

Replication Process Model

A ReplicationProcess consists of ReplicationTask objects and is created and started by the System Administrator.

A ReplicationTask is created by the Data Replication Managers of the respective business unit. The Data Replication Manager defines the content of a ReplicationTask. A ReplicationTask consists of at least one ReplicationTaskAssignment.

A ReplicationTaskAssignment references exactly one StagingGroup and one Domain, thus embodying a ReplicationGroup.

ReplicationGroups can be selected by the Data Replication Managers in the back office of their business unit.

Figure: Mass Data Replication process model

Staging Process Model

A staging process consists of several components describing the content affected by this process.

Figure: Staging process model

Each StagingProcess has a type. The types which a StagingProcess can assume correspond to the data replication types which the System Administrator can set for each replication process in the back office (see Replication Types described in the section before):

Replication (Data Transfer)
Publication
ReplicationPublication (Data Transfer and Publishing in a single process)
Undo

Each StagingProcess references one or more StagingProcessComponents. A StagingProcessComponent references exactly one domain and one StagingGroup.

Staging Resources (StagingResourceAssignment)

The staging framework uses resources definitions of the locking framework to lock affected resources (e.g., tables) whenever a data replication process is executed. Thus, the staging mechanism prevents the respective resources from being changed by other processes (e.g., jobs, imports), while a replication process is underway.

Mass Data Replication Entity Model

The entity model describes the content components to be transferred by replication processes, making use of fundamental concepts of the staging framework such as StagingGroup, StagingTable and StagingDirectory.

Data Replication Groups and Staging Groups

Data replication groups identify the content to be transferred between source and target system from a business point of view, e.g., catalogs, channels or product prices. Replication groups are configured via an XML configuration file, replication.xml, located in <IS.INSTANCE.SHARE>/system/config/cluster.

Replication groups can be conceived of as staging group-to-domain assignments. Hence, replication groups relate logical data containers (domains) with physical data containers (staging groups, bundling database tables or staging directories).

Assignment of Staging Groups to Replication Groups

There is no persistent object representing a data replication group. Replication groups are used at pipeline layer (see below) and at template layer (to visualize the organization of replication processes).

The staging group-to-domain assignment takes place when assigning a data replication group to a replication task. Responsible for the staging group-to-domain assignments are the pipelines ProcessReplicationGroupAssignment[channelType]. These pipelines are channel type-specific. They perform the following actions:

The pipeline's Start start node is called, handing over the replication group ID as defined in replication.xml. The pipeline analyzes this given ID and, depending on it, selects a jump node which targets a specific sub-pipeline to handle staging group assignment for this replication group.

For each replication group, a specific sub-pipeline is triggered which assigns the required staging groups to the respective domains and adds this assignment to the ReplicationTask to which the replication group is added.

Responsible for the assignment is the pipelet AddStagingGroupToReplicationTask of cartridge bc_foundation. It is called subsequently in the sub-pipeline for each staging group-to-domain combination required by the specified replication group.

For example, the sub-pipeline ProcessReplicationGroupAssignment_52-ProductPrices handles the staging group-to-domain assignments when adding the data replication group “Product Prices” of the B2C channel to a data replication task. The pipeline contains two instances of the pipelet AddStagingGroupToReplicationTask, assigning the staging groups “Prices” and “PRICING_PriceScale”, respectively.

Note

It is necessary that all referenced domains exist at the point of assigning the replication group to the replication task for successful assignment of staging groups. For example, a catalog to be replicated has to be created before you add the replication group Catalogs to a replication task.

Staging Groups

A staging group consists of several staging entities of the same type and contains the configuration determining how that entities are replicated (the staging processor).

Staging Entities

A staging entity describes an atomic data container for a certain type of content: database tables, materialized database views, or file system content. Accordingly, the following types of staging entities have to be distinguished:

Staging Groups bundling Database Tables
StagingTables represent tables in the database, such as PRODUCT. Staging groups of content type DATABASE contain only staging entities of type StagingTable. For details, see Staging Tables below.

Staging Groups bundling File System Content
StagingDirectories represent directories in the file system including all underlying file system content like files and sub-directories. Staging groups of content type FILE SYSTEM contain only staging entities of type StagingDirectory. For details, see Staging Directory below.

Staging Groups bundling Materialized Views
StagingMViews represent materialized views in the database. Staging groups of content type DATABASE MATVIEW contain only staging entities of type StagingMView. For details, see Staging Materialized Views below.

Figure: Staging Group and Staging Entities

StagingTable

The staging entity StagingTable represents a database table. A StagingTable can be domain-specific or not.

A simple staging table (being not domain-specific) has to fulfill the following requirements:

The table does not contain columns of type LONG or LONG RAW, respectively. SQL statements used for database replication during the staging process do not support tables with these column types. Nevertheless, columns of type BLOB and CLOB are supported.

The table possesses a primary key. This is necessary to identify each row unambiguously.

The table does not reference other tables that do not belong to staging content. In order to avoid inconsistent references, tables connected via foreign keys have to be staged at one go.

Tables containing domain-specific content additionally have to fulfill the following requirement:

The table contains a column storing domain identifiers, or references another table containing a column storing the domain identifier. This column is used to assign staging content to domains (i.e., units or sites). A table which does not include or reference this column will automatically be assigned to the system domain.
By default, the column name DOMAINID is assumed. Despite that, it is possible to specify a different column name by modifying the respective definitions in the StagingGroup.properties file used by DBInit resp. DBMigrate (see section on Staging Group Preparation below).

Tables that are writable in the storefront (and hence will be replicated using a delta replication mechanism) additionally have to fulfill the following requirement:

The table contains a column storing the modification time of the respective table row. This column has to be named LASTMODIFIED being of type DATE, and needs to be updated on each change of the according row. A mechanism is provided which sets the current date in case of changes.
Note
If the column does not exist, it is not possible to track changes.

When creating custom persistent objects using INTERSHOP Studio, the column is generated automatically when setting property ModificationTimeTracking for the respective class to true.

StagingDirectory

The staging entity StagingDirectory represents a directory containing file system content to be replicated. The staging directories reside in numbered subdirectory of each site directory. The entire content within in these directories can be replicated. The directory tree may look like pictured below:

Figure: Staging Directories in ICM

Note

Data replication can include unit directories in <IS.INSTANCE.SHARE>/sites/<site>/<.active>/units, where <.active> references the currently active directory (1 or 2). Note furthermore, that unit directories in <IS.INSTANCE.SHARE>/sites/<site>/units cannot be replicated, since they do not contain any staging relevant content.

The .active file, located in the site directory, contains the number of the directory currently used by the application server (either 1 or 2), i.e., it defines the active directory. The other numbered directory stores the changed or new files. Upon publication, the content of the .active file is altered to point to the new active directory. The look-up mechanism of the application server reads this information and uses the specified directory.

StagingMView

The staging entity StagingMView, together with the MViewStagingProcessor, is used to update materialized views whose original tables were affected by replication processes. The new content of materialized views is published using database synoymys.

MViews will be refreshed in the background during replication process.

Staging Processors and Staging Decorators

Staging processors provide the core methods for the replication of different content types, such as database content or file system content.

Staging processor decorators provide additional functionality to extend the functionality of staging processors. The decorators perform tasks before or after a state has changed during a data replication process (cf. Replication Process Phases above).

Every StagingGroup is associated with a StagingProcessorConfig. The StagingProcessorConfig determines which staging processors have to be used to replicate the content represented by the staging group, i.e., it defines in which way data is replicated. Each staging group has assigned one staging processor, whereby the staging processor may or may not be extended by one or more staging decorators. As a result, a StagingProcessConfig consists of exactly one StagingProcessor and none, one, or more decorators.

Figure: StagingProcessorConfig

According to the content types there exit different staging processors and decorators, implementing various methods to replicate data from editing to target systems. A detailed description of the available standard processors and decorators is given in the next section.

The Standard Staging Processors and Staging Decorators

Staging Processor Model

All staging processors are derived from the class BasicStagingProcessor. This class provides the signature of a couple of hooks called by the pipelets of the staging process pipelines. The following figure depicts the class hierarchy of the standard staging processors. All processor classes and all but two of the decorator classes are provided by the core cartridge; the RefreshSearchIndexesDecorator is implemented in bc_search, ShippingRuleEngineStagingProcessorDecorator comes with bc_shipping.

Figure: Staging Processor Model: Class Hierarchy

For each data replication phase (Preparation, Synchronization, Replication, Publication, Refresh Caches; see Data Replication Phases above), a staging processor provides the following hook functionality:

The first method, onPre<Phase>Hook, is called at the beginning of the phase with all staging process components describing content affected by this assigned processor. It may be used to initialize some objects or to get system resources like database connections.

The second method, on<Phase>Hook, actually executes the phase (such as replication or synchronization of content).

When the second method has been successfully called for all assigned staging process components, the third method, onPost<Phase>Hook, is called to clean up objects or to release system resources.

In case of an error the fourth method ( onError<Phase>Hook) is called. It is used to release system resources and to perform some error handling.

The staging processor classes provide specific implementations of these hook methods, depending on the type of content and replication mechanism.

The staging processor objects are created by a factory. The factory uses the default constructor of each processor object for initialization.

StagingProcessors

Staging Processors for File System Content

File system content is handled by sub-classes of FileSystemStagingProcessor providing functionality (hooks) for the publication phase of a staging process (switching directories of site content).

ICM includes the SimpleFileSystemStagingProcessor as default implementation class for the FileSystemStagingProcessor. This processor first creates binary index files in the source system, keeping information on the staging directories in $IS_SITES/<site>/<.active>. The index files are stored in $IS_DIST (default=/intershop/sites/root/temp-share/dist).

Then the same procedure is executed in the target system. Afterwards, the target system downloads the binary index files from the source system and checks them for changed file system content by comparing them with its own index files. The target system then downloads the changed files directly into to the shadow directory of the target system.

There is another implementation of the FileSystemStagingProcessor, the DRPIndexFileSystemStagingProcessor. Instead of binary index files it uses a DRP index (XML representation of file system content) on target and source directories to detect changes of file content. Despite that, the procedure is basically equivalent to the SimpleFileSystemStagingProcessor. For performance and resource reasons (memory usage) it is recommended to use the SimpleFileStagingProcessor for new projects.

Note

The DRPIndexFileSystemStagingProcessor uses a modified DRP index mechanism. In contrast to the standard mechanism, the created index file which is used for file comparison contains rounded time stamps and size of each file instead of a check sum to reduce the time necessary to build the DRP index file.

File replication based on FileSystemStagingProcessor involves the following phases:

Preparation
DRPIndexFileSystemStagingProcessor:
Generation (or re-use) of DRP indexes of the source and target directories. Both indexes are compared. Changed files are zipped at the source system and are then copied (along with a list of files to be deleted) to the source system directory $IS_DIST (default=/intershop/sites/root/temp-share/dist).
SimpleFileSystemStagingProcessor:
Generation of a binary index of the source system’s file content.
Synchronization
DRPIndexFileSystemStagingProcessor only:
Old content of the target system that should not be changed is replicated from the active directory (1) to the shadow directory (2).
Replication
DRPIndexFileSystemStagingProcessor:

The target system downloads the generated zip archives from the source system. The zip files are extracted into the shadow directory. Files to be deleted are removed in the shadow directory.

SimpleFileSystemStagingProcessor:

Generation of a binary index of the target system’s file content. The two indexes are compared. Changed files are downloaded to the target system, obsolete files are deleted.

Publication
The file *.active is changed according to the number of the shadow directory.
Cache Refresh
The change of the file *.active is propagated to the application servers of the target system. All affected caches are cleared at the target system (page cache, PO cache).

Figure: Replication of file system content

Staging Processors for Database Content

The base class for all staging processors handling database content is the abstract class DatabaseStagingProcessor. It provides methods for transaction, database connection and statement handling. Furthermore, it collects all affected persistent objects being involved in the current replication process.

Database staging processors come in two basic types: full replication processors and delta (partial) replication processors. Both mechanisms are described below.

Full Replication Processors

In case of full replication, database content is transferred from the source to the target system regardless of changes. Full replication is used for most types of database content, except tables that are writable in the target system (such as promotion codes on a system used as live system).

Performance tests proved that it is faster in most cases to delete the whole data from a table and re-fill it completely including changed data than to update only the changed table entries.

Full data replication is available for global (not domain-specific) and for domain-specific data. Global means, that data to be replicated is not selected based on a DOMAINID. Domain-specific means, that data is selected for a DOMAINID column.

The full replication mechanism relies on the following basic database objects: for each table to be replicated there are two tables with suffixes $1 and $2 added to the original table name (one used as the live, i.e., the currently active, and the other one as a shadow table) and an additional database synonym with the original table name pointing to the current live table; see also figure below. The Java functionality accesses the database table via the synonym.

Full replication involves the following steps:

Preparation
Database tables are analyzed on source and target system to collect statistical data in order to optimize the replication process.
Synchronization
The shadow tables are cleared. Subsequently, data which are not replicated (e.g., of domains which are not involved in the current replication process) are copied on the target system from the currently active tables into the shadow tables.
Replication
Data are replicated using the database link resp. direct access to the source database schema from the active tables of the source system into the shadow tables of the target system.
Publication
Old synonyms of the tables involved in the replication process are dropped. New synonyms are created that map to the former shadow tables, thus making them now active tables while former active tables now get shadow tables.
Cache Refresh
All affected caches are cleared at the target system (page cache, PO cache).

Figure: Replication of database content using the full replication mechanism

Delta (Partial) Replication Processors

In case of delta replication, only content which has actually changed is transferred from the source to the target system. The replicated content is directly inserted into the live (i.e., active) tables of the target system and published by committing the respective database transaction.

Note

Note, that for this reason no Undo is possible for replication processes which include content replicated with delta replication processors.

Delta replication is used for database data which is writable in the target system. It is needed in every case where data independently resp. concurrently is created or changed not only in the source system, but in the target system, too. An example are promotion codes, which are created in editing system and changed (redeemed) in the target system.

All delta staging processors are derived from the abstract class TransactionalStagingProcessor, which itself is derived from DataBaseStagingProcessor. The TransactionalStagingProcessor provides a method to enable the deletion triggers needed to track deletion of table rows.

Delta replication comprises the following steps:

Preparation
Database tables are analyzed on source and target system to collect statistical data in order to optimize the replication process.
Synchronization
All data of active tables at the target system which has changed are copied to the shadow table.

In case of any error, the active table is completely copied into the shadow table. This is necessary for the data replication type Undo.

Replication
All data which has changed since the last replication process are copied from the active tables of the source system to the active tables of the target system (using the database link resp. direct access to the source database schema).

Those data records can be detected in terms of the column LASTMODIFIED. The replication is carried out in one large transaction.

Publication
The large transaction is committed.
Note
Synonyms of the tables in the target system are not changed in delta replications.
Cache Refresh
All affected caches are cleared at the target system (page cache, PO cache).

Figure: Replication of database content using the delta replication mechanism

Default Database Staging Processor Classes

ICM includes the following database staging processor classes:

Full replication

FullStagingProcessor
This staging processor is used for system content (i.e., not domain-specific) such as regional settings, permissions, or roles. With this processor, ORACLE direct load SQL statements are used (TRUNCATE, INSERT /*+ APPEND */) to reduce replication time.
With this processor, simple SQL statements are used (DELETE, INSERT).

FullDomainSpecificStagingProcessor
This staging processor is used for tables containing domain-specific data. With this processor, ORACLE direct load SQL statements are used (TRUNCATE, INSERT /*+ APPEND */) to reduce replication time.
With this processor, simple SQL statements are used (DELETE, INSERT).

MViewStagingProcessor
This staging processor is used for updating materialized views in the database, whose original tables were affected by data replication processes.

Delta replication

MergeDomainSpecificStagingProcessor
This staging processor is used to replicate database content residing in database tables being changed in source as well as target system. Due to this, the replication occurs in one huge transaction. It uses the 'MERGE' sql statement to transfer the new and updated content, and uses the deletion tracking with deletion triggers to realize removed rows in editing system.
Note
The 'MERGE' statement has a restriction: It does not work on tables having a column with a context index. So, only tables with normal indexes are supported.
Note
Due to restriction of a huge transaction, the publication phase cannot be started separately. Further, the undo process is not supported in order to save the backup time of old content in live system.

IncrementalDomainSpecificStagingProcessor
This staging processor is used to replicate database content residing in database tables being changed in source as well as target system. Due to this, the replication occurs in one huge transaction. It uses the 'MERGE' sql statement to transfer the new and updated content, based on the LASTMODIFIED column.

Table rows which do not exist in the editing system are deleted from the live system.

Note

The 'MERGE' statement has a restriction: It does not work on tables having a column with a context index. So, only tables with normal indexes are supported.

Note

Due to restriction of a huge transaction, the publication phase cannot be started separately. Further, the undo process is not supported in order to save the backup time of old content in live system.

AppendDomainSpecificStagingProcessor
This staging processor is used to replicate only new content of domain-specific tables. Old content of the live system is never overwritten. Removed rows in the editing system will never be deleted in the live system.
Note
This processor replicates its contents in the publication phase in order to support separated Replication and Publication modes.

DeleteAppendDomainSpecificStagingProcessor
This staging processor is used to replicate new and deleted content of domain-specific tables. Existing old content of the live system is never overwritten. Removed rows in the editing system will be deleted in the live system.
Note
This processor replicates its contents in the publication phase in order to support separated Replication and Publication modes.

Basic SQL Commands for Selected Staging Processors

Processor	Deleting Data From Shadow Tables	Inserting Data	UnDoing Replication
FullStagingProcessor	TRUNCATE TABLE {0} REUSE STORAGE	INSERT /+ APPEND / INTO <shadow_table_name> dst SELECT * FROM <live_synonym_name> src and INSERT /+ APPEND / INTO <shadow_table_name> dst SELECT * FROM <source_table_in_editing_system> src resp. INSERT /+ PARALLEL(dst, <nn>) / INTO <shadow_table_name> dst SELECT /+ PARALLEL(src, <nn>) / * FROM <live_synonym_name> src and INSERT /+ PARALLEL(dst, <nn>) / INTO <shadow_table_name> dst SELECT /+ PARALLEL(src, <nn>) / * FROM <source_table_in_editing_system> src	SQL statement to save content that should not be undone: same as inserting data.
FullDomainSpecificStagingProcessor	TRUNCATE TABLE {0} REUSE STORAGE	INSERT /+ APPEND / INTO <shadow_table_name> dst SELECT * FROM <live_synonym_name> src WHERE <column_name_of_DOMAINID> NOT IN (select stagingdomainid from stagingprocesscomponent where stagingprocessid = <current_stagingprocess_id> and staginggroupid = <current_staginggroupid> and INSERT /+ APPEND / INTO <shadow_table_name> dst SELECT * FROM <source_table_in_editing_system> src WHERE <column_name_of_DOMAINID> IN (select stagingdomainid from stagingprocesscomponent where stagingprocessid = <current_stagingprocess_id> and staginggroupid = <current_staginggroupid>) resp. INSERT /+ PARALLEL(dst, <nn>) / INTO <shadow_table_name> dst SELECT /+ PARALLEL(src, <nn>) / * FROM <live_synonym_name> src WHERE <column_name_of_DOMAINID> NOT IN (select stagingdomainid from stagingprocesscomponent where stagingprocessid = <current_stagingprocess_id> and staginggroupid = <current_staginggroupid> and INSERT /+ PARALLEL(dst, <nn>) / INTO <shadow_table_name> dst SELECT /+ PARALLEL(dst, <nn>) / * FROM <source_table_in_editing_system> src WHERE <column_name_of_DOMAINID> IN (select stagingdomainid from stagingprocesscomponent where stagingprocessid = <current_stagingprocess_id> and staginggroupid = <current_staginggroupid>)	INSERT INTO <shadow_table_name> SELECT * FROM <live_synonym_name> WHERE <column_name_of_DOMAINID> = <domainID>
MViewStagingProcessor	ddl.drop_materialized_view(<mview_name>);	SELECT query FROM user_mviews WHERE mview_name=<mview_name> UNION ALL SELECT query FROM user_synonyms s JOIN user_mviews v ON (s.table_name=v.mview_name) WHERE synonym_name=<mview_name>	same as inserting data
AppendDomainSpecificStagingProcessor	none	INSERT INTO <live_table_name> SELECT * FROM <source_table_in_editing_system> src WHERE NOT EXISTS (SELECT * FROM <live_table_name> dst WHERE src.<primary_key>=dst.<primary_key> AND (<column_name_of_DOMAINID> IN (SELECT stagingdomainid FROM stagingprocesscomponent WHERE stagingprocessid=<current_stagingprocess_id> AND staginggroupid=<current_staginggroupid>))	transactional
DeleteAppendDomainSpecificStagingProcessor	DELETE FROM <live_table_name> dst WHERE (<primary_keys_of_table>) IN (SELECT <primary_keys_of_table> FROM <live_table_name> dst WHERE <column_name_of_DOMAINID> IN (SELECT stagingdomainid FROM stagingprocesscomponent WHERE stagingprocessid=<current_stagingprocess_id> AND staginggroupid=<current_staginggroupid>) MINUS SELECT <primary_keys_of_table> FROM <source_table_in_editing_system> src WHERE <column_name_of_DOMAINID> IN (SELECT stagingdomainid FROM stagingprocesscomponent WHERE stagingprocessid=<current_stagingprocess_id> AND staginggroupid=<current_staginggroupid>))
		INSERT INTO <live_table_name> SELECT * FROM <source_table_in_editing_system> src WHERE (<primary_keys_of_table>) IN (SELECT <primary_keys_of_table> FROM <source_table_in_editing_system> src WHERE <column_name_of_DOMAINID> IN (SELECT stagingdomainid FROM stagingprocesscomponent WHERE stagingprocessid=<current_stagingprocess_id> AND staginggroupid=<current_staginggroupid>) MINUS SELECT <primary_keys_of_table> FROM <live_table_name> dst WHERE <column_name_of_DOMAINID> IN (SELECT stagingdomainid FROM stagingprocesscomponent WHERE stagingprocessid=<current_stagingprocess_id> AND staginggroupid=<current_staginggroupid>))	transactional
IncrementalDomainSpecificStagingProcessor	DELETE FROM <live_table_name> WHERE ((<column_name_of_DOMAINID>=<domainid_of_current_component>) AND (<primary_keys_of_table>) NOT IN (SELECT <primary_keys_of_table> FROM <source_table_in_editing_system> WHERE <column_name_of_DOMAINID>=<domainid_of_current_component>))
		MERGE INTO <live_table_name> dst USING (SELECT s.* FROM <source_table_in_editing_system> s LEFT OUTER JOIN <live_table_name> d ON (s.<comparsion_key_of_table>=d.<comparsion_key_of_table>) WHERE <column_name_of_DOMAINID>=<domainid_of_current_component> AND (d.lastmodified IS NULL OR d.lastmodified<s.lastmodified)) src ON (s.<primary_key_of_table>=d.<primary_key_of_table>) WHEN MATCHED THEN UPDATE SET dst.<assigned_column_names>=src.<assigned_column_names> WHEN NOT MATCHED THEN INSERT (<column_names>) VALUES src.<column_names>)	transactional
MergeDomainSpecificStagingProcessor	DELETE FROM <live_table_name> WHERE (<primary_keys_of_table>) IN (SELECT <primary_keys_of_table> FROM <source_deletion_table_in_editing_system> WHERE (<column_name_of_DOMAINID>=<domainid_of_current_component>)
		MERGE INTO <live_table_name> dst USING (SELECT s.* FROM <source_table_in_editing_system> s LEFT OUTER JOIN <live_table_name> d ON (s.<comparsion_key_of_table>=d.<comparsion_key_of_table>) WHERE (<column_name_of_DOMAINID>=<domainid_of_current_component>) AND (d.lastmodified IS NULL OR d.lastmodified<s.lastmodified)) src ON (s.<comparsion_key_of_table>=d.<comparsion_key_of_table>) WHEN MATCHED THEN UPDATE SET dst.<assigned_column_names>=src.<assigned_column_names> WHEN NOT MATCHED THEN INSERT (<column_names>) VALUES (src.<column_names>)	transactional
	DELETE FROM <source_deletion_table_in_editing_system> WHERE (<primary_keys_of_table>)) IN (SELECT <primary_keys_of_table> FROM <live_deletion_table>)		transactional

Staging Decorators

Staging process decorators add special functionality to a staging processor. All staging processor decorators are derived from the abstract class StagingProcessorDecorator, which itself is derived from BasicStagingProcessor. It is possible to use more than one staging processor decorator for a staging process.

As the staging processors themselves, the staging processor decorators are specific for the content type (file system content or database content.)

Staging Processor Decorators for File System Content

The base class for all staging processor decorators handling file system content is the abstract class StagingProcessorDecorator.

Staging processor decorators for file system content add functionality to extend the pure transportation of files provided by the FileSystemStagingProcessor classes. This can include a reload of replicated files in the target system(s).

Default File System Staging Processor Decorator Classes

ICM provides the following file system staging processor decorator classes:

RefreshLocalizationsDecorator
The RefreshLocalizationsDecorator is provided by cartridge core and is used to refresh the localization set in target system(s) for domains included in the current staging process (onPreRefreshCache).

RefreshSearchIndexesDecorator
The RefreshSearchIndexesDecorator is provided by cartridge bc_search and is used to reload the search indexes in target system(s) for domains included in the current staging process (onPreRefreshCache).

RuleRepositoryStagingProcessorDecorator
The RuleRepositoryStagingProcessorDecorator is provided by cartridge bc_ruleengine and is used to reload the rules in target system(s) for domains included in the current staging process, after the rules have been transferred (onPreRefreshCache).

ABTestStatisticsStagingProcessorDecorator
The ABTestStatisticsStagingProcessorDecorator is provided by cartridge bc_marketing and is used to create new empty ABTestStatistics for new ABTestGroups after ABTests have been transfered. ABTestStatistics are not part of the replication process to have separate statistics on live and edit systems (onPostPublicationHook).

Staging Processor Decorators for Database Content

The base class for all staging processor decorators handling database content is the abstract class DatabaseStagingProcessorDecorator, which itself is derived from StagingProcessorDecorator.

Database staging processor decorators should be used to handle table statistics, indexes, or constraints. They can also provide the possibility to execute additional database queries before or after a staging is done.

Default Database Staging Processor Decorator Classes

ICM provides the following database staging processor decorator classes:

AnalyzeTablesDecorator
The AnalyzeTablesDecorator is provided by cartridge core and used to analyze tables of the source and target systems during the replication process. It collects statistical data about the tables (for the Oracle Cost Based Optimizer - CBO). In the source system, tables are analyzed by the onPrePreparationHook, in the target system by the onPostReplicationHook.

DisableConstraintsDecorator
The DisableConstraintsDecorator is provided by cartridge core and is used to disable all constraints on shadow tables of the target system before the synchronization phase starts ( onPreSynchronizationHook). After the replication phase, the constraints will be enabled again, depending on the staging.live.enable_foreignkeys property in staging.properties( onPostReplicationHook for constraints in then-shadow tables, onPostPublicationHook for foreign keys referencing then-live tables).

RebuildIndexesStagingProcessorDecorator
The RebuildIndexesStagingProcessorDecorator is provided by cartridge core and rebuilds all indexes in the onPostReplicationHook, which refer to tables that are assigned to the selected staging processor and belong to the given staging components.

UnusableIndexesStagingProcessorDecorator
The UnusableIndexesStagingProcessorDecorator is provided by cartridge core and sets all indexes of shadow tables that are assigned to the staging processor referenced by this decorator to unusable ( onPreSynchronizationHook).
Note
This decorator requires the RebuildIndexesStagingProcessDecorator described just before.

RemoveCatalogDecorator
The RemoveCatalogDecorator is provided by cartridge bc_mvc and is used to mark the catalog domains as deleted that are removed by the replication process ( onPostPublicationHook). In case of an Undo replication process, the restored catalog domains are enabled again.

ExecuteQueryDecorator
The ExecuteQueryDecorator is provided by cartridge core. It is based on FullStagingProcessor's switching $1 and $2 tables on publication phase. Further, it uses the ICM Query Framework to execute query files on each staging hook to perform the replication.
The staging queries to be executed have to follow the syntax requirements of the Query Framework and have to reside in the directory queries/staging. By convention, they have to be named following the schema <tablename> _ <hookname> with hookname being:
```
"on[Pre|Post|Error][Preparation|Synchronization|Replication|Publication|RefreshCache]Hook.query"
```
E.g.: PRODUCT _ onErrorReplicationHook.query.

ShippingRuleEngineStagingProcessorDecorator
The ShippingRuleEngineStagingProcessorDecorator is provided by cartridge bc_shipping and is used to reload the shipping rules in target system(s) for domains included in the current staging process, after the rules of cartridge bc_ruleengine have been transferred ( onPostRefreshCacheHook).

Configuring Staging Processors

Staging processors are configured in the global staging configuration file staging.properties, which is located in $IS_CLUSTER_CONFIG (default=/intershop/system-conf/cluster).

Each staging processor configuration entry consists of

A key that represents the staging processor
The key is used later on to assign the staging processor to a staging group (see Assignment of Staging Groups to Replication Groups).

The staging processor class
You can use custom processor classes or any of the staging processor classes discussed in Staging Processors for File System Content and Staging Processors for Database Content.

The (optional) staging processor decorators to execute with the staging processor
You can use custom decorator classes or any of the processor decorator classes discussed in Staging Decorators.

For a detailed description see the section Replication Configuration.

Communication Between Replication Systems

As already stated before, a mass data replication is started off by the editing system by informing each assigned target system on a new replication process. Each target system then pulls the advised data from the editing system.

Command Flow Direction

Communication between the application servers of source and target system(s) is based on a web service (SOAP) and HTTP. The direction of the command communication flow is from source to target system: the source system sends SOAP requests to the Web server of the target system(s), which then forwards these requests to an application server belonging to the server group configured to handle Replication.

Data Flow Direction

File system content in data replication is retrieved by the target system via HTTP from the source system.

To allow the database content to be replicated from the editing system to the target system(s), an additional communication channel connects the database schema of the source system with the database schema(ta of each) of the involved target system(s). Here, two basic replication scenarios can be distinguished:

Local Database Replication
The local database replication scenario assumes a single database instance both for the source and the target system. Source and target system use different users/schemata, with both users/schemata working on separate table sets.
Remote Database Replication
In a remote database replication scenario, two different database instances are used, possibly residing on different hosts.

Note

Local data replication can have significant performance advantages over remote data replication. Use local data replication whenever possible.

In case of remote database replication, the connection is enabled by means of a database link from the target to the source system.
In case target and source system use database schemata in the same database instance (local database replication), the source system can grant access to certain tables to the target system.

Authentication of the Source System in a Target System: StagingIdentification

A special identification mechanism prevents the target system from performing data replication tasks triggered by other systems than the source system.

After getting a SOAP call from the naming service in the target system, the following steps are performed in order to uniquely identify the source system:

In the source system, the createPermissionID() method in the StagingMgr is called.
The StagingMgr thus creates a new UUID and inserts it into the STAGINGIDENTIFICATION table of the source system using JDBC.
Subsequently, an arbitrary method of the StagingService is called in the target system, whereby this UUID is passed as a parameter.
In the target system, the StagingService calls the checkPermissionID() method in the StagingMgr, providing the UUID.
The StagingMgr of the target system searches this UUID in the STAGINGIDENTIFICATION table in the source system database using the database link resp. the direct access to the source database schema.

If the StagingMgr finds the UUID, it will accept the call, as the source system is now unambiguously identified. After processing the call, the target system's StagingMgr deletes the UUID in the STAGINGIDENTIFICATION table in the source system database, again using the database link resp. the direct access to the source database schema.
If the UUID is not found, the StagingMgr denies the access and throws an IdentificationException.

System Preparation for Data Replication

Before database content or file system directories can be replicated, some preparations are required by the staging processors in order to create the environment. Preparing the environment for data replication is the task of the preparer classes StagingGroupPreparer, StagingEnvironmentPreparer and related preparer classes (e.g., StagingTablePreparer), which are executed on DBInit.

Note

The Staging framework depends on the identical structure of the tables to be replicated. Moreover, it depends on identical UUIDs of all domains in the database and of all staging configuration.

Note

There is no automatic process which initially copies the database content from editing to target system(s). Despite that, it is a task of the installation and deployment process to equalize the databases.

The easiest and highly recommended way to assure this is to execute a DBInit in the editing system, then to export the database with ant export (in the editing system) and to import the resulting database dump in the target system(s) using ant import.

Another way would be the usage of DBMigrate on both, editing and target systems. In this case, all relevant UUIDs would be needed to be predefined.

The following sub section describes all default staging processors provided with ICM. Please refer to the according JavaDoc and configuration examples in the respective Cookbook for more detailed information on configuring these preparers.

Preparers to Create the Database Staging Configuration - DBInit

The StagingGroupPreparer (DBInit)

The StagingGroupPreparer class is the first preparer class called when preparing the database for data replication. It prepares all staging groups, staging tables, staging materialized views and staging directories and stores their configuration data in the corresponding STAGINGGROUP, STAGINGTABLE, STAGINGMVIEW and STAGINGDIRECTORY tables. Prepared staging groups can then be used by the pipeline ProcessReplicationGroupAssignment when assigning data replication groups to data replication tasks (see Assignment of Staging Groups to Replication Groups).

Note

The StagingGroupPreparer has to be executed before the StagingEnvironmentPreparer is executed.

To prepare staging groups, the StagingGroupPreparer usually uses the property files StagingGroup.properties (Staging Group preparation) and StagingGroupInformation.properties (Staging Processor - to - Staging Group assignment), which are part of the sub-package dbinit.data.staging (included in the dbinit.jar) of each cartridge.

The ResourceAssignmentPreparer (DBInit)

The Staging framework uses the Locking framework to assure the exclusive access to affected resources (i.e., database tables, files) during a replication process. Thus, it prevents inconsistent data caused by jobs, imports etc running in parallel,.

Staging resource assignments are usually defined in ResourceAssignments.properties. They map staging groups (the key) onto one or more resource definitions of the locking framework (value).

The StagingEnvironmentPreparer (DBInit)

The StagingEnvironmentPreparer creates the environment (such as special database tables or views) which is necessary to replicate database tables.

The StagingEnvironmentPreparer

Reads all configured staging groups of the current cartridge
Gets the assigned staging processor for each staging group
Gets the StagingTablePreparer associated with each staging processor.

The retrieved staging table preparer actually creates the necessary database structures. Since staging processors may impose different requirements on their database environment, each staging processor invokes its own StagingTablePreparer.

Figure: StagingEnvironmentPreparer and StagingTablePreparer

Database Environment for Full Replication

The following figure shows the environment which the StagingTablePreparer creates for database tables ( foobar and foobar_AV in the sample below) that are replicated via full replication (see Full Replication Processors).

The preparer

Renames the source table into live table, e.g., table foobar$1.
Creates a synonym for the live table, e.g., synonym foobar.
Copies the live table structure to the shadow table, e.g., table foobar$2.
Creates a source view, e.g., view foobar$S.

The resulting database structure is shown here:

Figure: Database Environment: Full Replication

The created database objects and their purpose are :

Object	Purpose
TABLE	Database tables contain the actual data (live table foobar$1, shadow table foobar$2).
SYNONYM	Table data are accessed by the Java application servers via synonyms (synonym foobar).
VIEW	Views provide access for the staging process to the table content in its according domain context, even if the accessed object itself does not have a domain ID. For example, the view foobar_AV$S joins the synonyms foobar_AV and foobar to get the domain ID from the table foobar$1.

Database Environment for Delta Replication

For tables replicated via delta replication, a more complex environment is required, due to the change tracking mechanism used. Changes are tracked in each staging table using a time frame defined by the last successful staging process and the current time. Inserts and updates are detected by the values in the LASTMODIFIED column in each staging table.

Note

Each persistent object is responsible to set the LASTMODIFIED column after/before inserts and updates. If the persistent object is generated using jGen, this functionality will be created automatically.

Deletions are tracked using a deletion trigger and a special deletion table. The deletion trigger and deletion table for each $1 and $2 table are created by the DeletionTrackingStagingTablePreparer (see Figure below). The deletion table stores

The primary key and domain identifier of the source table and
The LASTMODIFIED column containing the deletion time.

The deletion trigger establishes the deletion tracking mechanism by copying primary key and domain identifier from the source table into the deletion table and setting the LASTMODIFIED column to current database date.

To prepare tables for delta replication, the preparer:

Tenames the source table into live table, e.g., table foobar$1.
Creates a synonym for the live table, e.g., synonym foobar.
Copies the live table structure to the shadow table, e.g., table foobar$2.
Creates deletion tracking, e.g., deletion trigger T$foobar$1 and deletion table D$foobar$1, of the live table.
Creates deletion tracking, e.g., deletion trigger T$foobar$2 and deletion table D$foobar$2, of the shadow table.
Creates a live deletion synonym for the live deletion table, e.g., synonym D$foobar.
Creates a source view, e.g., view foobar$S.
Creates a source deletion view, e.g., view D$foobar$S.

In this way,the preparer is creating a structure as shown in the next figure,

Figure: Database Environment: Delta Replication

Preparers to Change the Database Staging Configuration - DBMigrate

Change Staging Group Configurations

AddStagingGroupsInformationPreparer
This preparer adds resp. updates the staging processor configuration, the (optionally) assigned domain, and the localized staging group information (display name, description) of staging groups, which already exist in the database.

Note

This preparer does not add new staging groups.

Note

When changing a database StagingProcessor, it is required to remove the staging environment before from all staging entities (using DeleteStagingEntitiesEnvironmentPreparer) of the staging group and to re-create the staging environment for the new staging processor (using MigrateStagingEnvironment).

Change Staging Group and Staging Entities Configurations

AddStagingGroupsPreparer
This preparer is used to add new staging groups AND the according staging entities of these staging groups.

Note

This preparer does not update existing staging groups.

UpdateStagingGroupsPreparer
This preparer is used to update the attributes (group configuration) AND re-creates the according staging entities of staging groups belonging to the current cartridge.

Note

This preparer does not allow to add new staging groups.

RemoveStagingGroupsWithEntitiesPreparer
This preparer is used to remove given staging groups AND all their assigned staging entities. Additionally, the replication task assignments and the staging group resource assignments of the respective staging groups are removed.

Note

This preparer does NOT remove the staging environment (i.e., the $1, $2, $S etc.) from the staging tables to be removed. Call the DeleteStagingEntitiesEnvironmentPreparer before to strip the staging environment from staging entities to be removed.

Change Staging Entities Configurations

DeleteStagingEntitiesEnvironmentPreparer
This preparer is used to remove the staging environment from staging entities (staging tables).

DeleteStagingEntitiesPreparer
This preparer is used to delete existing staging entities from staging groups.

Note

This preparer does not remove staging groups, even if the staging group would become empty.

Note

UpdateStagingEntitiesPreparer
This preparer is used to add resp. update staging entities of a single staging group.

Note

This preparer does not allow to add new staging groups nor to remove staging entities.

Change Staging Resource Assignment

AddResourceAssignmentsPreparer
This preparer is used to add additional resource assignments to staging groups.

Change Staging Environment

MigrateStagingEnvironment
This preparer is used to migrate the staging environment of the current cartridge. It is normally used after staging groups or staging entities have been changed / added.

Replication Configuration

(Mass) Data Replication uses several configuration files:

staging.properties provides basic settings like the system role (edit, live) for the current ICM system, time-out settings, staging processor settings.

replication-clusters.xml provides the communication settings for an editing system to connect to live system(s).

replication.xml defines the replication groups, which are useable in the back office. Moreover, recurring mass data replication processes can be defined here.

staging.properties

This file is used in both, source (editing) and target (live) systems. Configurable settings are:

Property	Default (Development)	Type	Range	Description
General Settings:
staging.system.type	none	String	editing live none	Defines the type of staging system. editing: The system is used to import, add, update or delete staging content. live: The system is used to process storefront requests. It gets new content from editing system using the staging process. none: The system does not use staging mechanisms.
staging.WebServerURL	inactive, empty	URL		The web server URL being used by staging processes (optional). In the live system the property configures the URL of SOAP staging service. In the editing system it configures the web server from which the files should be downloaded. If no value is set, the standard web server URL configured in the appserver.properties is used.
Database Communication Settings:
staging.database.connection.factory	com.intershop.beehive.core.capi.staging. DefaultConnectionFactory	String		Defines the database connection factory to be used during staging process. com.intershop.beehive.core.capi.staging. OracleDriverConnectionFactory: does not use the configured JDBC pool (avoid problems caused by a known ORACLE bug using DB links) com.intershop.beehive.core.capi.staging. DefaultConnectionFactory: uses the standard JDBC pool (e.g. UCP)
Parallelism Section:				These properties should be set in live and editing system. They are used for configuring the parallelism behavior during a staging process.
staging.process. NrOfParallelProcessors	2	Integer		The number of staging processors executed in parallel.
staging.process. EntityParallelism	3	Integer		The number of parallely replicated entities per staging processor. Note Currently only the `FullDomainSpecificStagingProcessor` supports this setting.
staging.process. StatementParallelism	1	Integer		The number of parallel threads within database performing a SQL statement. Note See PARALLEL in hints in Oracle (it works only with an Oracle Enterprise Edition).
staging.process. MinRowsForStatement Parallelism	1000000	Integer		The minimum number of rows a table must have to replicate its content with parallel SQL hints configured in the property above.
Timeout section:				These properties should be set in the live system. If a timeout is reached, the staging process proceeds its execution. An according error is logged in the error log file. Warning In case a timeout is reached, the page cache may possess inconsistent data. FIX: Restart all application servers that did not respond and remove the page cache.
staging.locking. acquisition.timeout	1200 = 20min	Integer		The maximum time the staging process waits for resources (in s).
staging.timeout. cacheRefresh	600 = 10min	Integer		Defines the maximum time the staging process waits for each application server refreshing the cache of persistence layer (in s).
staging.timeout. switchDirectories	600 = 10min	Integer		Defines the maximum time the staging process waits for each application server switching their directories (in s).
staging.timeout. waitingForState	7200 = 2h	Integer		The maximum time the staging process waits for a new state during a staging process (in s). Note This property is also required in the source system.
Staging Processor Configuration Section:				This section contains the configuration of the staging processors. These settings express the assignment of the staging processor name as defined in the Staging-Processor-To_Staging-Group assignment (StagingGroupInformation.properties), which only represents a processor name (like an alias) to an implementing staging processor Java class together with assigned staging processor decorator(s)' Java class(es). Since these settings are subject of release-specific changes, only the syntax of the assignment is depicted here, together with an example taken from the IS 7.0 release. For information on the standard staging processor and decorator classes, please see Standard Staging Processors and Staging Decorators. Warning If these properties contain invalid entries, staging can result in data corruption! Please make sure you have understood the documentation before changing these settings!
Syntax:
staging.processor.<StagingProcessorName>.className = <implementingClassInclusiveJavaPackage> staging.processor.<StagingProcessorName>.decorator.<consecutiveNumbered> = <implementingDecoratorClassInclusiveJavaPackage>
Example: FullDomainSpecificStagingProcessor				Configuration of the database staging processor that transfers domain-specifc data (Products, Discounts, etc.).
staging.processor.FullDomainSpecificStagingProcessor.className = com.intershop.beehive.core.capi.staging.process.FullDomainSpecificStagingProcessor staging.processor.FullDomainSpecificStagingProcessor.decorator.0 = com.intershop.beehive.core.capi.staging.process.AnalyzeTablesDecorator staging.processor.FullDomainSpecificStagingProcessor.decorator.1 = com.intershop.beehive.core.capi.staging.process.DisableConstraintsDecorator staging.processor.FullDomainSpecificStagingProcessor.decorator.2 = com.intershop.component.mvc.capi.staging.RemoveCatalogDecorator staging.processor.FullDomainSpecificStagingProcessor.decorator.3 = com.intershop.beehive.core.capi.staging.process.ExecuteQueryDecorator
Staging index/constraint performance section:				These settings should be set in the target system. They work with `FullDomainSpecificStagingProcessor` or derived classes and add the following features around the existing `insert /+ append / ..`.: indexes unusable insert ... [alter session force parallel ddl parallel <nr>] indexes rebuild nologging constraints disable insert ... [alter session force parallel ddl parallel <nr>] constraints enable validate
staging.process.unusableIndex.rowCountLimit[.TableName]	0	Integer		Set the 'global' or 'table' specific limit (table row count) to enable the unusable index processing. It is possible to overwrite the 'global' value per table. Note Write table names in UPPERCASE letters. Hint: Avoid global, i.e., enabled every time. Small tables (rowcount) consume more processing time for handling if this feature is enabled instead of disabled. RowCountLimit values: 0 = disabled, default 1 = enabled every time >1 = enabled if table row count is greater or equal to the value
Examples:				(default value): staging.process.unusableIndex.rowCountLimit = 0 valid for tables with rowcount > 100000: staging.process.unusableIndex.rowCountLimit = 100000 valid, if rowcount of table PRODUCT is > 200000: staging.process.unusableIndex.rowCountLimit.PRODUCT = 200000
staging.process.unusableIndex.rebuildParallelism	1	Integer		Set the number of parallel threads within database performing a SQL unusable index rebuild statement.
staging.process.disableConstraint.rowCountLimit[.TableName]	see Examples below	Integer resp. String		Set the 'global' or 'table' specific limit (table row count) to enable the disable constraint processing. It is possible to overwrite the 'global' value per table. Note Write table names in UPPERCASE letters. Hint: Avoid global, i.e., enabled every time. Small tables (rowcount) consume more processing time for handling if this feature is enabled instead of disabled. RowCountLimit values: 0 = disabled, default 1 = enabled every time >1 = enabled if table row count is greater or equal to the value use `${staging.process.unusableIndex.rowCountLimit}` to use identical rowcount limits (this is the default setting).
Examples:				(default value): staging.process.disableConstraint.rowCountLimit = ${staging.process.unusableIndex.rowCountLimit} valid, if rowcount of table PRODUCT is > 200000: staging.process.disableConstraint.rowCountLimit.PRODUCT = ${staging.process.unusableIndex.rowCountLimit.PRODUCT}
staging.process.disableConstraint.enableParallelism	see Description column	Integer resp. String		Set the number of parallel threads within database performing a SQL disable constraint statement. Use `${staging.process.unusableIndex.rebuildParallelism}` to use identical thread limits (this is the default setting).
staging.contextIndexCreationMode	sync	String	sync async disabled	Defines the bahavior of the staging process depending on the creation of context indexes. It is possible to overwrite the 'global' value per table. Note Write table names in UPPERCASE letters. If there is no or no valid value for a staging table, the general setting is used. If the general setting is not set or not valid, the default 'sync' is used. Valid `contextIndexCreationMode` values are (in upper or lower case): sync - synchronous, staging ends after created indexes (default) async - asynchronous, staging ends without waiting for indexes disabled - no context index will be created
Examples:				(default value): staging.contextIndexCreationMode=sync valid for table PRODUCT: staging.contextIndexCreationMode.PRODUCT=async
Staging Processor Configuration Section, older version:				This section contains the configuration of the staging processors, as it was valid in ESL6.5. These setting are possibly not up-to-date now. Since these settings are subject of release-specific changes, probably these old information will be removed soon.
staging.processor. FullDomainSpecific StagingProcessor. className	c.i.b.c.c.s.p.Full DomainSpecific StagingProcessor	String		Configuration of the database staging processor that transfers domain-specific data (Products, Discounts, etc.). The processor replicates only the content of the selected domains during a batch process. The processor class is used to stage tables containing domain-specific content.
staging.processor. FullFastDomain SpecificStaging Processor.decorator.0	c.i.b.c.c.s.p.Analyze TablesDecorator	String		See previous description. This decorator is used to analyze tables of the editing and live system during the staging process. In the editing system tables are analyzed on preparation hook, in live the system on replication hook.
staging.processor. FullFastDomain SpecificStaging Processor.decorator.1	c.i.b.c.c.s.p.Disable ConstraintsDecorator	String		See previous description. This decorator is used to disable all constraints on shadow tables of live system before the synchronization starts. After replication the constraints will be enabled.
staging.processor. FullFastDomain SpecificStaging Processor.decorator.2	c.i.c.m.c.s.Remove CatalogDecorator	String		See previous description. This decorator is used to mark the catalog domains as deleted that have been removed by the replication process.
staging.processor. FullFastDomain SpecificStaging Processor.decorator.3	c.i.b.c.c.s.p.Execute QueryDecorator	String		This staging processor is based on full staging processor switching $1 and $2 tables on publication phase. Further, it calls query files on each staging hook to perform the replication.
staging.processor. FullStagingProcessor. className	c.i.b.c.c.s.p.Full StagingProcessor	String		Configuration of the database staging processor transferring system content like regional settings, permissions, roles, etc. This processor is used to perform staging processes for tables containing system-wide content.
staging.processor. FullStagingProcessor. decorator.0	c.i.b.c.c.s.p.Analyze TablesDecorator	String		This decorator is used to analyze tables of editing and live system during staging process. In editing system tables are analyzed on preparation hook, in live system on replication hook.
staging.processor. FullStagingProcessor. decorator.1	c.i.b.c.c.s.p.Disable ConstraintsDecorator	String		This decorator is used to disable all constraints on shadow tables of live system before the synchronization starts. After replication the constraints will be enabled.
staging.processor. DeltaDomainSpecific StagingProcessor. className	c.i.b.c.c.s.p.Merge DomainSpecific StagingProcessor	String		Configuration of the database staging processor transferring domain-specific content, that may be written in storefront of live system (like Users). This staging processor is used to replicate database content residing in database tables being changed in source as well as target system. Due to this, the replication occurs in one huge transaction. It uses the 'MERGE' sql statement to transfer the new and updated content and uses the deletion tracking with deletion trigger to realize removed rows in the editing system. The 'MERGE' statement has a restriction. It does not work on tables having a column with a context index. So, only tables with normal indexes are supported.
staging.processor. DeltaDomainSpecific StagingProcessor. decorator.0	c.i.b.c.c.s.p.Disable ConstraintsDecorator	String		This decorator is used to disable all constraints on shadow tables of live system before the synchronization starts. After replication the constraints will be enabled.
staging.processor. AppendDomainSpecific StagingProcessor. className	c.i.b.c.c.s.p.Append DomainSpecific StagingProcessor	String		Configuration of the database staging processor transferring domain-specific content, that is only appended to live system content. Old content is neither replicated, deleted nor changed.
staging.processor. MergeDomainSpecific StagingProcessor. className	c.i.b.c.c.s.p.Merge DomainSpecific StagingProcessor	String		Configuration of the database staging processor transferring domain-specific content that may be written in the storefront of the live system (like Users) and have a lot of rows in the live system. The Undo process is not supported.
staging.processor. FileSystemStaging Processor.className	c.i.b.c.c.s.p.Simple FileSystemStaging Processor	String		Configuration of the file system staging processor transferring simple files (gifs,...).
staging.processor. LocalizationStaging Processor.className	c.i.b.c.c.s.p.Simple FileSystemStaging Processor	String		Configuration of the file system staging processor transferring localization files. It is based on file system staging processor, too.
staging.processor. LocalizationStaging Processor.decorator.0	c.i.b.c.c.s.p.Refresh Localizations Decorator	String		The decorator reloads the localization files in the live system after the localization files have been replicated.
staging.processor. SearchIndexes StagingProcessor. className	c.i.b.c.c.s.p.Simple FileSystemStaging Processor	String		Configuration of the file system staging processor transferring search indexes. It is based on file system staging processor, too.
staging.processor. SearchIndexes StagingProcessor. decorator.0	c.i.c.f.c.r.Refresh SearchIndexes Decorator	String		The decorator refreshes the search indexes on each application server in the live system.
staging.processor. MViewStaging Processor.className	c.i.b.c.c.s.p. MViewStaging Processor	String		Configuration of the mview staging processor refreshing materialized views referencing affected tables.
staging.processor. RulesStaging Processor.className	c.i.b.c.c.s.p.FullFast DomainSpecific StagingProcessor	String		Configuration of the database staging processor transferring rules. This processor uses direct path SQL statements improving performance during replication of huge amount of data. Further, during replication the indexes are not maintained. After the replication has been finished the rebuild of all indexes affected by replication will be rebuilt. Furthermore, replicated rules will be reloaded in the target system. This staging processor operates in the same way like `FullStagingProcessor,` but uses special SQL statements that disable redo logging in Oracle database. Note In case of a database crash the data inserted by this staging processor are not recoverable due to only direct load DML being used.
staging.processor. RulesStaging Processor.decorator.0	c.i.b.c.c.s.p.Analyze TablesDecorator	String		This decorator is used to analyze tables of the editing and live system during the staging process. In the editing system tables are analyzed on preparation hook, in live system on replication hook.
staging.processor. RulesStaging Processor.decorator.1	c.i.b.c.c.s.p.Disable ConstraintsDecorator	String		This decorator is used to disable all constraints on shadow tables of the live system before the synchronization starts. After replication the constraints will be enabled.
staging.processor. RulesStaging Processor.decorator.2	c.i.b.c.c.s.p.Execute QueryDecorator	String		This staging processor is based on full staging processor switching $1 and $2 tables on publication phase. Further, it calls query files on each staging hook to perform the replication.
staging.processor. RulesStaging Processor.decorator.3	c.i.c.s.c.s.Shipping RuleEngineStaging ProcessorDecorator	String		This decorator is used to reload the shipping rules, after the rules of cartridge `bc_ruleengine` have been transferred.
staging.objects.chunksize	inactive, 15	Integer		Business Object Replication: If the user plans to replicate a lot of objects (e.g. 10000 products), these objects will be sent in several loops, 15 objects each loop and the cache refresh is started after all objects have been sent and merged. Note Remember that Business Object Replication is only meant for emergency updates of a few objects. If you want to replicate a lot of data use the Mass Data Tasks menu.

Staging Decorator/Processor Configuration via Guice

Instead of declaring the staging-processors and decorators inside the staging.properties, they are defined in code with Guice-modules now. Customers who are using customized staging properties can still use them. The content of the staging properties will be used instead of what is defined in the Guice modules.

Priority of Staging Properties and Guice Module Registrations

In case the staging.properties contains a registration of a processor, all code bindings (for processor and decorators) of this processor will be ignored. That means, the system does not support a mixed scenario of code and properties for a specific processor. But it is possible to declare one processor in Guice (code) and another in properties.

The complete deactivation of Guice binding for decorators allows the project to override standard decorators, as in previous versions. In case the project only needs to add processors or decorator, it is recommended to register these via code. It is possible to overwrite one single processor and decorator with an entry in these staging properties.

Examples

CoreStagingModuleJava

public class CoreStagingModule extends AbstractModule
{

    @Override
    protected void configure()
    {
        MapBinder<String, BasicStagingProcessor> processorBinder = MapBinder.newMapBinder(binder(), String.class,
                        BasicStagingProcessor.class);
        MapBinder<String, StagingDecoratorFactory> decoratorBinder = MapBinder
                        .newMapBinder(binder(), String.class, StagingDecoratorFactory.class)
                        .permitDuplicates();

        /*
         * FullDomainSpecificStagingProcessor
         */
        addBinding(processorBinder, decoratorBinder, "FullDomainSpecificStagingProcessor",
                        FullDomainSpecificStagingProcessor.class, AnalyzeTablesDecoratorFactory.class,
                        DisableConstraintsDecoratorFactory.class, ExecuteQueryDecoratorFactory.class);

        /*
         * FullStagingProcessor
         */
        addBinding(processorBinder, decoratorBinder, "FullStagingProcessor", FullStagingProcessor.class,
                        AnalyzeTablesDecoratorFactory.class, DisableConstraintsDecoratorFactory.class);

        /*
         * DeltaDomainSpecificStagingProcessor
         */
        addBinding(processorBinder, decoratorBinder, "DeltaDomainSpecificStagingProcessor",
                        MergeDomainSpecificStagingProcessor.class, DisableConstraintsDecoratorFactory.class);

        /*
         * AppendDomainSpecificStagingProcessor
         */
        addBinding(processorBinder, decoratorBinder, "AppendDomainSpecificStagingProcessor",
                        AppendDomainSpecificStagingProcessor.class, ExecuteQueryDecoratorFactory.class);

        /*
         * DeleteAppendDomainSpecificStagingProcessor
         */
        addBinding(processorBinder, decoratorBinder, "DeleteAppendDomainSpecificStagingProcessor",
                        DeleteAppendDomainSpecificStagingProcessor.class, ExecuteQueryDecoratorFactory.class);

        /*
         * MergeDomainSpecificStagingProcessor
         */
        processorBinder.addBinding("MergeDomainSpecificStagingProcessor")
                        .to(MergeDomainSpecificStagingProcessor.class);

        /*
         * MergeDomainSpecificStagingProcessor
         */
        addBinding(processorBinder, decoratorBinder, "MergeDomainSpecificAndQueryStagingProcessor",
                        MergeDomainSpecificStagingProcessor.class, ExecuteQueryDecoratorFactory.class);

        /*
         * TemplateStagingProcessor
         */
        addBinding(processorBinder, decoratorBinder, "TemplateStagingProcessor", SimpleFileSystemStagingProcessor.class,
                        CompileTemplatesDecoratorFactory.class);

        /*
         * LocalizationStagingProcessor
         */
        addBinding(processorBinder, decoratorBinder, "LocalizationStagingProcessor",
                        FullDomainSpecificStagingProcessor.class, RefreshLocalizationsDecoratorFactory.class);

        /*
         * PipelineStagingProcessor
         */
        addBinding(processorBinder, decoratorBinder, "PipelineStagingProcessor", SimpleFileSystemStagingProcessor.class,
                        RefreshPipelinesDecoratorFactory.class);

        /*
         * MViewStagingProcessor
         */
        processorBinder.addBinding("MViewStagingProcessor")
                        .to(MViewStagingProcessor.class);
    }

    private void addBinding(final MapBinder<String, BasicStagingProcessor> processorBinder,
                    final MapBinder<String, StagingDecoratorFactory> decoratorBinder, final String name,
                    final Class<? extends BasicStagingProcessor> processor,
                    final Class<? extends StagingDecoratorFactory>... decorators)
    {
        processorBinder.addBinding(name)
                        .to(processor);
        for (Class<? extends StagingDecoratorFactory> decorator : decorators)
        {
            decoratorBinder.addBinding(name)
                            .to(decorator);
        }
    }
}

BcShippingStagingModule

public class BcShippingStagingModule extends AbstractModule
{

    /*
     * This modules configures the RuleStagingProcessor. 
     * 
     * Configuration of the database staging processor transferring
     * Rules. This processor uses direct path SQL statements improving performance during replication of huge amount of
     * data. Further, during replication the indexes are not maintained. Afterwards the replication has been finished
     * the rebuild of all indexes affected by replication will be rebuilt. Furthermore, replicated Rules will be
     * reloaded in the target system.
     */
    @Override
    public void configure()
    {
        MapBinder<String, BasicStagingProcessor> processorBinder = MapBinder.newMapBinder(binder(), String.class,
                        BasicStagingProcessor.class);
        processorBinder.addBinding("RulesStagingProcessor")
                        .to(FullDomainSpecificStagingProcessor.class);

        MapBinder<String, StagingDecoratorFactory> stagingBinder = MapBinder.newMapBinder(binder(), String.class,
                        StagingDecoratorFactory.class).permitDuplicates();
        stagingBinder.addBinding("RulesStagingProcessor")
                        .to(AnalyzeTablesDecoratorFactory.class);
        stagingBinder.addBinding("RulesStagingProcessor")
                        .to(DisableConstraintsDecoratorFactory.class);
        stagingBinder.addBinding("RulesStagingProcessor")
                        .to(ExecuteQueryDecoratorFactory.class);
        stagingBinder.addBinding("RulesStagingProcessor")
                        .to(ShippingRuleEngineStagingProcessorDecoratorFactory.class);
    }

}

Factories

Since a decorator needs a processor to decorate, each decorator must have a factory that takes the processor as parameter. In the binding, not the decorator but the factory is bound.

replication-clusters.xml

In replication-clusters.xml the communication parameters used for replication are specified. Actually, these settings define the communication infrastructure for both mass data replication and business object replication (fast publishing, e.g., of products).

replication-clusters.xml resides in editing (source) system(s).

TargetSystem and TargetCluster

The ICM supports system setups that can be spatially distributed over multiple data centers, each data center keeping its own database, webservers and appservers with (among others) their own database users and web-URLs. From a physical and IT technical point of view, the ICM systems in all data centers are different systems, but from a business point of view, they may form a logical unit.

For data replication this means that the target of one replication process might be not only one single ICM system but several ICM systems residing in multiple data centers. Therefore, the concept of data replication with one target system as the recipient of replication data was extended to replication target clusters.

A replication target cluster represents the recipient of replication data from a business point of view. Logically one recipient, technically it consists of one or more replication target systems, while a target system represents one (technical) ICM cluster with its own web URL and database user.

A Data Replication Manager will now select one replication target cluster as the target of a data replication process, and under the surface the replication mechanism will have to transfer the data to every target system belonging to the selected target cluster.

Accordingly, the data replication's configuration needs to provide information about the replication clusters now, which potentially are intended to be updated by the respective source system. Moreover, it has to keep the information as to which target systems belong to each of the target clusters, and how these target systems can be reached.

Configuration

The replication-clusters.xml (to be found in /intershop/replication-conf/replication-clusters.xml) defines the communication parameters for both mass data replication and business object replication. It is required in the source (editing) system of a data replication environment.

The XML file structure is defined in replication.xsd.

Some example configurations are shown in the Cookbook - Mass Data Replication - Administration.

Basic Content

The file contains the replication-configuration, defining the XSD schema localization, and one target clusters list.

basic replication config with target clusters list

<?xml version="1.0" encoding="UTF-8" ?>
<replication-configuration
    xsi:schemaLocation="http://www.intershop.com/xml/ns/enfinity/6.5.0/core/replication replication.xsd"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xmlns="http://www.intershop.com/xml/ns/enfinity/6.5.0/core/replication">

    <target-clusters>
        ....
    </target-clusters>

</replication-configuration>

The target clusters list keeps one or more target cluster definitions, each identified by an id attribute.

target clusters list with target cluster definitions

    <target-clusters>
        <target-cluster id="Cluster1">
            ..
        </target-cluster>
        ..
        ..
        <target-cluster id="ClusterN">
            ..
        </target-cluster>
    </target-clusters>

Each target cluster configuration has to set an id. A target cluster configuration contains one target systems list.

target cluster definition with target systems list

        <target-cluster id="Cluster42">
            <target-systems>
                ...
            </target-systems>
        </target-cluster>

The target systems list keeps one or more target system definitions, each identified by an id attribute and holding an active attribute.

target systems list with target system definitions

            <target-systems>
                <target-system id="TargetSystem1" active="true">
                    ..
                </target-system>
                ..
                ..
                <target-system id="TargetSystemN" active="false">
                    ..
                 </target-system>
            </target-systems>

Each configuration of a target system owns an id attribute, an active attribute, and a set of connection parameters.
The active attribute can be "true" or "false". It defines whether a target system configuration is used for data replication or not.
The connection parameters involve:
1. The web server URL of the target system.
2. If required, the target system's server group definition (see following explanation).
3. The source system's server group definition.
4. The database access configuration.
If the target system uses URLMapping, then the complete URL to the SOAP servlet (including the server group to be used in the target system) has to be given, according to the settings for intershop.urlmapping.urlPrefix and intershop.urlmapping.servlet.webadapter in appserver.properties of the target server.
Note
In this case you must not set an explicit target server group!

target system definition; Web server URL when using URLMapping

                <target-system id="TargetSystem_with_URLMapping" active="true">
                    <webserver-url>http://ts3.mydomain.com:80/INTERSHOP/servlett/BOS/SOAP</webserver-url>
                    ..
                 </target-system>

If no URLMapping is configured in appserver.properties of the target system, ICM will use default settings for its servlet paths. In this case, provide only the target web server URL consisting of protocol, hostname and port.
Additionally, you have to provide a target server group.

target system definition; Web server URL and target server group when not using URLMapping

                <target-system id="TargetSystem_without_URLMapping" active="true">
                    <webserver-url>http://ts2.mydomain.com:80</webserver-url>
                    <target-server-group>STG</target-server-group>
                    ..
                 </target-system>

Provide the server groups to be used in the source system.

target system definition; Source system server group

                <target-system id="TargetSystem" active="true">
                    ..
                    <source-server-group>BOS</source-server-group>
                    ..
                 </target-system>

Database connection to be used by the target system.
There are two ways for a target system to connect to the source system: database link or direct schema access.
In case a database link has to be used, provide the name of a database link which has been created in the target system to access the editing database.

target system definition; Database access via database link

                <target-system id="TargetSystem_using_DBLink" active="true">
                    ..
                    <source-database-link>ISEDITING.world</source-database-link>
                 </target-system>

Alternatively, if source and target system's database schemata reside in the same database instance, you may prefer to use direct schema access from target to source schema. In this case provide the name of the target schema name. ICM will take care to grant the required schema access rights for the target database schema to the source database schema.

target system definition; Database access via database link

                <target-system id="TargetSystem_using_DBLink" active="true">
                    ..
                    <target-database-user>INTERSHOP_LIVE</target-database-user>
                 </target-system>

Note

It is possible to use database access via database link from one, and direct database access from another target system within one target cluster.

Complete Example Configuration

The following example shows some basic configuration examples of replication-clusters.xml.

basic replication-clusters.xml

<?xml version="1.0" encoding="UTF-8" ?>
<replication-configuration
    xsi:schemaLocation="http://www.intershop.com/xml/ns/enfinity/6.5.0/core/replication replication.xsd"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xmlns="http://www.intershop.com/xml/ns/enfinity/6.5.0/core/replication">

    <target-clusters>
        <target-cluster id="Cluster42">
            <target-systems>
                <target-system id="TargetSystem1" active="true">
                    <webserver-url>http://ts1.mydomain.com:80</webserver-url>
                    <source-server-group>BOS</source-server-group>
                    <target-server-group>BOS</target-server-group>
                    <source-database-link>ISEDITING.world</source-database-link>
                </target-system>
                <target-system id="TargetSystem2" active="false">
                    <webserver-url>http://ts2.mydomain.com:80</webserver-url>
                    <source-server-group>BOS</source-server-group>
                    <target-server-group>STG</target-server-group>
                    <target-database-user>INTERSHOP_LIVE</target-database-user>
                </target-system>
                <target-system id="TargetSystem_with_URLMapping" active="true">
                    <webserver-url>http://ts3.mydomain.com:80/INTERSHOP/servlett/BOS/SOAP</webserver-url>
                    <source-server-group>WFS</source-server-group>
                    <source-database-link>ISEDITING.world</source-database-link>
                 </target-system>
            </target-systems>
        </target-cluster>
    </target-clusters>


</replication-configuration>

Explanations:

+"Cluster42":
The file contains one cluster definition for the cluster named "Cluster42", which involves three target systems, "TargetSystem1", "TargetSystem2" and "TargetSystem_with_URLMapping".

"TargetSystem1":

Is active, i.e., will be used as a replication target
Uses default URL mapping in target system
Uses server group BOS in both, source as well as in target system
Uses database link ISEDITING.world, which has to be defined in the target database schema to point to the source schema

"TargetSystem2":

Is inactive, i.e., will not be used as a replication target
Uses default URL mapping in target system
Uses server group BOS in source and server group STG in target system
Uses direct database access to the source schema. The target database schema is named INTERSHOP_LIVE; the system will grant access in the source schema to INTERSHOP_LIVE.

"TargetSystem_with_URLMapping":

Is active, i.e., will be used as a replication target.
Uses changed URL mapping in target system, according to the settings for intershop.urlmapping.urlPrefix and intershop.urlmapping.servlet.webadapter in appserver.properties of the target server, where URLPrefix /INTERSHOP and urlmapping.servlet /servlett (sic!) is used.
Uses server group BOS in target system.
Note
The target server group is defined only as part of the <webserver-url> and is not given as <target-server-group>!
Uses server group WFS in source system.
Uses database link ISEDITING.world, which has to be defined in the target database schema to point to the editing schema.

replication-cluster.xml in K8s

When deploying ICM to a K8s cluster via the Helm chart, a replication-cluster.xml is automatically created using the values provided in the replication section. This generated replication-cluster.xml is then mounted within icm-as with role=source in the path /intershop/replication-conf/replication-clusters.xml (the exact location this file is loaded from).

The Helm chart supports only 1-step replication (edit + live system) (icm-as-chart version 1.7.0).

Note that database links are not supported by Azure SQL MI and Azure SQL Database, so the live-db user only needs access to the edit-db.

replication.xml

Together with replication-clusters.xml, the various replication.xml files contain the configuration for the data replication functionality of ICM. While replication-clusters.xml defines the communication channels for both mass data replication and business object replication (for example, fast publishing of products), replication.xml is only used by mass data replication.
The replication.xml files define the replication groups and their descriptions that can be used in the back office. In addition, mass data replication processes can be predefined in this file.

The structure is defined in the replication.xsd file.

ICM supports 1 to n replication.xml files in total, but only 0 to 1 per cartridge. They are looked up as a classpath resource at <cartridge-jar>/resources/<cartridge-name>/replication/replication.xml (source location: <cartridge-root>/src/main/resources/resources/<cartridge-name>/replication/replication.xml). This allows each cartridge to define/add its own groups, processes and tasks.

There is an additional replication.xml that is located outside of any of the cartridges. Its location can be configured using the intershop.replication.group.configuration configuration property. If this property exists, it must contain either:

A file path pointing to a replication.xml file accessible by the icm-as container, or

A URL pointing to an internal or external replication.xml-URL (can even be a response of a micro-service).

All the replication.xml files located in cartridges and the additional replication.xml file are then merged and interpreted as if they were one file.

Basic Structure

The file replication.xml consists of three parts:

The definition of the data replication groups for the data replication manager's back office,
The definition of mass data replication processes, which can be executed by jobs (recurring or single times),
And the definition of the replication tasks, which are referenced by the replication processes.

While the groups section is mandatory, processes and tasks definitions are optional.

The following schema shows the basic structure of replication.xml.

replication.xml, Basic structure

<?xml version="1.0" encoding="UTF-8" ?>
<replication-configuration
    xsi:schemaLocation="http://www.intershop.com/xml/ns/enfinity/6.5.0/core/replication replication.xsd"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xmlns="http://www.intershop.com/xml/ns/enfinity/6.5.0/core/replication">

    <!--
    In this (mandatory) section all replication groups are defined, that are shown in the Data Replication Manager's backoffice.
    -->

    <groups>
    ...
    </groups>

    <!--
    In this (optional) section all replication processes are specified, that can be replicated by job 'Regular Replication Process' in SLDSystem (manually,
    i.e., single time, or on a regular, i.e., recurring base) .
    -->

    <processes>
    ...
    </processes>

    <!--
    This (optional) section contains all replication tasks, that can be reused by several replication processes. Each referenced
    replication task is created at the beginning of replication process in according enterprise or channel.
    -->

    <tasks>
    ...
    </tasks>

</replication-configuration>

Replication Group Configuration

The following schema shows an excerpt of replication.xml dealing with the replication group definition.

A replication group definition consists of:

A unique replication group ID. This ID is used in the ProcessReplicationGroupAssignment pipelines to determine the required assignment sub pipeline.
A list of business unit types. This list defines in which kind of business units (organization, channel) the replication group is shown to the data replication manager. Currently, the types 20 (organization), 30 (partner channel), and 52 (b2c channel) are supported by default.
A sequence of locale-specific name / description definition. Each locale definition consists of:
- A locale id. It defines for which locale the replication group name and description are presented,
- A replication group name as to be shown in backoffice, and
- A textual description of the replication group, as to be shown in backoffice.

The example below depicts the replication group "Search Indexes" with configurations for locales "en_US" and "de_DE".

replication.xml, Replication Group structure

<?xml version="1.0" encoding="UTF-8" ?>
<replication-configuration
    ...>
    ...

    <!--
    In this (mandatory) section are defined all replication groups, that are shown in the Data Replication Manager's backoffice.
    -->

    <groups>

      <group id="SEARCH_INDEXES" >
        <business-unit-types>20 30 52</business-unit-types>
        <locale id="en_US">
          <name>Search Indexes</name>
          <description>Search indexes and their configuration, search query definitions (predefined product filters), and search redirects. 
                       Note: The objects group that is indexed, e.g. PRODUCTS and PAGELETS, must be added to avoid inconsistencies.
          </description>
        </locale>
        <locale id="de_DE">
          <name>Suchindizes</name>
          <description>Suchindizes und Indexkonfiguration, vordefinierte Suchanfragen und Such-Redirects. 
                       Achtung: Die Replikationsgruppe, die indizierte Objekte enthält (z.b. PRODUCTS, CATALOG oder PAGELETS), muss ebenfalls
                       repliziert werden.
          </description>
        </locale>
      </group>

      <group id="... >
      ...
      </group>

    </groups>

    ...
</replication-configuration>

Predefined Mass Data Replication Processes

The following schema shows an excerpt of replication.xml dealing with the (mass data) replication process definition. These process definitions can be read by the job Regular Replication Process in domain SLDSystem to create automated replication processes.

Note

If no predefined replication processes are needed, remove the "processes" section from replication.xml or comment it.

A replication process definition consists of:

A replication process ID. This ID is used in the job configuration of Replication Process Scheduler as attribute ReplicationProcessID.

Note

Since the job Regular Replication Process can hold only one process ID in its attribute ReplicationProcessID, it is necessary to create (copy) an own job for each replication process to be executed by job.

A replication process type. Valid typed for predefined replication processes are:
- Replication
- Publication and
- ReplicationPublication
A process description.
The ID of the target cluster for the replication process. The ID is case-sensitive. It refers to the ID of a target cluster as defined in replication_clusters.xml.
A sequence of replication task references. These references are the IDs of replication task definitions, which have to be defined also in replication.xml (for details see below).

The example below depicts the definition of a replication process "nightly" of type "ReplicationPublication" with attached replication tasks "PrimeTechProducts" and "PrimeTechSpecialsProducts".

replication.xml, Replication process definition

<?xml version="1.0" encoding="UTF-8" ?>
<replication-configuration
    ...>
    ...

    <!--
    In this (optional) section there are specified all replication processes that can be replicated by job 'Regular Replication Process' in SLDSystem (manually,
    i.e., single time, or on a regular, i.e., recurring base) .
    -->

    <processes>
      ...
      <process id="nightly">
        <type>ReplicationPublication</type>
        <description>This process is started every night.</description>
        <target-cluster-id>Cluster42</target-cluster-id>
        <task ref="PrimeTechProducts"/>
        <task ref="PrimeTechSpecialsProducts"/>
      </process>
      ...
    </processes>

    ...
</replication-configuration>

Predefined Mass Data Replication Tasks

The following schema shows an excerpt of replication.xml dealing with the (mass data) replication task definition. These task definitions are referenced by job Regular Replication Process in domain SLDSystem when creating automated replication processes.

Note

If no predefined replication processes / tasks are needed, remove the "processes" and "tasks" sections from replication.xml or comment them.

A replication task definition consists of:

A replication task ID. This ID is used by the replication process configuration in replication.xml as replication task references.

Note

Task IDs in replication task definitions and task references in replication process definitions are case-sensitive.

An organization (name) for which data has to be replicated resp. of whose channels data have to be replicated (mandatory).

Note

The organization name is case-sensitive and needs to be written like in table DOMAININFORMATION.

A channel (name), for which data have to be replicated (optional; only if channel data have to be replicated).

Note

The channel name is case-sensitive and needs to be written like in table DOMAININFORMATION.

A task description.
A sequence of replication group references (the groups which are required to be part of the replication task). These group references are the IDs of replication group definitions which have to be defined also in replication.xml (for details see above).

The example below depicts the definition of

A replication task "PrimeTechProducts", transferring data of replication group PRODUCTS of the organization "PrimeTech", and
A replication task "PrimeTechSpecialsProducts", transferring data of replication groups CATALOGS and PRODUCTS of the channel "PrimeTechSpecials" of organization "PrimeTech".

replication.xml, Replication task definition

<?xml version="1.0" encoding="UTF-8" ?>
<replication-configuration
    ...>
    ...

    <!--
    This (optional) section contains all replication tasks that can be reused by several replication processes. Each referenced
    replication task is created at the beginning of the replication process in the according enterprise or channel.
    -->

    <tasks>
      ...
      <task id="PrimeTechSpecialsProducts">
        <organization>PrimeTech</organization>
        <channel>PrimeTechSpecials</channel>
        <description>Replicates all products of channel PrimeTechSpecials</description>
        <group ref="CATALOGS"/>
        <group ref="PRODUCTS"/>
      </task>
      <task id="PrimeTechProducts">
        <organization>PrimeTech</organization>
        <description>Replicates all products of channel PrimeTechSpecials</description>
        <group ref="PRODUCTS"/>
      </task>
      ...
    </tasks>

    ...
</replication-configuration>

For customization aspects, see the according information provided in the Cookbook - Mass Data Replication - Customization and Adaption.

Replication Chains

A data replication system can be configured to serve as source and target system. Hence, it is possible to set up data replication chains in which content is transferred consecutively across multiple systems (e.g., system A replicates to system B, and then system B replicated to system C).

As a business case example, setting up a data replication chain may be required for test or acceptance systems where data or design changes are tested or approved before they go live.

Note

It is NOT supported to create a replication ring, i.e., that the last target system in a replication chain serves as source system to replicate data back to the orininal source system (e.g., system A to system B, then system B to system C, and then system C to system A).

The following figure depicts a data replication chain with 3 stages in a simplified form. For easier understanding it shows only target systems instead of target clusters, since only one system in a target cluster can act as an editing system for the next stage in a replication chain.

Figure: Mass Data Replication: Simplified schema of a data replication chain.

When setting up data replication chains, take care of the following topics:

All systems serving as source systems have to set the property staging.system.type to value editing (e.g., system A and system B in a 3-tier chain).
Only the final target system (resp. all target systems of the final target cluster) have to set the property staging.system.type to value live (e.g., system C).
It is necessary to set up a database connection from each target system to each precessing source system in the chain (e.g., from system C to system B, and from system B to system A). It does no matter if a database link, direct access, or both is used in the chain.
DBInit has to be executed in the first source system (A). Then a database dump has to be exported in the first source system and has to be imported into every target system (B and C in the example).
File systems of all target systems need to be up-to-date (e.g., the sites directory).

Error Detection, Handling, and Recovery

Basics

Replication processes are intended to be an atomic operation, i.e., they are counted as successfully finished only if they

Have been finished successfully in each target system of the target cluster,
Have been finished successfully for each phase of the process (depending on the type of process preparation, synchronization, replication, publication, cache refresh),
Have been finished successfully for each table resp. each file / directory which are part of any replication group / staging group involved in the process.

Therefore, when- and whereever an error occurs during a replication process, the whole replication / staging process is broken up and signed as failed.

Error Tracking / Error Detection

By default, status / errors of replication and staging processes are written into the PROCESS table. The status of a replication process is displayed in the back office (SLDSystem | Data Replication | Replikation Processes | Process detail page).

Additionally, errors in replication and staging processes are tracked via log messages.

Out-of-the-box, there is no additional error notification implemented for data replication, but there is a standard mechanism to call a custom pipeline at special stages of a staging process, which can be used to implement and call a custom notification pipeline at the end of a staging process (StagingProcessCustomization -> OnPreCompletition).

Error Logging for Replication Processes Started via Back Office

Log Files

All information related to a staging process (pipelines TriggerStagingProcess, TriggerPublicationProcess, TriggerUndoProcess in editing, and StagingProcess in target system) is written using log messages.

Database Stored Process States

Replication processes, staging processes, and their staging sub processes store their process states in the PROCESS table. In case of a failed or blocking replication sometimes it may be helpful to check the respective process states in the database. Since PROCESS rows contain a LASTMODIFIED column, the most recent replication resp. staging process can easily be determined by ordering the rows by LASTMODIFIED.

The following table gives an overview of the occuring process states. To check the most recent process for its process state, execute the following SQL command (replace <ProcessName> by the respective name from the table below):

Process state of latest process

select state from PROCESS where name=<ProcessName> order by LASTMODIFIED desc;

Meaning of table columns:

Column	Meaning
Process type	Process type, where a process state does occur (replication, staging, or staging sub process).
Process name	Name of the process type as it occurs in the PROCESS table.
Process state	Name of the process state as it occurs in the PROCESS table.
State type	Defines if the described state is set in the middle of a running staging process (type process), or if state is set at the end of a staging process (type final).
System	Shows where the described state occurs (source, or target system). States marked with source* are set in source system, but observed by the target systems. They are used to synchronize the target systems (publication, cache refresh). States marked with target* are set in target system, but in the corresponding sub-process in the source system, too. The source system observes all sub-processes for synchronization purposes. For non-error states, the source system counts the aggregated target state as set if each of the target systems has reached the according state. For error states, the source system counts the staging process as failed if at least one target system has set an error state; in such case all other target systems will end their staging process, too, setting it as failed.

Process type	Process name	Process state	State type	System	Description / Possible error cause
Replication process	'ReplicationProcess'	WAITING	process	source	The replication process is prepared but the execution time is not yet reached.
		CANCELED	final	source	The replication process was canceled in the back office.
		RUNNING	process	source	The replication process is underway.
		COMPLETED	final	source	The replication process has successfully finished.
		FAILED	final	source	The replication process has finished due to errors.

Staging process	'StagingProcess'	ErrorInternal	final	source	Any severe failure when calling the source system's staging pipeline. See staging.log, too. Maybe missing or wrong settings in pipeline directory or other parameters. Should normally not occur during system operation. Maybe wrong localization.properties between edit and live system. Take care that localization.properties does describe same lead locale.
		ErrorExecutingEditingStagingPipeline	final	source	Staging pipeline in editing system cannot be executed. Possibly app server error in source system.
		ErrorNonStagedDomains	final	source	Some replication content references at least one domain that is not part of the current replication process nor exists at least in one target system. Check the replication tasks / groups of current replication process. See also staging.log.
		ErrorNonStagedParentSites	final	source	Some replication content belongs to at least one unit whose parent site is not part of the current replication process nor exists at least in one target system. Check the replication tasks / groups of current replication process.
		ErrorConnectToEditingDB	final	source	The source system cannot create the staging identification token. Check DB connectivity.
		ErrorConnectLiveSystem	final	source	The source system's staging web service cannot connect to a target system. Check web reachability of the target system(s).
		ErrorCreatingLiveStagingProcess	final	source	Failure when copying the staging process to (at least) one target system. See also staging.log. Check web reachability of target systems, check DB in target systems.
		ErrorAcquiringLiveLocks	final	source	Failure when acquiring the locks for staging resources in (at least) one target system. See also staging.log. Check DB / locks in target systems.
		ErrorAcquiringEditingLocks	final	source	Failure / timeout when acquiring the locks for staging resources in source system. See also staging.log. Check DB / locks in source system.
		ErrorInitializingStagingProcessors	final	source	Failure when checking the assignments of staging processors for all staging groups. Check staging processor assignments in staging groups and staging.properties settings.
		ErrorStagingProcessModeNotSupported	final	source	(At least) one staging processor does not support the current replication process type, i.e., the staging process mode. See staging.log, too.
		StartingPreparation	process	source	The preparation phase is starting.
		PreparationSuccessfullyFinished	process	source	The preparation phase finished successfully.
		ErrorPreparation	final	source	The preparation phase finished with an error. See log files for more information.
		FatalErrorPreparation	final	source	Fatal error during error handling in preparation phase. See log files for more information. Check DB accessibility.
		ErrorCallingLivePipeline	final	source	An error occured while the source system called the staging pipeline in a target system. Check the log files, and check the web reachability of the according target system.
		StartingSynchronization	process	target	The synchronization phase is starting.
		SynchronizationSuccessfullyFinished	process	target	The synchronization phase finished successfully.
		ErrorSynchronization	final	target *	The synchronization phase finished with an error. See log files for more information.
		FatalErrorSynchronization	final	target *	Fatal error during error handling in synchronization phase. See log files for more information. Check DB accessibility.
		StartingReplication	process	target	The replication phase is starting.
		ReplicationSuccessfullyFinished	process	target	The replication phase finished successfully.
		ErrorReplication	final	target *	The replication phase finished with an error. See log files for more information.
		FatalErrorReplication	final	target *	Fatal error during error handling in replication phase. See log files for more information. Check DB accessibility.
		ReplicationProcessCompleted	final	target *	A staging process of type Replication has successfully finished.
		StartPublication	process	source *	This is a state used to get the target systems in sync before the publication phase can start.
		StartingPublication	process	target	The publication phase is starting.
		PublicationSuccessfullyFinished	process	target	The publication phase finished successfully.
		ErrorPublication	final	target *	The publication phase finished with an error. See log files for more information.
		FatalErrorPublication	final	target *	Fatal error during error handling in publication phase. See log files for more information. Check DB accessibility.
		StartRefreshCache	process	source *	This is a state used to get the target systems in sync before the refresh_cache phase can start.
		StartingRefreshCache	process	target	The refresh_cache phase is starting.
		RefreshCacheSuccessfullyFinished	process	target	The refresh_cache phase finished successfully.
		ErrorRefreshCache	final	target *	The refresh_cache phase finished with an error. See log files for more information.
		FatalErrorRefreshCache	final	target *	Fatal error during error handling in refresh_cache phase. See log files for more information. Check DB accessibility.
		StagingProcessCompleted	final	target *	A staging process of type ReplicationPublication or of type Publication has successfully finished.
		ErrorDeterminingUndoContent	final	source	An error occurred while determining the Undo content. See staging.log file for more information.
		StartingSaveNoneUndoContent	process	target	The sub-step SaveNoneUndoContent of Undo phase is starting.
		SaveNoneUndoContentSuccessfullyFinished	process	target	The sub-step SaveNoneUndoContent of Undo phase finished successfully.
		ErrorSaveNoneUndoContent	final	target *	The sub-step SaveNoneUndoContent of Undo phase finished with an error. See log files for more information.
		FatalErrorSaveNoneUndoContent	final	target *	Fatal error during error handling in sub-step SaveNoneUndoContent of Undo phase. See log files for more information. Check DB accessibility.
		StartingRestoreUndoContent	process	target	The sub-step RestoreUndoContent of Undo phase is starting.
		RestoreUndoContentSuccessfullyFinished	process	target	The sub-step RestoreUndoContent of Undo phase finished successfully.
		ErrorRestoreUndoContent	final	target *	The sub-step RestoreUndoContent of Undo phase finished with an error. See log files for more information.
		FatalErrorRestoreUndoContent	final	target *	Fatal error during error handling in sub-step RestoreUndoContent of Undo phase. See log files for more information. Check DB accessibility.
		StagingUndoCompleted	final	target *	A staging process of type UnDo has successfully finished.
		ErrorUndoStaging	final	target *	A staging process of type UnDo has finished with error(s). See log files for more information.
		ErrorInternalInLiveSystem	final	target *	Any severe failure when calling the target system's staging pipeline. See staging.log, too. Maybe missing or wrong settings in pipeline directory or other parameters. Should normally not occur during system operation.
		ErrorEditingStagingProcessKilled	final	source	At its start-up time, ICM checks the PROCESS table for staging processes with any non-final state (such a process would be broken due to shutdown or crash of the appservers). If so, this process is set to ErrorEditingStagingProcessKilled in the source system.
		ErrorLiveStagingProcessKilled	final	target *	At its start-up time, ICM checks the PROCESS table for a staging process with any non-final state (this process would be broken due to shutdown or crash of the appservers). If so, this process is set to ErrorLiveStagingProcessKilled in the target system.

Error Handling / Recovery

In case an error occurs during a replication process, both editing and live system(s) will keep the active data as they were active before the now broken replication process. In this sense, a data recovery is not needed if a replication process threw an error.

However, there is a situation where a manual intervention might be needed: In case the ICM application server that executes the replication process in a target system just crashed in that moment when it is performing the synonym switches, it might be that synonyms point to the newly filled table while this information is still not written to the database table STAGINGTABLE, which is used as an administration table for staging.

If such a situation occurs, open a SQL prompt as the target system's database user and execute the procedure staging.restore_synonyms.

exec staging.restore_synonyms

Possible Causes of Errors

Malfunctioning web connection between source and target system
The source system uses the web connection that is configured in replication-clusters.xml to inform the target systems about new replication processes.
Check as the according operation system user in the source system, e.g., isas1, if you can access the target system web address as configured. You may use " telnet <webserver> <port>" or suchlike.

Malfunctioning web connection between target and source system
The target system needs the web connection to download file content. While starting a replication process, the source system transmits its own web address as defined in appserver.properties or - if defined - in staging.properties.
Check as the according operation system user in the target system, e.g., isas1, if you can access the source system web address as configured. You may use " telnet <webserver> <port>" or suchlike.

Database connection fault (wrong DBLink configuration or broken connection) resp. database access forbidden from target to editing database schema
The target system's database user needs access to the source system's database schema to transfer database content.
Check as the according operation system user in the target system, e.g., isas1, if you can connect to the target system's database schema using SQLPlus and the credentials as defined in orm.properties. Check, if you can access source system data, e.g., by:
select using DBLink
```
select count(1) from product@<source_dblink_name>;
```
or
select using direct schema access
```
select count(1) from <source_schema_name>.product;
```

Errors due to database issues (e.g., ORA-errors like constraint violations, not existing tables, table changed but $s view not updated...)
Avoid manual copying of data! Replication can do that for you.
Beware of parallel manual creation of organizations, channels, catalogs in source and target systems! Replication can do that for you.
Use the DBMigrate preparers as described above to change tables or to create new tables. Pay attention to predefine UUIDs and to execute the DBMigrate preparers both in source and target system(s) with the identical configuration files.

Errors due to file system issues (access rights, disk space, quotas)
Check if the target system(s) provide enough disk space for index files and file downloads (path $IS_SITES). Check access rights for the file system. Check, if there are any active quotas that may limit file transfer (file system, web download, ..).
Database crash during publication (synonym switch)
See above for restoration of synonyms.

Disclaimer

The information provided in the Knowledge Base may not be applicable to all systems and situations. Intershop Communications will not be liable to any party for any direct or indirect damages resulting from the use of the Customer Support section of the Intershop Corporate Web site, including, without limitation, any lost profits, business interruption, loss of programs or other data on your information handling system.