Document Properties
Kbid29655T
Last Modified29-Oct-2020
Added to KB02-Oct-2020
Public AccessEveryone
StatusOnline
Doc TypeGuidelines, Concepts & Cookbooks
Product
  • ICM 7.10
  • Intershop CaaS

Concept - CaaS Synchronization Process

1 Introduction

This concept describes how the synchronization between different environments works. A nearly identical data set on different environments allows for example to run tests under (nearly) production conditions using a lower tier test environment. The sync process can also help to restore the database or sync it from another environment.

After the sync, a pseudonymization process ensures that no sensitive personalized data is stored in local environments.

1.1 References

1.2 Glossary

TermDescription
DBDatabase
DEVDevelopment Team
INTIntegration environment
PRDProduction environment
SFSShared file system
UATUser acceptance test environment

2 Synchronization Process

In contrast to the replication process, which takes place between live and edit ICM clusters, the synchronization process can take place between the edit or live cluster of two environments, for example from PRD edit cluster to UAT or INT edit cluster. 

The sync always occurs from a "higher" to a "lower" environment. 

Running the sync process is done manually, as this has to be agreed between all parties. An automatic execution (e.g. 1x per week) is also conceivable, but must be individually coordinated for each project.


The synchronization consists of two processes:

  • Shared File System Synchronization
    This allows to synchronize the shared file systems of each environment. During this process, shared file system data or files like images and general assets are copied from the PRD file system to the UAT file system. This is considered optional since for example missing images do not break general functionality or change semantic correctness of implemented or newly introduced features.
    However, it may still be useful for testing, as the duration of data replication from EDIT to LIVE may give an indication of behavior and duration of replication for the PRD environment.
  • Database Synchronization
    This allows to backup and restore the database in the desired environment. The goal is to have the same customer data, products, service configurations etc. on the UAT/INT environment as on the PRD environment.

This can be done by the customer for INT and UAT environments, not for PRD.

Info

The time required for the synchronization can vary greatly depending on the data stock. A large number of images in particular will increase the time required for SFS synchronization. A large number of products will increase the time required for database synchronization. Also note that the initial synchronization takes longer than subsequent synchronization processes.

3 Configuration

The configuration of the two synchronization processes can be done in Jenkins.

3.1 Shared File System Synchronization

The shared file system synchronization can be done in Jenkins via ICM Shared Filesystem Sync.

Below Build with Parameters, the following parameters are available:

  • Sync direction (SYNC_DIRECTION)
    For example from PRD to INT
  • Sync directories (SYNC_DIRS)
    Allows to define the scope of the sync (sites or also system/config/domains)
  • Do Sync (DO_SYNC)
    Define whether an actual sync or just a dry run should be done.
  • Delete Files From Target (DO_DELETE)
    Define if files from target not present on the source should be deleted. 

3.2 Database Synchronization

The synchronization of the database can be done in Jenkins using jobs "ICM DB MSSQL Backup" and "ICM DB MSSQL Restore".

If ICM DB MSSQL Backup job is not available, then restore can still be made using ICM DB MSSQL Restore as in this case point-in-time recovery is available.

3.2.1 Database Backup

The database backup is scheduled to run automatically and regularly on database level, for example to occur every night or during lowest frequented time of day. It can also be triggered manually by clicking Build with Parameters in Jenkins, if available (see above). 

3.2.2 Database Restore

The previously created database backup can be restored in another environment or point-in-time recovery is used to achieve this, see above. For example, a backup from UAT edit database can be restored to INT edit database.

The restoration is done in 5 steps:

  1. Monitoring downtime
    This will shut down the monitoring system, so there are no alerts regarding the following shutdown of the application server nodes. 
  2. Stop application server nodes
  3. GIT project checkout
    A git configuration project including scripts that are used for restoring is checked out. 
  4. MSSQL database Restore
    The actual restore happens at this point. When restoring from a PRD database, pseudonymization scripts will be run afterwards. This will disguise all sensitive data. 
  5. Startup application server nodes

3.2.3 Responsibilities

The operations team takes care of customer data pseudonymization during the synchronization process.

  • The development team must inform the operations team about newly introduced customer data containing database tables and related things, so the operations team can change/adopt/extend the pseudonymization process.
  • “not-to-pseudonymize” accounts (e.g. for test/QA) and related data needs to be communicated/agreed with the operations team – allowing a set of specific “whitelisted” accounts to still work on UAT after sync from PRD.
  • E-mails, logins etc. from real customers must not be available on UAT.

The operations team also takes care of deactivating automatically running jobs/schedules in order to prevent order exports, etc., for example if the back end target is configured incorrectly (see Limitations below).

Furthermore, the development team is responsible for ensuring that PRD configurations, e.g. regarding Paypal account data, are not identical to those on the UAT environment.

3.2.4 Limitations

PRD configuration data (service, backend, targets etc.) are also synchronized with UAT. This potentially allows UAT to communicate with PRD backends and possibly execute actions that are not intended for UAT.

Solution:

  • The development team provides a (declarative) configuration set, automatically injected on application server node startup, e.g. via resource files.
  • This configuration set must depend on the environment/cluster (there are property files and other ways to determine exactly whether the application server node belongs to a PRD, UAT or INT environment and also whether it is a LIVE or EDIT cluster)
  • This would lead to defined service config, resets of testing accounts and related passwords (e.g. for SMC, SLDSystem access) would also be possible if desired.
  • Needs be reviewed constantly and maintained if new channels and/or configs are introduced to match intended config for UAT/INT clusters.

4 Frequency

The decision on how often the synchronization is done is determined by the customer. Intershop recommends at least one synchronization at the end of each PRD deployment.

Disclaimer

The information provided in the Knowledge Base may not be applicable to all systems and situations. Intershop Communications will not be liable to any party for any direct or indirect damages resulting from the use of the Customer Support section of the Intershop Corporate Web site, including, without limitation, any lost profits, business interruption, loss of programs or other data on your information handling system.

Customer Support
Knowledge Base
Product Resources
Tickets