Related Documents
Document Properties
Kbid28A509
Last Modified24-Sep-2020
Added to KB17-Nov-2017
Public AccessEveryone
StatusOnline
Doc TypeGuidelines, Concepts & Cookbooks
Product
  • ICM 7.9
  • ICM 7.10

Guide - Deployment Solr Cloud Server

1 Introduction

The Solr server can run in cloud mode (using the new collections API) and in stand-alone mode (using the Core API as before). The new ICM connector uses the collection API, thus requiring the Solr server running in Solr cloud mode.

The official documentation can be found here:

2 Single Server Installation for Development

2.1 Windows

The following steps have to be executed:

  1. Download the Solr package from an Apache download mirror, see Solr 8
  2. Unpack the package to a directory, e.g., c:\dev\solr.
  3. Start a command line (java environment needs to work; test with java -version) at c:\dev\solr\bin.

  4. Test run the Solr Cloud server (start the Solr server with the cloud example). The questions can be answered with the default values (just hit the <return> key).

    solr start -e cloud

2.2 Linux

The Following steps have to be executed:

  1. Install Oracle or Open JDK latest version (see Requirements for further details)
  2. Fetch the Solr package.
    Download the Solr package from an Apache download mirror, see Solr 8.
  3. Unpack the package to a directory, e.g., /opt/solr.
  4. Test run the Solr Cloud server (start the Solr server with the cloud example).
    The questions can be answered with the default values (just hit the Enter key).

    ./solr start -e cloud

2.3 Run Solr Cloud for Development

These steps refer to windows. For Linux, the corresponding native script should be used!

The following steps have to be executed:

  1. Start the Solr server in the /bin directory of your installation with the following options:

    solr start -c

    This will run a Solr server in cloud mode listening on the default port 8983.

    There are many other options. The following table lists some helpful options:

    Command line optionsExampleDescription
    -csolr start -crun in the cloud mode (required)
    -psolr start -c -p 8000listen at the port set instead of the default port
    -ssolr start -c -s C:\indexesserver directory to place indexes into a different location
    -helpsolr start -helpshows a list of start options

    The zookeeper is started embedded with the zkport = port + 1000.

    You can use solr start -c -p <port> -s <serverDirFromPoint1> -z <zooKeeperHost:zkPort> to connect your instance to a running zoo keeper instance. The embedded zookeeper is not started with the -z option. Each server contributing to the same Solr cluster needs an own server directory.

  2. Check the Solr server. Connect to http://localhost:<port>/solr to see the admin overview.
  3. Configure the solrZooKeeperHostList property in the environment.properties of your developer home to point to the running Solr embedded ZooKeeper.

    environment.properties
    #list of ZooKeeper connection urls (separated by ';'), which manage the Solr cluster
    solrZooKeeperHostList = localhost:9983
    #prefix to distinguish solr indexes, within the Solr cluster
    #solrClusterIndexPrefix =

    solrClusterIndexPrefix

    Optional, but if multiple ICM installations use the same Solr Cloud cluster, it is necessary to distinguish them. For more details, see Guide - Deployment Solr Cloud Server#Configuration of IS7.

  4. Stop the Solr server if not required.

    solr stop -p <startPort>

3 Setup and Run a Solr Cloud Cluster for Production

3.1 Introduction

Apache Solr provides the ability to set up a cluster of Solr servers. This will combine fault tolerance and high availability and provide distributed indexing and distributed search capabilities, automatic load-balancing and fail-over for queries. The central configuration and coordination for the entire cluster is done with Apache ZooKeeper. Installation of ZooKeeper and Solr is well-described in the given references below:

In addition, an example setup is described in the next section.

3.2 Simple Solr Cluster Setup

The following section describes a minimal Solr cluster setup of 3 servers. Each server runs a ZooKeeper and a Solr server. The host names of the 3 servers are noted in the example configuration as <host name 1>, <host name 2>, <host name 3>. Those are just symbolic names used in this document. You need to replace them with real host names that fit your needs. The setup uses the default locations and does only a minimum set of configurations. You may need to adapt memory and other settings according to the requirements. See Solr Production Setup (6.6 | 8.1) in the Apache Solr Reference Guide.

3.2.1 Install the Java JDK

ZooKeeper as well as Solr require a Java JDK. Solr requires version 1.8 or higher, ZooKeeper a Java version 1.6 or higher. 

  1. Install the JDK for your platform with version 1.8 or higher.

  2. Check that you can run Java and check that you have installed an appropriate version.

    # cd /usr/java/jdk1.8.0/bin
    # ./java -version
    java version "1.8.0_131"
    Java(TM) SE Runtime Environment (build 1.8.0_131-b11)
    Java HotSpot(TM) 64-Bit Server VM (build 25.131-b11, mixed mode)	

3.2.2 Create User and Group

Create an user and a group for this kind of installation. It is also possible to use an existing user.

groupadd -g 200 solr
useradd -g solr -m -u 200 solr

3.2.3 Install and Configure Apache ZooKeeper

All installation steps are executed by the root user.

  1. Get the recommended ZooKeeper version from the Solr documentation and download it.
  2. Install the release to /opt and create a link for the special version.

    tar -xvf zookeeper-3.5.7.tar.gz -C /opt
    ln -s /opt/zookeeper-3.5.7 /opt/zookeeper
  3. Create a configuration file named zoo.cfg in /opt/zookeeper/conf.

    /opt/zookeeper/conf/zoo.cfg
    rm -f /opt/zookeeper/conf/zoo.cfg
    cat <<EOT >> /opt/zookeeper/conf/zoo.cfg
    tickTime=2000
    dataDir=/var/zookeeper/data
    clientPort=2181
    initLimit=5
    syncLimit=2
    server.1=<host name 1>:2888:3888
    server.2=<host name 2>:2888:3888
    server.3=<host name 3>:2888:3888
    EOT
  4. Create necessary directories.

    mkdir -p /var/zookeeper/data
    mkdir -p /var/zookeeper/logs
  5. Create a myid text file in the ZooKeeper data directory /var/zookeeper/data and put the server id in that file on a single line. For <host name 1> it would be 1. On other nodes use different IDs.

    echo "1" >/var/zookeeper/data/myid
  6. Setup the logging by changing the default logging environment. Create /opt/zookeeper/conf/zookeeper-env.sh and set the environment variables.

    zookeeper-env.sh
    rm /opt/zookeeper/conf/zookeeper-env.sh
    cat <<EOT >> /opt/zookeeper/conf/zookeeper-env.sh
    JAVA_HOME=/usr/java/jdk1.8.0
    ZOO_LOG_DIR=/var/zookeeper/logs
    ZOO_LOG4J_PROP="WARN, ROLLINGFILE"
    EOT
    
  7. Change the owner of all files and directories.

    chown -R solr:solr /opt/zookeeper
    chown -R solr:solr /var/zookeeper
    chown -R solr:solr /opt/zookeeper-3.4.10
  8. Add the service configuration and enable this service.

    cat <<EOT >> /etc/systemd/system/zookeeper.service
    [Unit]
    Description=Zookeeper Server
    After=network.target
    
    [Service]
    Type=forking
    User=solr
    Group=solr
    SyslogIdentifier=zookeeper
    Restart=always
    RestartSec=0s
    ExecStart=/opt/zookeeper/bin/zkServer.sh start
    ExecStop=/opt/zookeeper/bin/zkServer.sh stop
    ExecReload=/opt/zookeeper/bin/zkServer.sh restart
    
    [Install]
    WantedBy=multi-user.target
    EOT
    systemctl daemon-reload
    systemctl enable zookeeper
    # start zookeeper instance
    systemctl start zookeeper
    # stop the service if necessary
    # systemctl stop zookeeper

Note

Repeat all the above steps beginning with the Java installation on each server node.

3.2.4 Install and Configure Solr

All installation steps are executed by the root user.

  1. Get the latest Solr 8  .tgz package, extract and execute the installation script for the user 'solr':

    tar xzf solr-8.6.0.tgz solr-8.6.0/bin/install_solr_service.sh --strip-components=2
    ./install_solr_service.sh solr-8.6.0.tgz -u solr -n
  2. Change the default configuration of Solr server instance.

    1. Set the ZK_HOST variable of the Solr node to connect to your ZooKeeper ensemble.
      This will automatically put your Solr server instance into Solr cloud mode.

    2. Enable and change the SOLR_LOG_LEVEL to set it to the desired log level.

      Info

      Further logging settings may be changed in /var/solr/log4j.properties.
    3. Edit /etc/default/solr.in.sh, uncomment and change the 2 environment variables as below. Other settings like memory size may be changed as needed.

      /etc/default/solr.in.sh
      cat <<EOT >> /etc/default/solr.in.sh
      SOLR_JAVA_HOME=/usr/java/jdk1.8.0
      ZK_HOST=<host name 1>:2181,<host name 2>:2181,<host name 3>:2181
      SOLR_LOG_LEVEL=WARN
      EOT
    4. In data replication environments a backup location must be available for all Solr nodes and must be configured as an allowed path in the according system property.

      /etc/default/solr.in.sh
      SOLR_OPTS="$SOLR_OPTS -Dsolr.allowPaths=/mnt/solr_backup"
  3. Set up the Solr server service.

    Setup Solr server service
    cat <<EOT >> /etc/systemd/system/solr.service
    [Unit]
    Description=Apache SOLR
    After=network.target
    
    [Service]
    Type=forking
    User=solr
    Group=solr
    Environment=SOLR_INCLUDE=/etc/default/solr.in.sh
    ExecStart=/opt/solr/bin/solr start
    ExecStop=/opt/solr/bin/solr stop
    Restart=always
    
    [Install]
    WantedBy=multi-user.target
    EOT
    systemctl daemon-reload
    systemctl enable solr
    # start sole instance
    systemctl start solr
    # stop the service if necessary
    # systemctl stop solr

    Remove the old init script '/etc/init.d/solr'.

Note

Repeat the installation on all server nodes.

3.2.5 Verify the Installation

  1. Open the url in a browser http://<host name 1>:8983/solr/#/~cloud?view=tree
  2. You should see the active nodes in the tree:

3.2.6 Data Replication Backup Location

If the Solr server is used for an ICM data replication system, a location for backups at the Solr server is required. This must be a shared storage location that is available on all Solr nodes with the same path. See also Guide - Solr Replication for more information. 

  1. Create a backup directory and export it on the first node:

    mkdir /mnt/solr_backup
    
    vi /etc/exports
    
    /mnt/solr_backup <host name 2>(rw,sync,no_subtree_check)
    /mnt/solr_backup <host name 3>(rw,sync,no_subtree_check)
  2. Mount it on the other Solr nodes:

    mount -t nfs <host name 1>:/mnt/solr_backup /mnt/solr_backup
  3. Clean-up the backups of old replication processes regularly.

    #delete the backup directories older than 7 days
    find /var/solr/backup -mindepth 1 -maxdepth 1 -daystart  -mtime +7 -type d -print -exec rm -rf '{}' \; 

3.2.7 Configure the Intershop Commerce Management

Configure the solrZooKeeperHostList and the solrClusterIndexPrefix properties to the ZooKeeper ensemble in your deployment, see here Recipe - Add Solr Cloud Search Service.

solrZooKeeperHostList
solrZooKeeperHostList=<host name 1>:2181,<host name 2>:2181,<host name 3>:2181

solrClusterIndexPrefix

solrClusterIndexPrefix=myserver-1

Optional, but if multiple ICM installations use the same Solr cloud cluster, it is necessary to distinguish them.

Furthermore adapt the collectionReplicationFactor property, please see here Recipe: Configure Intershop 7.

collectionReplicationFactor

solr.collectionReplicationFactor=2

Must be set for production systems

In data replication systems it is required to configure the backup location of the Solr server.

solr.cloudBackupLocation

solr.cloudBackupLocation=/mnt/solr_backup

4 Configuration of IS7

The following configuration options for connecting to and for configuring Solr Cloud exist in IS7:

AppServer
deployment configuration

environment.properties

Solr adapter cartridge properties

$IS_SHARE/system/config/cartridges/
ac_solr_cloud.properties

DescriptionExample
solrZooKeeperHostListsolr.zooKeeperHostList

List of ZooKeeper host:port configurations (separated by ';') that hold the Solr cluster.
Defaults to localhost:9983, the default internal ZooKeeper port of a standard installation.

mysolrzookeeperserver.com:9983
solrClusterIndexPrefixsolr.clusterIndexPrefix

A prefix to distinguish Solr collections within the Solr cluster if multiple ICM installations use the same Solr cloud cluster.
Defaults to the hosts simple name and the instance ID of the installation,  e.g. myserver-1

cbk-prod-live

solr.SocketTimeoutThe socket timeout in seconds that is set at the HttpClient when waiting for data.30

solr.ConnectionTimeoutThe connection timeout in seconds that is set at the HttpClient for waiting for a connection.15

solr.collectionNumberOfShards

Solr Cloud supports sharding for load balancing and distribution of large amounts of data.
Using more than one shard for an index requires a sharding strategy.
The number of shards is used when creating an index and when staging an index to a target (Live) system.
Default is 1.

1

solr.collectionReplicationFactor

Solr Cloud supports replicas. Solr replicas are copies of the same index data, for load balancing and fault tolerance.
The replicas are distributed among the nodes in the Solr cluster.
The replication factor should not exceed the number of nodes in the Solr cluster.
The replication factor is used when creating an index and when staging an index to a target (Live) system.
Default is 1.

2

solr.cloudBackupLocation

Staging of an index (collection) in Solr cloud requires:

  1. Backup the collection in the source (Edit) system
  2. Restore the backup in the target (Live) system

This requires a shared file system accessible by all Solr cloud servers in the Edit Solr cloud as well as in the Live Solr cloud.
It has nothing to do with the IS7 SFS!
The Solr cloud backup location is used by the Edit Solr cloud servers to write the backup
and by the Live Solr cloud servers to read the backup. The IS7 app server do not require access to it.

The Solr cloud backup location is mandatory for IS7 replication/staging systems!
There is no default value for this configuration property.

/mnt/solr_backup

solr.removeCollectionAfterReplication

After staging/replication the old collection is still available as shadow collection.
This is good in case you need to undo the replication. Because then you can switch the alias back to this collection.
But it is bad with respect to resource usage because the shadow collection is kept in the Solr Cloud until the next replication.

Possible values: true, false
Default: false

false

solr.collectionSuffix1

solr.collectionSuffix2

In a replication environment the general collection name is just an alias.
The actual collection name has a suffix, solr.collectionSuffix1 or solr.collectionSuffix2, e.g. ".1" or ".2".

During replication the shadow collection is removed and the backup of the collection from the source system is restored to be the new shadow collection.

Then the alias is switched from the active collection to the shadow collection.
Thus, the old shadow collection becomes the new active collection and the old active collection becomes the new shadow collection.

Default for solr.collectionSuffix1: .1
Default for solr.collectionSuffix2: .2

solr.collectionSuffix1=.1

solr.collectionSuffix2=.2

All properties can be overridden in the ${environment}.properties file. The exact name of this properties file depends on your configured environment (environment=development) Please see environment.properties for more details.

Disclaimer

The information provided in the Knowledge Base may not be applicable to all systems and situations. Intershop Communications will not be liable to any party for any direct or indirect damages resulting from the use of the Customer Support section of the Intershop Corporate Web site, including, without limitation, any lost profits, business interruption, loss of programs or other data on your information handling system.

Customer Support
Knowledge Base
Product Resources
Support Tickets