Personal Library

Guide - Deployment Solr Cloud Server

Introduction

The Solr server can run in cloud mode (using the new collections API) and in stand-alone mode (using the Core API as before). The new ICM connector uses the collection API, thus requiring the Solr server running in Solr cloud mode.

The official documentation can be found here:

Solr 8 reference guide
General documentation resources (may need to change version number)

References

Single Server Installation for Development

Windows

The following steps have to be executed:

Download the Solr package from an Apache download mirror, see Solr 8
Unpack the package to a directory, e.g., c:\dev\solr.
Start a command line (Java environment needs to work; test with java -version) at c:\dev\solr\bin.
Test run the Solr Cloud server (start the Solr server with the cloud example). The questions can be answered with the default values (just hit the <return> key).
```
solr start -e cloud
```

Linux

The Following steps have to be executed:

Install Oracle or Open JDK latest version (see Requirements for further details)
Fetch the Solr package.
Download the Solr package from an Apache download mirror, see Solr 8.
Unpack the package to a directory, e.g., /opt/solr.
Test run the Solr Cloud server (start the Solr server with the cloud example).
The questions can be answered with the default values (just hit the Enter key).
```
./solr start -e cloud
```

Run Solr Cloud for Development

These steps refer to windows. For Linux, the corresponding native script should be used!

The following steps have to be executed:

Start the Solr server in the /bin directory of your installation with the following options:

solr start -c

This will run a Solr server in cloud mode listening on the default port 8983.

There are many other options. The following table lists some helpful options:

Command line options	Example	Description
-c	solr start -c	run in the cloud mode (required)
-p	solr start -c -p 8000	listen at the port set instead of the default port
-s	solr start -c -s C:\indexes	server directory to place indexes into a different location
-help	solr start -help	shows a list of start options

The zookeeper is started embedded with the zkport = port + 1000.

You can use solr start -c -p <port> -s <serverDirFromPoint1> -z <zooKeeperHost:zkPort> to connect your instance to a running zoo keeper instance. The embedded zookeeper is not started with the -z option. Each server contributing to the same Solr cluster needs an own server directory.

Check the Solr server. Connect to http://localhost:<port>/solr to see the admin overview.
Configure the solrZooKeeperHostList property in the environment.properties of your developer home to point to the running Solr embedded ZooKeeper.
environment.properties
```
#list of ZooKeeper connection urls (separated by ';'), which manage the Solr cluster
solrZooKeeperHostList = localhost:9983
#prefix to distinguish solr indexes, within the Solr cluster
#solrClusterIndexPrefix =
```
solrClusterIndexPrefix
Optional, but if multiple ICM installations use the same Solr Cloud cluster, it is necessary to distinguish them. For more details, see Guide - Deployment Solr Cloud Server#Configuration of IS7.
Stop the Solr server if not required.
```
solr stop -p <startPort>
```

Setup and Run a Solr Cloud Cluster for Production

Introduction

Apache Solr provides the ability to set up a cluster of Solr servers. This will combine fault tolerance and high availability and provide distributed indexing and distributed search capabilities, automatic load-balancing and fail-over for queries. The central configuration and coordination for the entire cluster is done with Apache ZooKeeper. Installation of ZooKeeper and Solr is well-described in the given references below:

General Solr Cloud Documentation (8.6)
External ZooKeeper Ensemble Setup:
- ZooKeeper Administrator's Guide - Clustered (Multi-Server) Setup
- Howto Setup Apache Zookeeper Cluster on Multiple Nodes in Linux
Solr Production Setup (8.6)

In addition, an example setup is described in the next section.

Simple Solr Cluster Setup

The following section describes a minimal Solr cluster setup of 3 servers. Each server runs a ZooKeeper and a Solr server. The host names of the three servers are noted in the example configuration as <host name 1>, <host name 2>, <host name 3>. Those are just symbolic names used in this document. You need to replace them with real host names that fit your needs. The setup uses the default locations and does only a minimum set of configurations. You may need to adapt memory and other settings according to the requirements. See Solr Production Setup (6.6 | 8.1) in the Apache Solr Reference Guide.

Install the Java JDK

ZooKeeper as well as Solr require a Java JDK. Solr requires version 1.8 or higher, ZooKeeper a Java version 1.6 or higher.

Install the JDK for your platform with version 1.8 or higher.

Check that you can run Java and check that you have installed an appropriate version.

# cd /usr/java/jdk1.8.0/bin
# ./java -version
java version "1.8.0_131"
Java(TM) SE Runtime Environment (build 1.8.0_131-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.131-b11, mixed mode)

Create User and Group

Create a user and a group for this kind of installation. It is also possible to use an existing user.

groupadd -g 200 solr
useradd -g solr -m -u 200 solr

Install and Configure Apache ZooKeeper

All installation steps are executed by the root user.

Get the recommended ZooKeeper version from the Solr documentation and download it.

Install the release to /opt and create a link for the special version.

tar -xvf zookeeper-3.5.7.tar.gz -C /opt
ln -s /opt/zookeeper-3.5.7 /opt/zookeeper

Create a configuration file named zoo.cfg in /opt/zookeeper/conf.

/opt/zookeeper/conf/zoo.cfg

rm -f /opt/zookeeper/conf/zoo.cfg
cat <<EOT >> /opt/zookeeper/conf/zoo.cfg
tickTime=2000
dataDir=/var/zookeeper/data
clientPort=2181
initLimit=5
syncLimit=2
server.1=<host name 1>:2888:3888
server.2=<host name 2>:2888:3888
server.3=<host name 3>:2888:3888
EOT

Create necessary directories.

mkdir -p /var/zookeeper/data
mkdir -p /var/zookeeper/logs

Create a myid text file in the ZooKeeper data directory /var/zookeeper/data and put the server id in that file on a single line. For <host name 1> it would be 1. On other nodes use different IDs.
```
echo "1" >/var/zookeeper/data/myid
```

Setup the logging by changing the default logging environment. Create /opt/zookeeper/conf/zookeeper-env.sh and set the environment variables.

zookeeper-env.sh

rm /opt/zookeeper/conf/zookeeper-env.sh
cat <<EOT >> /opt/zookeeper/conf/zookeeper-env.sh
JAVA_HOME=/usr/java/jdk1.8.0
ZOO_LOG_DIR=/var/zookeeper/logs
ZOO_LOG4J_PROP="WARN, ROLLINGFILE"
EOT

Change the owner of all files and directories.

chown -R solr:solr /opt/zookeeper
chown -R solr:solr /var/zookeeper
chown -R solr:solr /opt/zookeeper-3.4.10

Add the service configuration and enable this service.

cat <<EOT >> /etc/systemd/system/zookeeper.service
[Unit]
Description=Zookeeper Server
After=network.target

[Service]
Type=forking
User=solr
Group=solr
SyslogIdentifier=zookeeper
Restart=always
RestartSec=0s
ExecStart=/opt/zookeeper/bin/zkServer.sh start
ExecStop=/opt/zookeeper/bin/zkServer.sh stop
ExecReload=/opt/zookeeper/bin/zkServer.sh restart

[Install]
WantedBy=multi-user.target
EOT

systemctl daemon-reload
systemctl enable zookeeper
# start zookeeper instance
systemctl start zookeeper
# stop the service if necessary
# systemctl stop zookeeper

Note

Repeat all the above steps beginning with the Java installation on each server node.

Install and Configure Solr

All installation steps are executed by the root user.

Get the latest Solr 8 .tgz package, extract and execute the installation script for the user 'solr':

tar xzf solr-8.6.0.tgz solr-8.6.0/bin/install_solr_service.sh --strip-components=2
./install_solr_service.sh solr-8.6.0.tgz -u solr -n

Change the default configuration of the Solr server instance.
1. Set the ZK_HOST variable of the Solr node to connect to your ZooKeeper ensemble.
  This will automatically put your Solr server instance into Solr cloud mode.
2. Enable and change the SOLR_LOG_LEVEL to set it to the desired log level.
  Info
  Further logging settings may be changed in /var/solr/log4j.properties.
3. Edit /etc/default/solr.in.sh, uncomment and change the 2 environment variables as below. Other settings like memory size may be changed as needed.
  /etc/default/solr.in.sh
```
cat <<EOT >> /etc/default/solr.in.sh
SOLR_JAVA_HOME=/usr/java/jdk1.8.0
ZK_HOST=<host name 1>:2181,<host name 2>:2181,<host name 3>:2181
SOLR_LOG_LEVEL=WARN
EOT
```
4. In data replication environments a backup location must be available for all Solr nodes and must be configured as an allowed path in the according system property.
  /etc/default/solr.in.sh
```
SOLR_OPTS="$SOLR_OPTS -Dsolr.allowPaths=/mnt/solr_backup"
```

Set up the Solr server service.

Setup Solr server service

cat <<EOT >> /etc/systemd/system/solr.service
[Unit]
Description=Apache SOLR
After=network.target

[Service]
Type=forking
User=solr
Group=solr
Environment=SOLR_INCLUDE=/etc/default/solr.in.sh
ExecStart=/opt/solr/bin/solr start
ExecStop=/opt/solr/bin/solr stop
Restart=always

[Install]
WantedBy=multi-user.target
EOT

systemctl daemon-reload
systemctl enable solr
# start sole instance
systemctl start solr
# stop the service if necessary
# systemctl stop solr

Remove the old init script '/etc/init.d/solr'.

Note

Repeat the installation on all server nodes.

Verify the Installation

Open the URL in a browser http://<host name 1>:8983/solr/#/~cloud?view=tree
You should see the active nodes in the tree:

Data Replication Backup Location

If the Solr server is used for an ICM data replication system, a location for backups at the Solr server is required. This must be a shared storage location that is available on all Solr nodes with the same path. See also Guide - Solr Replication for more information.

Create a backup directory and export it on the first node:

mkdir /mnt/solr_backup

vi /etc/exports

/mnt/solr_backup <host name 2>(rw,sync,no_subtree_check)
/mnt/solr_backup <host name 3>(rw,sync,no_subtree_check)

Mount it on the other Solr nodes:

mount -t nfs <host name 1>:/mnt/solr_backup /mnt/solr_backup

Clean-up the backups of old replication processes regularly.

#delete the backup directories older than 7 days
find /var/solr/backup -mindepth 1 -maxdepth 1 -daystart  -mtime +7 -type d -print -exec rm -rf '{}' \;

Configure the Intershop Commerce Management

Configure the solrZooKeeperHostList and the solrClusterIndexPrefix properties to the ZooKeeper ensemble in your deployment, see here Recipe - Add Solr Cloud Search Service.
solrZooKeeperHostList
```
solrZooKeeperHostList=<host name 1>:2181,<host name 2>:2181,<host name 3>:2181
```
solrClusterIndexPrefix
```
solrClusterIndexPrefix=myserver-1
```
Optional, but if multiple ICM installations use the same Solr cloud cluster, it is necessary to distinguish them.
Adapt the collectionReplicationFactor property, please see here Recipe: Configure Intershop 7.
collectionReplicationFactor
```
solr.collectionReplicationFactor=2
```
Must be set for production systems
In data replication systems it is required to configure the backup location of the Solr server.
solr.cloudBackupLocation
```
solr.cloudBackupLocation=/mnt/solr_backup
```

Configuration of IS7

The following configuration options for connecting to and for configuring Solr Cloud exist in IS7:

AppServer deployment configuration environment.properties	Solr adapter cartridge properties $IS_SHARE/system/config/cartridges/ ac_solr_cloud.properties	Description	Example
solrZooKeeperHostList	solr.zooKeeperHostList	List of ZooKeeper host:port configurations (separated by ';') that hold the Solr cluster. Defaults to localhost:9983, the default internal ZooKeeper port of a standard installation.	mysolrzookeeperserver.com:9983
solrClusterIndexPrefix	solr.clusterIndexPrefix	A prefix to distinguish Solr collections within the Solr cluster if multiple ICM installations use the same Solr cloud cluster. Defaults to the hosts simple name and the instance ID of the installation, e.g. myserver-1	cbk-prod-live
	solr.SocketTimeout	The socket timeout in seconds that is set at the HttpClient when waiting for data.	30
	solr.ConnectionTimeout	The connection timeout in seconds that is set at the HttpClient for waiting for a connection.	15
	solr.collectionNumberOfShards	Solr Cloud supports sharding for load balancing and distribution of large amounts of data. Using more than one shard for an index requires a sharding strategy. The number of shards is used when creating an index and when staging an index to a target (Live) system. Default is 1.	1
	solr.collectionReplicationFactor	Solr Cloud supports replicas. Solr replicas are copies of the same index data, for load balancing and fault tolerance. The replicas are distributed among the nodes in the Solr cluster. The replication factor should not exceed the number of nodes in the Solr cluster. The replication factor is used when creating an index and when staging an index to a target (Live) system. Default is 1.	2
	solr.cloudBackupLocation	Staging of an index (collection) in Solr cloud requires: Backup the collection in the source (Edit) system Restore the backup in the target (Live) system This requires a shared file system accessible by all Solr cloud servers in the Edit Solr cloud as well as in the Live Solr cloud. It has nothing to do with the IS7 SFS! The Solr cloud backup location is used by the Edit Solr cloud servers to write the backup and by the Live Solr cloud servers to read the backup. The IS7 app server do not require access to it. Note The Solr cloud backup location is mandatory for IS7 replication/staging systems. There is no default value for this configuration property.	/mnt/solr_backup On Windows escape backslashes in the property: D:\\mnt\\solr_backup
	solr.removeCollectionAfterReplication	After staging/replication the old collection is still available as a shadow collection. This is good in case you need to undo the replication. Because then you can switch the alias back to this collection. However, it is bad in terms of resource usage because the shadow collection is kept in the Solr Cloud until the next replication. Possible values: true, false Default: false	false
	solr.collectionSuffix1 solr.collectionSuffix2	In a replication environment the general collection name is just an alias. The actual collection name has a suffix, solr.collectionSuffix1 or solr.collectionSuffix2, e.g. ".1" or ".2". During replication the shadow collection is removed and the backup of the collection from the source system is restored to be the new shadow collection. Then the alias is switched from the active collection to the shadow collection. Thus, the old shadow collection becomes the new active collection and the old active collection becomes the new shadow collection. Default for solr.collectionSuffix1: .1 Default for solr.collectionSuffix2: .2	solr.collectionSuffix1=.1 solr.collectionSuffix2=.2

All properties can be overridden in the ${environment}.properties file. The exact name of this properties file depends on your configured environment (environment=development) Please see environment.properties for more details.

Disclaimer

The information provided in the Knowledge Base may not be applicable to all systems and situations. Intershop Communications will not be liable to any party for any direct or indirect damages resulting from the use of the Customer Support section of the Intershop Corporate Web site, including, without limitation, any lost profits, business interruption, loss of programs or other data on your information handling system.

Table of Contents

Introduction

References

Single Server Installation for Development

Windows

Linux

Run Solr Cloud for Development

Setup and Run a Solr Cloud Cluster for Production

Introduction

Simple Solr Cluster Setup

Install the Java JDK

Create User and Group

Install and Configure Apache ZooKeeper

Install and Configure Solr

Verify the Installation

Data Replication Backup Location

Configure the Intershop Commerce Management

Configuration of IS7