Solr Operator Documentation

Solr on Kubernetes on local Mac

This tutorial shows how to setup Solr under Kubernetes on your local mac. The plan is as follows:

  1. Setup Kubernetes and Dependencies
    1. Setup Docker for Mac with K8S
    2. Install an Ingress Controller to reach the cluster on localhost
  2. Install Solr Operator
  3. Start your Solr cluster
  4. Create a collection and index some documents
  5. Scale from 3 to 5 nodes
    1. Using the Horizontal Pod Autoscaler
  6. Upgrade to newer Solr version
  7. Install Kubernetes Dashboard (optional)
  8. Delete the solrCloud cluster named ‘example’

Setup Kubernetes and Dependencies

Setup Docker for Mac with K8s

# Install Homebrew, if you don't have it already
/bin/bash -c "$(curl -fsSL \
	https://raw.githubusercontent.com/Homebrew/install/master/install.sh)"

# Install Docker Desktop for Mac (use edge version to get latest k8s)
brew install --cask docker

# Enable Kubernetes in Docker Settings, or run the command below:
sed -i -e 's/"kubernetesEnabled": false/"kubernetesEnabled": true/g' \
    ~/Library/Group\ Containers/group.com.docker/settings.json

# Start Docker for mac from Finder, or run the command below
open /Applications/Docker.app

# Install Helm, which we'll use to install the operator, and 'watch'
brew install helm watch

Install an Ingress Controller

Kubernetes services are by default only accessible from within the k8s cluster. To make them adressable from our laptop, we’ll add an ingress controller

# Install the nginx ingress controller
kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/master/deploy/static/provider/cloud/deploy.yaml

# Inspect that the ingress controller is running by visiting the Kubernetes dashboard 
# and selecting namespace `ingress-nginx`, or running this command:
kubectl get all --namespace ingress-nginx

# Edit your /etc/hosts file (`sudo vi /etc/hosts`) and replace the 127.0.0.1 line with:
127.0.0.1	localhost default-example-solrcloud.ing.local.domain ing.local.domain default-example-solrcloud-0.ing.local.domain default-example-solrcloud-1.ing.local.domain default-example-solrcloud-2.ing.local.domain dinghy-ping.localhost

Once we have installed Solr to our k8s, this will allow us to address the nodes locally.

Install the Solr Operator

You can follow along here, or follow the instructions in the Official Helm release.

Now that we have the prerequisites setup, let us install Solr Operator which will let us easily manage a large Solr cluster:

Now add the Solr Operator Helm repository. (You should only need to do this once)

$ helm repo add apache-solr https://solr.apache.org/charts
$ helm repo update

Next, install the Solr Operator chart. Note this is using Helm v3, in order to use Helm v2 please consult the Helm Chart documentation. This will install the Zookeeper Operator by default.

# Install the Solr & Zookeeper CRDs
$ kubectl create -f https://solr.apache.org/operator/downloads/crds/v0.8.1/all-with-dependencies.yaml
# Install the Solr operator and Zookeeper Operator
$ helm install solr-operator apache-solr/solr-operator --version 0.8.1

Note that the Helm chart version does not contain a v prefix, which the downloads version does. The Helm chart version is the only part of the Solr Operator release that does not use the v prefix.

After installing, you can check to see what lives in the cluster to make sure that the Solr and ZooKeeper operators have started correctly.

$ kubectl get all

NAME                                                   READY   STATUS             RESTARTS   AGE
pod/solr-operator-8449d4d96f-cmf8p                     1/1     Running            0          47h
pod/solr-operator-zookeeper-operator-674676769c-gd4jr  1/1     Running            0          49d

NAME                                              READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/solr-operator                     1/1     1            1           49d
deployment.apps/solr-operator-zookeeper-operator  1/1     1            1           49d

NAME                                                         DESIRED   CURRENT   READY   AGE
replicaset.apps/solr-operator-8449d4d96f                     1         1         1       2d1h
replicaset.apps/solr-operator-zookeeper-operator-674676769c  1         1         1       49d

After inspecting the status of you Kube cluster, you should see a deployment for the Solr Operator as well as the Zookeeper Operator.

Start an example Solr Cloud cluster

To start a Solr Cloud cluster, we will create a yaml that will tell the Solr Operator what version of Solr Cloud to run, and how many nodes, with how much memory etc.

# Create a 3-node cluster v8.11.2 with 300m Heap each:
helm install example-solr apache-solr/solr --version 0.8.1 \
  --set image.tag=8.11.2 \
  --set solrOptions.javaMemory="-Xms300m -Xmx300m" \
  --set addressability.external.method=Ingress \
  --set addressability.external.domainName="ing.local.domain" \
  --set addressability.external.useExternalAddress="true" \
  --set ingressOptions.ingressClassName="nginx"

# The solr-operator has created a new resource type 'solrclouds' which we can query
# Check the status live as the deploy happens
kubectl get solrclouds -w

# Open a web browser to see a solr node:
# Note that this is the service level, so will round-robin between the nodes
open "http://default-example-solrcloud.ing.local.domain/solr/#/~cloud?view=nodes"

Create a collection and index some documents

Create a collection via the Collections API.

# Execute the Collections API command
curl "http://default-example-solrcloud.ing.local.domain/solr/admin/collections?action=CREATE&name=mycoll&numShards=1&replicationFactor=3&maxShardsPerNode=2&collection.configName=_default"

# Check in Admin UI that collection is created
open "http://default-example-solrcloud.ing.local.domain/solr/#/~cloud?view=graph"

Now index some documents into the empty collection.

curl -XPOST -H "Content-Type: application/json" \
    -d '[{id: 1}, {id: 2}, {id: 3}, {id: 4}, {id: 5}, {id: 6}, {id: 7}, {id: 8}]' \
    "http://default-example-solrcloud.ing.local.domain/solr/mycoll/update/"

Scale from 3 to 5 nodes

So we wish to add more capacity. Scaling the cluster is a breeze.

# Issue the scale command
kubectl scale --replicas=5 solrcloud/example

After issuing the scale command, start hitting the “Refresh” button in the Admin UI. You will see how the new Solr nodes are added. You can also watch the status via the kubectl get solrclouds command:

kubectl get solrclouds -w

# Hit Control-C when done

Horizontal Pod Autoscaler (HPA)

The SolrCloud CRD is setup so that it is able to run with the HPA. Merely use the following when creating an HPA object:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: example-solr
spec:
  maxReplicas: 6
  minReplicas: 3
  scaleTargetRef:
    apiVersion: solr.apache.org/v1beta1
    kind: SolrCloud
    name: example
  metrics:
    ....

Make sure that you are not overwriting the SolrCloud.Spec.replicas field when doing kubectl apply, otherwise you will be undoing the autoscaler’s work. By default, the helm chart does not set the replicas field, so it is safe to use with the HPA.

Upgrade to newer version

So we wish to upgrade to a newer Solr version:

# Take note of the current version, which is 8.11.2
curl -s http://default-example-solrcloud.ing.local.domain/solr/admin/info/system | grep solr-i

# Update the solrCloud configuration with the new version, keeping all previous settings and the number of nodes set by the autoscaler.
helm upgrade example-solr apache-solr/solr --version 0.8.1 \
  --reuse-values \
  --set image.tag=8.11.3

# Click the 'Show all details" button in Admin UI and start hitting the "Refresh" button
# See how the operator upgrades one pod at a time. Solr version is in the 'node' column
# You can also watch the status with the 'kubectl get solrclouds' command
kubectl get solrclouds -w

# Hit Control-C when done

Install Kubernetes Dashboard (optional)

Kubernetes Dashboard is a web interface that gives a better overview of your k8s cluster than only running command-line commands. This step is optional, you don’t need it if you’re comfortable with the cli.

# Install the Dashboard
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.4/aio/deploy/recommended.yaml

# You need to authenticate with the dashboard. Get a token:
kubectl -n kubernetes-dashboard describe secret \
    $(kubectl -n kubernetes-dashboard get secret | grep default-token | awk '{print $1}') \
    | grep "token:" | awk '{print $2}'

# Start a kube-proxy in the background (it will listein on localhost:8001)
kubectl proxy &

# Open a browser to the dashboard (note, this is one long URL)
open "http://localhost:8001/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:/proxy/#/overview?namespace=default"

# Select 'Token' in the UI and paste the token from last step (starting with 'ey...')

Delete the solrCloud cluster named ‘example’

kubectl delete solrcloud example