Sisense Linux Multinode: MongoDB, Zookeeper and Data Recovery

vsolodkyi · ‎10-18-2022

Sisense Linux Multinode: MongoDB, Zookeeper and Data Recovery

Problem:

In unfortunate circumstances, there are situations where a Sisense Linux Multinode instance can become corrupted and the only choice to move forward is reinstallation. Sometimes in these cases, users can face a couple of challenges, especially in cases where there is no backup to recreate the instance. To add an additional layer, using the uninstallation parameter remove_user_data: false is not recommended for use in multinode environments since most storage types require empty filesystems and reinstallation will result in total data loss.

In this article, we will review methods that will help to revive data if a user has active storage in MongoDB or Zookeeper. It is possible to recover data with Sisense's system recovery service. With this service, Sisense can recover the data that is contained in MongoDB and Zookeeper including Dashboards, Models, Configuration, Users, etc.

Solution:

First, check if at least one PVC and PV of Zookeeper and/or MongoDB is active:
kubectl -n sisense get pv -owide && kubectl -n sisense get pvc -owide

In this example, all storages are online and active:

Next, scale the system-recovery deployment so at least one pod is up and running:
kubectl -n sisense scale deployment system-recover --replicas=1

Before accessing the recovery pod, we should scale Mongo and Zookeeper services to 0 to prevent them from making any changes to the filesystems during the backup process:
kubectl -n sisense scale sts sisense-mongodb --replicas=0 && kubectl -n sisense scale sts sisense-zookeeper --replicas=0

After that, we are ready to execute the rescue pod:
kubectl -n sisense exec -it system-recover-*HASH* bash

*Where the above command says *HASH* use the hash of your system-recovery pod*

In case you have more than one active storage, determine which storage is the “biggest”. This can be discovered using a df -h command:

bash-5.1# df -h

Filesystem Size Used Available Use% Mounted on

overlay 157.2G 31.4G 118.9G 21% /

tmpfs 64.0M 0 64.0M 0% /dev

tmpfs 31.5G 0 31.5G 0% /sys/fs/cgroup

10.50.43.182:vol_69de1298f26cc2b9dd3537eb9c6d8377

2.0G 248.5M 1.7G 12% /zookeeper1

10.50.43.182:vol_cf3a66b4b6440764c38d522f50944a80

20.0G 588.1M 19.4G 3% /mongodb2

10.50.43.180:vol_34989d737766e75c3ac2e5ea97b00d60

2.0G 247.7M 1.7G 12% /zookeeper2

10.50.43.183:vol_5f4d239870c545c9622cfc9d8a8a89b9

2.0G 250.1M 1.7G 12% /zookeeper0

10.50.43.183:vol_cbb77d9421ed30efcdeade33e27fdd57

70.0G 1007.7M 69.0G 1% /sisense

10.50.43.182:vol_f0e2b464bbc37a8592abe63398292738

20.0G 598.9M 19.4G 3% /mongodb0

10.50.43.182:vol_2ba80d1bdace1be776f0a1bfacd7f508

20.0G 592.1M 19.4G 3% /mongodb1

In our example /mongodb0 and /zookeeper0 are the biggest. Back them up using the following commands:

tar -czvf mongodb0.tar.gz mongodb0 && tar -czvf zookeeper0.tar.gz zookeeper0

Now we are good to quit/exit the pod and copy the backups from the pod to the instance:

kubectl -n sisense cp system-recover-*HASH*:/mongodb0.tar.gz /home/sisense/mongodb0.tar.gz

kubectl -n sisense cp system-recover-*HASH*:/zookeeper0.tar.gz /home/sisense/zookeeper0.tar.gz

*Where the above command says *HASH* use the hash of your system-recovery pod*

After that, we are ready to reinstall the system or to move the backups to a new system. Both are accomplished using similar steps.

In the new system, scale the system-recovery deployment so that at least one pod is up and running. Scale the MongoDB/Zookeeper pods to 0 to prevent them from making any changes to the filesystems during the restoration process:

kubectl -n sisense scale deployment system-recover --replicas=1 && kubectl -n sisense scale sts sisense-mongodb --replicas=0 && kubectl -n sisense scale sts sisense-zookeeper --replicas=0

Copy the backups to the recovery pod:

kubectl -n sisense cp mongodb0.tar.gz system-recover-*HASH*:/mongodb0.tar.gz

kubectl -n sisense cp zookeeper0.tar.gz system-recover-*HASH*:/zookeeper0.tar.gz

*Where the above command says *HASH* use the hash of your system-recovery pod* Next

Next, execute the rescue pod:

kubectl -n sisense exec -it system-recover-*HASH* bash

*Where the above command says *HASH* use the hash of your system-recovery pod*

Inside the pod, remove data from MongoDB and Zookeeper storages using the below command:
rm -rf mongodb0/* mongodb1/* mongod2/* zookeeper0/* zookeeper1/data/version-2/* zookeeper2/data/version-2/*

Note, that in zookeeper1 and zookeeper2 content should be removed only from the data/version-2/ directory, since zookeeper directories contain myid file with an instance id.

Extract backup archives:

tar -xzvf mongodb0.tar.gz && tar -xzvf zookeeper0.tar.gz

Content will be extracted to mongodb0 and zookeeper0 directories.

Next, copy this content to a second and third storage. For MongoDB:

cp -r mongodb0/* mongodb1/ && cp -r mongodb0/* mongodb2/

For Zookeeper:

cp -r zookeeper0/data/version-2/* zookeeper1/data/version-2/ && cp -r zookeeper0/data/version-2/* zookeeper2/data/version-2/

Now we are good to quit the pod using exit and scale back the MongoDB and Zookeeper pods:

kubectl -n sisense scale sts sisense-mongodb --replicas=3 && kubectl -n sisense scale sts sisense-zookeeper --replicas=3

Finally, restart all services with the following command: kubectl -n sisense delete pods --all Please note the restart process could take some time.

If all steps were performed correctly, the system will be ready to be used with recovered data that is contained in MongoDB and Zookeeper including Dashboards, Models, Configuration, Users, etc.

It’s also possible to recover data which is stored in the /opt/sisense directory, including farms, data from cubes, etc, in the same way. This data is mounted in the system-recovery pod in the /sisense directory.

If you need additional help, please contact Sisense Support or create a Support Case.

Sisense Community

Sisense Linux Multinode: MongoDB, Zookeeper and Data Recovery