Basic kubernetes troubleshooting guide [Linux]

This guide provides a general approach for troubleshooting Sisense Linux-based deployments running on Kubernetes. It is intended to collect initial information and perform basic diagnostics.

Step-by-Step Guide:

Do not restart anything before collecting information from step 1

1. Collect Overall Cluster Information

Start by collecting a high-level overview of the cluster and the current state of the Sisense deployment.

Commands:

kubectl get pods -A
kubectl -n sisense get events
kubectl -n sisense get pv
kubectl -n sisense get pvc
kubectl get nodes
kubectl get all -A > all.txt
kubectl describe all -A > desc.txt

Collecting these outputs provides an overall snapshot of the environment, which will help identify potential issues.

2. Validate Core Sisense Services

Verify that the following core services are up and running:

RabbitMQ
MongoDB
Zookeeper
Storage (FSx, GlusterFS, etc.)
API Gateway
Galaxy

Check the pod status for each service in the Sisense namespace:

kubectl -n sisense get pods

All critical pods should be in the running state. Any CrashLoopBackOff, Error, or Pending states should be investigated.

3. Identify Failed Pods and Gather Details

Locate any failed pods that require investigation and proceed with the troubleshooting:

kubectl -n sisense describe pod <pod_name>

Analyze the pod events and container termination statuses:

Exit Code 137:

Indicates the container was terminated by signal 9 (SIGKILL). In most cases, the container exceeded its memory limit (Out-Of-Memory Killer triggered) or the node itself was under memory pressure and killed processes to recover. Also could be related to other issues caused by resource pressure.

Exit Code 1:

Application-level failure inside the container. Focus on application logs to identify the cause.

Liveness Probe Failure:

Kubernetes considers the container unhealthy and restarts it. May indicate that the application is unresponsive or failed internal health checks.

Readiness Probe Failure:

Indicates the application is not ready to serve traffic but may still be running. Often related to initialization, dependencies, or internal connectivity issues.

4. Collect Container Logs

After identifying the problematic pod, collect its logs:

kubectl -n sisense logs <pod_name> -p

(The -p flag retrieves logs from the previously terminated container instance.)

Review the logs for specific error messages, stack traces, or indications of configuration or environment issues.

5. Next Steps

Collect all information from steps 1-4.
Provide full output to Sisense Support for further analysis.

Conclusion:

This is a basic troubleshooting workflow. Complex environments may require additional network, storage, or cluster-level diagnostics.

Add a disclaimer for custom solutions. DO NOT CHANGE IT THIS DISCLAIMER!!! ⤵️
Disclaimer: This post outlines a potential custom workaround for a specific use case or provides instructions regarding a specific task. The solution may not work in all scenarios or Sisense versions, so we strongly recommend testing it in your environment before deployment. If you need further assistance with this, please let us know.

Published 06-20-2025

Oshch

Sisense Employee

Joined August 01, 2024

View Profile

Knowledge Base Article

Basic kubernetes troubleshooting guide [Linux]

This guide provides a general approach for troubleshooting Sisense Linux-based deployments running on Kubernetes. It is intended to collect initial information and perform basic diagnostics.

Step-by-Step Guide:

1. Collect Overall Cluster Information

2. Validate Core Sisense Services

3. Identify Failed Pods and Gather Details

4. Collect Container Logs

5. Next Steps

Conclusion:

Related Content

Troubleshooting Pods in Kubernetes Clusters

How To Troubleshoot Build Failures (Linux OS)

Troubleshooting Sisense helm provisioner [Linux]

Pulse troubleshooting guide (Linux)

Sisense environment health check (linux & kubernetes deployments)

About Troubleshooting

Sisense

Support

Resources