ContributionsMost RecentNewest TopicsMost LikesSolutionsSisense Kubernetes Cluster Health Check Sisense Kubernetes Cluster Health Check Check the pods status 1. Check if there are pods that are not in a Running or Competed state. kubectl get po -A -o wide | egrep -v 'Running|Completed' -A is used to get pods from all namespaces (Sisense is usually installed in sisense one) -o wide is used to get the extended output The response should be empty: 2. If there is no output (all pods are in a Running or Competed state), check if all containers of the Running pods are READY. kubectl get po -A -o wide | egrep 'Running' You should be looking for numbers in the READY column, x/y, where x is the number of ready containers and y is the total number of containers. Please note if x is less than y, then not all containers are ready! Please refer to the sections below for instructions on troubleshooting this issue. 3. If all pods are in a Running or Completed state, and all Running pods containers are READY, check the status of the nodes: kubectl get nodes. All nodes should have the status ‘Ready.’ If it is a single-node environment, then you will see just one node. If a multi-node environment, then you should see several nodes. 4. If the node is not in a ‘Ready’ state, get details by ‘describing’ the node: kubectl describe node <node-name>. 5. You may also check storage health by running: kubectl -n sisense get PVC. 6. If all pods are in a Running or Completed state, all Running pods containers are READY, and all nodes are in a Ready state, the basic Kubernetes troubleshooting is complete, and the issue is not in the Kubernetes infrastructure. What if kubectl is not running? 1. If Linux doesn't recognize the kubectl command, there is an issue with the Kubernetes installation, or the user doesn’t have permissions to run kubectl: 2. The main Kubernetes component is kubelet. Check the status of the kubelet service (does not apply to RKE deployments): systemctl status kubelet It should be in an active/running state. 3. If the kubelet is not in an active (running) state, try restarting it 4. You may check kubelet logs by running: journalctl -u kubelet Click shift+g to go to the end of the list 5. If the kubelet is missing, then there is an issue with the Kubernetes installation 6. If the kubelet is in an active (running) state, check if there is a .kube directory in the home directory of your current user: cd && ls -la .kube If the .kube directory is missing or empty, the current user is not configured to run kubectl, and there is a problem with the Kubernetes configuration. What if the pods are not running correctly? 1. If you have a meaningful output, but you have pods in a state other than Running or Completed, or not all containers are READY like the below screenshot, then you will have to describe the pod to understand the reason why it is unhealthy. 2. For example, you have a pod with 0/1 containers READY: Assuming the pod is in the sisense namespace, copy the name of the pod, in this case, ‘external-plugins-5dcf494b77-gtsfk’, and run kubectl -n sisense describe pod external-plugins-5dcf494b77-gtsfk The output will look like this: The two main sections we are interested in evaluating are Conditions and Events. Conditions will give you True/False values if the pod is: -Initialized -Ready -ContainerReady -PodScheduled (pod is placed on the node) In the example above, the pod had been placed on the node (PodScheduled: True), but it is not ready, although it has been initialized, because its container is not Ready. Events will give you an excerpt from the kubelet log showing events related to the current pod. In the example above, the Readiness probe for the pod failed, so the problem is in the application itself and not in the Kubernetes infrastructure. 3. You can check the logs of the pod with the command: kubectl -n sisense logs external-plugins-5dcf494b77-gtsfk and look for errors given the clue about the root cause of the problem. 4. The container may be in a state other than Running: Use describe pod to check its Conditions and Events as we did in the previous case: kubectl -n sisense describe po external-plugins-5dcf494b77-gtsfk 5. If conditions and events don’t give you enough information about the root cause of the problem, look at the State/Last state section: In the example above, the Last State is ‘Terminated,’ and the Reason is OOMKilled, which means Out of Memory, Killed. This means that the Kubernetes has killed the container because the latter has exceeded the memory limit. To increase the memory limit, find the problematic pod: kubectl -n sisense get po Then find the Kubernetes object managing the pod. In our example kubectl -n sisense get all | grep connectors Find a resource without additional random letters/digits in the name: In our case, it’s a deployment. Then edit the resource: kubectl -n sisense edit deployment connectors And search for resources: Increase the ‘limits’ for the ‘memory’ in our example. 6. Let’s consider another example: We have a pod in a CrashLoopBackOff state. Let’s describe the pod: kubectl -n sisense describe po sisense-dgraph-alpha-0 It doesn’t give us anything obvious. Let’s check the logs: kubectl -n sisense logs sisense-dgraph-alpha-0 (add –previous if you don’t have any output). You can see that the root cause of the issue is the fact that there is “no space left on device.” This means we should allocate more space to the pod. In this case, you may check the status of persistent volumes and persistent volume claims with kubectl get pv and kubectl -n sisense get PVC. You are looking for statuses other than Bound: If you see a status other than Bound, you may “describe” the resource: kubectl -n sisense describe pvc data-dgraph-0 as in the case above. What else to check? 1. If the cluster looks healthy, but performance suffers, you may check the resource consumption by Sisense services. Start with checking nodes. kubectl top nodes The output should look like: Note if the CPU% is close to 100% or memory% is close to 85% 2. To check the resource consumption of the individual pods run kubectl -n sisense top po Note the pods with abnormally high memory consumption: Conclusion In conclusion, if you are a Kubernetes pro then this article will help you quickly grasp what infrastructure components are involved in Sisense deployment and what to check next. If you are a Kubernetes newbie, these basic instructions will let you troubleshoot the issue quickly and identify the issue to seek further help. If you need any additional help, please contact Sisense Support. Logging and Monitoring in Sisense Want to check Sisense logs in Linux and find out how many resources Sisense consumes? Read this community post for a quick introduction to Sisense logging and monitoring. Google Analytics CDATA Connector In May 2022, after Google’s April announcement deprecating an old API, Sisense announced that we will be deprecating the native Google Analytics connector. Despite these hurdles, users can still use CDATA drivers as a workaround in order to connect. This article will show you two ways of using a CDATA driver to connect to a Google Analytics data source. Kubernetes DNS Linux Issue Caused by Missing br_netfilter kernel module This article is inspired by the customer’s ticket with the issue that looked like a Rabbitmq failure in the beginning, and that ended up being the result of the missing br_netfilter kernel module absent on one of EKS cluster nodes