Data Flow Troubleshooting in vuSmartMaps

Most Common Challenges And How To Debug Them

vuSmartMaps is built to monitor the most complex environments. As part of this, it collects data using various methods. It internally uses K8s-based microservices that work together. Since there are many moving parts, issues can occur. This document explains most of the common issues and challenges faced by users who work with O11y Sources in vuSmartMaps. 

💡Note:  This would be a running document that will keep getting updated. 

For troubleshooting the installation-related issues, please check this document.

Data Ingestion

vuSmartMaps supports a lot of Observability Sources(O11ySources). It supports the collection of health and performance data for these O11ySources using either an Agent or Agentless method. Same agent or agentless method can be used for multiple O11ySources and most common challenges are usually similar across these O11ySources. Here are the common issues seen for these data ingestion methods and how you can debug and resolve them.

Data Isn’t Reaching The Input Stream

One of the most common challenges our engineers face is that the data does not reach the input topic. The potential causes for this will be due to the Agent side issues and we can verify this using the I/O Streams section under ContextStreams.

  1. For an O11y Source, Eg: Linux Monitoring, Navigate to ContextStreams Tab and get the Input stream details, by hovering on the respective blocks.


  2. Now, Navigate to Data Ingestion -> ContextStreams and click on Preview for the respective input stream.


  3. Now, select Capture Type as Latest to see the latest event details in the input stream.


  4. Refer to the steps below if you see ‘No data’ in the input stream as shown above.

Agents Based O11ySources

As of now, we’re supporting the below Agents

  1. Logbeat – Used for log collection from Windows and Linux servers. Refer here to learn more about the agent
  2. Healthbeat – Mainly used for system metrics collection from Windows and Linux servers. It also supports various other modules. Refer here to learn more about the agent.
  3. vuAppAgent – An agent to collect data from various JAVA applications. Refer here to learn more about the agent.
  4. vuHealthagent – A system metrics collection agent designed for AIX and Solaris operating systems, with the Python version also utilized for gathering OHS metrics. Refer here to learn more about the Java variant of vuhealthagent and refer here for the Python variant of vuhealthagent.
  5. vuLogagent – A logs collection agent designed for AIX and Solaris operating systems. Refer here to learn more about the agent.
  6. Traces – An agent to collect trace data from applications.

For Agent-based O11ySources, here are the potential issues because of which data may not reach the Input Kafka Topic:

  1. The agent isn’t running in running state
  2. The agent is running but there is no data to collect and send (in the case of the log file, the log file is empty).
  3. Agent configuration is not pointing to the right Kafka address and port.
  4. Local firewall rules in the target server do not allow a TCP connection to Kafka.
  5. Firewall rules don’t allow TCP connections on Kafka port OR Firewall rules allow TCP connection but don’t allow Data on Kafka port.
  6. The Kafka topic doesn’t exist and can’t be created dynamically.
  7. The Kafka pod isn’t running.
  8. If the endpoint server has restarted and:
           a. The agent is not running as a service.
           b. The agent is running as a service but is misconfigured.

Based on the above issues, we have created a checklist you should follow with your data source.

S.No.

Health Check Description

Checking Method

(Includes commands to verify)

Resolution Steps

1.

Check if the agent service is running.

For Linux:

ps -ef | grep

<agent-name>

sudo systemctl status

<agent-name>

 

For Windows:

Press the ⊞ Win + R keys simultaneously. Type

services.msc and

Press ↵ Enter.

You can check your

service status here.

For AIX:

ps -eaf | grep <agent-name>

lssrc -s <agent-name>

For Solaris:

ps -eaf | grep <agent-name>

svcs <agent-name>

For HP-UX:

Restart the agent service

For Linux:

sudo systemctl start <agent-name>

For non-service-based installations

<agent_home/agent-name start

Eg: /home/vunet/healthbeat/healthbeat start

 

For Windows:

Press the ⊞ Win + R keys simultaneously. Type services.msc and Press ↵ Enter. You can check your service status here. Right-click on the service name and select restart.

Powershell commands to start an agent

Start-Service -Name “AgentName”

Cmd commands to start an agent

net start “AgentName”

For AIX:

startsrc -s <agentname>

For non-service-based installations

<agent-home>/etc/init.d/<agent-name> start

For Solaris:

svcadm enable <agent-name>

For non-service-based installations

<agent-home>/etc/init.d/<agent-name> start

For HP-UX:

2.

Check if data is available to be sent? There is a chance that there is no data to be sent. For ex: no new logs being written,  no new data is available to be reported by cloud integrations etc..

For Linux, AIX, Solatris,

HP-UX:

“tail -100f <log-path>”

and check the

timestamp if new

data is being written.

 

For Windows:

We can open the

log file and check

the last timestamp

to see if any new

data is being written.

Please check if the service that is pushing logs is running and processing requests. Please inform the solution lead.

3.

Check if Agent configuration is not pointing to the right Kafka address

Check the Kafka

IP Address, Port and

Topic name in the

agent configuration.

Please get the right details and update it in the configuration and restart the agent.

4.

Check if local server firewall rules in the target server are not allowing a TCP connection to Kafka server/port or Data over TCP to Kafka server/port.

We should ask customers

to check and confirm this.

 

For Linux:

sudo iptables -S

Please ask customers to fix this.

5.

Check if firewall rules don’t allow TCP connections to the Kafka Broker/port.

OR

Firewall rules allow TCP connection but don’t allow Data to be forwarded to the Kafka Broker/port

We should ask customers

to check the firewall.

Please ask customers to fix this.

6.

Check if kafka topic configured in the agent configuration exist in vuSmartMaps.

Please login into

vuSmartMaps.

Go to the ‘ContextStreams’

tab and search for

the kafka topic name

in the I/O Streams

tab’s listing.

verify that the topic exists with kafka-topics.sh –list

 

If the I/O stream with the topic name doesn’t exist, we should create the same. If it should have got created as part of O11ySource, please inform your solution lead.

7.

Kafka pods are not in running state

Kafka brokers might

be out of resources

such as memory, CPU,

or disk space, leading

to ingestion failures.

There could be other

reasons as well.

Run the below

command to get

the exact error details.

 

kubectl describe

kafka-cp-kafka-0

-nvsmaps 

 

and check the

error messages 

Most common Describe messages.

 

-> CrashLoopBackOff

If the Kafka container in the pod is repeatedly crashing, you will see a message like this:

The events section might show something like:

This typically indicates the Kafka service inside the container is unable to start due to configuration issues, missing files, or other errors.

 

->ContainerCreating

If the pod is stuck in the container creation process, it will show a message like

This can occur due to issues with pulling the Kafka container image or insufficient resources like CPU or memory.

 

->ImagePullBackOff

If Kubernetes is unable to pull the Kafka image from the container registry, you will see:

The events section might show something like: Failed to pull image “kafka-image:version”: rpc error: code = Unknown desc = Error response from daemon: manifest for kafka-image:version not found

 

->PodPending

If the Kafka pod is stuck in the “Pending” state, it means the pod cannot be scheduled onto a node, possibly due to insufficient resources. You will see:

The events section might provide more details, such as:

0/3 nodes are available: 3 Insufficient memory.

 

-> OOMKilled

If the Kafka process consumes too much memory and is killed by the system, you will see:

This happens when the Kafka pod exceeds its memory limits.

 

-> FailedMount

If Kafka requires volumes (such as persistent storage) that are not being mounted correctly, you will see:

The events might show

MountVolume.SetUp failed for volume “kafka-pv” : mount failed: exit status 32

 

Each of these messages helps identify specific issues with Kafka pods, which can be related to resource allocation, configuration, or environment. The kubectl describe command provides detailed events that can help with debugging.

8.

Incorrect Kafka Broker Configuration

Issues like incorrect

broker addresses,

misconfigured listeners,

or replication issues

can prevent successful

ingestion.

Review Kafka broker logs for errors, verify listeners, and advertised.listeners configurations.

9.

Check server’s uptime. Was the server restarted recently? Check the agent is not running as a service.

For Linux, AIX,

Solatris, HP-UX: uptime

Check the agent service configuration.

10.

Still, I Don’t see the Data

Set the log level

to debug and restart

the agent to see

the detailed log

messages.

 

Logbeat/Healthbeat 

Open the

healthbeat/logbeat

YAML file, scroll to

the end, locate the

logging settings,

and then update

the following line.

logging.level: debug

 

vuhealthagent/

vuappagent/

vulogagent

Open the log4j.properties from the conf.d directory and update the following line.

log4j.rootLogger = DEBUG

Debug logs can provide valuable insights into any issues with data collection. If the problem is related to prerequisites, you might see errors like ‘connection failed‘. For configuration or agent module-related issues, you may encounter errors like ‘unexpected key in the YML file‘.

For prerequisite issues, ensure all the requirements mentioned in the Getting Started page are met. If the problem lies with the agent, try to identify the error and resolve it. If further assistance is needed, escalate the issue to your solution lead.

Agentless Related O11y Sources

For Agentless O11ySources, agent actually runs on vuSmartmaps pods. This is a telegraf based agent. A common telegraf agent is used for all such agentless O11y Sources.

For Agentless O11ySources, here are the potential issues because of which data may not reach Input Kafka Topic:

  1. The telegraf pod is not created because of the missing deployment template in MinIO UI.
  2. The telegraf pod got created, but it’s not in a running state.
  3. Pod stopped running post changes in configuration.
  4. The Kafka topic doesn’t exist and can’t be created dynamically.
  5. Check if the kafka topic name is populated correctly in the config.
  6. Kafka pod isn’t in running state.

Based on the above issues, we have created a checklist which you should go through with your data source.

S. No.Health Check Description

Checking Method 

Includes

commands

to run

Resolution Steps
1.

Check

if the pod

for the telegraf

agent

and corresponding pipeline

is created

and is

running

for the O11ySource

Use the kubectl command:

kubectl get pods -nvsmaps | grep <o11ysource-name>

The above

command should

give at least

two pods.

One for telegraf

agents and

another for

pipeline.

Login to Minio UI and

check

whether the

deployment template

for the respective O11y

Source is available under

vublock-templates

To access minio Please go

to the URL

http://<vusmartMapsIP

>:30910/login

For most O11ySources, the

generic-telegraf.yaml is

utilized as a deployment template.

If you have backend access,

log in to the vublock-store pod and navigate to /app/vublocks/

<o11ysrc_name>/

<Version>/sources.json. Locate the deployment template being used and verify that

it is available in the MinIO vublock-templates bucket.

Check the logs of the

Orchestration pod to

identify any issues

that occurred during

the deployment of

the Telegraf pod.

kubectl logs -f

<orchestration pod>

-n vsmaps

2.

The telegraf pod got created

but

not in a running

state

Use the kubectl command:

kubectl get pods -nvsmaps | grep <o11ysource-name>

Container creating –

Wait for 2 min for the

pod status to be changed

ErrorImageNeverPull –

Describe the pod, and

pull the telegraf image

in the Node, where this

telegraf pod is running

kubectl describe pod

<pod name> -nvsmaps

CrashLoopBackOff –

Check the logs of

the pod for more errors

kubectl logs -f 

<pod name> -nvsmaps

If the state is Pending:

-> If you see the

following in the

describe pod output:

 Warning  FailedScheduling 

3m12s  default-scheduler 

0/1 nodes are available:

1 Too many pods.

preemption: 0/1 nodes

are available: 1

No preemption

victims found for incoming pod.

You must remove a

few not used pods

to change this pod

from pending to

running. If you can’t delete,

reach out to your

admin to increase resources.

-> This will be caused

due to the unassignment

of labeling the node,

Label the node and

then check the pod status

kubectl label node

<node-name>

<labelname>=”True”

– Get the label using

describing the pod

Error state –  Do describe

pod and check the

logs of the pod for

more details

3.

Pod

stopped running

post

changes

in configuration.

Please login into vuSmartMaps.

Goto the respective

O11y Source and

re-save the Source

and check the

telegraf

configuration under vublock/1/1/

bucket in the

MinIO UI for the respective

O11y Source

Check the

logs of the

orchestration

pod for any errors/warnings

Please re-save the

source with the updated

details, in case the telegraf configuration is incorrect.

Run the below command

to check the orchestration

logs

kubectl logs -f <

orchestration pod>

-n vsmaps

4.

The

Telegraf

pod is running

but no

data in

the input topics.

Check the

respective pod

logs to identify

the data collection

issues.

kubectl logs -f

<telegraf pod> -n vsmaps

The telegraf inputs

log will show the errors

related to data collection

in the logs. 

->  Errors such as

Unable to connect to the

database‘ or ‘Collection

took longer than expected

suggest connectivity issues.

Verify the connection and

advise the client to

enable the necessary

connectivity.

-> ‘Permission denied

errors indicate that

the necessary

prerequisites are not

fulfilled. Review the

Getting Started page

and collaborate with

the client to grant

the appropriate

permissions for data

collection.

5.

Check

if the

Kafka

topic is created

for

the O11ySource

Please login into vuSmartMaps.

Goto the ‘ContextStreams’

tab and search for

the kafka topic

name in the

I/O Streams

tab’s listing.

As this will always be

part of O11ySource,

please inform the

solution lead about this.

6.

Check

if the

kafka

topic

name

is

populated correctly

in the

config

Get into the

telegraf pod and

check the Kafka

settings in the

config:

kubectl exec

-it -n vsmaps

<telegraf-pod-name> bash

cat /etc/telegraf/

telegraf.conf

At the end, you should see the following configuration:

[[outputs.kafka]]

    # URLs of kafka brokers

    brokers = [‘broker:9092’]

    # Kafka topic for producer messages

    topic = <topic-name>

Pl check the <topic-name>.

As this will always

be part of O11ySource,

please inform the

solution lead about this.

7.

Kafka

pods

are not

in

running

state

Kafka brokers might be out of resources such as memory, CPU, or disk space, leading to ingestion failures. There could be other reasons as well. Run the below command to get the exact error details.

kubectl describe kafka-cp-kafka-0 -nvsmaps 

and check the error messages 

Most common error/debug messages.

-> CrashLoopBackOff

If the Kafka container in the pod is repeatedly crashing, you will see a message like this:

The events section might show something like:

This typically indicates the Kafka service inside the container is unable to start due to configuration issues, missing files, or other errors.

->ContainerCreating

If the pod is stuck in the container creation process, it will show a message like

This can occur due to issues with pulling the Kafka container image or insufficient resources like CPU or memory.

->ImagePullBackOff

If Kubernetes is unable to pull the Kafka image from the container registry, you will see:

The events section might show something like: Failed to pull image “kafka-image:version”: rpc error: code = Unknown desc = Error response from daemon:

manifest for kafka-image:

version not found

->PodPending

If the Kafka pod is stuck in the “Pending” state, it means the pod cannot be scheduled onto a node, possibly due to insufficient resources. You will see:

The events section might provide more details, such as:

0/3 nodes are available: 3 Insufficient memory.

-> OOMKilled

If the Kafka process consumes too much memory and is killed by the system, you will see:

This happens when the Kafka pod exceeds its memory limits.

-> FailedMount

If Kafka requires volumes (such as persistent storage) that are not being mounted correctly, you will see:

The events might show

MountVolume.SetUp failed for volume “kafka-pv” : mount failed: exit status 32

Each of these messages helps identify specific issues with Kafka pods, which can be related to resource allocation, configuration, or environment. The kubectl describe command provides detailed events that can help with debugging.

Data Isn’t Reaching To The Pipeline

There are a lot of cases where we see data reaching the input Kafka topic but not reaching the output Kafka topic. This happens usually because of some issue in your pipeline. We can verify this using the Pipelines section under ContextStreams.

  1. For an O11ySource, Eg: Linux Monitoring, Navigate to the ContextStreams Tab and get the Pipeline details, by hovering on the respective blocks.



  2. Now, Navigate to Data Ingestion -> ContextStreams -> Pipelines and check the Status of the respective pipeline.



  3. Now, Click on Debug of the pipeline and verify the data inflow. The ContextStream logs will be available in the next tab View Logs.


Following are the most frequent and common issues seen in the pipeline, if there’s no data

  • The incorrect input or output kafka topic is used in the pipeline configuration
  • The pipeline is not in a running state.
  • The new configuration added in the standard pipeline is not correct.
  • Updated configuration changes aren’t there in the pipeline.

Based on the above issues, we have created a checklist which you should go through with your data source.

S. No.Health Check Description

Checking Method 

Includes commands to run

Resolution Steps
1.Check the pipeline configuration for input and output kafka topic in Pipeline configurationPlease login into vuSmartMaps. Goto the ‘ContextStreams’ tab and search for the pipeline name in the ‘pipeline’ tab’s listing. Edit it and check the input and output Kafka topic associated with this.If this is not what you expected, please fix it only if it’s not a standard O11ySource. If it’s a standard O11ySource, please talk to your Solution Lead.
2.Check the pipeline pod and make sure it’s running

Please login into vuSmartMaps. Goto the ‘ContextStreams’ tab and search for the pipeline name in the ‘pipeline’ tab’s listing. 

Click on view and navigate to View logs  to see the logs of the pipeline.

Restart the pipeline, in case the pipeline is in a failed/stopped state.

If this data is for a standard O11ySource, please inform the solution lead about this.

3.A recent configuration change is not correct

Check the block and plugin level statistics using ContextStream Failure Dashboards

You can also look at the ContextStream logs to identify the errors and which block.

Check the plugin config where most of the records are showing exceptions and fix this.
4.A recent configuration change is not reflected in the data in the output topic.

Please check if you published the pipeline post making the configuration change.

 If you have published the pipeline then please check if you are seeing exceptions in that plugin using ContextStream Failure Dashboards.

Stop the pipeline the do a Save and Publish of the same pipeline to see the update changes.

Check the plugin config where most of the records are showing exceptions and fix this.

Data Isn’t Reaching To The Output Stream

The above one is a rare scenario, where data is available in both input stream and in pipelines but not in output stream. The potential causes for this will be due to the misconfiguration of the output stream.  We can verify this using the I/O Streams section under ContextStreams.

Following are the most frequent and common issues seen in the output stream, if there’s no data

  • The incorrect input or output kafka topic is used in the pipeline configuration.
  • New configuration added in the standard pipeline is not correct.

Based on the above issues, we have created a checklist which you should go through with your data source.

S. No.Health Check Description

Checking Method 

Includes commands to run

Resolution Steps
1.Check the pipeline configuration for input and output kafka topic in Pipeline configurationPlease login into vuSmartMaps. Go to the ‘ContextStreams’ tab and search for the pipeline name in the ‘pipeline’ tab’s listing. Edit it and check the input and output Kafka topic associated with this.If this is not what you expected, please fix it only if it’s not a standard O11ySource. If it’s a standard O11ySource, please talk to your Solution Lead.
2.A recent configuration change is not correctCheck the block and plugin level statistics using ContextStream Failure DashboardsCheck the plugin config where most of the records are showing exceptions and fix this.

Data Isn’t Being Ingested Into The Hyperscale Tables

There are cases where data is available in both input and output kafka topics but is not getting inserted in Hyperscale DS. 

Following are the most frequent and common issues seen in reaching the data to the hyperscale tables.

  1. The hyperscale tables are created in a different database.
  2. Data type for a specific field is not proper i.e field is a float but we have added it as UInt in the schema or field is a string but it is added as UInt in schema and so on.
  3. Kafka Engine configuration using an incorrect kafka output topic and kafka address
  4. Clickhouse isn’t able to ingest data into the tables.
  5. Clickhouse service not able to connect to Kafka service because of service or network issue.

Based on the above issues, we have created a checklist which you should go through with your data source.

S. No.Health Check Description

Checking Method 

Includes commands to run

Resolution Steps
1.Check whether all the tables are created under the default vusmart database inside click house

Login to clickhouse pod using the below command 

kubectl exec -it chi-clickhouse-vusmart-0-0-0 -n vsmaps bash clickhouse-client;

Type use vusmart; 

Run  below query to check whether the tables are present in the database or not.

show tables like ‘%<part of table name>%’

There should be four different tables for each output topic, 

Eg: show tables like ‘%additional%’

linux-monitor-additional-metrics

If the tables are available in any other database, drop the tables in that database and recreate them under the vusmart database. Report the issue Solution lead if it is a standard O1ysoruce.
2.Check if data type for a specific field is incorrect in DB schema

This is usually available in the logs. To check logs, you can execute following command:

kubectl logs -f chi-clickhouse-vusmart-0-0-0 -n vsmaps

You can add a grep for your O11ySource as there may be too many logs. You can also redirect it to a file so that you can collect it for sometime and then analyze.

Once you have identified the field and the mismatched data type, you should update the DataType in the Clickhouse Schema.

At present, the only simple way to fix this is to drop the table and then recreate it with the correct DataType.

3.Check the Kafka engine configured with incorrect output kafka topic name and address.

This is usually available in the database and you would need to execute a command to describe the KafkaEngine table.

show create table <kafka-engine-table-name>

If this is not a standard O11ySource, fix the kafka topic. If this is a standard O11ySource, and if the kafka topic name is not right, please inform your solution lead.
4.Clickhouse isn’t able to ingest the dataCheck the logs of the chi-clickhouse-vusmart-0-0-0 pod to check whether there are errors/warnings during the data ingestion.

If the error mentions any issue wrt O11y Source, please inform your solution lead.

If the error relates to disk consumption, drop the tables which are consuming more space.

5.Check network connectivity between Kafka and Clickhouse pods

You should check if the corresponding service is running/available or not. You can use following command to check all services:

kubectl get svc -n vsmaps

If clickhouse or kafka service is not running, you can start the same by deleting the corresponding pod.

Data Consumption

Data Isn’t Available In The Dashboard Panels

Sr. NoHealth Check DescriptionChecking Method (Including commands to run)Resolution Steps
1.Check the time selected in Global Time SelectorLogin into vuSmartMaps, load the dashboard and check the time selected in the global time selector in the top right corner.Try changing the time based on data availability.
2.Check the filters applied if anyThere are cases where a dashboard is saved with filters on. Please check the filters at the top of the dashboard.Remove the filters
3.Check if data is available in the Database table used for Panels in the dashboardThere is a case that there is no data available in the table used in Panels.

Goto the ‘Explore’ tab from the menu bar and select the database and table and check if data is available in the table.

You can also use ‘Data Modelling Workspace’ to explore the data available in a table.

Alerts Aren’t Getting Generated

S. No.Health Check Description

Checking Method

Includes commands to run

Resolution Steps
1.There is no data for the conditions mentioned in the alert. Check the data preview for each Rule for the configured time range. The data preview should have the data for the selected alert execution period
2.Data available is not crossing the thresholds.Check whether the thresholds are configured properly.Provide the thresholds either in the Data Model/Alert rule according to the final data for the selected time range
3.

Alert execution failed

Getting an error like below when clicking on Save & Execute in the Alert Rule

Failed to process execution of Alert Rule in the specific time range

Verify whether the Evaluation script is correct or not using the alert execution logs inside alert-0 pod

Verify whether Redis service is running or not. Run kubectl get pods -n vsmaps | grep redis

redis pod must be in a running state.

Check the alert-0 pod logs and correct the error in the evaluation script

Redeploy the vunodes helm-chart

Alerts Aren’t Delivered To Channels

S. No.Health Check Description

Checking Method

Includes commands to run

Resolution Steps
1.

Check

 whether

Alert Channel

 configuration 

details

Check under

Platform Settings ->

Preferences

Check if the

channel are reachable

and working

If the channel is not configured, configure channel settings.

Test the channel availability and reachability based on the channel. 

2.

Check whether

Alert channel celeries are listed under dao pod

Login into dao pod

and use ps -ef

command and

following should show

up in the output:

daq.services.

scan.celery.mail_task

daq.services.

scan.celery.teams_task

daq.services.

scan.celery.slack_task

daq.services

.scan.celery.whatsapp_task

daq.services.

scan.celery.teams_task

If celery processes are not running, please inform your solution lead.
3.

Alerts are

being generated,

but not received via Emails

Check the Mail Server settings under Preferences.

Check if you are able to send email using email script.

Check whether “less secure apps” is enabled or not

Check whether the Email id provided don’t have 2FA

Please update the Mail Server settings correctly.

Fix the firewall port if there is no communication with Mail Server.

  • Check the email server connectivity using telnet.
  • Use the email script and send a test email.
  • Login to vuinterface-cairo-o pod and navigate to mickey/vusoft/vumap
  • Run python send_email.py test test <email id>
4.

Disk space

inside vuinterface/

dao/alert pods

Verify whether there is sufficient disk space available inside vuinterface/dao/alert pods

Check the disk space using df -kh command

Increase the disk space or delete the logs under /var/log directory to have enough amount of data available

Unable To Generate Reports

S. No.Health Check Description

Checking Method 

Includes commands to run

Resolution Steps
1.Report generation failed

Check whether report celery is listed under dao pod using 

Login to dao pod using kubectl exec -it dao-0 -n vsmaps bash

Run ps -ef to check whether the below celery task is running or not

daq.services.scan.celery.reports_task

Redeploy the vunodes helm-chart and check whether the celery came up in the dao-0 pod

Navigate to helm-charts/vunodes path and redeploy the helm-chart using below commands:

helm uninstall vunodes -n vsmaps

helm install vunodes . -n vsmaps

Wait until the pods are up and then verify the celery tasks under dao pod.

3.Report generation failed for Dashboard as a datasourceCheck whether the values.yaml for vunodes is updated with the latest GF_SERVICE_ACCOUNT_TOKEN present under Platform Settings/Service Accounts in the VuSmartMaps UI

Generate a new  token using below steps

Login to the UI of the server using vunetadmin creds.

Navigate to Platform Settings -> Service Accounts 

Generate a new token with role as Admin and add it to the values.yaml under vunodes helm-chart and redeploy the vunodes helm-chart

4.Report generated with No Data

Check whether there is data in the data store for the selected time range.

Check the filters applied if any.

Use the correct time range and filters accordingly to get the data

Reports Aren’t Being Received Via Email

S. No.Health Check Description

Checking Method 

Includes commands to run

Resolution Steps
  1.  
Reports aren’t being received via Email

Check whether report and email related celeries are listed under dao pod using ps -ef command

daq.services.scan.celery.reports_task

daq.services.scan.celery.mail_task

Check whether the valid email preferences details are provided under the Preferences section

Check whether an email is delivered using email script.

Redeploy the vunodes helm-chart and check whether the celery came up in the dao-0 pod

Provide the valid email preferences details under the Preferences section

Check the mail server connection.

Resources

Browse through our resources to learn how you can accelerate digital transformation within your organisation.

Quick Links