Docs > Configuration > RCA Workspace > RCA Bot
1. Getting Started with vuSmartMaps™
3. Console
5. Configuration
6. Data Management
9. Monitoring and Managing vuSmartMaps™
vuRCABot uses gathered data to detect system incidents promptly. It analyzes various data points and patterns to identify potential outages and disruptions. When an incident is detected, vuRCABot digs deeper into the data to offer insights into the likely root cause. This involves correlating data from multiple sources, assessing historical trends, and applying advanced algorithms.
vuSmartMaps Insights can be accessed from the left navigation menu (Configure Observability > RCA Workspace).
Create Workspace: The workspaces page shows a list of previously configured Workspaces. Click on the + icon to create a new Workspace.
You can now configure the workspace; the workspace comprises 5 major sections
Enter the Workspace Name, Description, and choose the Category as RCA, and choose the Run Type as Online or Offline based on the requirement.
Note: Choose Run Type as Online for live data and Offline for third-party systems or data in CSV files. For the Offline data use the data imported using Import Data.
Click on Create to create your Workspace.
Once Workspace is created, you will be directed to the Schema page, where you can configure the schema which comprises the Journey, Components, and Graphs section. The schema is the place where you’ll have to define the business journey and its metrics.
Metrics can be categorized into three types,
Note: You must categorize the metrics accurately because the incidents will be detected primarily based on the lead indicators.
The journey will be the super-set of all metrics and components (i.e. you can think of this as a business journey). You can categorize the metrics at the journey level if they don’t specifically come under any particular component.
You can now click Journey, and add a new signal.
For each signal you will be adding, you’ll have to specify the data model and metric column in that model for this signal. Only metric columns in a data model are eligible to be indicators.
You can use the listing option to specify the data model and respective metric column and then categorize that metric using the category listing option. Similarly, you can add other signals.
Usually, in a business journey, most of the metrics will be defined at the touchpoints/component level. In such cases, you can use the components section to categorize the metrics. You can now add a new component.
Then you can specify the component name. A Component can have two constituents.
1. Signals as a Constituent
In the case of signals as constituents, you can now add a signal and follow the same procedure as you did for Journey signals to Categorize them.
Note: A particular signal (i.e., a unique metric for a data model) can be defined only once in the whole schema. It can either be defined as the journey or inside a component
2. Components as Constituent
For Components as constituents, you can add a new component and select the component as a constituent.
Then in the components listing, all the components that we previously created will be shown, you can select them based on the requirement and save them.
Components as Constituents are useful when you want to create sub-graphs.
For example: Say your graph is: A -> C1_big
where,
C1_big = [C11 -> C12 -> C13],
C11 = [M111, M112] (metrics)
C12 = [M121, M122] (metrics)
C13 = [M131, M132] (metrics)
The Graphs are another important part of the Schema. It is used to define the topology of the business journey/system after metrics categorization. Graphs can be specified at two levels,
For E.g.:
Say your graph is: A -> C1_big
where,
C1_big = [C11 -> C12 -> C13],
C11 = [M111, M112] metrics
C12 = [M121, M122] metrics
C13 = [M131, M132] metrics
Journey-level graph: A -> C1_big
Component-level graph:
C1_big = [C11 -> C12 -> C13]
You can click Add New Graph, and select the graph type:
After selecting the graph type, you can create connections. Each connection acts like a link between two touchpoints/components. For each connection, now you’ll have to specify the following
And save the connection for that graph type. Similarly, if you want to create a connection for a particular component, you can follow the same approach.
After successfully submitting the Schema, you will be directed to the Signalizers page. The signalizers page gets automatically populated with the list of metrics configured in the Schema page along with information on ML techniques that will be running for the respective metrics
If you want to change the hyper-parameters for a particular metrics ML method, such as Anomaly Detection, CHI, OPI, etc., advanced users or ML Engineers can click the edit button of the metrics ML method.
It will direct you to the hyper-parameter editing page. On completion of editing, you can click the Update button to override the default parameters.
Now you can either globally activate the signalizers or activate only specific metric signalizers as per requirement locally (at the action section of each metric listed on this page)
After activation, a pop-up will come up where you can click the Activate button.
After clicking, the vuRCABot will start creating the required pipelines. Once the pipelines are created, all the signals configured in the workspace activation buttons will be switched on.
You can click on the Next button at the bottom right to go to the next section, Bot Settings.
The Bot Settings page allows users to configure key parameters that govern the behavior, analysis, and response capabilities of their automated Root Cause Analysis (RCA) bots. These settings help fine-tune how incidents are identified, correlated, and resolved, enabling efficient and precise incident management.
Key parameters include options for controlling how and when the RCA bot checks for system abnormalities, such as setting schedule frequencies and defining validation intervals. Hierarchical and signal correlation settings allow for deeper insights into incidents by filtering metrics and correlating indicators to identify root causes more effectively. Timeout configurations, like Incident Blacklist Timeout and Incident Clear Timeout, ensure incidents are managed appropriately over time, either by suppressing noise or marking them as resolved based on inactivity.
You can specify the Topological Correlation Frequency (in minutes) and Training Frequency (in Days).
Again, you need to activate the pop-up
If vuRCABot is successfully configured, you’ll get the activation message. After this step, you can expect incidents on the incidents page if any real incidents occur in the system/journey for respective times.
The storyboard contains insights into a workspace. The initial section gives an overview of the list of metrics configured and their roles and health.
💡 Note: The Storyboards can be viewed in a separate browser window for ease of navigation and better user experience. To do this, users need to click the Open New Window button which will open the respective storyboard group in a separate window.
RCA Storyboard:
If you want to get insights on configured ML methods for these metrics, you can further select the CHI Storyboard and Anomaly Storyboard as follows.
RCA Validation
Alerts Storyboards: The Alert Storyboard offers actionable insights and recommendations to help reduce the volume of alerts and improve overall alert management.
Once done, click on Finish to head back to the listing page and find your newly created RCA workspace.
Under the Action column, click the View Incidents button to navigate to the RCA Incidents page for detailed insights.
Browse through our resources to learn how you can accelerate digital transformation within your organisation.
VuNet’s Business-Centric Observability platform, vuSmartMaps™ seamlessly links IT performance to business metrics and business journey performance. It empowers SRE and IT Ops teams to improve service success rates and transaction response times, while simultaneously providing business teams with critical, real-time insights. This enables faster incident detection and response.