VuNet Docs

Configuration > RCA Workspace > Event Correlation

Event Correlation

A powerful feature that utilizes a set of correlation profiles to link events from multiple sources intelligently.

We have 2 types of correlation.

  1. 3T Alert Correlation
  2. ML Alert Correlation

3T Alert Correlation

Rule-based correlation involves using a determined set of correlation profiles (rules) to correlate events originating from a list of event sources. Each profile has a set of instructions that guides the system to combine 2 or more events into a single cluster. The cluster generated is termed a Correlated Event. 

For example, 

  • An instruction that says to cluster all the events originating from the location Bangalore is a Correlation Profile. 
  • Similarly, an instruction that guides the systems to combine all the events that represent an anomaly in a specific device is a Correlation Profile.

We can associate various metadata to a Correlated Event like the count of raw events, start time representing the timestamp when the correlated event was first formed, duration, end time, and so on. This provides the users the ability to make more sense of the problem at hand and reduce the time to fix it.

Add Workspace

Click on the RCA Workspace on the left Toggle Menu.

The workspaces page shows a list of previously configured Workspaces. Click on the + button to create a new Workspace.

You can now configure the workspace; the workspace comprises 3 major sections

  1. Basic Details
  2. Setup Event Sources
  3. Setup Correlation Profiles

Basic Details

Enter the Workspace Name and Description, select the Category as 3T Alert Correlation, and choose the Run Type as Online or Offline.

Click on Create and Next at the bottom right to create the Workspace.

Setup Event Sources

Once Workspace is created, you will be directed to the Event Sources page, where you can add events by selecting the Event Data Model.

  • Select Event Data Model: Choose a Data Model from the drop-down.
  • Enter Description (Optional): Provide a brief description that explains the need for this correlation.
  • + Add Events: You can add multiple events by clicking on the + Add Events button
  • Delete: Click on the Delete button to delete an Event Data Model.

Click on Next to move to the final step.

Setup Correlation Profiles

Choose a category on which you want to correlate the events. The ‘Journey Based’ and ‘Fields Based’ Correlation profiles are fully functional now.

  1. Network Topology Based: Alerts for device or service availability are suppressed if the system identifies that the alert is because of an intermediate router, switch, or link failure. This will be functional in the upcoming releases.

  1. Journey Based: Alerts that are part of a particular business journey are combined in this case. Identifying whether an alert is part of a business journey will be done by looking for various matches including journey name, IP addresses, app name, etc.

  1. Fields Based: Users can specify the list of fields based on which context mapping will be done. For example, all alerts with a value for ‘summary’, and ‘severity’ will be combined.

  1. Tag Values Based: Similar to fields. Here, all alerts having a certain value in the tag are combined. This will be functional in the upcoming releases.

Finish

Click on Finish to complete the setup.

And, you will arrive back at the listing page. Click on the Activate Workspace button from the Actions column. 

Choose the ‘Start Time’ and ‘End Time’ within which events will be considered for correlation. Click on the Start button to begin the correlation.

You will find the correlated alerts that appear below. Here out of 100 raw events, 77 events were correlated and brought down to just 6 events. You can notice an amazing 71% suppression.

Alternatively, you can upload a CSV file to correlate the events and click on the Start button.

You will find the correlated alerts that appear below. Here out of 4 raw events, 2 events were correlated and brought down to 1 event. You can notice an amazing 25% suppression.

ML Alert Correlation

ML Alert Correlation is a sub-module that helps customers optimize their time while investigating potential downtimes and failures inside the application. 

The correlation module helps by analyzing many alert streams from different sources, correlating them by various factors, including data and domain, and reducing the noise.

This helps reduce the false positives to the maximum extent and suppresses the events/ alerts fatigue, which greatly helps operators and respective teams improve the MTTR.

Add Workspace

Click on the RCA Workspace on the left Toggle Menu.

The workspaces page shows a list of previously configured Workspaces. Click on the + button to create a new Workspace.

You can now configure the workspace; the workspace comprises 3 major sections

  1. Basic Details
  2. Event Sources
  3. Settings

Basic Details

Enter the Workspace Name, and Description, and select the Category as Event Correlation.

Click on Save and Next to create the Workspace.

Event Sources

Once Workspace is created, you will be directed to the Event Sources page, where you can add events by selecting the Event Data Model.

  • Select Event Data Model: Choose a Data Model from the drop-down.
  • Enter Description (Optional): Provide an optional description.
  • + Add Events: You can add multiple events by clicking on the + Add Events button
  • Delete: Click on the Delete button to delete an Event Data Model 

Click on Save and Next to move to the next step.

Settings

After successfully configuring Event Sources, you will be directed to the Settings page. 

It has 4 major sections.

  • General Configuration
  • Hyperparameter Configuration
  • First Time Training
  • Schedule

General Configuration

This is the first section and it allows you to configure notification types. It supports Email and WhatsApp notifications.

Email: Enter the Recipient’s email address. Use commas to add multiple IDs. You could also add an Email group to notify a set of people.

WhatsApp: Enter the Recipient’s mobile. Additionally, you can add a WhatsApp group to notify a set of people.

Hyperparameter Configuration

This is the second section of the Settings page. It has 2 main segments – Training and Inference.

  • Training: The training phase learns from data to create and adapt rules based on which events/alerts are to be correlated. The hyperparameters listed here can be tuned and they have a direct effect on the rules that the algorithm creates.



    • Window Length: The length of the window in days within which events will be considered for learning the clusters. Default to 1 day. The training will be done in a scheduled fashion.
    • Overlap Length: the length of the window in days by which to have an overlap between event data for two consecutive days. Default to 0.5 days. Overlap helps to reduce end-of-day cut-off effects.
    • Filter Noisy Nodes: Events from nodes that frequently generate non-meaningful events will be filtered out before clustering and marked as such.
    • Scale Affinity: If true, a 0-1 scaling is applied to the affinity matrix which is internally estimated by the correlation engine. Scaling prioritizes larger cluster formation while sacrificing slight information on graph node closeness. Enable this if you often see smaller non-meaningful correlated events.

  • Inference: The inference phase utilizes the rule created during training to correlate events in real time. The hyperparameters listed here can be tuned and they have a direct effect on the correlated events/alerts that are created.


    • Cluster Confidence Threshold: clustering rules having lesser confidence than the threshold will be deprioritized when generating correlated events. Defaults to 40% which is a good default. Higher confidence can only be achieved when the correlation engine is enhanced with feedback. Hence, setting a high value here may result in low to no correlated events getting created.
    • Detect Noisy Nodes: Select this option to detect nodes that frequently generate non-meaningful events
    • Cluster Noisy Nodes: Select this option to cluster events from nodes that frequently generate non-meaningful events

Note: A new user may choose to leave the default settings unchanged

First Time Training

This is the third section of the Settings page. You must choose the start time and end time of the data that must be utilized to train the algorithm. 

Please select the larger range of data for the first run so that the algorithm can learn the rules.

Note: The larger the data training the more the algorithm will take a significant amount of time to learn the rules.

Schedule

This is the last section of the settings page. The event correlation algorithm runs in a scheduled fashion. You can use this page to adjust how frequently training and inference jobs must run.

Finish

Click on the Finish button to complete the ML Alert Correlation configuration.

And, you will arrive back at the listing page. Click on the Activate Workspace button from the Actions column. 

Choose the ‘Start Time’ and ‘End Time’ within which events will be considered for Training or Training and Inference. Click on the Start button to begin.

If you choose Training and Inference, choose the percentage of the Data used for Inference.

💡Note: The inference phase utilizes the rule created during training to correlate events in real-time. 


You will find the training and inference happening in the Status window. To view the results you can click on the link-like button that will take you to the Alert Console and display the results.

Alternatively, you can upload a CSV file to correlate the events and click on the Start button.

Further Reading:

Resources

Browse through our resources to learn how you can accelerate digital transformation within your organisation.

Unveiling our all powerful Internet and Mobile Banking Observability Experience Center. Click Here