Skip to main content
Version: NG-2.15

RCA Workspace

Introduction

Welcome to the RCA Workspace, where you can navigate your journey graphs with ease. The Workspace offers a structured approach, providing a clear separation between different components. This segmentation ensures a customized RCA experience tailored to your specific needs. Within the Workspace, you can define the role of each signal, encapsulate multiple signals, and explore their relationships and dependencies. These features empower you to gain deeper insights and effectively analyze the factors influencing your system's performance.

Overview

Our system architecture revolves around key components working harmoniously to provide a robust root cause analysis (RCA) solution. It consists of 4 main layers which are loosely connected to complete the setup

The Data Model forms the core, facilitating the identification of essential metrics and signals crucial for RCA. Built on our Data Store, these models enable users to extract critical insights into system performance. Configured within the RCA Workspace, users define specific journeys and analyze associated data, leveraging insights from Data Models to explore system dependencies and relationships.

As the RCA algorithm processes data and identifies potential issues, it generates Incidents accessible through the RCA Incident module. This seamless integration ensures users have timely access to actionable insights, fostering informed decision-making and proactive issue resolution.

Prerequisites

The user must have configured Data Models and Data Store on vuSmartMaps.

Workspaces

Workspaces read the information provided by the Data Model and map them as part of an application or business journey.

We have 4 main Workspaces.

  1. RCA
  2. Time Series Analysis
  3. 3T Alert Correlation
  4. ML Alert Correlation

FAQs

How can the RCA Workspace help me proactively identify and address potential system outages?

The RCA Workspace enables proactive monitoring and analysis of system metrics, allowing Operations Managers to detect anomalies and potential issues before they escalate into outages. By leveraging automated incident detection and root cause analysis, the RCA Workspace helps prevent service disruptions.

What customization options are available for configuring metrics in the RCA Workspace?

Data Analysts can use the Schema Configuration to define and categorize metrics into Lead Indicators, Operational Indicators, and External Indicators. This categorization helps in monitoring business impact and system performance effectively.

How can I utilize the RCA Workspace to extract critical insights into system performance?

You can leverage the RCA Workspace to configure schemas, categorize metrics, and create visual representations of business journeys. By analyzing data from multiple sources and exploring relationships between metrics, you can gain deeper insights into system performance and make data-driven decisions.

What are the benefits of using the RCA Workspace for Incident Response Teams?

Incident Response Teams benefit from the RCA Workspace by gaining real-time access to actionable insights through the Incidents module. By correlating data from various sources and applying advanced algorithms, the RCA Bot helps detect and resolve incidents efficiently, enabling proactive issue resolution.

How can Time Series Analysis help me detect anomalies in my system?

Time Series Analysis uses forecasting techniques to identify anomalies in data trends, enabling proactive issue detection. It helps visualize performance patterns and spot irregularities before they escalate.

What role does the Schema play in Time Series Analysis?

The Schema defines and categorizes metrics into Lead, Operational, and External Indicators, providing a structured view of system performance. This categorization helps detect issues early by identifying anomalies in key business metrics.

How can Storyboards in Time Series Analysis help me monitor system health and performance trends?

Storyboards display anomaly scoring, masking, and text insights, helping you visualize and interpret system trends. This enables better decision-making and prioritization of actions based on performance patterns.

How does ML Alert Correlation benefit?

ML Alert Correlation helps operations teams by reducing noise and false positives in alert streams. It streamlines incident investigation by correlating alerts from multiple sources, enabling faster issue detection and resolution.

How can ML Alert Correlation improve my workflow?

By correlating multiple alert streams, ML Alert Correlation reduces alert fatigue and highlights critical issues, allowing engineers to focus on genuine incidents. This improves incident management efficiency and system reliability.

Can ML Alert Correlation help in reducing incident response times?

Yes, ML Alert Correlation reduces incident response times by clustering related alerts and suppressing noise. This enables teams to quickly identify and resolve issues, improving system uptime and minimizing service disruptions.

How does ML Alert Correlation support gaining insights from alert data?

ML Alert Correlation provides data analysts with insights by identifying patterns and recurring issues in alert data. This helps in optimizing system performance and making informed, data-driven decisions.

Can ML Alert Correlation help business stakeholders understand the impact of incidents on business operations?

Yes, ML Alert Correlation provides business stakeholders with visibility into incident impact by correlating alerts and identifying critical issues. This helps them prioritize responses and mitigate risks to business continuity.