Docs > Console > Dashboard > RCA Storyboard

RCA Storyboard

Introduction

RCABot is a crucial ready-to-use ML Model that can easily be configured for a business journey or individual application to:

  • Provide an “Early Warning” signal based on the abnormality observed on any of the health and/or performance metrics related to configured components
  • During incidents, enable the operations team to do faster resolution as the RCABot will precisely narrow down the problematic components, and golden signal and highlight the probable root cause

Type of Views in RCABot Storyboard

The RCABot storyboard view encompasses the overall state of a business journey/application at a specific point in time, and the view can be classified into three categories:

1. Incident View

This view indicates a business journey/application at a given point in time is in an impacted/unstable state. When a journey is in an “Incident” state, it signifies that issues or problems are occurring in the journey that are directly affecting the end-user experience. On the other hand, the users are not able to perform their transactions.

This state serves as an alert or warning that something is not functioning as expected, and it requires immediate attention. The storyboard view will:

  • Pinpoint the specific lead indicators or key performance indicators (KPIs) that are affected which helps in understanding the scope and severity of the issue
  • Provide a list of touchpoints, applications, or components that are unstable or experiencing problems and describe the Probable Root Cause
  • Highlight the dependencies between the impacted component on the journey graph
  • Finally, the recommended actions are shown based on the feedback and learning from the similar events in the past

2. Early Warning View

This state is a valuable aspect of journey observability, which is used to indicate an “Early Warning” sign for a journey at a given point in time. This state serves as a proactive indicator with the below characteristics:

  • This state acts as an early indicator that there are elements within the journey such as touchpoints, applications, components, or golden signals that are showing signs of instability or problems
  • Importantly, the end-user experience remains unaffected during this stage. This means that lead indicators or key performance indicators (KPIs) are still within acceptable ranges, and users are not yet experiencing issues
  • The purpose of identifying these early warnings is to enable proactive actions to prevent potential issues from escalating and impacting the end-user experience
  • The operation team can use this state to closely monitor the elements that are showing signs of instability and take preventive measures

3. Incident Analysis & Pattern Identification View

The storyboard extends the option for end-users to perform in-depth analysis, particularly in the context of historical incidents, and seamlessly achieve the below outcomes:

  • Historical Incident Analysis: This capability is crucial for understanding the evolution of issues and their impact on the business journey.
  • Understanding Hidden Patterns and Behaviors: By allowing detailed analysis, the platform facilitates the identification of hidden patterns and behaviors within historical incidents. This could involve recognizing trends, recurring issues, or anomalies that might not be immediately apparent without a thorough historical perspective.
  • Focus on Golden Signals and Components: Golden signals often represent critical metrics, and understanding their historical behavior can provide insights into the overall health of the system. Similarly, understanding how different components (such as Network components, Infra components, and Application components) behave during incidents is valuable for diagnostics.
  • Identifying Repetitive Nature of Issues: The platform aims to uncover the repetitive nature of issues. Recognizing patterns of recurrence is vital for anticipating and proactively addressing problems before they impact the business journey.
  • Permanent Resolutions: The goal of this analysis is to identify permanent resolutions. Instead of merely addressing symptoms, understanding the historical context allows for root cause identification. Addressing root causes contributes to long-term solutions, enhancing the stability and performance of the business journey.

In summary, the RCABot storyboard provides valuable insights and actionable information for IT and business teams to monitor, manage, and optimize the end-to-end journey, ultimately improving user experiences and operational efficiency.

RCABot Storyboard

In this section, let us delve deeper into the richness of information and actionable insights available in the RCABot storyboard for user consumption:

Incident Overview Panel

Below is the overview panel, which shows the total number of active incidents, incidents cleared by users & Bot, and incidents closed by the users for the selected time window.

It also highlights the below metrics to measure the efficiency in identifying, resolving, and finding the root cause for incidents:

  • MTTD: Mean time taken to Detect an incident
  • MTTR: Mean time taken to Resolve incident.
  • MTTFC: Meat time taken to find the Root Cause for incidents
  • AETC: Average expected time to Clear incidents

Incident Card

An individual card for every incident will be displayed to summarize the below important information related to each of the incidents:

  • The severity of the incident (Critical/Error/Warning).
  • Brief Incident description which precisely indicates the impacted/affected Lead Indicators.
  • Metrics like TTD (Time taken to Detect), TTR (Time taken to Resolve) & TTFRC (Time taken to find Root Cause).
  • Incident Status, Duration along with start and end time (if cleared) for the incident.
  • Trend of impacted Lead Indicator.
  • A detailed description of the incident.
  • Probable Root Cause.
  • Expected time to close (ETC) for the active incident.
  • Confidence Level on reporting this incident.

The pagination option is available in the right bottom corner to view all the incidents.

RCA Card

Once the user clicks on the “Incident Card,” an RCA card view will be shown below which highlights the list of probable root causes and other additional details for an incident

  • List of impacted golden signals.
  • List of Probable Root Causes.
  • Dependency of impacted components using Journey graph, and Recommended Action.
  • Options to Provide Feedback

  • Options to share the details about the incidents using any configured communication channel

 

  • Options to Create tickets and modify the workspace configuration

Summary Card

This view shows the list and trend for each of the impacted golden signals under each of the impact components

Resources

Browse through our resources to learn how you can accelerate digital transformation within your organisation.

Quick Links