Data Streams >

Data Pipeline and Parser Configuration – Data Streams

Introduction

Data Streams are an integral part of the data flow process within the vuSmartMaps™ platform. They play a pivotal role in acquiring, processing, and storing data for further analysis. Understanding how data flows from the source to its ultimate destination is essential for efficient monitoring and analysis.

Data Stream pipelines are constructed using blocks, each of which executes a series of plugins. These pipelines are highly versatile, enabling users to structure them in various ways to suit their specific needs. The primary function of Data Stream pipelines is to read data from a data stream, process it, and then send it to another Data Stream. This multi-stage process paves the way for data transformation and enrichment, preparing it for storage and further analysis.

Data Processing Journey

The data is collected from the target system through Observability Sources. The data then undergoes a significant transformation during the data processing phase, thanks to Data Streams. This phase occurs in distinct sections: I/O Streams, Data Pipeline, and DataStore Connectors, with each having a unique role in processing data. Additionally, the Flows tab gives a display of the data processing journey.

  • I/O Streams and Data Pipeline: The data collected by Agents enters the platform via an input stream and proceeds through a series of transformations within the Data Stream pipeline. This phase is crucial for data processing and structuring.
  • DataStore Connectors: After processing within the Data Stream pipeline, the data is directed to an output stream. The role of DataStore Connectors is to facilitate the transfer of this processed data to designated data storage destinations, such as Elasticsearch or TSDB.
  • Flows: The Flows section provides a visual representation of the entire data flow journey. It acts as a valuable reference to understand how data moves from its source to storage, making it an essential feature for system observability.

With Data Streams, users can better comprehend the data flow within the vuSmartMaps platform. This understanding helps optimize system performance, enabling proactive monitoring, and extracting valuable insights from the data processed within the platform.

In the subsequent sections of this user guide, we will explore Data Streams in more detail, including their configuration, key functionalities, and how they contribute to an enhanced data flow experience.

Working with Data Streams

The Data Streams page can be accessed from the platform left navigation menu by navigating to Data Ingestion > Data Stream.

The Data Streams landing page will look like this where you can configure with the different options.

The user interface of the Data Stream section is composed of four primary tabs, each designed to facilitate specific actions and configurations, enhancing your ability to harness the full potential of data stream management. Let’s take a closer look at these tabs:

  1. I/O Streams
    This tab enables you to categorize and organize data by creating unique I/O streams. Here, you can create, edit, view, preview, and delete I/O streams to optimize your data organization.
  2. Data Pipeline
    Data transformation is at the core of this tab. You can configure your data pipeline to refine raw data into a more meaningful and actionable format. Options include creating, viewing, editing, and debugging your data pipeline to streamline the transformation process.
  3. DataStore Connectors
    The DataStore Connector tab focuses on the delivery of transformed data to a permanent storage unit, such as Elasticsearch or MySQL. Here, you can create, view, edit, or delete connectors to ensure seamless data storage and accessibility.
  4. Flows
    The Flows tab provides a dynamic visual representation of your data’s journey. It allows you to customize the flow of data within Input Data Streams, Pipelines, Output Data Streams, and DataStore Connectors. You can zoom in, zoom out, or reset the flow dimensions to gain deeper insights into the data flow.

These tabs collectively empower you to configure, manage, and visualize the data flow within your system effectively, facilitating smoother data processing, storage, and analysis. In the upcoming sections, we will delve into each tab’s functionalities to provide a comprehensive understanding of their roles in the data stream management process.

I/O Streams

I/O Streams serve as temporary storage units in the data processing journey. Each I/O stream is uniquely named across the entire data stream cluster, ensuring clear and distinct identification for your data.

You can configure the I/O Streams in the following ways:

  • Create 
    Set up new I/O streams tailored to your data categorization requirements.
  • Edit
    Modify existing I/O streams to adapt to evolving data organization needs.
  • View
    Access a comprehensive overview of your configured I/O streams.
  • Preview
    Get a sneak peek into the content of your I/O streams.
  • Delete
    Efficiently remove I/O streams that are no longer needed, maintaining a clutter-free workspace for your data.

With these functions, you gain the flexibility to tailor your I/O streams to match your specific data organization preferences, enhancing data management within the Data Streams feature.

Data Pipeline

The Data Pipeline plays a pivotal role in converting raw data into a format that holds more significance for the end user. This transformation is achieved through the utilization of a diverse range of plugins, including enrichment, manipulation, and more. A Data Pipeline reads data from an I/O stream and, after applying these transformations, sends it to another I/o stream.

Data Pipeline offers the following configurations:

  • Create
    Establish new data pipelines tailored to your data transformation requirements.
  • View
    Access detailed information on your existing data pipelines, gaining insight into their configuration.
  • Edit
    Modify and fine-tune your data pipelines to adapt to evolving data processing needs.
  • Delete
    Efficiently remove data pipelines that are no longer necessary, ensuring a streamlined workspace.
  • Debug
    Identify and address potential issues within your data pipelines, promoting optimal data transformation and accuracy.

With these configurations, Data Pipeline empowers you to efficiently process and enhance your data, making it more valuable and meaningful to the users within the Data Streams feature.

DataStore Connector

The transformed data residing within data streams finds its way to a permanent storage destination, such as Elasticsearch or MySQL, via the DataStore Connector.

You can configure the DataStore Connectors in the following ways:

  • Create
    Establish new DataStore Connectors to enable the seamless transfer of data from data streams to your chosen storage destination.
  • View
    Access detailed information about your existing DataStore Connectors, providing insights into their configuration and performance.
  • Edit
    Modify and refine your DataStore Connectors to adapt to evolving storage requirements, ensuring data accuracy and accessibility.
  • Delete
    Efficiently remove DataStore Connectors that are no longer needed, maintaining a clean and organized workspace.

With these configurations, DataStore Connectors facilitate the secure and efficient transfer of data from Data Streams to a permanent storage unit, ensuring data integrity and accessibility for end users.

Flows

Flows serve as a dynamic visual representation, elucidating the intricate data flow within the system. It vividly illustrates the path data follows, originating from the source data, traversing through collection agents, and concluding in permanent storage.

Managing Flow Dimensions

  • Input Data Stream: Select specific devices and elements within the Input Data Stream to customize and tailor the data flow.
  • Pipeline Data: Tweak and configure the flow within the pipeline, ensuring an optimized and efficient journey for your data.
  • Output Data Stream: Customize the flow parameters within the Output Data Stream to align with your specific requirements.
  • DataStore Connector: Manage and fine-tune the final stretch of the data’s journey into permanent storage, ensuring data integrity and accessibility.

Flow Dimension Controls

  • Zoom In (+): Enlarge specific sections of the flow for detailed examination and fine-tuning.
  • Zoom Out (-): Zoom out to get an overarching view of the entire data journey.
  • Reset to Normal ([]): Revert the flow dimensions to their standard settings for a comprehensive perspective.

Flows empower you to visualize, adapt, and optimize the data’s journey from its origin to permanent storage, enhancing your understanding and control over the data processing pipeline.

Further Reading

  1. I/O Streams Configuration
  2. Data Pipeline Configuration
  3. DataStore Connector Configuration

Resources

Browse through our resources to learn how you can accelerate digital transformation within your organisation.

Unveiling our all powerful IBMB Observability ExperienceCenter. Click Here