VuNet Docs

Configuration > Data Model > Working with Data Model

Working with Data Model

To add a new Data Model, simply click the + icon on the Data Models listing page.

Data Model creation is a 4 step process. These steps are as follows:

  1. Basic Information: Fill in the basic details.
  2. Prepare Data Query: Configure the data query.
  3. Do Transform: Apply transformations as needed.
  4. Set Properties: Define the properties of your Data Model.

Basic Information

You’ll land on the Data Model’s configuration page, where you provide basic information about the data model.

  • Data Model Name: Give your data model a unique name for easy reference.
  • Input Data Type: You can choose between “Metric” or “Event” based on your needs.
    • Metric: Use this for metric-based data models.
    • Event: Select this for alert-related or event-based data models.
  • Description (Optional): You can add a brief description for future reference.

Explore Data Store

It is a helpful feature that offers users a quick view of the information stored in a Data Store. It provides information about tables or indices, and their schema. This feature assists users in selecting and configuring the correct field in the data model and workspace stages. Manage your data stores directly from the listing page. Check it out here for more details.

Prepare Data Query 

This step is where you can prepare your queries to fetch the required data.

The syntax for building and writing queries could be different based on the selected data store. For each data store, you can either build your queries with a query builder or write the raw queries. Here are the tasks you can do in this section:

  1. Build a Query with a Query Builder
  2. Specify a Raw Query
  3. Run all Queries
  4. Query editor functions

Build a Query with a Query Builder

A model query can be built in two ways: 

  1. Using Build Query
  2. Write Query.

To start configuring the query, you must first select the Data Store. For more details on managing Data Stores, please see the Data Store section.

💡NoteIn the below example, Elasticsearch Data Store is considered.

A data model can have multiple queries based on your needs. For example, if you want to use data from different Data Stores, you can create more than one query and use transformations in the subsequent steps to combine them.

Example: If you use Hyperscale DB for Query 1 and ElasticSearch Data Store for Query 2, simply use a Join transform in step 3 to merge them according to your needs.

Using Build Query Option

Now, you can create a database query using the Query Builder. This tool helps you build basic select queries for various supported databases using Vunet’s Query Language.

In the “Select Table” section, choose the table or index from which you want to retrieve data.

Field for Time filtering (Optional)

If your data is time-series data, you want the Data Model to return data for the time range selected in your dashboard. In such a case, select the field containing the timestamp corresponding to your time series data here. This dynamically serves the Global Time Filter (Last 15 min, 24 hours, or 7 days, and so on) as input for the given query. If no field is selected, the query will retrieve all the data from the Data Store, regardless of the chosen time range.

Elasticsearch Filter Parameter

You can add the filter to fetch the data from the Data Store. It must follow the proper Elasticsearch Syntax. Please note that this is an optional field and is only visible when using Elasticsearch Data Sources.

It uses a key:value syntax and the following is an example:

 (sev:error AND (state:Prog OR state:Comp))

💡Note:

1. If we choose Hyperscale Data Store then Hyperscale Filter Parameters must follow proper SQL syntax.

Example: CPU usage >= 70 AND memory usage >= 70

If the selected Datastore is elasticsearch, we’ll have a few more fields that involve:

2. If we choose Hyperscale Data Store then you can optionally use VQL (VuNet Query Language) Filters. VQL is an easy-to-use text-based query language for filtering data. Use VQL to filter documents based on substring match, value match, where a value for a field exists and similar such simple and compound conditions.

Before using the VQL filter:

Output

Click on Run Query to view the result.

After using the VQL filter:

Output

Click on Run Query to view the result.

Aggregation in ElasticSearch

  1. Aggregations: Metric aggregation allows you to compute metrics in a specified form from the selected data points. For example: You can use the average aggregation with the memory utilization field to get an average value of memory utilization.
  2. Group By (Optional): Group-by allows you to organize data into different groups/buckets/categories. For example: A metric (CPU Utilization) is calculated for each host which means you are bucketing data at the host level.

Aggregations

Choose an Aggregation from the dropdown based on your requirements.



To add a new Aggregation, click on the + button.



There are different aggregations possible and you can select an aggregation from the dropdown. Please note that all aggregation fetches data based on the time selected in the Global Time Selector. Here is a list of aggregations supported:

a. Min: This allows you to fetch the minimum value of the selected field.


  • Select Aggregation: Choose min from the drop down.
  • Select Field: Choose a field of interest.
  • Display name: It is the name you can give to this aggregation. This name will be visible to users in Step 3 and Step 4 of the Data Model configuration.

b. Average: Calculates the average value for a given field.

c. Count: Provides the total number of documents in that index.

d. Max: The max aggregation returns the maximum value of a numeric field. Example: Max aggregate for the ‘duration’ field will return the highest value of the ‘duration’ field in all the records.

e. Percentiles: Computes a specific percentile for all the values in a given field. For example, the 10th percentile for ‘timetodetect’ is the value that is greater than the ‘timetodetect’ value of 10% of the transactions.

💡Note: If you choose HyperScale DB, you can also select “Calculation Type” for Percentiles which was otherwise possible in Write Query.

f. Sum: The sum aggregation returns the total sum of a numeric field.

g. Top Hits: Top Hits allows you to get the most recent data for the selected field. For example: Top Hits for a status field of a service will always return the most recent status of the service.

h. Unique Count: Provides a distinct count for selected fields. For example: A unique count of the host field will provide the number of unique hosts available in the selected table/index.

Group By (Optional)

You can group the aggregation in different buckets by using methods like Data Histogram, Histogram, and Term.  Click on ‘Add New Group By’ to configure a bucket.

Click on Add Group By to include the Buckets.

Following Buckets supported by Group By.

a. Date Histogram: You can use the “Date Histogram” to group time series data into continuous intervals:

In the Date Histogram, we further have 3 field that needs to be filled to make term aggregation work which involves:

  • Select Fields: Select the Data and Time field which represents the record time. In most of the cases, it would be @timestamp.
  • Time Unit: It is the unit for which time interval needs to take place. Auto is the default value for this.
  • Interval: It is the time interval. Units will be as per time unit.

b. Histogram: It is used to group numerical data into nearby intervals based on the frequency of occurrence. The “Interval” represents the size of these intervals in seconds.

c. Terms: Terms are used to group data based on the value of the selected field.

Under “Terms,” there are four fields you need to fill in to make the term aggregation work. These fields include:

  • Order By: The results of the grouped data are ordered/sorted based on the field selected in Order-by.
    • Note: While using Group By, we also allow you to sort your data based on your preference. The sorting field can be different from the field used for grouping, hence, there is a separate selection.
    • Example: State City: MH Pune, MP Bhopal, MH Mumbai
    • “Order by” State City becomes ⇒, MH Mumbai, MH Pune, MP Bhopal
  • Field: This is the field that will be considered for ordering the data.
  • Order: This determines the order in which the data needs to be sorted, either in ascending or descending order. There is no default sort so we have to choose either ascending or descending.
  • Size: This limits the number of buckets/groups that the query should return.

Advanced Settings: You can select a specific time zone by enabling Advanced Settings.

  • Usually, the Data Models use the timezone of the database that is configured when we set up the Data Store to execute the queries.
  • In a rare case scenario, if the timezone of timestamps in a particular table is somehow different from the one configured for the selected Data Store, then make use of Advanced Settings to set the appropriate timezone for that particular query.
  • This option overrides the setting at the Data Store level and is not intended to be used unless you know the consequences.

For example, if the time timezone configured at the Data Store level is UTC, but for some reason, one particular table/index in that Data Store follows an Asia/Kolkata timezone, then enable Advanced Settings and override the timezone to Asia/Kolkata for that particular query on that table.

💡Note: The display timezone (i.e. the timezone in which data will be displayed in previews, etc.) will always be the local timezone.

Dynamic Filters

Check out how to configure this here.

Write Query 

You can specify a raw database query for any data source if you want to use more advanced source-based queries. To do this, click on Write Query and enter the raw query. For example, a sample SQL query to calculate top failed transactions.

Here we are selecting  span_attributes_http_target from the table click.apm_span. The query counts the Failed transactions.

Script:

SELECT

    span_attributes_http_target as Transaction,

    count() as Failed

FROM click.apm_span

WHERE $__timeFilter(“@timestamp”) AND span_attributes_http_target!=” AND status_code>0 and $__dynamicFilter(‘resource_attributes_application’, ‘Application’)

GROUP BY span_attributes_http_target

ORDER BY Failed DESC

💡Note:

  • Valid syntax: Existing and added features.
    • $__timeFilter(“time_column”)  # with double quotes [EXISTING]
    • $__timeFilter(time_column)  # without quotes [ADDED FEATURE]
    • $__timeFilter(db_function(“time_column”, *args, **kwargs))  # with function wrapping [ADDED FEATURE]
    • $__timeFilter(db_function(time_column, *args, **kwargs))  # with function wrapping [ADDED FEATURE]
  • Invalid syntax: Do not use. Deprecated old syntax, will be removed in a future release. Continues to work currently due to backward compatibility.
    • $__timeFilter(‘time_column’)  # with single quotes [EXISTING, DEPRECATED]

After completing this query, you can preview, validate, and move on to the next step.

Run Query

You can now preview any query that is created. There are two ways to preview,

  • Run Query: This will run a single data model query.
  • Run all Queries: This will run all the Data Model queries. It will display all the results of every query specified in that data model.

Run Query: Click on the run query option to get a preview of the selected query.

The results will be displayed as shown below.

To run all the queries, click on the Run all queries option at a model level to preview all the queries in a data model. 

The results will be displayed as shown below.

Once the preview looks accurate click on Validate and Next to move to the third step. 

Skip Validation

During the creation of a data model where data is currently unavailable in the data store, queries cannot be validated. In such instances, clicking the ‘Validate and Next’ button will result in an error. To bypass data checks and proceed to the next step, you can enable ‘Skip data checks’ before clicking on ‘Validate and Next’.

You can skip the validation by clicking Skip Data Checks as shown below.

Such skipped Data Models will have a warning tag before the name on the landing page as shown below.

Other Query Options

You can use other operations to Copy, Delete, and Disable a query. 

Copy: This allows you to copy an existing query to create a new query within the same data model.

Saved Search: This allows you to populate a query from a Saved Search.

Click on Proceed to auto-populate the query.

Disable: You can disable a query in a data model. When a data model is used somewhere, a disabled query won’t be executed.

Delete: You can delete a query both at the local and global levels. This means you can remove a query entirely from your data model.

Collapsible arrow: You can click on the button to hide the query content and keep only one of them to focus on that.

Do Transform

After successfully validating the query section, you’ll enter the transformation section. The transformations are used for data manipulation. You can combine data from two tables or data stores by using a join transformation when required.

On this page, you’ll see a list of supported transformations. Each transformation here comes with a description of its highlights. Clicking on a transformation will take you to the respective transformation screen, where you can create and configure the transformation.

Single Query Transform: 

  1. Organize Fields
  2. Add Field
  3. Filter By
  4. Aggregate
  5. Sort By
  6. Top N

Multiple Query TransformIn addition to incorporating all the single query transformations, two additional transformations are available for handling multiple queries.

  1. Join
  2. Concatenate

Organize Fields

The “Organize Fields” transformation allows you to rename fields or columns in a query. Here’s how to use it:

  • Choose the Organize Fields option.
  • Select the query you want to modify.

  • Select the Source field from the dropdown and provide the new field name in the Target field.
  • If you wish to rename other fields from the same query, Click ‘Add Field’ to add another field you would like to rename.

  • If you need to organize fields for another query in the same data model, use the Add transformation option, select ‘Organize field’, and follow the same steps.

Add Field

The “Add Field” transformation allows you to add a new field to the output of a query. You can do this using arithmetic operations, custom values, or expressions. 

Select a query in the Query dropdown. Choose a mode:

  • Arithmetic Operation
  • Custom
  • Expressions

Arithmetic Operation

With the “Arithmetic Operation” mode, you can create a new field based on arithmetic operations between existing fields. Here’s how to do it:

  • Select a Field from the Dropdown.
  • Choose an Arithmetic Operation of your choice (addition, subtraction, multiplication, or division).
  • Select a second Field to perform the arithmetic operation with.
  • Provide an Alias for the newly calculated field.

The existing fields will stay the same, and a new field will be added based on your specifications.

Output

Custom

In the “Custom” mode, you can create a new field with a static value in the Query. The value can be a number (integer or float) or a string. Here’s how to do it:

  • Provide the Field Value, which can be a number or string.
  • Provide an Alias for this new field.

Example: You can give a field value of 100 and name it New Field 1. You can add a new “Add Field” transform to find the ratio of Average Duration to Count using Arithmetic Operation mode. The resultant ratio is stored in the Alias name “Divide”

Now the result “Divide” column is multiplied by the static value “New Field 1” and the result is stored in the Alias name “Percentage”

Output:

Once you’ve entered this information, click the Run Transform button to see the changes reflected in the preview.

In the output, you’ll discover columns such as “New Field 1” with a static value, the “Divide” column representing the result of avg_duration divided by count, and finally, the “Percentage” column showcasing the multiplication of the Divide column and New Field 1.

Expressions

In the “Expressions” mode, you can create a new field based on a user-defined expression that involves operations like addition, subtraction, multiplication, or division between two or more existing fields or static floats/integers in the same Query. Here’s how to do it:

  • Write a custom expression involving two or more fields and multiplicative factors as applicable.
  • Enclose the field names you want to compute in curly braces, like this: {column1}.
  • Use round brackets for other purposes, for example, to compute a complex expression. For example, 2*({column1}+{column2})/{column3}.
  • Do not use square brackets [].
  • Ensure that curly braces are only used for specifying field names in the expression.
  • Provide an Alias for each newly calculated field.
  • You can compute multiple fields using multiple expressions in the same transformation.

Example: Here we are calculating the percentage of average duration to the count using the expression {Avg_Duration}/{count}*100 and the result is stored under the Alias name New 1

💡Note:

  • If the brackets are not completed and If the field datatypes don’t match the requested computation (e.g. str/float) then the corresponding error is thrown.
  • We have also nested field expressions such as {system.cpu.util.norm}, {system.cpu.util.min}, {system.cpu.util.init}, etc. are separate selectable fields. If these are selected, they become separate fields in the visualized data. Essentially, we flatten any nested-value fields.

Once you’ve entered your expressions and aliases, click the Run Transformation button to see the changes reflected in the preview. The existing fields will remain unaffected, and new fields will be added according to the specifications of your expressions.

Filter By

The “Filter By” transformation allows you to filter or remove specific portions of the data in a frame.

Here’s how to use it:

  • Choose the query from the dropdown menu and specify the condition for data filtration.
  • The user is required to select the field and input the corresponding matching fields to facilitate the filtering process.

This transformation helps you refine your data by including specific records based on the defined conditions and matching values.

Example: You are filtering out the Avg_Duration value which is greater than 0.5.

Table before filtering

Filter-By Transformation

Table after Filtering

Aggregate

The “Aggregate” transformation is used to summarize or aggregate the output of a query. It also offers time-bucketing capabilities.

 Here’s how to use it:

  • Select the query from the dropdown that you want to aggregate.
  • If needed, choose one or more “Group By” fields to group your data.
  • Next, you have two options:
    • Aggregate without Time Bucketing
    • Aggregate with Time Bucketing

The “Aggregate” transformation allows you to condense and analyze your data, providing you with insights and summaries based on your chosen fields and grouping criteria. Time bucketing helps in organizing data based on time intervals, which can be particularly useful for time-series data.

1. Aggregate without Time Bucketing

This process allows you to aggregate data based on the chosen source field, perform various calculations, and create new fields with summarized information. You can add and remove multiple aggregations as needed.

Example: We are calculating the Unique Count for the source field event_id and the result is stored under the alias name Unique which is grouped by Severity.

  • Disable Time Bucketing: Time bucketing should be disabled by default for this option.
  • Choose a Source Field: Select the field you want to aggregate.
  • Select the Calculation: Choose from the available options for aggregation:
    • Mean
    • Min
    • Max
    • Count
    • Total
    • First
    • Last
    • Unique Count
  • Provide an Alias: Give a name to the aggregated field. This name will represent the aggregated result.
  • Add Multiple Aggregations: You can add multiple aggregations by clicking the “+ Add” button. You can perform multiple aggregation operations on the same field or different fields simultaneously.
  • Remove an Aggregation: If you want to remove an aggregation that you’ve added, click the “Delete” button next to that aggregation.

Output:

After clicking on Run Transform you will find the Unique Counts of the Event ID grouped by the severity.

b) Aggregate with Time Bucketing

This process allows you to perform time-based aggregation on data, summarizing information within specified time intervals. The other aggregation settings, like source field and calculation methods, are similar to aggregating without time bucketing.

Example: You are calculating the Unique Count for the source field event_id and the result is stored under the alias name Unique which is grouped by Severity.

You are bucketing the start_time with intervals of 10mins and the result is stored under Bucket Label.

  • Enable Time Bucketing: Turn on the Time Bucket option to perform time-based aggregation.
  • Choose a Time Column: Select the column that contains time-related data, which will be used for time-based aggregation.
  • Specify Bucketing Interval: Define the time interval in minutes for creating time buckets. Data will be aggregated within these specified time intervals.
  • Configure Aggregation: Configure aggregation settings as you did in the previous case (without time bucketing). Choose the source field, select the calculation method, provide an alias, and add or remove multiple aggregations as needed.
  • Preview: After configuring all the settings, click the Run Transform button to apply these aggregations with time bucketing. You can review the aggregated results in the preview.

Output:

After clicking on Run Transform you will find the Unique Counts of the Event ID grouped by the severity with the start_time interval of 10 minutes.

Sort By

Sort By transformation is used to sort a query output by the selected field. When reverse is enabled, the values are sorted in descending order.

To use the Sort By transformation, select a query and then choose the field for sorting.

  • Select a Query.
  • Choose a Field. The field is a multi-select dropdown. If multiple fields are selected, multi-level sorting is enabled. i.e. data is first ordered by the values in the first selected field, then by the values in the second, and so on.
  • Optionally, enable Reverse for descending order.
  • Click Run Transformation.

Ascending Sort-BySelect a query and choose the field. The reverse button is by default disabled to sort them in ascending order.

Descending Sort-ByClick the Reverse button to sort them in descending order.

Top N

This transformation helps you choose the top N rows from a data frame, allowing for further operations.

Join

This transformation allows you to join the outputs of two queries into a single frame, functioning like a typical SQL join.

  • Choose two queries in the Query dropdown.
  • Select the fields to join the queries using the On Fields dropdown. It will show only the intersection of fields from both queries.

💡Note: Please find the individual outputs of Query 1 and Query 2.

Here before using the Join transformation, you configure 2 queries in the Prepare Data Query section. Where Query 1  gives you the data of the count of Respcode.

The output of Query 1.

Query 2 displays data on the number of respcode” per transaction category.

The output of Query 2

Once done, choose the type of Join you want to perform. Available options are:

Left join: It returns all records from the left table and the matched records from the right table.

Right join: It returns all records from the right table and the matched records from the left table.

Inner join: It returns records that have matching values in both tables

Outer join: It returns records that have unmatching values in both tables.

Click on the Run Transform button to see the changes in the preview.

💡Note: You can join multiple fields across multiple queries using this transformation.

Concatenate

Concatenate combines the results of two Queries into one frame.

Choose two Queries in any order from the dropdown.

Pick either Horizontal (side by side) or Vertical (stacked) concatenation.

  • Horizontal – Concatenate the stages horizontally (left to right)
  • Vertical – Concatenate the stages vertically (top to bottom)

Click Run Transform to see the merged results.

 a) Horizontal Concatenate

You are Horizontally concatenating Query 1 and Query 2 to create a new table.

As seen above, this is aggregated data from two or more sources having a similar number of rows in them.

Output:

b) Vertical Concatenate

You are Vertically concatenating Query 1 and Query 2 to create a new table.

As seen above, this is time series data with a similar schema coming from two different queries.

To concatenate these into a single stage/frame, use Vertical Concatenate.

Output:

💡NotePreview is restricted to a max of 20 records

Set Properties

You can use the ‘Set Properties’ step to select the metrics and buckets that should be exposed for this data model. Additionally, you can set properties for ML modeling and threshold for alerting purposes.

Click Add Metric Attribute to add Metric Name, Severity Rules, and other attributes. These definitions are used by Visualizations, Alerting, Reporting, and Machine Learning modules, to handle the metric and their monitoring/reporting.

Select Metrics: Pick a field representing the Metric or key data point. You have the following 2 sections to make the required configurations.

1. Severity Rules: A severity rule is a value range at which a metric might require monitoring or alerting.

  • No Thresholding: Choose this if you do not want to set any thresholds.
  • Use Dynamic Thresholding: It helps you in auto-baselining and Health Scoring for the given metric using ML. This happens when we use the Data Model in Alerts. Dynamic thresholds help detect problems that may have gone unnoticed when using static thresholds.

  • Use Static Thresholding: Set numeric values to receive notifications when the metric exceeds specific thresholds (up to 10 thresholds).

2. Other Attributes (Optional): The attributes presented in this section are tailored for Data Models utilized within the RCA Workspace.

Each metric can now be given with information on the following properties. 

  • Select Metric Name: Choose a unique name for the metric to use in your workspaces.
  • Upper Cap (optional): If you know the maximum value, you can specify it here (numeric).
  • Frequency (optional): Indicate how often the metric data is recorded in minutes (numeric).
  • Signal Category: Categorize your metric as Bounded (known upper and lower limits), Unbounded (unknown limits), or Binary (only two values).



💡Note: All these settings are very useful when the Data Model is used in the RCA Workspace.

Define Time Fields in Data Model

This helps you enable the timestamp for the given Data Model. It is data that is recorded over consistent intervals of time. You should enable this whenever the Final Data is time-series data. Time series data could come when bucketed data is fetched, or raw time series data is queried.

Define unique keys for which the metrics from this Data Model are to be tracked

Unique Keys are fields in the data model output that help in distinguishing different categories of data in the data model. Alerting and Machine Learning Modules use this to track the metrics separately for each value of unique keys.

Run Set Properties: Once you’ve finished configuring your metrics, simply click the Run Set Properties button at the top right. This will validate the metric properties, and the Data Model will perform a background check based on the new information you’ve provided.

Preview

Click on the blue eye icon to preview the results.

Save

Once validation is complete, the save button gets activated at the bottom right. Click on it to save the Data model. 

With this, The Data Model creation is now complete.

Support for Dynamic Filters in Dashboards for UTM and Insights Visualization

Dynamic filters allow you to set up custom filters within your queries using the “$” symbol with a variable.

For Hyperscale DB using Build Query:

Enable the Toggle button to configure the Dynamic Filters in the Build Query section of the Data Model.

Select the Field name, based on which you want to filter the data, and provide the Variable Name. The same variable name will be used to configure ‘variable’ in the dashboard and use it as a Filter. (Variable name is case sensitive). Click on the + New Filter button to add multiple filters.

This is what the whole Build Query configuration looks like for a sample ibmb_transactions table.

For Timescale Datastore (PSQL):

  • When defining a query for a data model, include placeholders for where clauses based on important filtering fields.
    • Example: select name, ip, location group by the location where $__dynamicFilter(“location”, “location”)

    • Example: If we have multiple dynamic filters in a single query, then the query should be sent as: SELECT c1, c2, c3 FROM T1 WHERE $__dynamicFilter(“c1”, “c1”) AND $__dynamicFilter(“c2”, “c2”)

  • Example: If we have multiple dynamic filters in a single query, then the query should be sent as SELECT c1, c2, c3 FROM T1 WHERE $__dynamicFilter(“c1”, “c1”) AND $__dynamicFilter(“c2”, “c2”)



  • When running these queries if the variable (e.g., “location”) is not available, the where clause is excluded. However, if the variable is present, it’s used to create the where clause based on the user’s selection in the dashboards.

💡Note: 

    • Valid syntax: Added features.
      • $__dynamicFilter(“column”, $variable)  # with double quotes [ADDED FEATURE]
      • $__dynamicFilter(column, $variable)  # without quotes [ADDED FEATURE]
      • $__dynamicFilter(db_function(“column”, *args, **kwargs), $variable)  # with function wrapping [ADDED FEATURE]
      • $__dynamicFilter(db_function(column, *args, **kwargs), $variable)  # with function wrapping [ADDED FEATURE]
    • Invalid syntax: Do not use. Deprecated old syntax, will be removed in a future release. Continues to work currently due to backward compatibility.
      • $__dynamicFilter(‘column’, ‘variable’)  # with single quotes [EXISTING, DEPRECATED]
    • For Elastic Search DataStore, use both Build Query and Write/Raw Query Dashboard Filters.
    • For PSQL DataStore, use Write Query exclusively for Dashboard Filters.

Further Reading

Resources

Browse through our resources to learn how you can accelerate digital transformation within your organisation.

Unveiling our all powerful Internet and Mobile Banking Observability Experience Center. Click Here