Working with Data Model > New Data Model Creation

New Data Model Creation

To add a new Data Model, simply click the + icon on the Data Models home page.

To create a new Data Model, click the “Create” button and follow these four simple steps:

  1. Basic Information: Fill in the basic details.
  2. Prepare Data Query: Configure the data query.
  3. Do Transform: Apply transformations as needed.
  4. Set Properties: Define the properties of your Data Model.

Basic Information

You’ll land on the Data Model’s configuration page, where you provide basic information about the data model.

  • Data Model Name: Give your data model a unique name for easy reference.
  • Input Data Type: You can choose between “Metric” or “Event” based on your needs.
    • Metric: Use this for metric-based data models.
    • Event: Select this for alert-related or event-based data models.
  • Description (Optional): You can add a brief description for future reference.

Explore Data Store

It is a helpful feature that offers users a quick view of the information stored in a Data Store. It provides information about tables or indices, their schema, and sample data.

This feature assists users in selecting and configuring the correct values in the data model and workspace stages. Additionally, it offers insights into the raw data through visualizations.

Add Query 

  1. Data Model Query
  2. Build a Query with a Query Builder 
  3. Specify a Raw Query
  4. Run Query 
  5. Query editor functions

Data Model Query

A data model query in data models is like a database select query used to fetch data from a database. Keep in mind that the query language or syntax can vary between different databases, so some familiarity with these queries is helpful to understand how they work.

Build a Query with a Query Builder

A model query can be built in two ways: 

  1. Using a Query Builder
  2. Directly specifying a Raw query.

“Prepare Data Query” is one of the three important sections in a Data Model creation.

You need to pick a Data Source when creating a query, but remember that your data sources should have already been set up using Data Store configurations.

A model can have multiple queries based on your needs. For example, if you want to use data from different Data Stores, you can create more than one query and use transformations to combine them later.

Build a Database Query

Now, you can create a database query using the Query Builder. This tool helps you build basic select queries for various supported databases using Vunet’s Query Language.

In the “Select Table” section, choose the table or index from which you want to retrieve data. This listing may include Elasticsearch data sources and more.

Note: The “Field of Time Filtering” column, which is optional, includes settings related to time. For instance, you can select options like Timestamp or Date for time-related configurations.

Field for Time filtering (Optional)

Specify the timestamp column of that table. 

If the selected Datastore is elasticsearch, we’ll have a few more fields that involve:

1. Number of rows

You can set the maximum size limit for the query results. This field is specific to Elasticsearch Data Sources.

2. ES Filter

You can choose a filter based on your needs. Please note that this is an optional field and is only visible when using Elasticsearch Data Sources.

To support Aggregation in ElasticSearch you have the following 2 options:

  1. Aggregations
  2. Group By (Optional)

Aggregations

Choose an Aggregation from the dropdown based on your requirements.



To add new fields, click on the + button.



  • Min: Finds the minimum value for a given field.

  • Average: Calculates the average value for a given field.

  • Count: Provides the total number of documents in that index.

  • Max: Finds the maximum value for a given field. 
  • Percentiles: Shows where a certain percentage of observed values occur. 
  • Sum: Adds up all the values for a given field. 
  • Top Hits: Tracks the most relevant document being aggregated.
  • Unique Count: Provides a distinct count for selected fields.

Group By (Optional)

Click on Add Group By to include the Buckets.

Following Buckets supported by Group By.

1. Date Histogram: You can use the “Date Histogram” to group time series data into continuous intervals:

In the Date Histogram, we further have 3 field that needs to be filled to make term aggregation work which involves:

  • Select Fields: Select the required field.
  • Time Unit: It is the unit for which time interval needs to take place. Auto is the default value for this.
  • Interval: It is the time interval. Units will be as per time unit.

2. Histogram: It is used to group numerical data into contiguous intervals based on the frequency of occurrence. The “Interval” represents the size of these intervals, typically in seconds.

3. Terms: The “Terms” query is used to find documents based on a precise value.

Under “Terms,” there are four fields you need to fill in to make the term aggregation work. These fields include:

  • Order By: This allows you to add one or more sorts on specific fields.
  • Field: This is the field for which the term needs to be considered. This field must match.
  • Order: This determines the order in which the data needs to be sorted, either in ascending or descending order.
  • Size: This sets the size limit for which bucket to take place.

Time Zone: You can select a specific time zone by enabling Advanced Settings and choosing the desired time zone.

Specify a Raw Query 

You can specify a raw database query for any data source if you want to use more advanced source-based queries.

To do this, click on Edit Query and enter the raw query. For example, a sample SQL query (for data sources like PostgreSQL, MySQL, etc) could be:

SELECT timestamp, host, the value from network_heath where type==’in_bandwidth_p’ and $__timeFilter(“timestamp”)

$__timeFilter(“timestamp”)  is a mandatory step to be given in any raw query 

Note: This is used to pass the time range to the query, and you cannot use any other custom time range based on the original query language.

After completing this query, you can preview, validate, and move on to the next step.

Run Query

You can now preview any query that is created. There are two ways to preview,

  • Run Query: This will run a single data model query.
  • Run all Queries: This will run all the Data Model queries. It will display all the results of every query specified in that data model.

Run Query: Click on the run query option to get a preview of the selected query.

The results will be displayed as shown below.

To run all the queries, click on the Run all queries option at a model level to preview all the queries in a data model. 

The results will be displayed as shown below.

Once the preview looks accurate click on Validate and Next to move to the third step. 

Skip Validation

When you click Validate and Next sometimes it throws an error.

You can skip the validation by clicking ticking Skip Data Checker as shown below. This helps in creating Data Models for future use when the Data is unavailable.

Such skipped Data Models will have a warning tag before the name on the landing page as shown below.

Other Query Options

There are other operations you can use to Copy, Delete, and Disable a query. 

Copy: This allows you to copy an existing query to create a new query within the same data model.

Disable: You can disable a query in a data model. When a data model is used somewhere, a disabled query won’t be executed.

Delete: You can delete a query both at the local and global levels. This means you can remove a query entirely from your data model.

Do Transform

After successfully validating the query section, you’ll enter the transformation section.

Transformations are used for data manipulation. Keep in mind that they are not always mandatory for data models with only one query, but they are necessary, especially joint transformations, for data models that have more than one query.

On this page, you’ll see a list of supported transformations. Each transformation comes with a description of its highlights.

Clicking on a transformation will take you to the respective transformation screen, where you can create and configure the transformation.

Single Query Transform: 

  1. Organize Fields
  2. Add Field
  3. Filter By
  4. Aggregate
  5. Sort By
  6. Top N

Multiple Query Transform: 

  1. Join
  2. Concatenate

Organize Fields

The “Organize Fields” transformation allows you to rename fields or columns in a query. Here’s how to use it:

  • Select the query you want to modify.

  • Choose the “Organize” option for that specific query.
  • Click the Add button to select the fields or columns you want to transform.
  • If you wish to rename other fields from the same query, repeat the steps from ‘Add’ for each one.
  • If you need to organize fields for another query in the same model, use the Add transformation option and follow the same steps.

After you’ve organized the fields as needed, you can either run the transformation immediately or proceed to validate the transformation and move to the next section.

Add Field

The “Add Field” transformation allows you to add a new field to the output of a query. You can do this using arithmetic operations, custom values, or expressions. 

Select a query in the Query dropdown. Choose a mode:

  • Arithmetic Operation
  • Custom
  • Expressions

Arithmetic Operation

With the “Arithmetic Operation” mode, you can create a new field based on arithmetic operations between existing fields. Here’s how to do it:

  • Select a Field from the Dropdown.
  • Choose an Arithmetic Operation of your choice (addition, subtraction, multiplication, or division).
  • Select a second Field to perform the arithmetic operation with.
  • Provide an Alias for the newly calculated field.

The existing fields will stay the same, and a new field will be added based on your specifications.

Custom

In the “Custom” mode, you can create a new field with a static value in the Query. The value can be a number (integer or float) or a string. Here’s how to do it:

  • Provide the Field Value, which can be a number or string.
  • Provide an Alias for this new field.

Once you’ve entered this information, click on the Run Transform button to see the changes reflected in the preview.

Expressions

In the “Expressions” mode, you can create a new field based on a user-defined expression that involves operations like addition, subtraction, multiplication, or division between two or more existing fields or static floats/integers in the same Query. Here’s how to do it:

  • Write a custom expression involving two or more fields and multiplicative factors as applicable.
  • Enclose the field names you want to compute in curly braces, like this: {column1}.
  • Use round brackets for other purposes, for example, to compute a complex expression. For example, 2*({column1}+{column2})/{column3}.
  • Do not use square brackets [].
  • Ensure that curly braces are only used for specifying field names in the expression.
  • Provide an Alias for each newly calculated field.
  • You can compute multiple fields using multiple expressions in the same transformation.

Once you’ve entered your expressions and aliases, click on the Run Transformation button to see the changes reflected in the preview. The existing fields will remain unaffected, and new fields will be added according to the specifications of your expressions.

Filter By 

The “Filter By” transformation allows you to filter or remove specific portions of the data in a frame.

Here’s how to use it:

  • Select the query from the dropdown for which you want to apply the filter.
  • Choose the condition that you want to use for filtering the data.
  • Select the field that you want to filter based on.
  • Provide the matching field values to specify the filtering criteria.

This transformation helps you refine your data by including or excluding specific records based on the defined conditions and matching values.

Aggregate

The “Aggregate” transformation is used to summarize or aggregate the output of a query. It also offers time-bucketing capabilities.

 Here’s how to use it:

  • Select the query from the dropdown that you want to aggregate.
  • If needed, choose one or more “Group By” fields to group your data.
  • Next, you have two options:
    • Aggregate without Time Bucketing
    • Aggregate with Time Bucketing

The “Aggregate” transformation allows you to condense and analyze your data, providing you with insights and summaries based on your chosen fields and grouping criteria. Time bucketing helps in organizing data based on time intervals, which can be particularly useful for time-series data.

1. Aggregate without Time Bucketing

  • Disable Time Bucketing: Time bucketing should be disabled by default for this option.
  • Choose a Source Field: Select the field you want to aggregate.
  • Select the Calculation: Choose from the available options for aggregation:
    • Mean
    • Min
    • Max
    • Count
    • Total
    • First
    • Last
    • Unique Count
  • Provide an Alias: Give a name to the aggregated field. This name will represent the aggregated result.
  • Add Multiple Aggregations: You can add multiple aggregations by clicking the “+ Add” button. You can perform multiple aggregation operations on the same field or different fields simultaneously.
  • Remove an Aggregation: If you want to remove an aggregation that you’ve added, click the “Delete” button next to that aggregation.

This process allows you to aggregate data based on the chosen source field, perform various calculations, and create new fields with summarized information. You can add and remove multiple aggregations as needed.

2. Aggregate with Time Bucketing

  • Enable Time Bucketing: Turn on the Time Bucket option to perform time-based aggregation.
  • Choose a Time Column: Select the column that contains time-related data, which will be used for time-based aggregation.
  • Specify Bucketing Interval: Define the time interval in minutes for creating time buckets. Data will be aggregated within these specified time intervals.
  • Configure Aggregation: Configure aggregation settings as you did in the previous case (without time bucketing). Choose the source field, select the calculation method, provide an alias, and add or remove multiple aggregations as needed.
  • Preview: After configuring all the settings, click the Run Transform button to apply these aggregations with time bucketing. You can review the aggregated results in the preview.

This process allows you to perform time-based aggregation on data, summarizing information within specified time intervals. The other aggregation settings, like source field and calculation methods, are similar to aggregating without time bucketing.

Sort By

To use the Sort By transformation:

  • Select a Query.
  • Choose a Field.
  • Optionally, enable Reverse for descending order.
  • Click Run Transformation.

To use the Sort By transformation, select a query and then choose the field for sorting.

Top N

This transformation helps you choose the top N rows from a data frame, allowing for further operations.

Join

This transformation allows you to join the outputs of two queries into a single frame, functioning like a typical SQL join.

  • Choose two queries in the Query dropdown.
  • Select the fields to join the queries using the On Fields dropdown. It will show only the intersection of fields from both queries.
  • Once done, choose the type of Join you want to perform. Available options are
    • Left join: It returns all records from the left table and the matched records from the right table.
    • Right join: It returns all records from the right table and the matched records from the left table.
    • Inner join: It returns records that have matching values in both tables

Click on the Run Transform button to see the changes in the preview.

Concatenate

Concatenate combines the results of two Queries into one frame.

Choose two Queries in any order from the dropdown.

Pick either Horizontal (side by side) or Vertical (stacked) concatenation.

  • Horizontal – Concatenate the stages horizontally (left to right)
  • Vertical – Concatenate the stages vertically (top to bottom)

Click Run Transform to see the merged results.

Examples

 1. Horizontal Concatenate

Your data from queries look like this below

As seen above, this is aggregated data from two or more sources having a similar number of rows in them.

To concatenate these into a single stage/frame, use Horizontal Concatenate.

2. Vertical Concatenate

Your data from queries look like below:

As seen above, this is time series data with a similar schema coming from two different queries.

To concatenate these into a single stage/frame, use Vertical Concatenate.

After the transformation, your data will look like the below:

Note: Preview is restricted to a max of 20 records.

Set Properties

Set Properties allows you to specify the metrics that a data model will work with. This selection is crucial for further analysis and monitoring.

First, choose the timestamp column, which should be a common column across all the metrics defined in this section.

  • Grouping Fields (Optional):
      • Grouping fields help distinguish different data series in the Data Model.
      • If you’re not sure, just select all the fields used for categorization, except the one related to time.
  • Select Metrics (Optional):
    • Pick a field that represents the Metric or key data point.
    • You can choose as many metrics as needed and describe their properties using the dropdown menu.

The goal is to make it more straightforward and easier for users to understand.

Each metric can now be given with information on the following properties. 

  • Metric Name: Choose a unique name for the metric to use in your workspaces.
  • Upper Cap (optional): If you know the maximum value, you can specify it here (numeric).
  • Frequency (optional): Indicate how often the metric data is recorded in minutes (numeric).
  • Signal Category: Categorize your metric as Bounded (known upper and lower limits), Unbounded (unknown limits), or Binary (only two values).
  • Critical Thresholds: Set numeric values to receive notifications when the metric exceeds specific thresholds (up to 10 thresholds).

  • Dimensions (Will be available in future iterations): In future updates, you’ll be able to specify additional dimensions to analyze data at a more detailed level. For example, for a metric like “Transaction Volume,” you could add dimensions like “Merchant Name.”

To add another metric, simply click the Assign metric column.

Delete: If you want to delete a metric, you can use the delete option. 

Run Set Properties: Once you’ve finished configuring your metrics, simply click the Run Set Properties button at the top right. This will validate the metric properties, and the Data Model will perform a background check based on the new information you’ve provided.

Preview

Click on the blue eye icon to preview the results.

Save

Once validation is complete, the save button gets activated at the bottom right. Click on it to save the Data model. 

With this, The Data Model creation is now complete.

Support for Dynamic Filters in Dashboards for UTM and Insights Visualization

Dynamic filters allow you to set up custom filters within your queries using the “$” symbol with a variable. Depending on your DataStore type, follow these guidelines:

For Elasticsearch DataStore:

  • Configure your query in the build query section.
  • Ensure the selected field value matches the “Variable” name.

For Timescale Datastore (PSQL):

  • When defining a query for a data model, including placeholders for where clauses based on important filtering fields.
    • Example: select name, ip, location group by the location where $__dynamicFilter(“location”, “location”)

  • Example: If we have multiple dynamic filters in a single query, then the query should be sent as SELECT c1, c2, c3 FROM T1 WHERE $__dynamicFilter(“c1”, “c1”) AND $__dynamicFilter(“c2”, “c2”)
  • When running these queries, if the variable (e.g., “location”) is not available, the where clause is excluded. However, if the variable is present, it’s used to create the where clause based on the user’s selection in the dashboards.

Note: 

  • For Elastic Search DataStore, use both Build Query and Write/Raw Query Dashboard Filters.
  • For PSQL DataStore, use Write Query exclusively for Dashboard Filters.

Further Reading

  1. Accessing Data Modelling Workspace

Resources

Browse through our resources to learn how you can accelerate digital transformation within your organisation.

Unveiling our all powerful IBMB Observability ExperienceCenter. Click Here