Version: NG-2.15

Extract Data

Introduction

The role of the Extract Data module is to update the destination (proxy and/or database) with the latest data from the data model. Jobs are set up for each signal ID or Data Model. These jobs make sure there's a constant few days' worth of data available at the destination.

Working with Data Extraction

vuSmartMaps Data Extraction can be accessed by navigating from the left navigation menu (Data Ingestion > Data Extraction).

To create a new Data Extraction module, click on the + button.

Select Data Model: Choose a Data Model from the list for which the data needs to be extracted.

Destination: The extracted data can be stored in a tabular structure in 2 locations.

Proxy Store: Enable Proxy Store, if the extracted data has to be stored in a data lake.
Data Store: Enable Data Store, if the extracted data has to be stored in the in-house database.

note

You can enable both destinations at the same time.

Execution Frequency: It is the interval in which the data extraction job has to run for the given source.

Save: Click on Save at the top right of the window and complete the creation of a Data Extraction Module.

Data Extraction Actions

Edit

To edit a Data Extraction profile, simply click on the Edit button associated with the specific module.

Delete

Deleting a Data Extraction profile is straightforward. Locate the profile you want to remove, and in the Action column, click the Delete option.

Multi Delete

To delete multiple profiles at once, simply select the profiles you want to remove by checking the boxes on the left, and then click the Delete button at the top right.

FAQs

How do I handle data extraction when dealing with very large datasets that might exceed system capacity?

To manage large datasets efficiently, use the following strategies:

Batch Processing: Split the dataset into smaller batches and extract them separately.
Incremental Extraction: Extract only the data that has changed since the last extraction, reducing the load.
Parallel Processing: Run multiple extraction jobs simultaneously to speed up the process.

Ensure that the destination systems (Proxy Store or in-house database) have sufficient storage and processing capacity.

Can I extract and store the same data in both a proxy store and an in-house database simultaneously?

Yes, you can configure your data extraction process to store data in both locations.

Select Data Model: Choose the required Data Model for extraction.
Enable Proxy Store: Store data in the data lake.
Enable Data Store: Store data in the in-house database.
Configure Settings: Adjust necessary settings, such as data format and paths.

This dual-storage setup ensures redundancy and data availability.

How do I ensure my data extraction jobs run frequently enough to keep the data current?

To maintain up-to-date data, set an appropriate execution frequency for extraction jobs:

Set Frequency: Specify the interval (e.g., hourly, daily) during job setup.
Monitor Execution: Regularly check job completion times to verify they are running as scheduled.
Optimize Jobs: If you detect data latency, adjust the frequency or optimize the extraction process.

If I delete a data extraction profile by mistake, is there a way to recover it?

No, once a data extraction profile is deleted, it cannot be recovered. Double-check selections before confirming deletions.

What steps should I take if my data extraction job is not running as scheduled?

First, verify the job’s configuration settings:

Execution Frequency: Ensure the frequency is correctly set.
Destination Settings: Check that the data model and destination (Proxy Store or in-house database) configurations are accurate.
Troubleshoot Errors: Review any error logs to identify issues affecting the job schedule.

Introduction​

Working with Data Extraction​

Data Extraction Actions​

Edit​

Delete​

Multi Delete​

FAQs​