Docs > Data Management > Extract Data

Extract Data

Introduction

The Data Extraction module’s role is to update the destination (proxy and/or database) with the latest data from the data model. Jobs are set up for each signal ID or data model.

These jobs make sure there’s a constant few days’ worth of data available at the destination.

Working with Data Extraction

vuSmartMaps Data Extraction can be accessed by navigating from the left navigation menu (Data Ingestion > Data Extraction). 

To create a new Data Extraction module, click on the + button.

Select Data Model: Choose a Data Model from the list for which the data needs to be extracted.

Destination: The extracted data can be stored in a tabular structure in 2 locations.

  1. Proxy Store: Enable Proxy Store, if the extracted data has to be stored in a data lake.
  2. Data Store: Enable Data Store, if the extracted data has to be stored in the in-house database.

Note: You can enable both destinations at the same time. 

Execution Frequency: It is the interval in which the data extraction job has to run for the given source.

Save: Click on Save at the top right of the window and complete the creation of a Data Extraction Module.

Data Extraction Actions

Edit

To edit a Data Extraction profile, simply click on the Edit button associated with the specific module.

Delete

Deleting a Data Extraction profile is straightforward. Locate the profile you want to remove, and in the Action column, click the Delete option.

Multi Delete

To delete multiple profiles at once, simply select the profiles you want to remove by checking the boxes on the left, and then click the Delete button at the top right.

Further Reading

FAQs

It’s crucial to use strategies that prevent system overload for very large datasets. 

  • Batch Processing: Divide the dataset into smaller batches and extract each batch separately.
  • Incremental Extraction: Only extract and update the data that has changed since the last extraction. This reduces the volume of data processed in each cycle.
  • Parallel Processing: Utilize parallel processing to handle multiple smaller extractions simultaneously, speeding up the overall process.

Ensure your destination systems (e.g., Proxy Store or in-house database) are optimized for handling large data volumes with sufficient storage and processing capacity.

Yes, you can configure your data extraction process to store data in both a proxy store and an in-house database. Follow these steps:

  • Select Data Model: When setting up a new Data Extraction module, choose the appropriate Data Model.
  • Enable Proxy Store: Check the option for Proxy Store to route data to your data lake.
  • Enable Data Store: Also enable the Data Store option to store data in your in-house database.
  • Configure Settings: Adjust other necessary settings such as data formats, paths, and authentication details.

This dual-storage setup ensures redundancy and availability across different platforms.

To keep your data current, configure the execution frequency of your data extraction jobs to match your requirements. Follow these steps:

  • Set Frequency: During the setup of your data extraction job, specify how often the job should run (e.g., hourly, daily, weekly).
  • Monitor Performance: Regularly monitor the performance and completion times of your extraction jobs to ensure they are running as scheduled.

Adjust as Needed: If data latency is detected, adjust the frequency settings or optimize the data extraction process to improve performance.

Once a data extraction profile is deleted, it cannot be recovered. Double-check selections before confirming deletions.

First, check the job’s configuration and ensure the execution frequency is set correctly. Verify there are no errors in the Data Model or destination settings

Resources

Browse through our resources to learn how you can accelerate digital transformation within your organisation.

Quick Links