Configuring the Monitoring and Processing of Raw Data
- Updated2025-10-28
- 2 minute(s) read
A Data Preprocessor instance scans the raw data areas to detect new files, file changes, or deleted files, and processes the data according to processing rules.
- In Data Preparation, click Data Preprocessor Instances.
- Select an instance and click .
-
On the Monitoring tab, configure scanning and processing
with the following options.
Setting Description Changes in Raw Data Areas Scans and processes the data in the raw data areas when files are created, modified, or deleted. Note The operating system notifies SystemLink about every new, modified, or deleted file. Therefore you should not use this setting if you expect a very large amount of files or file changes.File Scan Schedule Creates a raw data scanning schedule. Continuous Scan Checks in regular intervals whether data has been added or changed in the raw data area. Job Files Batch processes specific files or folders. Processing Rules Specifies the reaction to file changes. Timeout per File Maximum amount of time, in seconds, for processing a file. Number of Parallel Requests to the Compute Nodes Number of requests that the Data Preprocessor instance sends to the compute nodes for execution. The maximum value for this setting is 64. The default setting for a new Data Preprocessor instance is 4. Incrementing this number leads to more throughput and increases processing resource utilization on the machine. Index Adaptor Configures the adaptor. Adaptors connect the Data Preprocessor instance with a database, where they store the instance data. The connected databases may differ in performance and functionality. Note This setting only displays if the Data Preprocessor instance uses a different database than the standard database.
Related Information
- Specifying the Reaction to File Changes
Specify how a Data Preprocessor instance reacts to data changes in the raw data area. These rules apply every time the system scans and processes files.
- Continuously Scanning Raw Data Areas for New or Deleted Files
Scan raw data areas in short intervals to detect whether files were added or deleted. The continuous scan is fast because it does not scan for modified files.
- Scanning and Processing Files Manually
Scan or process the files in the raw data areas of a Data Preprocessor instance manually if the automated process is deactivated, if the file system scan didn't work properly, or if you need to process a new or changed file immediately.
- Batch Processing Folders or Files
Use job files to prioritize the harmonization of new files in a batch process. Enable the processing of job files and specify their location.
