A Data Preprocessor instance scans the raw data areas to detect new files, file changes, or deleted files. The processing
rules you configure in
define, whether the Data Preprocessor instance also processes the data.
-
In
Data Preparation, click
Data Preprocessor Instances.
-
Select an instance and click
.
-
Configure the scanning process with the following options.
Setting
|
Description
|
Changes in Raw Data Areas
|
Scans and processes the data in the raw data areas when files are created, modified, or deleted.

Note
The operating system notifies SystemLink about every new, modified, or deleted file. Therefore you should not use this setting
if you expect a very large amount of files or file changes.
|
File Scan Schedule
|
Creates a raw data scanning schedule.
|
Continuous Scan
|
Checks in regular intervals whether data has been added or changed in the raw data area.
|
Timeout per File
|
Maximum amount of time, in seconds, for processing a file.
|
Number of Parallel Requests to the Computing Nodes
|
Number of requests that the Data Preprocessor instance sends to the computing nodes for
execution. The maximum value for this setting is 64. The
default setting for a new Data Preprocessor instance is 4.
Incrementing this number leads to more throughput and
increases processing resource utilization on the machine.
|
Index Adaptor
|
Configures the adaptor. Adaptors connect the Data Preprocessor instance with a
database, where they store the instance data. The connected
databases may differ in performance and functionality.

Note
This setting only displays if the Data
Preprocessor instance uses a different database than the
standard database.
|
Job Files
|
Batch processing of specific files or folders.
|