Data Extraction Performance for Standard Files

Learn how SystemLink Enterprise performs when extracting test data from standard files.

Standard files include Bench Data Connector (BDC) files and Standard Test Data Format (STDF) files. For each file type, NI tested data extraction for three test data scenarios using Notebook executions and measured performance.

Testing Conditions for Data Extraction

NI evaluated performance under the following infrastructure and the following measurement conditions.

NI did not evaluate the impact from the following parameters.
  • Multiple file extractions inside single execution
  • Parallel extractions
  • Existing data sets
Table 50. Infrastructure Conditions and Measurement Conditions for Data Extraction Tests
Condition Description Specifications
Execution pod Execution pod deployed and managed by Kubernetes (AWS EKS)
  • Memory: 2 GB
  • CPU: 100 m
Test Monitor service Test Monitor service deployed and managed by Kubernetes (AWS EKS)
  • Replication: Auto scale (default: 2, maximum: 10)
  • Node specification:
    • Type: r6a.4x large
    • vCPU: 16
    • RAM: 128 GiB
  • Pod specification:
    • CPU: 250 m
    • Memory: 320 Mi (up to 512 Mi)
Test Monitor database Test Monitor database sourced as a single instance of RDS PostgreSQL
  • PostgreSQL version: 14.7
  • Instance class: db.t4g.xlarge
  • vCPU: 4
  • RAM: 16 GiB

Data Extraction Performance for BDC Files

Bench Data Connector (BDC) files are test data files that contains parametric data about tests. BDC files consist of the following three types of rows.
Table 51. BDC Row Types and Descriptions
Row type Description
Column headers The name of the column

Each column header must be unique.

Column types The type of data that the column contains

Column type must be META, STD, COND, or INF.

Measurement data The values associated with each measurement
Refer to Bench Data Connector Logging Libraries to learn more about BDC files.
Table 52. Data Set Characteristics for each Test Scenario for BDC Files
Characteristic Scenario 1: High Mix–High Volume Scenario 2: Medium Mix–Medium Volume Scenario 3: Low Mix–Low Volume
Scenario description Many results with many steps Medium results with medium steps Few results with few steps
STD columns 4 4 4
COND columns 20 15 10
INF columns 20 15 10
META Columns 16 16 16
Number of results (META column combinations) 89 61 36
Number of steps (COND column combinations) 10-890 10-610 10-360
Number of measurements per step 25 25 25
Approximate total number of measurements 1 million 500,000 200,000
Table 53. Performance Summary for each Test Scenario for BDC Files
Characteristic Scenario 1: High Mix–High Volume Scenario 2: Medium Mix–Medium Volume Scenario 3: Low Mix–Low Volume
Scenario description Many results with many steps Medium results with medium steps Few results with few steps
Average extraction time 12 minutes 6 seconds 4 minutes 53 seconds 1 minute 52 seconds
Approximate rate of extraction per hour 5 files 10 files 28 files

Data Extraction Performance for STDF Files

Standard Test Data Format (STDF) files are binary files that store test and measurement data from semiconductor manufacturing and testing. Refer to the Standard Test Data Format (STDF) Specification to learn more about STDF files.

Table 54. Data Set Characteristics for each Test Scenario for STDF File
Characteristic Scenario 1: High Mix–Low Volume Scenario 2: Low Mix–Very High Volume Scenario 3: Low Mix–High Volume
Scenario description Many results with few steps Few results with very high steps Few results with many steps
Number of results 5,000 20 20
Number of steps 1,000 54,000 25,000
Number of measurements per step 1 1 1
Approximate total number of measurements 5 million 1 million 500,000
Table 55. Performance Summary for each Test Scenario for BDC Files
Characteristic Scenario 1: High Mix–Low Volume Scenario 2: Low Mix–Very High Volume Scenario 3: Low Mix–High Volume
Scenario description Many results with few steps Few results with very high steps Few results with many steps
Average extraction time 2 hours 22 minutes 18 seconds 38 minutes 10 seconds 18 minutes 10 seconds
Approximate rate of extraction per 10 hours 4 files 15 files 33 files