This document provides results from a set of single-point benchmarks that NI R&D ran on two NI real-time controllers. The focus of these benchmarks is to provide an indication of the performance of NI hardware and software in the area of single-point performance. In isolation, these benchmarks cannot provide a guide to entire system performance, but they can assist in selecting the appropriate platform for a particular application by comparing the different hardware/software combinations on a set of simple, standard tests.
In this paper, we begin with a description of the two different hardware platforms tested, CompactRIO (cRIO) and PXI, follow with a description of the single-point tests, and end with detailed benchmark numbers for each of the tests.
The NI CompactRIO programmable automation controller (PAC) is a low-cost reconfigurable control and acquisition system designed for applications that require high performance and reliability. The system combines an open embedded architecture with small size, extreme ruggedness, and hot-swappable industrial I/O modules. CompactRIO is powered by reconfigurable I/O (RIO ) FPGA technology.
Figure 1: CompactRIO System
In order to program the CompactRIO system, you need to use LabVIEW, the LabVIEW Real-Time Module and the LabVIEW FPGA Module. In our tests, we used the LabVIEW FPGA Module to gain access to the I/O while performing computation, local logging, and host communication with the host on the real-time controller.
Figure 2: cRIO AI/AO Sample Diagram on Real-Time Controller
PCI eXtensions for Instrumentation (PXI) is the open, multivendor standard for measurement and automation with access to a wide variety of I/O and communication modules, including data acquisition, modular instrumentation, reconfigurable I/O (RIO), image acquisition, motion control, Ethernet, serial, CAN, DeviceNet, reflective memory, and more. With PXI, you automatically benefit from the low cost, ease of use, modularity, and flexibility of PC technology. The PXI system is programmed using LabVIEW and the LabVIEW Real-Time Module, and in these tests we used NI DAQmx software and a data acquisition board for access to single-point I/O.
Figure 3: PXI Controller
In addition to the PXI form factor, the same suite of software products can be run on standard off-the-shelf PC hardware.
Figure 4: DAQmx AI/AO Sample Diagram
3. Single-Point Tests
NI R&D designed a set of single-point tests to gauge the performance of the systems across a variety of program architectures. The table below lists the tests along with a brief description. Each test used 1, 4, and 16 channels.
Table 1: Single-Point Tests
We created tests to determine the fastest loop the systems could attain without losing data or being “late.” These loop rates are reported later in this paper. A loop iteration is considered late if the software is unable to receive a sample from the I/O hardware, process that sample, and output the result before the next input sample is ready.
The NI-DAQmx and NI-RIO drivers are capable of hardware-timed single-point operations, and they provide feedback to help ensure that the software keeps up with the hardware clock. In the next three sections, we briefly review how lateness checking can be accomplished with each of these drivers.
See Appendix A for a complete discussion of lateness checking.
A key consideration when designing a LabVIEW real-time application is whether the system needs to concurrently perform its time-critical function along with other non-time-critical operations, such as local disk logging or communication with the host. This decision dictates the basic architecture of the real-time application.
Polling can be used for I/O mode in the case where the system has no non-time-critical responsibilities, or, more realistically, the system uses a state machine to schedule time-critical and non-time-critical tasks to operate sequentially. For most I/O drivers, polling mode is faster than interrupt mode.
Although slower than polling mode, interrupt mode is the more common for real-time applications as most applications contain a mix of time-critical and non-time-critical functions occurring simultaneously. Interrupts allow the I/O portion of a diagram to suspend its operation and allow other code, such as communication and logging, to run while the hardware is in the process of acquiring data. Once the hardware has finished its acquisition, it raises an interrupt to notify the software that it should resume with it’s time-critical I/O processing.
As noted in the table, we ran the single-point tests using the appropriate I/O mode to highlight the differences between these two modes of operation.
A variation on the polling mode architecture that we did not utilize is to use of the microsecond wait function in time-critical loop in conjunction with the polling I/O mode. This methodology allows the programmer to reserve a pre-determined block of time for concurrent non-time-critical code. The polling numbers presented in this paper establish the maximum rate for such a mode, assuming a zero microsecond wait, and increasing the duration of the microsecond wait will reduce the loop rates in a linear fashion.
Figure 5: DAQmx Example with Microsecond Wait
An additional consideration when building systems that require concurrent time-critical and non-time-critical function is how to transfer data from the time-critical loop to the non-time-critical loop so that it can be logged or communicated back to the host. Real-Time FIFOs, available as part of the LabVIEW Real-Time Module, provide this needed functionality. Figure 8: Use of Real-Time FIFOs shows how tests T3 & T4 use the Real-Time FIFOs to provide jitter-free communication between the time-critical code and communication/logging functions.
Figure 6. Use of Real-Time FIFOs
Buffer size selection is an important consideration when programming with RT FIFOs and altering the size of a FIFO buffer might produce a large change in the final loop rate a test can achieve. Larger buffers generally provide better performance at the expense of higher memory usage. For these tests, all buffer sizes were fixed to a 4KB size to ensure that the tests could be run on even the most memory constrained devices.
Test T4 included communication with a host PC and the method we chose for this communication was TCP/IP. LabVIEW 8.0 introduced the Network Published Shared Variable which is a simplified communication mechanism for communicating between LabVIEW programs across the network. Currently, the network published shared variable uses a network protocol optimized for supervisory monitoring of large numbers of variables and not high speed streaming as required by these tests. In future versions of LabVIEW, National Instruments will optimize the network published shared variable to support the streaming use case. For more information regarding the shared variable, please refer to the white paper titled LabVIEW Shared Variable available on ni.com.
4. Test Results
Final tests results, along with the specific hardware and software versions, are provided in the following sections.
- NI cRIO-9024: Real-Time Controller
- 800 MHz PowerPC processor
- 512 MB RAM
- NI cRIO-9205: 32-Ch ±200 mV to ±10 V, 16-Bit, 250 kS/s Analog Input Module(s)
- NI cRIO-9264: 16-Channel Analog Output Module
Software installed on controller
- LabVIEW Real-Time 2011
- NI-RIO 4.0.0
Figure 7. cRIO-9024 Benchmark Results
- NI PXI-8108 RT
- 2.53 GHz Intel Core 2 Duo T9400 dual-core processor
- 1 GB 800 MHz DDR2 RAM
- NI PXIe-6361
- NI PXIe-6363
Software installed on controller
- LabVIEW Real-Time 2011
- NI-DAQmx 9.3.5
Figure 8. PXIe-8108 With SMP Benchmark Results
Figure 9. PXIe-8108 No SMP Benchmark Results
5. Comparative Graphs
Figure 10. T1 Test Results Comparison
Figure 11. T2a Test Results Comparison
Figure 12. T2b Test Results Comparison
Figure 13. T3 Test Results Comparison
Figure 14. T4 Test Results Comparison
6. Appendix A
The DAQmx 9.3.5 driver provides feedback to guarantee that no input samples have been lost and that all output channels are updated with their corresponding values before the Sample Clock signal. To achieve this, DAQmx 9.3.5 uses the “Wait For Next Sample Clock” VI to force synchronization of the software I/O loop with the Sample Clock signal of one of its I/O tasks.
For more detailed information on DAQmx lateness detection please refer to the DAQmx Online Help or the following document on the NI Developer Zone: DAQmx Hardware-Timed Single-point Lateness Checking
Lateness checking on RIO devices is implemented in the FPGA by using a handshaking mechanism to determine if the real-time software VI is able to receive the input samples, process the data, and provide corresponding output values before the next sample clock signal occurs. The handshaking mechanism can use interrupts, or it can poll a user-defined register to signal that a new I/O cycle has begun.
The diagrams for the interrupt mechanism are depicted below in Figure 15 and Figure 16. The FPGA VI generates an interrupt when new input data is available and then waits for the time-critical VI to acknowledge the interrupt.
Figure 15: NI-RIO Diagram (FPGA)
The time-critical VI must service the interrupt within the duration of a single sample period. This includes reading the input samples, processing the samples, and writing the output values back to the FPGA.
Figure 16: NI-RIO diagram (Real-Time)
The following diagram illustrates the interrupt-based handshaking employed on the NI-RIO benchmarking tests:
Figure 17: Interrupt-based handshaking employed on NI-RIO tests
The polling mechanism is implemented in a very similar way. The time-critical VI polls an FPGA register to determine when data is available and does not reset the value of that register until it has finished the I/O tasks (processing and output). At that point, the FPGA is allowed to move on to the next I/O iteration.
7. Appendix B
The tests for the cRIO and PCI/PXI RIO benchmarks were written such that all data acquired through the FPGA modules/board was transferred to the real-time controller for processing. An alternative approach for tests T1 (Analog In + Analog Out) and T2a (Analog In + PID + Analog Out) is to perform all the processing directly on the FPGA. This approach can achieve much faster loop rates than the numbers shown the respective graphs. Single channel PID calculation on both the cRIO and PCI/RIO approach 150kHz and are mainly limited by the performance of the A/D chips.
8. Appendix C
Since the data shown is using the latest hardware and software versions, we have attached a pdf below containing any previously published data.