1. Introduction to Impoving System Throughput
"How do I maximize my automated test system throughput?" is a common question posed by many engineers and scientists. For years, engineers have employed numerous strategies to extract more speed from their systems in both R&D laboratories and on the manufacturing floor. These optimization techniques have often included brute force procedures, such as cutting down the number of tests and purchasing redundant instruments. This whitepaper describes four strategies for maximizing the throughput without having to make such sacrifices. These strategies include:
2. Strategy 1: Choose Highest-Throughput Bus for Your Application
On the surface, it may appear straightforward to select a bus based on its bandwidth alone. The theoretical bandwidth might give an indication of performance, but, unfortunately, it is not that simple. In reality, there are four major factors that affect instrument bus performance - bandwidth, latency, implementation, and application. And because most test-industry buses are based on PC buses, they follow the PC trend that system buses such as PCI and PCI Express perform better than communication buses such as USB and LAN.
Figure 1. Bandwidth versus Latency of Mainstream Test and Measurement Buses
A lot of bus performance comparisons focus solely on the bus bandwidth and ignore the other hardware and software components that influence actual bus performance. The bandwidth is the data rate. This is usually measured in millions of bytes per second (MB/s). The latency is the transfer time. This is usually measured in microseconds (µs). For example, in Ethernet transfers, large blocks of data are broken into small segments and sent in multiple packets. The latency is the amount of time to transfer one of these packets. Figure 1 compares the theoretical bandwidths versus latencies of mainstream test and measurement buses.
The implementation of the bus software, firmware, and hardware affects its performance. Not all instruments are created equal. A PC implemented with a faster processor and more RAM performs better than one with a slower processor and less RAM. The same holds true for instruments. The implementation trade-offs made by the instrument designer, whether working with a user-defined virtual instrument or vendor-defined traditional one, have an impact on the instrument performance. One of the main benefits of virtual instruments is that the end user, as the instrument designer, decides the optimal implementation trade-offs.
The final major factor affecting bus performance is the application or how the instrument is used. The instrument I/O hardware and firmware, CPU/RAM combination, software application, and measurement speed affect the bus performance. Changing any one of these components may change the bus performance, just like changing from one bus to another affects overall system performance. In some applications, the measurement subsystem is the bottleneck, and in others, the bottleneck is the processor subsystem. Understanding which subsystem is the bottleneck provides a path to improving performance. And the key point remains that in order for a particular bus to be high-performance or even usable, it must exceed the application requirements.
These factors impact whether an instrument bus exceeds required performance. Use benchmarks to compare actual performance between potential instruments. For more detail on choosing an instrumentation bus, view the What Makes a Bus High Performance? whitepaper.
3. Strategy 2: Select Software that Takes Full Advantage of the Latest Processor Technology
Multicore processors are the latest innovation in the PC industry. These first multicore processors contain two cores, or computing engines, located in one physical processor - hence the name dual-core processors. Processors with more than two cores also are on the horizon. Dual-core processors can simultaneously execute two computing tasks. This is advantageous in multitasking environments, such as Windows XP, in which you simultaneously run multiple applications. Two applications - National Instruments LabVIEW and Microsoft Excel, for example - each can access a separate processor core at the same time, thus improving overall performance for applications such as data logging.
The other major performance advantage of dual-core processors is gained by applications with multithreading. Multithreading gives an application the ability to separate its tasks into individual threads. A dual-core processor can simultaneously execute two of these threads. Dual-core PCs demonstrate significant performance improvements, especially for multithreaded applications. Benchmarks in NI LabVIEW 8 demonstrate a performance improvement for single-threaded applications of up to 25 percent between the National Instruments PXI-8105 dual-core embedded controller and the NI PXI-8196 single-core embedded controller (2.0 GHz Intel Pentium M processor 760), which have equivalent processor clock rates. This improvement is a result of numerous enhancements in the processor and chipset between these two generations of Intel architectures. You can see the performance improvement resulting from the fact that the PXI-8105 processor is dual-core in the multithreaded application benchmarks, which demonstrate an improvement of up to 100 percent compared to the PXI-8196 embedded controller.
Multithreading can significantly improve the performance of a test system. During the testing of a unit under test (UUT), some actions are independent in terms of their functionality and resource usage. For example, when testing a device, you may need to initialize several instruments. The initialization of different instruments is totally independent. Thus, you can break those tasks into multiple threads to improve performance. To fully take advantage of this capability, you must choose software that supports multithreading from driver level and application level to test executive level. For example, you need to choose hardware devices that include multithreaded software drivers. For this reason, all NI hardware devices are released with multithreaded driver software.
Also, you should consider the skill level required to take advantage of these multicore processors. For example, text-based programming languages include built-in multithreading libraries to create and manage threads. However, in text-based languages, where code typically runs sequentially, it often can be difficult to visualize how various sections of code run in parallel. Because the language syntax is sequential, the code basically runs line by line. And because threads within a single program usually share data, communication among threads to coordinate data and resources is so critical that you must implement it carefully to avoid incorrect behavior when running in parallel. You must write extra code to manage these threads. Thus, the process of converting a single-threaded application into a multithreaded one can be time-consuming and error-prone.
Text-based programming languages must incorporate special synchronization functions when sharing resources such as memory. If you do not implement multithreading properly in the application, you may experience unexpected behavior. Conflicts can occur when multiple threads request shared resources simultaneously or share data space in memory. Current tools, such as multithread-aware debuggers, help a great deal, but in most cases, you must carefully keep track of the source code to prevent conflicts. In addition, you must often adapt your code to accommodate parallel programming.
In contrast, graphical programming tools such as LabVIEW can easily represent parallel processes and are inherently multithreaded. For example, two independent loops running without any dependencies automatically execute in separate threads.
On computers with dual-core processors, multithreaded LabVIEW VIs can handle performance increases without any intervention on your part. The multiple threads are scheduled to run on separate processor cores without external instruction. For more detail, view the Using LabVIEW to Create Multithreaded Applications for Maximum Performance and Reliability whitepaper.
4. Strategy 3: Use Hardware Synchronization
Nearly all automated test systems require engineers to synchronize two or more instruments or switches. For example, in a fuel cell test application, an engineer must program a digital multimeter to scan through hundreds of individual cells using a switch. There are two types of synchronization to choose from: software-timed and hardware-timed. Software-timed synchronization uses triggers generated by a software call, and hardware-timed synchronization uses triggers generated by signals connected to the hardware trigger lines. With software synchronization, there is significant software dependency and overhead because software-timed systems need software intervention to advance the trigger for each reading. The timing of these scans is subject to the performance of the software system because the software shares resources with all other applications that require processor time.
Test throughput is maximized by implementing hardware-timed synchronization, which removes the processor dependency. Hardware synchronization takes advantage of triggers from both instruments trying to communicate. Using the DMM/switch example above, instead of adding software delay to allow for the switch settling time, the instrument and switch interact by sending triggers to each other when complete or ready. The instrument sends a trigger signal when the reading is complete, and the switch sends a trigger once it has settled and is ready for the next measurement. All trigger interaction is hardware-controlled, which minimizes time wasted between measurements and guarantees maximum throughput. Hardware synchronization between the switch and the instrument is completely independent of the software environment and is not affected by software.
Figure 3. Hardware Synchronization Scheme for a Multimeter and Switch
An example of hardware synchronization is shown in Figure 3. In this example, the NI PXI-4070 FlexDMM DMM communicates with the NI PXI-2530 high-channel multiplexer switch. After the first channel is closed, the switch trigger tells the DMM to take a measurement. When the DMM completes the measurement, its trigger tells the switch to move on to the next channel, and the cycle repeats without software intervention. Because of this determinism, additional software activity (such as adding extra data processing) does not affect the results of the hardware handshaking test as it does when software scanning is used.
The execution time improvement of hardware handshaking can be seen in Table 1. These are benchmark results for 1,000 switched DMM scans using an NI PXI-2501 FET switch and a PXI-4070 FlexDMM. The use of hardware handshaking increased throughput more than 46 percent. Also note that the hardware-handshaking use case is more resilient to extra software processing tasks, which severely slows execution of software scanning.
With Extra Processing
% Throughput Increase
Using Hardware Handshaking
Table 1. Execution Time Comparison for 1,000 Switched DMM Scans
The determinism and throughput of the system is maximized with hardware synchronization. Additionally, if PXI instrumentation is used, then all trigger signals are passed over the PXI synchronization backplane, eliminating the need for any external wiring and simplifying setup. For more details, view the Optimize Manufacturing Test Throughput Using Measurement Hardware Synchronization whitepaper.
5. Strategy 4: Design a System Architecture that Supports Parallel Test and Resource Sharing
You may have explored ways to enhance test system throughput in the past though parallel testing. However, the latest off-the-shelf test executive software tools simplify parallel test system implementation. These tools increase test throughput and drive down test system costs. In general, parallel testing involves testing multiple products or subcomponents simultaneously. A parallel test station typically shares a set of test equipment across multiple test sockets, but, in some cases, it may have a separate set of hardware for each UUT. The majority of nonparallel test systems test only one product or subcomponent at a time, leaving expensive test hardware idle more than 50 percent of the test time. Thus, with parallel testing, you can increase the throughput of manufacturing test systems without spending a lot of money to duplicate and fan out additional test systems. The following sections discuss ways parallel testing can reduce the cost of test and describe various approaches for implementing parallel testing in your test systems.
Choosing a Parallel Test Architecture
While you can implement parallel testing in most existing test systems, modular test system architectures deliver the best results when used in a parallel testing environment. Test management software, such as NI TestStand, and modular PXI hardware components offer many features for obtaining the highest performance out of a parallel test system. However, you can implement parallel testing using much of your existing test hardware without further hardware investment. Once you have selected your test architecture, the next step is to select the best process model based on your desired UUT test behavior.
Common Parallel Process Modules
When testing the proper assembly or functionality of a UUT, there are a variety of tasks to perform in any test system. These tasks include a mix of model or family-specific tests as well as many procedures that have nothing to do with actually testing the UUT. A process model separates the system-level tasks from the UUT-specific tests to significantly reduce development efforts and increase code reuse. Some of the tasks that a process model handles are tracking the UUT identification number, initializing instruments, launching test executions, collecting test results, creating test reports, and logging test results to a database. NI TestStand provides two process models, the parallel process model and the batch process model, to facilitate the general test flow of parallel testing based on your UUT test requirements.
You can use a parallel process model to test multiple independent test sockets. With this model, you can start and stop testing on any UUT at any time. For example, you might have five test sockets for performing radio board tests. Using the parallel process model, you can load a new board into an open socket while the other sockets test other boards. Figure 4 illustrates how the parallel process executes.
Figure 4. Parallel Process Model Flow Chart
Alternatively, you can use a batch process model to control a set of test sockets that test multiple UUTs as a group. For example, you might have a set of circuit boards attached to a common carrier. The batch model ensures you can start and finish testing all boards at the same time. The batch model also provides batch synchronization features. For instance, you can specify that the step runs only once per batch if a particular step applies to the batch as a whole. With a batch process model, you can also specify that certain steps or groups of steps cannot run on more than one UUT at a time or that certain steps must run on all UUTs simultaneously.
If you are trying to increase your test system performance while lowering your cost, providing each test socket with a dedicated set of instruments is not a feasible solution. Implementing a parallel test system often does not require any additional hardware investment. With parallel testing, you can share existing instrumentation in the test system among multiple test sockets. Decreasing idle time during a UUT test cycle provides substantial performance improvements without additional hardware costs. In many cases, you can add other inexpensive instruments to further optimize overall system performance while sharing the more expensive hardware among the test sockets.
Prior to the availability of off-the-shelf test management software, programming the allocation of shared instrumentation among multiple test sockets running a parallel test system required that you add a large amount of low-level synchronization code to test programs. Critical sections and mutexes often were intertwined with the actual code, making it difficult to program or reuse sections in future test systems.
By implementing parallel test systems that leverage many of the built-in features in NI TestStand, you can effortlessly control the sharing of instruments and synchronize multiple devices under test. You can use synchronization step types and configurable test properties at the individual test level to manage resource sharing between tests in a sequence. The synchronization step types used in test sequences often include lock, rendezvous, queue, notification, wait, and batch synchronization step types. Figure 5 shows how you can use a lock step while testing two UUTs.
Figure 5. This example test sequence uses a combination of lock step types to prevent multiple tests from trying to access the same instrument simultaneously.
For more details, view the Benefits of Parallel Testing whitepaper.
6. Summary: Maximizing Throughput in an Automated Test System
When developing a system were throughput is critical, you should choose the highest-throughput bus, select software that takes full advantage of the latest processors, use hardware synchronization, and design a system architecture that supports parallel test and resource sharing. The overriding theme of these four strategies is to choose a hardware and software platform that takes full advantage of the latest PC technology, such as PXI and LabVIEW.
7. Relevant Products and Whitepapers
National Instruments, a leader in automated test, is committed to providing the hardware and software products engineers need to create these next generation test systems.
- NI TestStand Test Management Framework
- LabVIEW Graphical Programming Environment
- Signal Express Interactive Measurement Software
- Modular Instruments (Oscilloscopes, Multimeters, RF, Switching, and more)
- Multi-function Data Acquisition
- PXI System Components (Chassis and Controllers)
- Instrument Control (GPIB, USB, and LAN)
NI offers a Designing Next Generation Test Systems Developers Guide. This guide is collection of whitepapers designed to help you develop test systems that lower your cost, increase your test throughput, and can scale with future requirements. To download the complete developers guide (120 pages) , visit ni.com/automatedtest.