NI PXIe-8135 Embedded Controller
The PXIe-8135 controller provides optimal streaming performance for certain PCI Express links that are connected to the PXIe backplane. In order to understand which slots in which chassis are affected by these special behaviors, refer to the following table:
Figure 4. PCI/PCIe Links on Chassis Backplane
Links 3 and 4 both return 128 byte write completions rather than the 64 byte write completions returned by links 1 and 2. Links 3 and 4 come from a x8 link. In order for the controller to return 128 byte completions, the read requests must be 128 or 256 bytes (the default PCI Express capability is 512 bytes). This can be accomplished with Modular Instruments arbitrary waveform generators and high-speed digital I/O devices by changing the Preferred Packet Size attribute in the API to 256 bytes.
Sizing of windows, including minimize and maximize operations, can severely affect streaming applications. This is a CPU limitation as opposed to an issue with data transfer across the PCIe bus
To turn off this effect, do the following:
- Go to Start>>My Computer, right-click on My Computer and choose Properties
- Choose the Advanced tab and click on the Settings for Performance
- Under the Visual Effects tab, choose Custom, and uncheck
- Animate windows when minimizing and maximizing (recommended)
- Show window contents while dragging (optional)
Figure 5. Performance Options Window Used to Change Window Visual Effects
In the PXIe-1062Q chassis slots 3, 4 and 5 each have dedicated x4 PCIe links to the controller, which allow for high bandwidth measurements from these slots. The PCI connections from slots 2, 3, 5, 6, 7 and 8 share a PCIe-PCI bridge to the controller. These will perform at PCI transfer rates.
The best performance from any of the slots in the PXIe-1065 chassis will come from the use of express devices in slots 7 and 8. Each of slots 7 and 8 have dedicated x4 links to the host controller. These slots do not share any of the slot bandwidth with any other devices and no switching considerations need to be made because of multi-slot switching on the backplane.
Slots 9-18 all share one PCIe switch. Each of slots 9-14 have individual x4 links to the switch, and slots 15-18 share a x1 link to the PCIe switch. This greatly reduces the ability to perform high bandwidth generations or measurements with multiple devices in slots 9-18 simultaneously.
The slots on the 1075 chassis are broken into groups that share 4 PCIe switches. It is important to note that the right hand PCI segment is connected to switch 3, not switch 4, via a PCIe/PCI bridge. So, if you are trying to maximize your PCI performance by minimizing the stress on the same switch as the PCIe/PCI bridge, it is important to reduce the load on switches 1 (left-hand PCI segment) and 3 (right-hand PCI segment).
LabVIEW I/O Performance
The standard LabVIEW File I/O VIs (Open/Create/Replace File.vi, Read From Binary File.vi, Close File.vi) need to be functional for all situations. Until the release of LabVIEW 8.6 these functions could not be optimized for streaming applications. After the release of LabVIEW 8.6, the option to disable buffering using a boolean input was added to the Open/Create/Replace File.vi, optimizing the function for streaming applications. It is important to note that you must read from or write to the file in integer multiples of the disk sector size when using this option. The Read and Write VIs will return an error if reads or writes of an inappropriate size are attempted. The following figure shows this option on the Open/Create/Replace File.vi.
Figure 6. Disable Buffering Option for the Open/Create/Replace File.vi in LabVIEW
There is also an analogous ANSI C file I/O function that implements equivalent functionality. There is a special flag on the Windows CreateFile function that will disable buffering as well (see code below).
Figure 7. C Code for Creating a File that Disables Buffering
Effects of Using a Virus Scanner
Virus scanners can have a significant impact on any time-critical application where the application requires access to disk or depends on scheduled access to CPU resources.
Virus scanners can interrupt sustained operation of an application for things like a scheduled daily scans or scheduled daily updates to the scanner. National Instruments recommends disabling the scheduled scans and updates for the entire duration of extended time streaming applications. The on-access scanner can be left in place to provide real-time protection from viruses.
The HDD-8264 12 drive RAID array is one of the highest performing RAID that National Instruments offers (for higher performance, consider the HDD-8266). The HDD-8264 has maximum possible transfer rates of approximately 685MB/s write and 740MB/s read. These values describe the limitations of the RAID controller’s transfer rate across the PXI Express x4 link. These values are decreased when reading from and writing to disk in a streaming application using hardware like an arbitrary waveform generator or high-speed digitizer. Benchmarks are shown below for streaming to arbitrary waveform generators.
The other option for a RAID array is the HDD-8263 4 drive RAID array. This RAID has possible peak rates of about 325MB/s with files located on the outer rim of the hard disk, which will be shown later in the this section.
MXI Express x4 Remote Controller
NI PXIe/PCIe-8371, NI PXIe/PCIe-8372 Remote PCI Express Control of PXI Express
On the 2-port PCIe-8372, port 2 provides higher performance (throughput) than port 1. This is due to the internal architecture of the PCI Express switch used on this product. The 1-port PCIe-8371 exposes port 2 and depopulates port 1.
The maximum aggregate data rate at which data can be sent upstream (e.g. digitizers writing to memory) through the PXIe-8370 is no more 799 MB/s, due to hardware limitations on this module.
There are some things that a user can do to slightly optimize the read and write rates of the RAID array. One thing that helps with the performance of reading and writing with a RAID is where on the actual hard disk that the file is written to. Performance is considerably better when the file is located near the outer rim of the hard disk.
For example, a write to disk test was run with the 8263 with a 950GB file that was broken into three parts. The first segment of the file was located near the inner rim of the disks, the second fragment located somewhere near the middle and the third fragment starting at the outer rim of the disks.
Figure 8. Performance of LV Write to Different Disk Locations
As you can see the first part of the write near middle of the disks operates at significantly lower rates compared to the two consecutive file segment’s writes.
Another thing that the user can do to optimize the file reads and writes is to pre-allocate the file space and then replace the contents when doing a write to disk operation. This tactic is only applicable for streaming to disk, for applications such as high-speed digitizer or HSDIO streaming. Note that in the above example, the file space was allocated dynamically by the operating system, so slightly increased performance could be possible with pre-allocated file space. When using this approach, it is important to make sure to use the open option for the operation(0:open) input of the Open/Create/Replace File.vi.
Figure 9. Open/Create/Replace File.vi Using "Open" Input to Overwrite the Pre-Allocated File
If you use replace or create or replace as the input, the application will replace the files that already exist instead of using them as they are.
Another optimization that can be made when reading and writing with RAID, is to use overlapped I/Os (asynchronous reads/writes). While this is fine for C applications, it is not practical for LabVIEW applications due to LabVIEW’s synchronous dataflow programming model. However, by writing to multiple files simultaneously on the HDD-8264, you can achieve aggregate rates that are faster than reading from or writing to one file. The reason for this is that the “dead-time” between writes is used by the write operations to the other file(s). As long as the read or write size is large enough, the penalty for re-locating the write head to different locations on disk for the file(s) should be small compared to the performance benefit. Taking this option a step further, the HDD-8264 can be formatted as 3 separate RAID volumes of 4 drives each, and read from or write to 3 separate files, each of which is pre-allocated per volume. This should allow for reads and writes on the outer edge of each of the 3 disks on each volume’s respective disks.
National Instruments recommendation is to read from or write to multiple files that are located on separate volumes of the HDD-8264.