Managing Large Data Sets in LabVIEW

Publish Date: Jan 28, 2012 | 34 Ratings | 4.71 out of 5 |  PDF

Overview

One of the great strengths of LabVIEW is automatic memory management. This memory management allows the user to easily create strings, arrays, and clusters with none of the worries C/C++ users constantly have. However, this memory management is designed to be absolutely safe, so data is copied quite frequently. This normally causes no problems, but when the data size in a wire starts creeping into the megabyte range, copies start causing memory headaches, culminating in an out of memory error. While LabVIEW is not optimized for large data wires, it can be used with large data sets, provided the programmer knows a few tricks and is prepared for large block diagrams.

Table of Contents

  1. Methods for Large Data Storage
  2. Displaying Data
  3. Breaking the 2 GB File Size Barrier (LabVIEW 7.1 and earlier)
  4. Interfacing with 64-Bit DLLs (LabVIEW 7.1 and earlier)
  5. Sample Code

1. Methods for Large Data Storage

Sometimes, you need to store large data sets in memory. To store large data sets without memory problems, you need a storage mechanism that allows you to save one copy of the data and access the data in chunks, which allows transport of the data without a large memory hit. One common solution to this problem is a functional global, which is also called a shift register global. Another solution is a single-element queue.

You can create the functionality of a C/C++ pointer to buffer using a reentrant VI with the VI server. This buffer produces a reference to the VI that can be passed anywhere. If you want another buffer, start another one with the VI server. Since the VI is reentrant, another instance will be created. You can now pass this VI reference around and use it anywhere. Note that use of a VI reference will cause any access of the data to use the UI thread. Using the UI thread for data access slows program execution. You can get around this problem by creating a different VI for every buffer and making each VI non-reentrant.

GLV_WaveformBuffer.vi in GigaLabVIEW.llb is an example of the functional global concept. To see the concept in action, open GLV_TypicalMemoryStoreAndBrowse.vi. Set the number of points to 100 thousand or more and run the VI. Now open GLV_GigaLabVIEWMemoryStoreAndBrowse.vi and do the same. Note the difference in responsiveness and memory usage. Increase the number of points to one million or more to see major differences. GLV_GigaLabVIEWMemoryStoreAndBrowse.vi uses the shift register database in GLV_WaveformBuffer.vi to hold the data. The database also holds a frame buffer – a pre-decimated array of the full data that can be reused without recalculating. It also uses the chunking and display algorithms already introduced in the previous example.

GLV_GigaLabVIEWMemoryStoreAndBrowseQ.vi in GigaLabVIEW.llb is an example of the single-element queue concept. It uses standard object-oriented techniques to create the queues and manipulate them. Use is identical to GLV_GigaLabVIEWMemoryStoreAndBrowse.vi. The different shift registers in GLV_WaveformBuffer.vi are implemented as separate queues for this example. Which method you use is largely a matter of taste and programming style.

Now that you know how to create large data sets, how much memory can you expect to allocate? The answer depends on several factors. When LabVIEW allocates an array, it requests a contiguous memory section. If your memory is fragmented, you may get an out-of-memory error even though you still have hundreds of megabytes of free memory. You can work around this somewhat by allocating your data in chunks. If you write your repository VI(s) correctly, your access to it should not change. LabVIEW uses signed 32-bit memory access, so your total memory will never be over 2 GBytes. Windows OSs use unsigned 32-bit memory access, but reserve the high 2 GBytes for system use and allow program execution only in the low 2 GBytes (server versions are more flexible). In addition, system DLLs occupy most of the high quarter of the 2 GByte user data space. Thus, a practical limit on a Windows system is 1.0 - 1.5 GBytes.

Different versions of LabVIEW fragment memory in different ways. This changes the maximum array size you can allocate. In LabVIEW 7.x and later, you can typically allocate slightly more than 1 GByte in a single array. LabVIEW 8.x, due to its larger feature set, only allows a maximum array size of about 800 MBytes.

Back to Top

2. Displaying Data

To see speed improvements in action, open GLV_TypicalGenerateAndDisplay.vi from GigaLabVIEW.llb. Open your OS's memory monitor. Set the number of points to 1 million or more and run it. Note the execution time and memory use. Close the VI. Now open and run GLV_GigaLabVIEWGenerateAndDisplay.vi using the same number of points. Note the differences in time and memory usage. GLV_TypicalGenerateAndDisplay.vi generates the entire data set at once using the standard LabVIEW sine wave generator, and then plots it to the screen. GLV_GigaLabVIEWGenerateAndDisplay.vi generates the data in chunks, decimates these chunks, and throws the original data away. It also uses a lower level sine wave generator that generates a simple array. When the waveform datatype is used for the graph, the number of points is low enough that a copy is not very costly. Note that converting this VI from using the waveform datatype sine generator to using the lower level, simple array generator resulted in a 20% increase in speed in LabVIEW 7.0 and later.

Back to Top

3. Breaking the 2 GB File Size Barrier (LabVIEW 7.1 and earlier)

LabVIEW 8.0 introduced 64-bit file pointers, so the techniques in this section are not needed. Earlier versions of LabVIEW use 32-bit, signed integers for file pointers. This directly limits the addressable size of a file to 2 GBytes. Streaming to disk at 10 MBytes/sec, which is easily done with NI digitizers, fills the 32-bit signed integer space in 3 minutes and 20 seconds. There are two direct options to overcome this problem.

The first is fairly simple. If you are using a Windows operating system with an NTFS formatted disk partition (must be Windows NT, 2000, or XP), you can simply write to disk using the LabVIEW write primitive. Do not wire anything to the offset and wire the position mode to current. You can write until you run out of disk space.

For a simple example of this, open GLV_StreamToDisk.vi in GigaLabVIEW.llb. This example saves an ascending sequence of double precision floating point numbers to disk. Set the amount of data on the front panel, run, and then sit back and relax while it stores data. The default chunk size, 65,000 bytes, was experimentally determined to be the speed optimum for Windows based systems.

To read the data back, reverse the process. Use the read primitive with no offset input. You can use the offset up to the 2 GByte boundary. If you use it outside that boundary, you will get an end-of-file error. However, if you simply read data sequentially from disk, you can read until the end of the file. The VI GLV_ReadFromDisk.vi shows this process.

This trick works only for the Windows OSes mentioned above. In addition, it is not possible to seek to an arbitrary location above the 2 GByte boundary in the file and read the data there.

This brings us to the second option – use a 64-bit file utility with LabVIEW. A good example of this is HDF5. HDF5 is a binary, hierarchical file utility designed, written, and maintained by the National Center for Supercomputing Applications (NCSA). It is free for any sort of use, since it is funded by the US government. For full information, source code, and binaries of HDF5, visit http://hdf.ncsa.uiuc.edu/HDF5/. Using HDF5, or any other 64-bit utility, requires the ability to pass 64-bit numbers to the utility. This brings us to our last topic.

Back to Top

4. Interfacing with 64-Bit DLLs (LabVIEW 7.1 and earlier)

LabVIEW 8.0 introduced full support for 64-bit integers, so the techniques in this section are not needed. For earlier versions, there are two options for interfacing with a 64-bit DLL. The first is to write a C/C++ wrapper, which only exposes data structures that LabVIEW can natively handle. Since this defeats the ease-of-use of LabVIEW, we will discuss the second option - use the call library node and access the DLL directly, using a bit of digital sleight-of-hand.

You can represent a 64-bit number by a cluster of two 32-bit numbers. Mathematical operations between 64-bit numbers can be coded using well-known algorithms for arbitrary precision arithmetic. These algorithms are beyond the scope of this paper, but you can easily find them on the web. In addition, these algorithms can also be found in the book The Art of Computer Programming, Volume 2: Seminumerical Algorithms by Donald Knuth.

 

Now that you have a method to represent and do math on 64-bit numbers, how do you get them to the DLL? The easiest way is to typecast 64-bit entities to doubles, and then pass in a double whenever the DLL asks for a 64-bit integer. Since the typecast does a binary image transform, similar to a union in C, all will be well, provided you have the high and low order double words in the proper order. The graphic at right shows the proper order and the cast to a double. The call library node will take care of byte ordering for the particular platform. This method also works for arrays and will even give the right padding on architectures that require it, such as SPARC. This is because the double is a 64-bit entity. So if the DLL has a function prototype of:

int32 fooFunc(uint64 length, uint64 *elements)

the prototype you create in the call library node looks like

long fooFunc(double length, double *elements).

You cannot use this trick to get the return value of a function. If the function you wish to use has a prototype

uint64 barFunc(void)

then older versions of LabVIEW have no way to access the full return value. LabVIEW can only get the bottom 32 bits because function return values are returned in the registers of the processor while items in the call list are returned in the program stack. On the stack, the only thing that matters is that LabVIEW and the DLL are using the same sized object. For function return values, integer and floating point values are returned in different registers. LabVIEW has no way of accessing the top 32 bits of a returned 64-bit integer. A C/C++ wrapper is necessary. Using the above example, the wrapper is of the form

void barFuncWrapper(uint64 *barFuncData){
   *barFuncData = barFunc();
   return;
}

Fortunately, this is usually not necessary. Two examples of interfacing to 64-bit DLLs are available in NI products – both interface to HDF5.

The HWS file utility is a C/C++ wrapper that was designed with LabVIEW interfacing in mind. Since HDF5 is very low-level and difficult to master, the HWS API puts a standard LabVIEW file I/O interface over the HDF5 complexity. HWS is currently available with the NI-HSDIO, NI-SCOPE, and NI-FGEN drivers, the Analog and Digital Waveform Editors, and any DriverCD dated August 2004 and later.

The sfpFile utility set is a LabVIEW utility that interfaces directly to HDF5 with as few C wrappers as possible. It is available from ni.com, but is not supported by National Instruments. It embodies the principles of direct use of a 64-bit DLL from LabVIEW. Two example VIs from this utility set are included in GigaLabVIEW.llb. The first is H5Screate_simple.vi, which is a direct call to the HDF5 DLL with prototype

int32  H5Screate_simple(int32  rank, const uint64 *dims, const uint64 *maxdims).

The LabVIEW call library node prototype is

long H5Screate_simple(long rank, double *dims, double *maxdims).

The second is DU64_DBLToDU64.vi, an example of how to convert a double precision floating point number into a cluster of two 32-bit integers, and then cast back into a double for passing to the HDF5 routines. Doubles provide a convenient method of keeping track of large file pointer integers in LabVIEW since they have 52 digits of precision. Since NTFS only has a 48-bit data space, this works well. Addition, subtraction, and multiplication of integer valued floating point numbers are usually exact operations.

Note that HWS and sfpFile produce the same file format. It is just the API that differs.

Back to Top

5. Sample Code

GigaLabVIEW.llb contains all the sample code mentioned in the tutorial. If you wish, you may also download the HDF5 DLLs to prevent the HDF5 examples from looking for nonexistent libraries. Place them in your system directory.
 
Related Links:
LabVIEW 2011 Help: Memory Management for Large Data Sets
LabVIEW Windows Routines for Data Compression
HDF Group - HDF5
Can I Edit and Create Hierarchical Data Format (HDF5) files in LabVIEW? (Download sfpfile)
Determining When and Where LabVIEW Creates a New Buffer (Download Buffer Viewer)

Back to Top

Bookmark & Share

Ratings

Rate this document

Answered Your Question?
Yes No

Submit