Writing Data-Management-Ready TDMS Files

Publish Date: Jun 02, 2015 | 19 Ratings | 4.53 out of 5 | Print

Overview

Technical data management streaming (TDMS) is the most common file format used by National Instruments software to store acquired data channels, and it is also open to 3rd party tools. For an overview of the benefits of the TDMS file format and the many programs and APIs that write and read these files, please review the following document:

Developer Zone tutorial: The NI TDMS File Format

For an overview of the bit-level binary structure of the TDMS file, please review the following document:

Developer Zone tutorial: TDMS File Format Internal Structure

Often, just writing a valid TDMS file is not enough. This document outlines the best practices for TDMS file creation so that later on you can optimally find, analyze, compare, and report the acquired data channels. Following these TDMS file recommendations will enable additional data management functionality, improve channel timing clarity, and maximize loading speed.

Table of Contents

  1. Writing Properties That Enable Data Management
  2. Associating Time Information With Data Channels
  3. Writing TDMS Files That Load Quickly
  4. Want More Best Practices?
  5. Related Links

1. Writing Properties That Enable Data Management

TDMS data files typically contain two different types of information: the acquired data arrays, often called bulk data, and the set-up conditions and/or scalar results, often called meta-data. TDMS files always store the bulk data in individual 1D arrays called channels. TDMS files can store meta-data as scalar name-value pairs, called properties. They can be attached to the file level, the channel group level, or the channel level. What information you store as properties, which level you attach those properties to, and how you name each property have a great effect on the usefulness of the meta-data for data management purposes.

Writing Meta-Data to Properties, Not Channels

Meta-data should always be stored to properties, which are searchable, instead of to data channels, whose values are not searchable. TDMS data channels are designed for arrays of data for graphing and analysis. Storing scalar set-up information and scalar result values to one-value data channels confuses the reading application because it mixes “real” data channels with “decoy” data channels. When named scalar information is stored instead to properties, in enables vastly improved browsing and searching experience in many TDMS reading applications such as National Instruments LabVIEW and DIAdem. For example, see the intuitive simultaneous display of channel data values (graph) and channel properties (table) that follows:

Figure 1. Channel Data and Properties

Writing Properties With Valid Names

A TDMS property has three components: the property value, the property name, and the property level. The TDMS file format allows any characters to be used in the property value, but many of the TDMS file writing and reading applications have property name restrictions. The following property name recommendations will guarantee that the property names will remain unchanged regardless of which TDMS reading application is used. 

  • Property Names should contain only letters, integers, and the underscore character.
  • Property Names should have either a letter or the underscore as the first character.

There are two special property names that are important to use in all TDMS files which are listed in the following table.

Table 1. Special Property Names

Level Name Description
File DateTime Start DateTime of the whole TDMS file
Channel Unit_String String representation of the channel unit

Writing Properties to the Right TDMS Level

A TDMS property has three components: the property value, the property name, and the property level. The TDMS file format allows you to save a property to the file, any of the channel groups in the file, or any of the channels in any of the channel groups. Where you save the property makes a big difference in its usefulness. The following recommendations maximize your ability to search for desired parts of a TDMS file based on one or more property conditions.

  • Save a property to the file level only if that property pertains to every channel group and every channel in the TDMS file.
  • Save a property to a particular channel group only if that property pertains to all the channels in that channel group.
  • Save a property to a particular channel if that property pertains to only that channel.

The converse rules also apply:

  • If the property pertains to all the channels of a particular channel group, then save the property to the channel group instead of to each of the channels in that channel group.
  • If the property pertains to each of the channel groups in the TDMS file, then save the property to the file instead of to each of the channel groups in the TDMS file.

A common mistake is to write a collection of set-up properties to a channel group called Setup Info which contains no channels. This makes it impossible to search for selected channel groups or selected channels that satisfy set-up property conditions.

Back to Top

2. Associating Time Information With Data Channels

Acquired data channels have associated timing information, either implicitly (constant sampling rate) or explicitly (channel of time values). Reliably finding this time information when reading the TDMS file is required to automatically analyze it. Note that the terms timing and time here are intended to include a larger category of associated x-axis information.

Most often, acquired data is plotted versus time on the x-axis, but other examples of associated x-axis information include angle, frequency, displacement, and so on. The following recommendations pertain equally to these other associated x-axis quantities, even though the term time is used below for simplicity. Two commonly used methods of recording the associated timing information are implicit waveform channel properties and explicit date/time channels. Each requires a number of additional channel properties to be completely documented.

Associating Time Channels and Data Channels

If your TDMS file’s associated time values for the acquired data channels are stored in one or more explicit time channels, then you need a convention to indicate which acquired data channels are associated with which explicit time channels. A convention is needed here because the TDMS file format does not provide a built-in method to make this association.

The clearest and simplest approach is to always have only one explicit time channel inside each channel group and to always position the explicit time channel as the first channel in that channel group. This leads to two common cases: one explicit time channel plus one acquired data channel in each channel group (XY) or one explicit time channel plus multiple acquired data channels in each channel group (XYYY), as the following figures illustrate:

Figure 2. XY Channel Group

Figure 3. XYYY Channel Group

Writing Complete DateTime Channels and Properties

TDMS files offer a native datetime data type for both properties and data channels. When you save datetime information, make sure to always use this built-in datetime option. Writing a numeric value of elapsed seconds is not sufficient to record a datetime, because different applications that write and read TDMS files have different conventions for the starting datetime value and even the increment metric (seconds as opposed to days). For NI LabVIEW programs, you should always wire a brown datetime wire directly to the property value or channel data input, as shown by the following:

Figure 4. DateTime Property and Data Channel in NI LabVIEW

An additional consideration for datetime values is the recording and reading of geographic location (time zone). Some applications that write and read TDMS files are geo-relative (assume the same time zone), while others are geo-absolute (UTC, based on Greenwich, England). If the TDMS writing and reading applications do not match in their geography expectations, your read datetime values can be different from your written datetime values. To safeguard against this possibility, it is best to save an additional UTC_Offset property (as a real number) which stores the number of (fractional) hours between the TDMS writing application and Greenwich Mean Time.

Writing Complete Waveform Channels

If your data channels are acquired at irregular time intervals, then an explicit time channel is required to document the timing accurately. More often, though, all data channels are acquired at a constant sampling rate, usually hardware timed. In this case, simply storing the timing information as a set of channel properties is an elegant and entirely sufficient approach to accurately document their timing—no explicit time channels are needed. The standard waveform property names for TDMS files are listed in the following table. 

Table 2. Standard Waveform Property Names

Name Example Required? Description
wf_xname Time Required Name of the x-axis quantity
wf_xunit_string s Required Unit of the x-axis quantity
wf_start_offset 0 Required Start offset value of the x-axis
wf_increment 0.001 Required Increment value of the x-axis
wf_starttime   Optional Start DateTime value of the time axis
wf_samples   Optional Number of values of the x-axis

Note that the property names are case sensitive and must be in lowercase. The wf_xname and wf_xunit_string properties are not set by default in NI LabVIEW— you need to add those properties yourself to every waveform channel in the TDMS file.

Back to Top

3. Writing TDMS Files That Load Quickly

The TDMS file format was designed to stream data as quickly as possible while still being flexible enough to accommodate changes in the number of channels and their sampling rates during the acquisition. Data files that stream quickly, though, do not necessarily load quickly. The TDMS file is an entirely binary file that consists of multiple sections, one layered on top of the other as you write to the file. These sections contain buffers of data values assigned to one or more channels and/or meta-data properties attached to one or more levels. The fewer sections the TDMS file has, in general, the faster it will load.

Each time a TDMS file is written or read, a TDMS_Index file is created that contains a map of the binary sections. Subsequent reads of the same TDMS file consult the TDMS_Index file to determine the correct byte positions to read out each channel and property collection from the TDMS file. Roughly speaking, if the resulting TDMS_Index file is similar in size to its TDMS file, then this TDMS file is “fragmented,” meaning it has more sections that it needs to and will therefore load slower than it should. There are several approaches you can implement, both during and after the acquisition, that will minimize the number of unnecessary sections and maximize the reading speed of the resulting TDMS file.

Writing TDMS Files with Minimal Fragmentation

First off, if you are acquiring data with NI’s data acquisition hardware, consider using the NI-DAQmx TDMS writing capability, since it automatically writes un-fragmented TDMS files. If you are using NI LabVIEW to acquire your data channels, you can choose the VIs from the TDMS Advanced palette to write a minimally fragmented TDMS file. If you are using the standard TDMS writing functions, the following tips will help minimize the TDMS file fragmentation.

  • Write all the TDMS properties either before or after the data acquisition (loop).
  • Write data points from multiple channels using a 2D array and one TDMS write function.
  • Write at least 1000 data points at a time to each acquired data channel in the TDMS file.
  • If you have to write 1 data point at a time, set the channel property NI_MinimumBufferSize equal to 1000.

Defragmenting TDMS Files After the Acquisition

Even if your data acquisition restraints force you to create fragmented TDMS data files, you can still address the issue after the acquisition. If you are using NI LabVIEW, you can execute the TDMS Defragment function to rewrite the TDMS file with minimal fragmentation. Alternatively, if you load the TDMS data file into NI DIAdem and simply resave it, the resulting TDMS data file will be minimally fragmented.

Writing Load-Speed-Enhancing Channel Properties

If your target application to read TDMS files is NI DIAdem, you can dramatically improve loading speed into NI DIAdem by creating the following four properties attached to every data channel in the TDMS file. If all four of these properties are not present and populated with a valid value when a given TDMS data channel is being loaded, then NI DIAdem will automatically calculate all four properties in order to speed up graph axis auto-scaling. If these properties are already created and filled with valid values and attached to each data channel in the TDMS file, then that TDMS file will load much faster into NI DIAdem.

Table 3. Properties to Improve Loading Speed in NI DIAdem

Name Example Description
Minimum -3.14 The minimum value of the channel
Maximum 3.14 The maximum value of the channel
Monotony Not monotone If the channel is monotone rising or falling
NoValueKey No If any NaN values are in the channel

Back to Top

4. Want More Best Practices?

You collect data to make decisions. However, inefficiently organizing the raw data may cause problems when analyzing your data. The key to organizing raw data in your application involves thinking about the current system requirements and how the file can adapt for future application needs.


» Best Practices Guide

 

 

Back to Top

5. Related Links

Back to Top

Bookmark & Share


Ratings

Rate this document

Answered Your Question?
Yes No

Submit