Writing Data-Management-Ready TDMS Files

Overview

Technical data management streaming (TDMS) is the most common file format used by NI software to store acquired data channels, and it is also open to 3rd party tools. For an overview of the benefits of the TDMS file format and the many programs and APIs that write and read these files, please review the following document:

The NI TDMS File Format

For an overview of the bit-level binary structure of the TDMS file, please review the following document:

TDMS File Format Internal Structure

Often, just writing a valid TDMS file is not enough. This document outlines the best practices for TDMS file creation so that later on you can optimally find, analyze, compare, and report the acquired data. Following these TDMS file recommendations will enable additional data management functionality, improve channel timing clarity, and maximize loading speed.

Contents

Writing Properties That Enable Data Management

ITDMS data files typically contain two different types of information: the acquired data arrays, often called bulk data, and the set-up conditions and/or scalar results, often called meta-data. TDMS files always store the bulk data in individual 1D arrays called channels. TDMS files can store meta-data as scalar name-value pairs, called properties. They can be attached to the file level, the channel group level, or the channel level. What information you store as properties, which level you attach those properties to, and how you name each property have a great effect on the usefulness of the meta-data for data management purposes.

 

Writing Meta-Data to Properties, Not Channels

Meta-data should always be stored to properties, which are searchable, instead of to data channels, whose values are not searchable. TDMS data channels are designed for arrays of data for graphing and analysis. Storing scalar set-up information and scalar result values to one-value data channels confuses the reading application, because it mixes “real” data channels with “decoy” data channels. When named scalar information is stored instead to properties, it vastly improves the browsing and searching experience in many TDMS reading applications such as NI LabVIEW and DIAdem. For example, see the intuitive simultaneous display of channel data values (graph) and channel properties (table) that follows:

 

Figure 1. Channel Data and Properties

 

Writing Properties With Valid Names

A TDMS property has three components: the property value, the property name, and the property level. The TDMS file format allows any characters to be used in the property value, but many of the TDMS file writing and reading applications have property name restrictions. The following property name recommendations will guarantee that the property names will remain unchanged regardless of which TDMS reading application is used. 

  • Property Names should contain only letters, integers, and the underscore (_) or tilde (~) character.
  • Property Names should have either a letter or the underscore as the first character.

There are two special property names that are important to use in all TDMS files which are listed in the following table.  Note that the (lower case) capitalization for these two particular properties is important.

 

Table 1. Special Property Names

LevelNameDescription
FiledatetimeStart DateTime of the whole TDMS file
Channelunit_stringString representation of the channel unit

 

The "datetime" property on the File level is the only date or time property that is guaranteed to be queryable in all DataFinders.  This is because a date or time property is only queryable in a DataFinder if it has been optimized.  The "datetime" property on the File level is one of a small number of base model properties that are always optimized in every DataFinder.  The "unit_string" property on the Channel level is the starting point of engineering unit handling in all NI data management software.  Store your engineering unit symbol in this property, and you will have many more options for unit management downstream.

 

Writing Properties to the Right TDMS Level

A TDMS property has three components: the property value, the property name, and the property level. The TDMS file format allows you to save a property to the file, any of the channel groups in the file, or any of the channels in any of the channel groups. Where you save the property makes a big difference in its usefulness. The following recommendations maximize your ability to search for desired parts of a TDMS file based on one or more property conditions.

  • Save a property to the file level only if that property pertains to every channel group and every channel in the TDMS file.
  • Save a property to a particular channel group only if that property pertains to all the channels in that channel group.
  • Save a property to a particular channel if that property pertains to only that channel.

The converse rules also apply:

  • If the property pertains to all the channels of a particular channel group, then save the property to the channel group instead of to each of the channels in that channel group.
  • If the property pertains to each of the channel groups in the TDMS file, then save the property to the file instead of to each of the channel groups in the TDMS file.

A common mistake is to write a collection of set-up properties to a channel group called something like Setup Info which contains no channels. Because of the rules behind the hierarchical DataFinder searching, this makes it impossible to search for selected channel groups or selected channels that satisfy these set-up property conditions.

 

Associating Time Information With Data Channels

Acquired data channels have associated timing information, either implicitly (constant sampling rate) or explicitly (channel of time values). In order to automatically analyze these data channels, the associated time information must be reliably identifiable when reading the TDMS file. Note that the terms timing and time here are intended to include a larger category of associated x-axis information.

Most often, acquired data is plotted versus time on the x-axis, but other examples of associated x-axis information include angle, frequency, displacement, and so on. The following recommendations pertain equally to these other associated x-axis quantities, even though the term time is used below for simplicity. Two commonly used methods of recording the associated timing information are implicit waveform channel properties and explicit date/time channels. Each requires a number of additional channel properties to be completely documented.

 

Associating Time Channels and Data Channels

If your TDMS file’s associated time values for the acquired data channels are stored in one or more explicit time channels, then you need a convention to indicate which acquired data channels are associated with which explicit time channels. A convention is needed here because the TDMS file format does not provide a built-in method to make this association.

The clearest and simplest approach is to always have only one explicit time channel inside each channel group and to always position the explicit time channel as the first channel in that channel group. This leads to two common cases: one explicit time channel plus one acquired data channel in each channel group (XY) or one explicit time channel plus multiple acquired data channels in each channel group (XYYY), as the following figures illustrate:

 

Figure 2. XY Channel Group

 

 

Figure 3. XYYY Channel Group

 

If you adopt this suggestion to have only one time channel in each channel group, then you should set the “wf_xcolumns” property on the channel group level to have the value "one", which will tell all NI software that the first channel in that group is the time channel.  If for some reason your time channel is not first, you can set the “xchannel” property on the channel group level and fill it with the name of the time channel in that channel group (in the current example, that name is “Time”).

 

Table 2. Properties to set for XY Channel Relationships

LevelNameValue
Groupwf_xcolumnsone
GroupxchannelTime

 

Writing Complete DateTime Channels and Properties

TDMS files offer a native datetime data type for both properties and data channels. When you save datetime information, make sure to always use this built-in datetime option. Writing a numeric value of elapsed seconds is not sufficient to record a datetime, because different applications that write and read TDMS files have different conventions for the starting datetime value and even the increment metric (seconds as opposed to days). For NI LabVIEW programs, you should always wire a brown datetime wire directly to the property value or channel data input, as shown by the following:

 

Figure 4. DateTime Property and Data Channel in NI LabVIEW

 

An additional consideration for datetime values is the recording and reading of geographic location (time zone). Some applications that write and read TDMS files are geo-relative (assume the same time zone), while others are geo-absolute (UTC, based on Greenwich, England). If the TDMS writing and reading applications do not match in their geography expectations, your read datetime values can be different from your written datetime values. To safeguard against this possibility, it is best to save an additional UTC_Offset property (as a real number) which stores the number of (fractional) hours between the TDMS writing application and Greenwich Mean Time.

 

Writing Complete Waveform Channels

If your data channels are acquired at irregular time intervals, then an explicit time channel is required to document the timing accurately. More often, though, all data channels are acquired at a constant sampling rate, usually hardware timed. In this case, simply storing the timing information as a set of channel properties is an elegant and entirely sufficient approach to accurately document their timing—no explicit time channels are needed. The standard waveform property names for TDMS files are listed in the following table. 

 

Table 3. Standard Waveform Property Names

NameExampleRequired?Description
wf_xnameTimeRequiredName of the x-axis quantity
wf_xunit_stringsRequiredUnit of the x-axis quantity
wf_start_offset0RequiredStart offset value of the x-axis
wf_increment0.001RequiredIncrement value of the x-axis
wf_start_time OptionalStart DateTime value of the time axis
wf_samples1000RequiredNumber of values of the x-axis

 

Note that the property names are case sensitive and must be in lowercase. The wf_xname and wf_xunit_string properties are not set by default in NI LabVIEW— you need to add those properties yourself to every waveform channel in the TDMS file.

 

Writing TDMS Files That Load Quickly

The TDMS file format was designed to stream data as quickly as possible while still being flexible enough to accommodate changes in the number of channels and their sampling rates during the acquisition. Data files that stream quickly, though, do not necessarily load quickly. The TDMS file is an entirely binary file that consists of multiple sections, one layered on top of the other as you write to the file. These sections contain buffers of data values assigned to one or more channels and/or meta-data properties attached to one or more levels. The fewer sections the TDMS file has, in general, the faster it will load.

Each time a TDMS file is written or read, a TDMS_Index file is created that contains a map of the binary sections. Subsequent reads of the same TDMS file consult the TDMS_Index file to read all the properties and to determine the correct byte positions to read out each stored block of each channel. Roughly speaking, if the resulting TDMS_Index file is similar in size to its TDMS file, then this TDMS file is “fragmented,” meaning it has more sections than it needs and will therefore load slower than it should. There are several approaches you can implement, both during and after the acquisition, that will minimize the number of unnecessary sections and maximize the reading speed of the resulting TDMS file.

 

Writing TDMS Files with Minimal Fragmentation

First off, if you are acquiring data with NI’s data acquisition hardware, consider using the NI-DAQmx TDMS writing capability, since it automatically writes un-fragmented TDMS files. If you are using NI LabVIEW to acquire your data channels, you can choose the VIs from the TDMS Advanced palette to write a minimally fragmented TDMS file. If you are using the standard TDMS writing functions, the following tips will help minimize the TDMS file fragmentation.

  • Write all the TDMS properties either before or after the data acquisition (loop).
  • Write property values from same-datatype properties using a 2D array and one TDMS write function.
  • Write at least 1000 data points at a time to each data channel in the TDMS file.
  • If you have to write 1 data point at a time, set the channel property NI_MinimumBufferSize equal to 1000.

 

Defragmenting TDMS Files After the Acquisition

Even if your data acquisition constraints force you to create fragmented TDMS data files, you can still address the issue after the acquisition. If you are using NI LabVIEW, you can execute the TDMS Defragment function to rewrite the TDMS file with minimal fragmentation. Alternatively, if you load the TDMS data file into NI DIAdem and simply resave it, the resulting TDMS data file will be minimally fragmented.

 

Writing Load-Speed-Enhancing Channel Properties

If your target application to read TDMS files is NI DIAdem, you can dramatically improve loading speed into NI DIAdem by creating the following four properties attached to every numeric or datetime data channel in the TDMS file. If any one of these four properties is missing or not populated with a valid value when a given TDMS data channel is being loaded, then NI DIAdem will automatically calculate all four properties for that channel in order to speed up graph axis auto-scaling. If these properties are already created and filled with valid values and attached to each data channel in the TDMS file, then that TDMS file will load much faster into NI DIAdem, because that calculation for each channel will be avoided during the TDMS file loading process.

Note that the property names are case sensitive and must be in lowercase. None of these four channel properties are set by default in NI LabVIEW— you need to add those properties yourself to every waveform channel you create in the TDMS file.

 

Table 3. Properties to Improve Loading Speed in NI DIAdem

NameExampleDescription
minimum-3.14The minimum value of the channel
maximum3.14The maximum value of the channel
monotonyNot monotoneIf the channel is monotone rising or falling
novaluekeyNoIf any NaN values are in the channel


Want More Best Practices?

You collect data to make decisions. However, inefficiently organizing the raw data may cause problems when analyzing your data. The key to organizing raw data in your application involves thinking about the current system requirements and how the file can adapt for future application needs.


» Best Practices Guide

 

 

Was this information helpful?

Yes

No