Data streaming speeds of common applications reach or exceed MS/s rates. In an application that collects one single channel of data at 1 MS/s, a total of 1,000,000 data points will be collected in a one second acquisition. In a matter of minutes, billions of data points can be saved to gigabytes of hard drive space.
When Microsoft Excel attempts to load a data file containing a large volume of data, it attempts to load every single data point into memory. With the release of the 64-bit version of Microsoft Excel 2010, this is less of a limitation, as the application has a larger addressable memory space; however, loading the entirety of a large data set into Excel can often take many minutes due to the sheer volume of data that needs to be loaded. Furthermore, Excel stores not just numerical values in each cell but also numeric formatting, cell formatting, formulas, spreadsheet links, Internet hyperlinks, and comments. This cell-centric flexibility is ideal for business spreadsheets where cell-level visibility is key, but it adds unnecessary memory overhead for data sets with millions of values. To avoid potential memory problems, Excel imposes a limit on the maximum number of rows and columns. The introduction of Excel 2007 increased the total number of rows per worksheet from 65,536 to just over 1,000,000 (220, to be precise) and the total number of columns from 256 to 16,384 (214). Using Figures 5 and 6, contrast Excel's row and column limitation with DIAdem's ability to manipulate 500,000,000 rows (points) as only a fraction of its limitation.
Figure 5. Excel can only load just over 1 million rows of data for any given column. This is a limitation for scientists and engineers.
Figure 6. DIAdem can easily handle extremely large data sets. This image shows an example of 500,000,000 (one-half Billion) data points in a channel - 500 times the maximum number of rows allowed by Excel.
|As shown in Figure 5, an acquisition rate of 1 MS/s using one single channel would exceed the number of data points that Excel could load in just over one second of acquisition. Many engineers and scientists feel forced to allow the limitations of their data post-processing software to dictate the terms of their acquisition and either reduce acquisition rates or segment acquisitions across numerous data files, introducing a nightmare for data management and organization.
DIAdem was designed to manipulate measurement data in both small and large volumes, and can process up to 2,000,000,000 data points (231) per channel across 65,536 (216) total data channels. Additionally, DIAdem includes selective loading, data reduction and register loading features specifically designed for working with extremely large data sets.
DIAdem can selectively load a subset of the data channels contained in a data file, whereas Excel always imports all of the columns from a data file. If you only need to load 1 channel from a very large data file with 10 channels in it, loading only the 10% of the data values that you actually need is much faster and more efficient than Excel's method of loading 100% of the data when 90% is overhead.
When files are loaded with data reduction, DIAdem loads data from a selected row range and/or condenses every N rows into one representative value, whereas Excel always loads all the data rows.
When files are register loaded, DIAdem uses the existing data file on disk as in-place virtual memory—DIAdem does not load all the values from the data file at once but instead registers how to access blocks of data values on-demand. This makes register loaded channels read-only, but it enables very quick graphing and inspection of extremely large data sets, as shown in Figure 6.
View a user solution on how DIAdem is processing massive amounts of data to help predict and monitor earthquake activity.