Dr. Kohji Ohbayashi 大林 康二 - Kitasato University, Graduate School of Medical Science 北里大学 大学院 医療系研究科 教授
D. Choi 崔 東学 - Kitasato University, Center for Natural Science 北里大学 一般教育部 自然科学教育センター
H. Hiro-Oka 廣岡 秀明 - Kitasato University, Center for Natural Science 北里大学 一般教育部 自然科学教育センター
A. Kubota 久保田 敦 - System House Co. 株式会社システムハウス
T. Ohno 大野 努 - System House Co. 株式会社システムハウス
R. Ikeda 池田 練造 - System House Co. 株式会社システムハウス つくば事業所 所長
K. Shimizu 清水 公也 - Kitasato University, Department of Opthalmology 北里大学 医学部 眼科学教室 教授
OCT is a noninvasive imaging technique that provides subsurface, cross-sectional images of materials. Interest in OCT technology continues to grow because it provides much greater resolution than other imaging techniques such as magnetic resonance imaging (MRI) or positron emission tomography (PET). Additionally, the method does not require much preparation and is extremely safe for the patient because it uses low laser outputs and does not require ionizing radiation.
OCT uses a low-power light source and the corresponding light reflections to create images – a method similar to ultrasound, but it measures light instead of sound. When the light beam is projected into a sample, much of the light is scattered, but a small amount reflects as a collimated beam, which can be detected and used to create an image.
OCT is a promising diagnostic tool in many medical fields. In OCT applications, imaging speed is crucial for fast inspection and achieving good image quality without motion artifacts. To inspect the human eye, which can be held relatively still using a chin rest, we must use a fast A-scan rate to eliminate all motion artifacts. However, in endoscopic OCT, such as the digestive and respiratory systems, the tissue being imaged cannot be fixed in place, so we must use ultrahigh-speed OCT methods to eliminate motion artifacts. Moreover, in noninvasive real-time optical biopsy, the imaging speed must be fast enough to display the 3D image in real time for immediate diagnosis, just like a conventional endoscope. A few previous methods have been proposed for ultrahigh-speed OCT, but none have succeeded in real-time display of 3D OCT movies.
We developed our first-generation ultrahigh-speed spectral domain (SD) OCT system to be capable of an A-scan rate of 60 MHz. The key element of the OCT method was using optical demultiplexers to separate 256 narrow spectral bands from a broadband incident light source. This allowed simultaneous, parallel detection of an interference fringe signal (a critical requirement for OCT imaging) at all wavelengths in the measured spectrum. As a result, the A-scan rate was equal to the data acquisition speed of the analog-to-digital converter (ADC) digitization rate in the system.
We based our data acquisition system around 32 NI PXI-5105 high-density 8-channel digitizers, which we used to digitize all 256 spectral bands simultaneously with a 60 MS/s sample rate. We used the digitizer’s onboard memory to acquire data, then transferred it to a PC and used LabVIEW to process and visualize the data. The NI PXI-6652 timing and synchronization module and NI-TClk synchronization technology were another critical part of our first-generation system, providing phase coherency among all channels in the system in the tens of picoseconds.
Using the optical demultiplexers in an SD-OCT system as spectral analyzers, we achieved OCT imaging of 60 million axial scans per second. Using a resonant scanner for lateral scanning, we demonstrated a 16 kHz frame rate with 1,400 A-lines per frame, a 3 mm depth range, and 23 µm resolution.
Next-Generation System Architecture
While we could capture volumetric OCT videos with our former system, the massive amount of data acquired by all channels simultaneously meant we were limited by the onboard memory of the digitizers. Overall, the duration of an OCT video was limited to about 2.5 seconds. After transferring the data to the PC, we required about three hours to fully process and render the 3D video data. In the end, real-time optical biopsy (a primary goal for endoscopic OCT) was not possible with this system. However, our new system achieves real-time 3D OCT image display (in essence 4D OCT display) with an A-scan, B-scan, and volume rate of 10 MHz, 4 kHz, and 12 volumes per second, respectively.
The experimental system is shown in Figure 1. The setup is similar to our previous system, but we made two main modifications – we chose a 1,310 nm center wavelength instead of 1,550 nm and we upgraded our data acquisition system for real-time processing.
In our system, the light source is a broadband superluminescent diode. A filter selects the wavelength range to match the optical demultiplexers. We amplify output light from the diode with a semiconductor optical amplifier and divide it equally into the sample arm and reference arm with the coupler. Our system directs the sample arm light onto the sample with a collimator lens and an objective lens. We use a resonant scanner and a galvano mirror to scan the light beam on the sample. The resonant frequency of the scanner is 4 kHz, which determines the B-scan rate of the system. Our system collects back-scattered or back-reflected light from the sample with the light illuminating optics and directs it to another optical amplifier with an optical circulator. We combine the amplified output and the reference light with another coupler. The reference arm includes an optical circulator, collimator lens, and a reference mirror.
The system sends the outputs from the coupler to two optical demultiplexers for balanced detection. The optical demultiplexers divide the light into 320 wavelengths and it’s then directed to differential photoreceivers. The data outputs of the photoreceiver system are sent to the data acquisition system. Although the OCT system can achieve a faster A-scan rate based on the 50 MHz A/D conversion speed, we selected a 10 MHz A-scan rate for the initial work.
Data Acquisition and Real-Time Processing
We built our 320-channel data acquisition system around NI FlexRIO modular FPGA hardware programmed with the NI LabVIEW FPGA Module, a graphical design language that we can use to design the FPGA circuitry without needing to know VHDL coding. NI FlexRIO combines interchangeable, customizable I/O adapter modules with a user-programmable FPGA module in a PXI or PXI Express chassis. FPGAs enable implementation of processing algorithms in hardware, so we achieved significant increases in our processing performance by moving portions of our code from the PC to the FPGA.
Figure 2 shows a diagram of our data acquisition system. For high-speed acquisition, we use the NI 5751 adapter module, which has a 50 MS/s sample rate on 16 simultaneous channels with 14-bit resolution. The adapter module interfaces to the NI PXIe-7962R FPGA module, which we use to perform the first stage of processing – subtraction of the sample-cut noise and multiplication of a window function. In total, we have 20 modules across two PXI Express chassis, so we use two NI PXIe-6674T timing and synchronization modules to distribute clocks for the system and assure precise phase synchronization across all the channels in the system.
Building our system around the NI PXI platform was critical in achieving the necessary performance. First, we needed the high data throughput of PXI Express. With x4 PCIe routed to each slot on the backplane, it can sustain data throughput over 700 MB/s to each module. In addition, the PXI Express backplane architecture allows point-to-point communication between different instruments using direct DMA, eliminating the need to send data through the host processor or memory for inter-module communication. For our throughput and processing requirements, we needed to communicate between several NI FlexRIO FPGA modules with low latency and high throughput. Implementing the direct DMA necessary to meet this need normally requires complex, low-level programming. Fortunately, NI peer-to-peer (P2P) streaming technology provided a high-level abstraction which allowed us to easily connect multiple FPGAs in the system without worrying about the low-level implementation. Instead, we focused our expertise on the FPGA algorithms that determine the imaging performance of the system.
Using P2P streaming, we stream the preprocessed data on the NI PXIe-7962R FPGA modules to two NI PXIe-7965R FPGA modules, which are built around the high-performance Virtex 5 SX95 FPGA. These FPGAs contain a large number of digital signal processing (DSP) slices to optimize them for signal processing, so we used them for our intense fast Fourier transform (FFT) processing. To achieve 3D imaging capabilities, the two FPGAs in the system computed more than 700,000 512-point FFTs every second. While we used LabVIEW FPGA IP for most of our development, we were able to easily integrate Xilinx CORE Generator™ VHDL IP into our LabVIEW FPGA application to achieve this complex FFT processing.
Using LabVIEW to integrate and control the different parts of the system, we transferred data over a high-speed MXI-Express fiber-optic interface from the PXI system to a quad-core Dell Precision T7500 PC with an NVIDIA Quadro FX 3800 Graphics Processing Unit (GPU) to perform real-time 3D rendering and display. We also needed to log data for extended time periods for applicability to group screening tests for cancer. While our architecture does not limit the image acquisition time, we enabled logging of up to 100 minutes on our prototype system by storing data to the NI HDD-8264 RAID system, which provided 3 TB of hard drive space.
When we defined our system, we estimated our data quantities and required throughput speeds as listed in Table 1. The number of channels (320) determines the samples per A-scan (depth scan). Then we chose 256 A-scans in a B-scan (2D cross-sectional image) to obtain reasonable image quality. This meant 81,920 data samples comprised our B-scan. With 256 B-scans per volume scan, we had 20,971,520 samples per volume, and using 14-bit ADCs, each sample was two bytes. Because our goal was to achieve a 12 volume/s volume rate, we needed an overall data rate of a little more than 500 MB/s, which we achieved with the NI PXIe-8375 MXI-Express interface.
||Data Quantity and Traffic Speed
|Sample number per A-scan
|Sample number per B-scan (frame)
||320 by 256 = 81,920 samples
|Sample number per V-scan (volume)
||320 by 256 by 256 = 20,971,520 samples
|Volume number per second
|Number of B-scans per second
||256 by 12 = 3,072 B-scans/second
|Sample number per second
||20,971,520 by 12 = 251,658,240 samples/second
|Number of bytes per second
||251,658,240 by 2 bytes = 503,316,480 bytes/second
|FFT numbers per second
||256 by 256 by 12 = 786,432 FFTs/second
|Total logging bytes
||503,316,480 by 60 by 100 = 3,019,898,880,000 bytes
Data Quantities and Data Traffic Speed
A photograph of our system is shown in Figure 3. The left-hand rack contains our bank of photoreceivers and the right-hand rack is the data acquisition system. The two PXI chassis are in the center of the rack, while signal breakout boxes are above and below. The DELL PC is on the floor at the bottom left.
We designed the system with three different real-time display modes: (a) continuous display of rendered 3D images, (b) continuous 2D cross-sectional frame scanning in a 3D cube along each of the axes, and (c) continuous display of all acquired B-scan images.
Figure 4 shows an example of a continuously rendered 3D display. The sample is showing the skin of a human finger. The rendered images are continuously refreshed and we can change the viewing direction arbitrarily in real time. Figure 4a shows the finger print pattern clearly while Figure 4b shows the sweat glands. We can observe the sweat glands changing in real time.
Although data at 12 volumes/s is continuously transferred from the data acquisition system to the PC, volumetric data arrays must be reformatted for the GPU rendering process, which causes a data processing bottleneck. Currently, our prototype system refreshes the rendered images twice per second, but the GPU board was benchmarked to perform volume rendering processing at four times per second. Therefore, we can increase the refresh rate of the system by further optimizing our algorithms.
Figure 5a shows a cut from a real-time video displaying rendered OCT images of the three-layer structure in an extracted pig esophagus. Real-time display of 2D cross-sectional slice scanning along an axis designated as x, y, or z is also possible, as shown in Figures 5b, 5c, and 5d. The depth range is 4 mm. This image penetration depth is sufficient to detect early stage cancer.
While displaying 3D rendered images, we can also virtually cut part of the tissue and reveal the inside structure in real time. Figure 6a is a rendered image of a piece of chicken meat viewed from above. Figure 6b shows a virtual cut of the thin surface layer of the tissue. As we increase the thickness of the layer to cut, as in Figures 6c and 6d, we can see a rod-like object in the chicken meat. The virtual cutting process can be done reversibly in real time. In this particular case, we can see strong reflected light from a steel sewing needle that we inserted into the sample. As we rotate the rendered image, we can see the rod directly without a virtual cut as shown in Figure 6e. Virtually cutting a rendered image is useful for estimating depth and the spread of cancer in real time.
Overall, we leveraged the flexibility and scalability of the PXI platform and NI FlexRIO to develop the world’s first real-time 3D OCT imaging system. We used LabVIEW to program, integrate, and control the different parts of the system, combining high-channel-count acquisition with FPGA and GPU processing for real-time computation, rendering, and display.
Using FPGA-based processing enabled by NI FlexRIO, we computed more than 700,000 512-point FFTs every second to achieve 3D imaging, while maintaining high channel density for the 320-channel system. Using PXI, we leveraged high-throughput data transfers over PCI Express, accurate timing and synchronization of multiple modules, and “peer-to-peer data streams” to transfer data directly between FPGA modules without going to the host. We were also able to maintain a high-throughput connection between our I/O and the GPU processing in the host PC, as well as to integrate RAID hardware for extended image data logging.
With this processing system, we demonstrated continuous real-time display of 3D OCT images, and we can rotate the rendered 3D image in any direction in real time. Observation of tissues, such as the trachea or esophagus, with good image penetration depth demonstrates the applicability of our method to optical biopsy. Further, revealing the inside of a structure by virtually cutting the tissue surface in real time would be very useful for cancer diagnosis. We even observed dynamic tissue changes with our system, which surgeons could use to observe blood flow and tissue changes during surgery.
This work is supported by “Development of Systems and Technology for Advanced Measurement and Analysis” program by Japan Science and Technology Agency (JST).
Dr. Kohji Ohbayashi 大林 康二
Kitasato University, Graduate School of Medical Science 北里大学 大学院 医療系研究科 教授
Kitasato 1-15-1, Sagamihara