FPGA Hardware Implementation and Experimental Verification of Direction-of-Arrival Estimation Algorithm Using LU Decomposition

Dr. Nizar Tayem, Texas A&M University-Commerce, Assistant professor, Engineering and Technology, Electrical Engineering

"We found NI platform to be the most suitable for hardware implementation of our DOA estimation algorithms. "

- Dr. Nizar Tayem, Texas A&M University-Commerce, Assistant professor, Engineering and Technology, Electrical Engineering

The Challenge:

The Department of Electrical Engineering at Prince Mohammad Bin Fahd University needed to implement a computationally intensive direction-of-arrival (DOA) algorithm in an FPGA to estimate arrival angles as fast as possible while using the fewest hardware resources. Researchers also wanted to experimentally verify the algorithm’s performance and estimation accuracy.

The Solution:

Using NI hardware and the high-throughput mathematical functions in the LabVIEW FPGA Module, researchers developed a complete RF testbed that can experimentally validate the performance of the proposed DOA estimation algorithm based on lower-upper (LU) decomposition. A uniform linear array of four antenna elements is deployed at the testbed receiver, which is connected to PXI modules that handle the RF signal acquisition, down-conversion, digitization, and high-throughput processing.


Dr. Nizar Tayem - Texas A&M University-Commerce, Assistant professor, Engineering and Technology, Electrical Engineering
Ahmed Hussain - Prince Mohammad Bin Fahd University,College of Electrical Engineering,Lecturer and Lab Coordinator
Dr. Soliman Abdel-Hamid - Staffordshire University,Associate Professor in Telecommunications and Signal Processing


Estimating the DOA angles of impinging RF signals is an active and important research area with practical applications in both civilian and military fields such as sonar and radar for source localization; multiple input, multiple output (MIMO) systems; and beamforming smart antenna arrays in mobile communication.

The practical significance of the DOA estimation problem can be established only through real-time test on actual hardware. This validates the estimation methods in terms of accuracy, computational speed, hardware resource requirements, and implementation costs in hardware.

Though the DOA estimation literature reports several algorithms, only a few of these are suitable for hardware implementation. Because of their higher complexity, resource requirements, and computation time, singular value decomposition (SVD) or eigenvalue decomposition (EVD) based algorithms such as ESPRIT and MUSIC are not suitable for hardware implementation. In contrast, algorithms such as those based on LU factorization, Cholesky decomposition, and QR decomposition have lower computational complexity, which makes them suitable for real-time hardware implementation.

Algorithms for DOA estimation involve real-time processing, high-speed and high-accuracy constraints, and complex numerical calculations. Such algorithms make hardware realization a challenging task for designers because choosing the most efficient algorithm has a vital role in the performance of the implementation. Technology advancements in ICs such as FPGAs make the implementation of these algorithms possible when large quantities of logic gates are available in a single chip.


Therefore, the challenge is to implement the proposed DOA estimation algorithms in hardware and build a complete hardware prototype for experimental verification and real-time test of the DOA estimation algorithms.


Proposed DOA Estimation Algorithm

In the proposed algorithm, we used LU factorization to find the DOAs of multiple RF incident sources. We also used it to decompose the data correlation matrix into signal and noise subspaces. LU factorization is much less complex compared with QR factorization, so it requires half the number of flops. A low number of flops reduces the memory storage and processing time.


We considered two methods in the proposed algorithm for hardware implementation: partial L matrix (LU-L) and partial U matrix (LU-U). These methods were verified through The MathWorks Inc. MATLAB® software simulations before being implemented on a Xilinx Virtex-5 FPGA using the high-throughput mathematical functions in the LabVIEW FPGA Module. We experimentally validated the proposed DOA estimation algorithms through real-time test on a hardware prototype built using the NI PXI platform and through hardware simulations using LabVIEW FPGA. We compared the performance of the proposed algorithms in terms of estimation accuracy, resource utilization, and processing time with QR decomposition-based DOA estimation methods (QR-R, QR-Q). Both simulations and real-time experiments establish LU-U to be the best method in all performance parameters; however, QR-R has slightly better estimation accuracy (compared with LU-U). This comes at a much higher cost in terms of FPGA resource consumption and processing time. LU-U consumes the fewest FPGA resources whereas QR-R consumes the most. In addition, LU-U is the fastest in computing the DOA estimates.


Hardware Implementation

To implement the proposed DOA estimation algorithms, we selected Xilinx Virtex-5 SXT FPGA target hardware and programmed it using LabVIEW software. LabVIEW graphical software helped us configure NI-certified hardware modules in a block diagram fashion so we could develop prototyping designs quickly.



As the hardware implementation model in Figure 1 shows, signals received from the ULA are downconverted, digitized, and stored in a first-in-first-out (FIFO) queue. We execute these steps on the host (PC) while executing the DOA estimation algorithm on the FPGA target. We then use DMA to speedily transfer signal data to the FPGA through the FIFO.


To achieve high throughput, we used a pipelined architecture for the FPGA implementation of the proposed DOA estimation algorithms (Figure 2). The different stages of the pipeline represent the major operations of the algorithm. Data flows from one stage of the pipeline to the next one, which permits high-throughput implementation for the chosen algorithm.



FPGA Resource Utilization and Processing Time

We implemented the proposed algorithms in hardware on the PXIe-7965 PXI FPGA Module for FlexRIO, which features a DSP-focused Xilinx Virtex-5 SXT FPGA with 512 MB of onboard RAM.


We programmed using the LabVIEW FPGA Module, which features high-throughput mathematical operations for implementing on FPGAs. We chose fixed-point data type for the implementation because it offers acceptable accuracy with much less resource usage and higher speeds. Floating-point data type provides higher accuracy at the cost of significantly more FPGA resources and lower speeds.

We developed separate LabVIEW code files called virtual instruments or VIs that implement the proposed DOA algorithms using LU-U and LU-L factorization. We also developed LabVIEW FPGA code using QR-Q and QR-R factorization for comparison. All these VI code files were compiled to test and evaluate the DOA estimation algorithms in real time and produce a report on the FPGA resources consumed and processing time required (in MHz).

Table 1 shows the FPGA resources consumed (for word length of 16 bits and integer size of 8 bits) in the implementation of DOA estimation algorithms using QR-Q, QR-R, LU-U, and LU-L. The DOA estimation using LU-U consumes the fewest resources while QR-R consumes the most resources.



Figure 3 shows the percentages for device utilization and processing time (in MHz) consumed by the DOA estimation. Overall, LU-U outperforms all other methods in terms of resource utilization and processing time.









Real-Time Experimental Verification

We implemented real-time experimental verification of the proposed algorithms using the NI PXI platform, which features a data acquisition module, digitizers, RF downconverters, RF upconverters, local oscillators, arbitrary waveform generators, and an FPGA module for FlexRIO with Xilinx Virtex-5.



Experimental Setup

Figure 4 shows the experimental setup with two transmitters and a uniform linear array with four antenna elements deployed at the receiver. The interelement spacing between the receiver antennas is half a wavelength (λ/2).



The NI PXI transmitter is implemented as shown in Figure 5. LabVIEW built-in functions for source coding, channel coding, and modulation are used to first generate a signal in the digital domain. This digital signal is then converted to an intermediate frequency (IF) analog signal using an arbitrary waveform generator (AWG) module (PXI-5421). Next, the analog signal is converted to an RF signal using an upconverter module (PXIe-5652). Finally, the signal is amplified before transmitting it using an RF amplifier module (PXI-5691). All these modules are housed in the PXI chassis shown in Figure 6. The transmitter unit acts as a source in a far field region of the receiver.




The AWG runs at a maximum sample rate of 100 MS/s. The IF signal has a frequency of 25 MHz, and the maximum frequency of the RF signal generated by the upconverter is 2.7 GHz.


Figure 7 shows the receiver units on the NI PXI chassis. Each receiver unit is composed of an RF downconverter (PXIe-5601) and a high-speed digitizer (PXIe-5622).


The downconverter operates at a maximum frequency of 2.7 GHz and a bandwidth of 15 MHz. The received signal is downconverted to an IF signal of 15 MHz, which is then fed to a digitizer operating at a maximum sample frequency of 64 MS/s. The outputs of the digitizers are modulated signals in (I, Q) form, from which the amplitude and phase information of the message signal is extracted.


Results and Conclusion

We evaluated the performance of the two proposed DOA estimation algorithms based on LU decomposition using real-time experiments. When we compared their performance with that of QR decomposition-based algorithms, LU-U was the optimum method for DOA estimation in terms of FPGA resource utilization, processing time, computational complexity, and estimation accuracy.

The work presented in this case study is significant in that it experimentally establishes LU decomposition as an efficient method for DOA estimation in terms of estimation accuracy, hardware resources, and processing time.


We intend to continue our work in the development of more efficient DOA estimation algorithms and to implement and experimentally validate the developed algorithms using NI hardware and software. In the future, we will consider more antenna elements and different array configurations such as L-shaped, circular, and so on. More antenna elements will improve estimation accuracy but at a significantly higher cost in terms of hardware resources and processing time. This will require smarter and more efficient algorithms to reduce computation complexity without compromising accuracy.


MATLAB® is a registered trademark of The MathWorks, Inc.


Author Information:

Dr. Nizar Tayem
Texas A&M University-Commerce, Assistant professor, Engineering and Technology, Electrical Engineering
Texas A&M University-Commerce
Commerce, TX 75428
United States
Tel: 2145580071


Figure 1. Hardware Implementation Model
Figure 2. Pipelined Execution of DOA Estimation Algorithm Based on LU Factorization
Table 1. FPGA Resources Consumed for DOA Estimation Algorithms Using QR and LU
Figure 3. Percentages for Device Utilization and Processing Time for DOA Estimation With 16/8 Data Size
Figure 4. Experimental Setup Showing Two Transmitters (in the foreground) and a Four-Element Antenna Array and PXI System (in the background)
Figure 7. NI PXI Receiver Modules in the NI PXI Platform Chassis
Figure 5. Transmitter Unit Block Diagram
Figure 6. NI PXI Transmitter Modules in the NI PXI Platform Chassis