FPGAs for Wireless Engineers Series: Don’t Let FPGA Compiles Be a Bottleneck

Publish Date: Aug 14, 2018 | 0 Ratings | 0.00 out of 5 | Print | Submit your review


Wireless engineers often want to use over-the-air signals to go from concept to prototype. Software defined radios (SDRs) such as the USRP (Universal Software Radio Peripheral) device provide a flexible solution to meet that need. With today’s applications demanding higher bandwidths and lower latencies, more of this signal processing needs to be implemented on the FPGAs of SDRs. However, wireless engineers programming FPGAs often face the following challenges:

1. Difficulties interfacing between FPGAs and RF signals, the host CPU, and other resources like on-chip memory
2. Unfamiliar programming paradigms for algorithm implementation, and
3. Long compile times

In this series on “FPGA Prototyping for Wireless Engineers”, learn how the LabVIEW Communications System Design Suite (LabVIEW Communications) and NI SDR hardware can help you overcome each of these key challenges and quickly create real-time, over-the-air testbeds without FPGA knowledge.

Table of Contents

Part 1. Immediately Connect Your FPGA Algorithms to I/O

Part 2. Go from a Concept to FPGA Code with No HDL Expertise

Part 3. Don’t Let FPGA Compiles Be a Bottleneck


Part 3. Don’t Let FPGA Compiles Be a Bottleneck

When creating an FPGA design, long compile times are frustrating and can traditionally slow development. However, by following this simple unit-test workflow in LabVIEW Communications, you can reduce the time wasted on unsuccessful FPGA compiles and know that when a compile is run, the output will function correctly.


Summary of Steps

  1. Conduct a Functional Simulation on a CPU
  2. Convert the Algorithm From Floating Point to Fixed Point
  3. Simulate the FPGA Input and Output Interfaces
  4. Conduct FPGA-in-the-Loop Test; Compile Algorithm and Input/Output Interfaces
  5. Integrate FPGA-in-the-Loop Code Into the Main Design


Step 1. Conduct a Functional Simulation on a CPU

In the second installment of the ““FPGA Prototyping for Wireless Engineers” series, you completed this step by creating a floating-point implementation of a 20 MHz, LTE-like OFDM modulator and a testbench to verify its correct function (see Figure 1). You generated a random stream of QAM symbols to be modulated and then analyzed them at the output to verify that the OFDM modulation was correct.




Figure 1. (a) The completed testbench for the OFDM modulator is running on the CPU. You generate a test vector and run it through the algorithm. Then you analyze the output. (b) When the outputs are plotted, you observe the expected magnitude and phase.



Step 2. Convert the Algorithm From Floating Point to Fixed Point

Converting a floating-point algorithm to fixed point is crucial in the FPGA design flow because FPGAs are resource constrained; therefore, designs tend to require fixed-point math. This step can often be extremely manual and tedious as you weigh trade-offs between algorithm fidelity and FPGA footprint. LabVIEW Communications automates and accelerates this process through the automated Convert to Fixed-Point tool. You can use the same testbench from Step 1 (see Figure 2) to compare the golden floating-point implementation with the fixed-point implementation. To see how this process works, refer to the “Convert Algorithms From Floating Point to Fixed Point in LabVIEW Communications” white paper.


Figure 2. The floating-point implementation of the OFDM transmitter is in the top branch, and the fixed-point implementation is in the bottom branch. The algorithm was converted using the automated Convert to Fixed-Point tool in LabVIEW Communications. The corresponding magnitude and phase outputs are compared to confirm that the implementation is sufficiently accurate.



Step 3. Simulate the FPGA Input and Output Interfaces

Now that the algorithm is in fixed point and functionally correct, you need to implement the FPGA interfaces into the algorithm. When working with FPGA math, you must simulate both the algorithm itself (steps 1 and 2) and the interfaces into and out of the algorithm before running a lengthy compile because the interfaces often involve bit packing and buffering that can be a source of user error.


The interface into the algorithm on the FPGA from the desktop computer testbench involves a few general steps, which are illustrated in Figure 3 and described below:

  1. Pack input data on the host for efficient streaming to the FPGA.
  2. Write the data from the host to a "Host to Target" DMA FIFO. This will be accessible by the simulated FPGA code.
  3. In the FPGA code, read data from the "Host to Target" DMA FIFO and unpack the data.
  4. Send data through the algorithm in the simulated FPGA code; the host testbench simultaneously waits for one symbol to be produced from the FPGA.
  5. In the FPGA code, pack the output from the algorithm and stream to a "Target to Host" DMA FIFO. This will be accessible by the host testbench.
  6. Read data from the "Target to Host" DMA FIO on the host testbench.
  7. Unpack the data and analyze the output to ensure the algorithm is correctly implemented.


Figure 3. (a) The top diagram shows the testbench on the host with interfaces to the FPGA. (b) The bottom diagram shows the simulated FPGA code including interfaces into and out of your algorithm. The data follows this path: (1) Data starts on he host and is packetized. (2) The packet is written to a "Host to Target" DMA FIFO. (3) In the FPGA code, the packet is read from the DMA FIFO, and unpacked. 4) The data is modulated by "OFDM Tx_FXP" on the FPGA. Meanwhile, the host waits for the algorithm to finish. (5) The data is packed and written to a "Target to Host" DMA FIFO. 6) The DMA FIFO is read by the host testbench and the data is unpacked. (7) The output from the testbench is and analyzed on the host to ensure the FPGA algorithm is correctly implemented.



Step 4. Conduct FPGA-in-the-Loop Test; Compile Algorithm and Input/Output Interfaces

The algorithm and the input and output interfaces are confirmed to be logically correct, so the first compile can now be run. Note that up to this point, the design flow was focused on simulations to verify as much of the implementation logic as possible before compiling code to an FPGA circuit. The simulated FPGA code in Figure 3 can be compiled without any changes, and you should be confident that the algorithm will successfully compile and function as expected given the rigorous simulations so far. Once done, the testbench on the desktop can be run again with minor changes to deploy the compiled bitfile down to the FPGA.



Step 5. Integrate the FPGA-in-the-Loop Code Into the Main Design

Now that you know the algorithm and its interfaces function correctly when deployed to hardware and you have a good sense of the FPGA resources required for the algorithm, you need to integrate the algorithm into your overall design, whether it be a custom system or the Application Frameworks, and either conduct system-level simulations or run a compile. You can copy the code in Figure 3b into the overall design and modify the FIFO interfaces to connect to the correct parts of the complete signal processing chain. Download the evaluation and explore in-product tutorials to learn how!


When running a compile, you should use the LabVIEW FPGA Compile Cloud Service, which can run faster and does not consume local processor resources. You can launch multiple compiles in the cloud for one design, which is recommended for larger designs, because compiles are not deterministic. This increases the chances of a successful compile.


By following these five steps when designing FPGA signal processing, you will reduce the time you waste on unsuccessful FPGA compiles and ensure the compiles that are run are more likely to function correctly the first time.

Additional Resources


Back to Top

Bookmark & Share





Rate this document

Answered Your Question?
Yes No