MIMO Application Framework White Paper


Author: Erik Luther, Senior Product Marketing Manager, RF and SDR

Massive MIMO is an exciting area of 5G wireless research. For next-generation wireless data networks, it promises significant gains that offer the ability to accommodate more users at higher data rates with better reliability while consuming less power. Using the NI Massive MIMO Software Architecture, researchers can build Massive MIMO testbeds to rapidly prototype large-scale antenna systems using award-winning LabVIEW system design software and state-of-the-art NI USRP™ RIO software defined radios (SDRs). With a simplified design flow for creating FPGA-based logic and streamlined deployment for high-performance processing, researchers in this field can meet the demands of prototyping these highly complex systems with a unified hardware and software design flow.

Introduction to Massive MIMO

Exponential growth in the number of mobile devices and the amount of wireless data they consume is driving researchers to investigate new technologies and approaches to address the mounting demand. The next generation of wireless data networks, called the fifth generation or 5G, must address not only future capacity constraints but also existing challenges—such as network reliability, coverage, energy efficiency, and latency—with current communication systems.  Massive MIMO, a candidate for 5G technology, promises significant gains in wireless data rates and link reliability by using significantly more antennas at the base transceiver station (BTS)than in current wireless technologies . This approach radically departs from the BTS architecture of current standards, which uses up to eight antennas in a sectorized topology. With hundreds of antenna elements, Massive MIMO reduces the radiated power by focusing the energy to targeted mobile users using precoding techniques. By directing the wireless energy to specific users, radiated power is reduced and, at the same time, interference to other users is decreased. This is particularly attractive in today’s interference-limited cellular networks. If the promise of Massive MIMO holds true, 5G networks of the future will be faster and accommodate more users with better reliability and increased energy efficiency.

With so many antenna elements, Massive MIMO has several system challenges not encountered in today’s networks. For example, today’s advanced data networks based on LTE or LTE-Advanced require pilot overhead proportional to the number of antennas at the base station. Massive MIMO manages pilot overhead for a large number of antennas using time division duplexing (TDD) between uplink and downlink assuming channel reciprocity.  Channel reciprocity allows channel state information obtained from uplink pilots to be used in the downlink precoder.  Additional challenges in realizing Massive MIMO include scaling data buses and interfaces by an order of magnitude or more and distributed synchronization amongst a large number of independent RF transceivers.   

These timing, processing, and data collection challenges make prototyping vital. For researchers to validate theory, this means moving from theoretical work to testbeds. Using real-world waveforms in real-world scenarios, researchers can develop prototypes to determine the feasibility and commercial viability of Massive MIMO. As with any new wireless standard or technology, the transition from concept to prototype impacts the time to actual deployment and commercialization. 


1. Massive MIMO Prototype Synopsis

Outlined below is a complete Massive MIMO prototyping system. It includes the hardware and software needed to build the world’s most versatile, flexible, and scalable Massive MIMO testbed capable of real-time, two-way communication over bands and bandwidths of interest to the research community. With NI software defined radios (SDRs) and LabVIEW system design software, the modular nature of the MIMO system allows for scaling from only a few antennas to a 128-antenna Massive MIMO system. With the flexible hardware, it can be redeployed in other configurations as wireless research needs evolve over time, such as as distributed nodes in an ad-hoc network, or as multi-cell coordinated networks.

Figure 1. One of the World's First Real-Time Massive MIMO Testbeds--Created at Lund University Using the Massive MIMO prototyping system


The Massive MIMO prototyping system includes fully-functional software that implements an LTE-like PHY and TDD for mobile access. Table 1 shows the system and protocol parameters supported by the Massive MIMO prototyping system.

[+] Enlarge Image

Table 1. Massive MIMO System Parameters

Back to Top


2. Massive MIMO System Architecture

Massive MIMO envisioned for cellular applications consists of the BTS and user equipment (UE) or mobile users. Massive MIMO, however, departs from the conventional topology by allocating a large number of BTS antennas to communicate with multiple UEs simultaneously. In the flexible NI prototyping system, the BTS uses a system design factor of approximately 10 base station antenna elements per UE, providing 12 users with simultaneous, full bandwidth access to the 128 antenna base station. A design factor of at least 10 base station antennas per UE has been shown to allow for most theoretical gains to be harvested and for optimal MRC decoder performance. 

In a Massive MIMO system, a set of UEs concurrently transmit an orthogonal pilot set to the BTS. The BTS received uplink pilots can then be used to estimate the channel matrix. In the downlink time slot, this channel estimate is used to compute a precoder for the downlink signals. Ideally, this results in each mobile user receiving an interference-free channel with the message intended for them. Precoder design is an open area of research and can be tailored to various system design objectives.  For instance, MIMO precoders and decoders can be designed to null interference at other users, minimize total radiated power, or reduce the peak to average power ratio of transmitted RF signals.

Although many configurations are possible with this architecture, the Flexible prototyping system supports up to 20 MHz of instantaneous real-time bandwidth that scales from 2 to 128 antennas and can be used with multiple independent UEs. The LTE-like protocol employed uses a 2,048 point fast Fourier transform (FFT) and 0.5 ms slot time shown in Table 1. 

Massive MIMO Hardware and Software Elements

Designing a Massive MIMO system requires four key attributes:

  1. Flexible SDRs that can acquire and transmit RF signals
  2. Accurate time and frequency synchronization among the radio heads
  3. A high-throughput and lossless bus for moving and aggregating large amounts of data
  4. High-performance processing for PHY and media access control (MAC) execution to meet the real-time performance requirements

The NI-based Flexible prototyping system combines SDRs, clock distribution modules, high-throughput PXI systems, and LabVIEW to provide a robust, deterministic prototyping platform for research. This section details the various hardware and software elements used in both the NI-based Massive MIMO base station and UE terminals.

USRP Software Defined Radio

The USRP RIO software defined radio provides an integrated 2x2 MIMO transceiver and a high-performance Xilinx Kintex-7 FPGA for accelerating baseband processing, all within a half width-1U rack-mountable enclosure. It connects to a host controller through cabled PCI Express x4 to the system controller allowing up to 800 MB/s of streaming data transfer to the desktop or PXI Express host computer (or laptop at 200 MB/s over ExpressCard). Figure 2 provides a block diagram overview of the USRP RIO hardware.

[+] Enlarge Image

Figure 2. USRP RIO Hardware (a) and System Block Diagram (b)

Cabled PCI Express Switch Box 

The CPS-8910 cabled PCIe switch box (CPS) combines multiple USRP RIO PCIe connections into a single stream, simplifying the system while allowing efficient aggregation of high numbers of channels. The switch box can support up to eight separate downstream devices through cabled Generation 1 PCI Express x4 connections. In the MIMO configuration, these streams are combined into a single Generation 2 PCI Express x8 connection, allowing an aggregate data rate of up to 3.2 GB/s. In other configurations, a Generation 1 PCI Express x4 upstream connection can be used instead. Additionally, the CPS-8910 enables peer-to-peer streaming between USRP devices. Both copper and fiber optic PCIe cables are supported. Figure 3 shows a block diagram overview of the CPS-8910 hardware.

[+] Enlarge Image

Figure 3. Switch Box (a) and System Diagram (b)

PXI Express Chassis Backplane

The Flexible prototyping system uses PXIe-1085, an advanced 18-slot PXI chassis that features PCI Express Generation 3 technologies in every slot for high-throughput, low-latency applications. The chassis is capable of 4 GB/s of per-slot bandwidth and 12 GB/s of system bandwidth. Figure 4 shows the dual-switch backplane architecture. 

[+] Enlarge Image

Figure 4. 18-Slot PXIe-1085 Chassis (a) and System Diagram (b)

High-Performance Reconfigurable FPGA Processing Module

The Flexible prototyping system uses FlexRIO FPGA modules to add flexible, high-performance processing modules, programmable with the LabVIEW FPGA Module, within the PXI form factor. The PXIe-7976R FlexRIO FPGA module can be used standalone, providing a large and customizable Xilinx Kintex-7 410T with PCI Express Generation 2 x8 connectivity to the PXI Express backplane.

[+] Enlarge Image

Figure 5. PXIe-7976R FlexRIO Module (a) and System Diagram (b)

8-Channel Clock Synchronization

The Ettus Research OctoClock 8-channel clock distribution module provides both frequency and time synchronization for up to eight USRP devices by amplifying and splitting an external 10 MHz reference and pulse per second (PPS) signal eight ways through matched-length traces. The OctoClock-G adds an internal time and frequency reference using an integrated GPS-disciplined oscillator (GPSDO). Figure 6 shows a system overview of the OctoClock-G. 

[+] Enlarge Image

Figure 6. OctoClock-G Module (a) and System Diagram (b)

Back to Top


3. LabVIEW System Design Environment

The Flexible prototyping system uses LabVIEW as an integrated tool for managing system-level hardware and software details; visualizing system information in a GUI, developing general-purpose processor (GPP), real-time, and FPGA code; and deploying code to the research testbed. With LabVIEW, users can integrate additional programming approaches such as ANSI C/C++ through call library nodes, VHDL through the IP integration node, and even .m file scripts through the LabVIEW MathScript RT Module. This makes it possible to develop high-performance implementations that are also highly readable and customizable. All hardware and software is managed in a single LabVIEW project, which gives the researcher the ability to deploy code to all processing elements and run testbed scenarios with a single environment. 

[+] Enlarge Image

Figure 7. LabVIEW Project and LabVIEW FPGA Application

Massive MIMO BTS Reference Design Architecture

The hardware and software platform elements above combine to form a testbed that scales from a few antennas to more than 128 synchronized antennas. For simplicity, this white paper outlines 32-, 64-, and 128-antenna configurations. The 128-antenna system includes 64 dual-channel USRP RIO devices that route data though PCIe switch boxes to a single PXIe chassis. The master chassis aggregates data for centralized processing with both FPGA co-processors and a PXI controller based on quad-core Intel i7. 

In Figure 8, the master uses the PXIe-1085 chassis as the main data aggregation node and real-time signal processing engine. The PXI chassis provides 17 slots open for input/output devices, timing and synchronization, FlexRIO FPGA boards for real-time signal processing, and extension modules to connect PCIe switch boxes. A 128-antenna Massive MIMO BTS requires very high data throughput to aggregate and process I and Q samples for both transmit and receive on 128 channels in real time for which the PXIe-1085 is well suited, supporting PCI Generation 2 x8 data paths capable of up to 3.2 GB/s throughput.

[+] Enlarge Image

Figure 8. Scalable Massive MIMO System Diagram Combining PXI and USRP RIO

In slot 1 of the master chassis, the PXIe-8135 RT controller or embedded computer acts as a central system controller. The PXIe-8135 RT features a 2.3 GHz quad-core Intel Core i7-3610QE processor (3.3 GHz maximum in single-core Turbo Boost mode). The master chassis houses eight PXIe-8384 (S1 to S8) interface modules to connect the PCIe switch boxes to the master system. Each switch box in turn aggregates eight USRP RIO connections. The connection between the PXI chassis and the switch boxes uses PCI Express Generation 2 x8, providing up to 3.2 GB/s between the chassis and each switch box. 

The system also features four PXIe-7976R FlexRIO FPGA co-processor modules to address the real-time signal-processing requirements for the Massive MIMO system. The slot locations provide an example configuration where the FPGAs can be cascaded to support data processing from each of the sub nodes. Each FlexRIO module can receive or transmit data across the backplane to each other and to all the USRP RIOs with < 5 microseconds of latency and up to 3.2 GB/s throughput per FPGA co-processor.

Timing and Synchronization

Timing and synchronization are important aspects of any system that deploys large numbers of radios; thus, they are critical in a Massive MIMO system. The BTS system shares a common 10 MHz reference clock and a digital trigger to start acquisition or generation on each radio, ensuring system-level synchronization across the entire system (see Figure 9). The PXIe-6674T timing and synchronization module with OCXO, located in slot 10 of the master chassis, produces a very stable and accurate 10 MHz reference clock (80 ppb accuracy) and supplies a digital trigger for device synchronization to the master OctoClock-G clock distribution module. The OctoClock-G then supplies and buffers the 10 MHz reference (MCLK) and trigger (MTrig) to OctoClock modules one through eight that feed the USRP RIO devices, thereby ensuring that each antenna shares the 10 MHz reference clock and master trigger. The timing and synchronization architecture proposed offers very precise control of each radio/antenna element.  

[+] Enlarge Image

Figure 9. Massive MIMO Clock Distribution Diagram

Table 2 provides a quick reference of the base station parts list for the 32-, 64-, and 128-antenna systems. It includes hardware devices and cables used to connect the devices as shown in Figure 1.

[+] Enlarge Image

Table 2. Massive MIMO Base Station Parts List

Back to Top


4. BTS Software Architecture

The base station software is designed to meet the system objectives outlined in Table 1 with OFDM PHY processing distributed among the FPGAs in the USRP RIO devices and MIMO PHY processing elements distributed among the FPGAs in the PXI master chassis. Higher level MAC functions run on the Intel-based general-purpose processer (GPP) in the PXI controller. The system architecture allows for large amounts of data processing with the low latency needed to maintain channel reciprocity. Precoding parameters are transferred directly from the receiver to the transmitter to maximize system performance.

[+] Enlarge Image

Figure 10. Massive MIMO Data and Processing Diagram

Starting at the antenna, the OFDM PHY processing is performed in the FPGA, which allows the most computationally intensive processing to happen near the antenna. The resulting computations are then combined at the MIMO receiver IP where channel information is resolved for each user and each subcarrier. The calculated channel parameters are transferred to the MIMO TX block where precoding is applied, focusing energy on the return path at a single user. Although some aspects of the MAC are implemented in the FPGA, the majority of it and other upper layer processing are implemented on the GPP. The specific algorithms being used for each stage of the system is an active area of research. The entire system is reconfigurable, implemented in LabVIEW and LabVIEW FPGA—optimized for speed without sacrificing readability.

Back to Top


5. User Equipment

Each User Equipment (UE) represents a handset or other wireless device with single input, single output (SISO) wireless capabilities. The UE prototype uses USRP RIO, with an integrated GPSDO, connected to a laptop using cabled PCI Express to an ExpressCard. The GPSDO is important because it provides improved frequency accuracy and enables synchronization and geo-location capability if needed in future system expansion. A typical testbed implementation includes multiple UEs where each USRP RIO might represent one or two UE devices. Software on the UE is implemented much like the BTS; however, it is implemented as a single antenna system, placing the PHY in the FPGA of the USRP RIO and the MAC layer on the host PC.

[+] Enlarge Image

Figure 11. Typical UE Setup With Laptop and USRP RIO

Table 3 provides a quick reference of parts used in a single UE system. It includes hardware devices and cables used to connect the devices as shown in Figure 11. Alternatively, a PCI Express connection can be used if a desktop is chosen for the UE controller.

[+] Enlarge Image

Table 3. UE Equipment List

Back to Top


6. Example Massive MIMO Deployments

a. Lund University

Figure 12. Massive MIMO testbed at Lund University: 100 antennas, 10 UEs, 3.7 GHz Carrier Frequency


b. Southeast University (SEU)



Figure 13. Massive MIMO testbed at SEU: 128 antennas, 12 UEs, 5.8 GHz Carrier Frequency


c. Bristol University

Figure 13. Massive MIMO testbed at Bristol University: 128 antennas, 12 UEs, 3.51 GHz Carrier Frequency

Back to Top


7. Conclusion

NI technology is revolutionizing the prototyping of high-end research systems with LabVIEW system design software coupled with the USRP RIO and PXI platforms. This white paper details one of the  first working real-time Massive MIMO solutions. The flexible NI prototyping system utilizes a unique combination of technology that enables the synchronization of time and frequency for a large number of radios and the throughput required to transfer and aggregate I and Q samples at a rate over 15.7 GB/s on the uplink and downlink. The Reference Design also uses design flows for the FPGA to simplify high-performance processing on the PHY and MAC layers to meet real-time timing requirements.

To ensure that these products meet the specific needs of wireless researchers, NI is actively collaborating with leading researchers and thought leaders. These collaborations advance exciting fields of study and facilitate the sharing of approaches, IP, and best practices among those needing and using tools like the Flexible prototyping system.

Read more solutions at ni.com/5g.


Back to Top


8. References

Larsson, E.; Edfors, O.; Tufvesson, F.; Marzetta, T., "Massive MIMO for next generation wireless systems," in Communications Magazine, IEEE , vol.52, no.2, pp.186-195, February 2014

H. Q. Ngo, E. G. Larsson, and T. L. Marzetta, “Energy and spectral efficiency of very large multiuser mimo systems,” CoRR, vol. abs/1112.3810, 2011.

Rusek, F.; Persson, D.; Buon Kiong Lau; Larsson, E.G.; Marzetta, T.L.; Edfors, O.; Tufvesson, F., “Scaling Up MIMO: Opportunities and Challenges with Very Large Arrays,” Signal Processing Magazine, IEEE , vol.30, no.1, pp.40,60, Jan. 2013

National Instruments and Lund University Announce Massive MIMO Collaboration, ni.com/newsroom/release/national-instruments-and-lund-university-announce-massive-mimo-collaboration/en/, Feb. 2014