The RX baseband is operating in the baseband clock domain of 250 MHz. The RX baseband block diagram is shown in Figure 12. Blue arrows indicate the data path while yellow ones are connected to the control path. Details about the information are available in the following sections.
The data source block selects the source for the receiver. Data can be taken from RF, from the TX baseband using internal loopback, or from the host or by using a host to target FIFO. The stream always has a sample rate of 80 MS/s for all sources. The synchronization detects the packet start and compensates an estimated carrier frequency offset. In parallel, the power measurement block calculates the received signal power. The stream is given to the RX I/Q Processing block, where the samples are transferred to the frequency domain. Then channel estimation, equalization, and phase tracking is done. The constellation with field assignment information is provided to the RX Bit Processing block. Inside this block the modulation is reversed, the bits are deinterleaved, decoded using a Viterbi core and descrambled. This bit stream is given to the RX PHY state machine. This state machine interprets the signal fields, such as L-SIG, VHT-SIG-A, and son on, in the PPDU and generates control information for I/Q processing, bit processing, and MAC. The PSDU, which can be MPDU or A-MPDU, is removed from the bit stream and delivered to the MAC as unsigned bytes. The MAC interprets the header information of the PSDU and transfers data to the Host application using the LabVIEW target-to-host interface.
Every module is designed to keep up with the data rate from the upstream module, so there is no need for throttle control inside the modules. The timing of the transfers is described in the following sections.
Figure 12 RX Baseband Block Diagram
188.8.131.52 Power Measurement
The power measurement module calculates the baseband signal power and the RF input power. The block diagram is shown in Figure 13.
Figure 13 Power Measurement Block Diagram
Based on the incoming samples, x, the signal power, s, is calculated over a window of 64 samples as described in Equation 1. The output of this calculation is updated after 64 samples arrive. The next step is the iterative calculation of the logarithm to the base of 10. The value of s shifts n times to the left until the MSB contains a one. The number of shifts, n, and an LUT of the six MSBs of the shifted value s’ are used to calculate the signal power p in logarithmic scale. This value represents the baseband signal power in dBFS.
Equation 1 Signal Power Calculation
Based on p, the RF input power r is calculated using the power calibration offset and the RF gain (see Equation 2). Both values are given from the host. The analog gain value is subtracted from p because applying gain before ADC means that the RF input power is lower than the measured signal power. The power calibration offset is based on the calibration data of the device. It maps the baseband signal power at minimum gain to the corresponding reference power level4 at the RF input port. This mapping is assumed to be linear at all gain levels.
Equation 2 RF Input Power Calculation
The value of r is compared against the given CCA energy threshold. If this threshold is exceeded, the CCA energy detect signal is asserted. Together with the baseband and the RF power, this value is available at the output of the Power Measurement VI.
The purpose of the synchronization module is to find the packet start in the continuous sample stream. The ideal position of the packet start for the implemented algorithm is in the center of the L-LTF field also referred as first sample of L LTF 2 (second OFDM symbol of L-LTF).
The block diagram of the synchronization unit is shown in Figure 14 and details of data types, control information, and identifiers used in equations can be found in Table 5.
The synchronization is fed from the data source with a sample rate of 80 MS/s. In the baseband clock domain of 250 MHz, approximately every third sample is valid. Each VI must use the enable chain to update only on valid samples. The data rate is not changed by any VI.
Figure 14 Synchronization Block Diagram
|Module||Identifier||Output Data Type||Output Control Information|
|Timing Metric Calculation||tm||FXP 2.6||-|
|Timing Metric Peak Search||-||U16|
|Timing Metric Valley Search||-||U16||Valley index|
|Timing Metric Evaluation||-||Boolean|
|CCA signal detect|
Packet start sample index
|CFO Removal||-||CFX 3.13||-|
|Frame Alignment||-||CFX 3.13||Packet start|
Table 5 Synchronization Data Types and Control Information
The synchronization block is implemented in two parallel paths (refer to Figure 14) to minimize the latency on the data path. The upper path finds the packet start sample index and estimates the carrier frequency offset (CFO) based on the Schmidl and Cox algorithm . These estimates are used by the lower path, which is the main data path, to compensate CFO and generate the packet start pulse for downstream modules.
For testing purposes, there is a bypass for the synchronization block where the packet start index can be given from the host. This path is not included in Figure 14. Use this bypass in combination with RX samples from the host or internal loopback to characterize the RX baseband without the impact of synchronization algorithms.
As shown in Figure 14, the upper path of the synchronization block starts to calculate the autocorrelation of the received signal x (see Equation 3). As the length of one period of the non-HT short training field is 64 samples at an 80 MS/s sample rate (refer to Section 18.3.3 in ), the length of autocorrelation window CP (see Equation 3) is set to 64. The normalized magnitude and the phase of the autocorrelation window, s, are given at the output of the autocorrelation module for each sample. Under ideal conditions, this autocorrelation scheme results in a normalized magnitude equal to 1 as shown in Figure 15.
Equation 3 Synchronization Autocorrelation
Figure 15 Simplified Signal Charts of Synchronization
To find the transition from L-STF to L-LTF, a so-called synchronization timing metric is calculated based on the magnitude of the normalized autocorrelation, as shown in Equation 4. The ideal behavior of this timing metric, tm, is also shown in Figure 15.
Equation 4 Synchronization Timing Metric
Based on the indices of the minimum and maximum value of this metric and the distance between minimum and maximum, the sample index of the packet start is calculated using the following steps:
- The timing metric peak search looks for the valley of the timing metric. This valley is 2 * CP samples after the start of L STF. A valley is found if a given number of samples is under a defined threshold. The sample index of the center of all samples being under the threshold is given by the valley search as valley index. A 16-bit wide wrapping counter is used to give that index. The phase of the timing metric is captured on the first value that exceeds the threshold.
- A peak search is also used for the timing metric. Under ideal conditions, this index is CP samples after the start of L LTF. It uses the same principle as the valley search.
- Those two search results are combined in the Timing Metric evaluation. This module checks the distance between valley and peak, which has to be within a given range of 9 * CP ± CP. As an additional check, the autocorrelation value must be under a given threshold. This check is valid for a packet start because during L-LTF the autocorrelation reports a low value.
Based on this algorithm and corresponding processing delays, the packet start sample index is calculated and given to the Frame Alignment module. During the calculation, the number of samples to cut into the OFDM guard interval is taken into account.
In addition to timing estimation, the phase of the autocorrelation is averaged over CP values and used for CFO estimation. This CFO estimation is based on the phase output of the peak search.
In the lower path of the synchronization block, the estimated CFO is compensated by applying a digital frequency shift. The CFO estimate is used for all OFDM symbols of the entire packet. The Frame Alignment module generates the packet start trigger pulse at the sample index given by the timing metric evaluation.
After the synchronization has indicated a packet start signal, further triggering of packet start signals is blocked until the synchronization is rearmed by the PHY RX end indication, generated at the end of the packet. This blocked status is indicated by the asserted CCA signal detect signal.
Figure 16 Synchronization Latency
The latencies for the different modules in the Synchronization block are illustrated in Figure 16. The left part of the figure contains the modules of the upper path. The latency of those modules sum up to 68 clock cycles. Given the sample rate of 80 MS/s at a 250 MHz clock rate, this time is equivalent to about 22 samples. Since the peak of the timing metric is located 256 samples before L-LTF-2, the packet start index is calculated before the packet start signal is asserted and there is no effective delay.
The latency of the lower data path is shown in the right part of Figure 16. This latency increases the length of the RX processing path by 15 clock cycles.
184.108.40.206 RX IQ Processing
The receiver RX IQ Processing block purpose is to restore the transmitted I/Q constellation. The block diagram is shown in Figure 17. Details of data types, control information, and identifiers used in equations are presented in Table 6.
Figure 17 RX IQ Processing Block Diagram
|Module||Identifier||Output Data Type||Output Control Information|
|Synchronization||-||CFX 3.13||Packet Start|
|Sample Generation Timing||-||Sample Timing|
|Cyclic Prefix Removal||-|
|FFT||R||CFX 4.21||OFDM symbol index|
|Channel Estimation||Hest||Subcarrier index|
|Channel Equalization||Yest||CFX 2.14||Field Map|
|Pilot Phase Estimation||ß||FXP 1.14||-|
|Phase Correlation||Xest||CFX 2.14||Field Map|
Table 6 RX IQ Processing Data Types and Control Information
The Sample Timing Generation module gets samples from the Synchronization module along with the packet start index. It starts passing samples to downstream modules as soon as the packet start signal is asserted. It stops passing samples as soon as the last OFDM symbol is finished whose index is given by the RX PHY state machine. The control information is carried by the sample timing cluster, which contains the following elements:
- OFDM symbol index
- Sample index (within the OFDM symbol; 0 to 319)
- Packet start flag
- OFDM symbol start flag
- Valid flag
The sample index is used by the Cyclic Prefix Removal module to invalidate the first 64 samples of each OFDM symbol.
The next downstream module is the FFT, which is a wrapper for the Xilinx FFT core. It contains a 256-point FFT operation using a Radix 4, Burst I/O architecture. A toggling negation realizes the FFT shift to have the DC at the 128th output value. The FFT starts execution as soon as 256 samples are provided. During the execution, no samples are taken on the input. A FIFO is placed before the input to capture the samples that arrive in the meantime. On finishing execution, the 256 subcarriers are provided at the output consecutively. The OFDM symbol index from the incoming sample timing cluster is passed through this module, parallel to the data stream. The maximum gain of the FFT is 256 if the energy is limited to only one subcarrier. Therefore, the fixed point data type is extended by nine bits to capture this output dynamic range of the FFT module. The output of the FFT is divided by 256 to have the same scaling as on the input of the IFFT in the transmitter chain. The resulting fixed point format is <4.21>.
The Demapper block aligns two control information clusters with the data stream. The first cluster is the subcarrier timing cluster, which contains the following elements:
- OFDM symbol index
- Subcarrier index (0 to 255)
- Frequency offset index (named k in  and ; -128 to 128)
- OFDM symbol start flag
- Valid flag
The frequency offset index is generated based on the control information from the RX PHY state machine (refer to section 0). The second control information cluster is the field map. This cluster is made up of Booleans, and each Boolean represents one field of the 802.11 packet structure, such as L-SIG, L-LTF, VHT-SIG-A, pilot subcarrier, or data subcarrier. Similar to a one–hot–code, only one of these Booleans is asserted for each sample. The packet structure is known to the Demapper module. Downstream modules can take this field map to filter for specific fields, such as the pilot subcarriers.
The channel estimation is computed using the second L-LTF OFDM symbol for 802.11a and VHT-LTF for 802.11ac. The inverse channel transfer function is calculated for each subcarrier R individually using the L-LTF definitions L from section 220.127.116.11.3 of  as shown in Equation 5. The signal names are included in Figure 16. The frequency offset index k from the subcarrier timing is used. The channel estimation block is implemented in a parallel path to minimize latency to the data path. The values of Hest are given to the channel equalization module where they are stored in memory. They have the same data type as the incoming subcarriers. Beginning with the L-SIG, the channel equalization uses those values to apply zero forcing to get signal Yest. The fixed point format of <2.14> is sufficient to represent the values of Yest. Larger values are saturated.
Equation 5 Channel Estimation and Compensation
The signal Yest is passed to the pilot phase modules that follow the same structure as the channel estimation and equalization. Removing the cyclic prefix residual carrier frequency offset after the synchronization leads to a phase jump between consecutive OFDM symbols. The phase for the current OFDM symbol αn is calculated based on the pilot sequences P. These sequences are taken from Section 18.104.22.168 of  and Section 22.214.171.124 of  at the frequency offset index k. The phase offset between OFDM symbols is compensated by adding the difference to the last phase estimation from OFDM symbol n - 1. The estimated phase ß of OFDM symbol n is applied to the OFDM symbol n + 1 by the Phase Correction module. This operation does not change the magnitude of the values, so the fixed point format is kept.
Equation 6 Phase Estimation and Compensation
As the last step of the RX I/Q processing, the clockwise rotation of VHT-SIG-A2 (refer to section 126.96.36.199 of ) is reversed.
|Sample Timing Generation||~ 1 sample / 3 clock cycles (320 samples per OFDM symbol)|
|Cyclic Prefix Removal||~ 1 sample / 3 clock cycles (256 samples per OFDM symbol)|
|FFT||256 subcarriers / OFDM symbol burstwise|
|Pilot Phase Estimation||1 phase estimate / OFDM symbol|
|Phase Correlation||256 subcarriers / OFDM symbol burstwise|
Table 7 RX IQ Processing Transfer Timing
The timing of the data stream is changed inside the RX I/Q Processing module. A summary for all submodules is given in Table 7. The input is given by the digital downconversion at a sample rate of 80 MS/s. The Cyclic Prefix Removal module removes 64 samples from the stream. Because of the chosen FFT architecture configuration the output of the Xilinx core is given burstwise. This transfer timing is kept for all downstream modules. The only exception is the Pilot Phase Estimation module that computes one phase estimate per OFDM symbol.
Figure 18 RX IQ Processing Latency
The overall latency of the RX I/Q Processing module for the last sample of the OFDM symbol is 665 clock cycles (refer to Figure 18). The FFT latency is smaller than reported by the Xilinx IP Generator, and this latency includes loading of all 256 samples. During the packet, the FFT executes and unloads samples in 617 clock cycles after the last sample arrived. The remaining clock cycles per OFDM symbol are used to transfer data from the input FIFO to the FFT core. By the time the last sample is available on the input, the FIFO is empty, and it is passed to the core as fast as possible. The delay of the FIFO is unknown, which is indicated in Figure 18. All other modules have a fixed latency.
188.8.131.52 RX Bit Processing
The RX bit-processing chain deinterleaves, decodes, and descrambles the data. It provides the received bits to the RX PHY state machine and the PSDU bytes to the MAC. The block diagram is shown in Figure 19. Details about data types and control information are given in Table 8.
Figure 19 RX Bit Processing Block Diagram
|Module||Output Data Type||Output Control Information|
|RX IQ Processing||CFX 2.14||Field Map|
|Align Configuration||Bit Processing Configuration|
|LLR Demapper||FXP8.0 Array (8 elements)|
Array (2 elements)
Table 8 RX Bit Processing Data Types and Control Information
The first module of the chain is the Packet Termination module. It passes all samples that have OFDM symbol indices in the subcarrier timing cluster below the value given from the RX PHY state machine. Passing only these samples ensures that the packet end is correctly processed. If you abort the current packet reception, this module terminates all I/Q data by setting the last OFDM symbol index to 0.
The next block is the Align Configuration module, which has two functions. The first function is to align the bit-processing configuration cluster from the RX PHY state machine with the start of a new OFDM symbol. All other control information is terminated in this module. The bit-processing configuration is transferred parallel to the data stream, and it contains the following information:
- Packet format
- Coding rate
- PSDU length (in bytes)
- Valid bits in current OFDM symbol
- Descrambler enable flag
- Viterbi flush required flag
The second function is the filtering of all noncoded fields for downstream modules. It uses the field map provided by the RX I/Q processing chain.
The I/Q samples in the coded fields are processed by the LLR Demapper block. Based on the given modulation scheme, an array of up to eight softbits is given at the output. The data type of each softbit is unsigned 8-bit integer.
The Softbit Serializer module takes this array of softbits and provides the serialized stream on the output. The number of valid softbits in the array is derived from the modulation. An internal FIFO is used to buffer softbits on the input.
The Deinterleaver module reverts the BCC interleaver operations defined in section 184.108.40.206 of  and section 220.127.116.11 of . The write operation into the memory is based on equation 22-82 of , which reverses the second permutation. The read operation is based on equation 22-77 of , which reverses the first permutation. Reading is started as soon as all softbits of the current OFDM symbol are saved to memory. A double page memory is used, which enables reading and writing at the same time.
Based on Figure 18-9 and 20-11 of , the Depuncturer module converts the incoming bit stolen data sequence to the bit inserted data sequence. Each bit gets a puncturing flag attached depending on whether it was transmitted or left out. One element of A and the corresponding element of B are combined into an array of two elements, where A and B are as defined in Figure 18-9 and 20-11 of .
The array is given to the Viterbi decoder, which is a wrapper for the Xilinx Viterbi core. The softbits are converted to the Xilinx format before passing them on to the s_axis_data_tdata input. The s_axis_data_tuser input is used to provide the punctured flags from the Depuncturer module and the block valid flag. The block valid has the same latency as the data path. After the last softbit of the current code word, the Viterbi is flushed to get the remaining bits out of the core. For flushing, strong zeros are pushed to the input with the block valid set to FALSE. On the output of the core, the block valid information can filter out the zeros from the flushing operation. The bits of the code word are provided at the output.
The Descrambler module processes the bits at the output of the decoder. If the scrambler is disabled, the input bits are bypassed to the output. On activation, detected by the rising edge of the enable signal, the Descrambler module assumes it is receiving a packet starting with the SERVICE field and uses the first seven bits to extract the scrambler seed. Those initial bits are overwritten by zeros. Afterward, all bits are descrambled with the recovered seed until deactivation.
The output is transmitted to the RX PHY state machine. Before sending to the MAC, the bit stream is filtered by the PSDU Masking module. The SERVICE, TAIL, and PAD fields are removed, and the bits are concatenated to bytes. The length of the PSDU is given by the configuration. Padding bits are removed. For the 802.11ac format, parts of the PAD field may be included in the PSDU data stream (refer to Section 18.104.22.168 for more information).
|Packet Termination||256 subcarriers burstwise / OFDM symbol|
|Align Configuration||NSD data subcarriers / OFDM symbol (48-108) burstwise; gaps due to pilots|
|Softbit Serializer||NCBPS coded softbits / OFDM symbol (48-864) burstwise; gaps due to pilots|
|Deinterleaver||NCBPS coded softbits / OFDM symbol (48-864) burstwise|
|Depuncturing||NDBPS encoded stream values or data bits / OFDM symbol (24-720)|
Peak rate: 1 value / clock cycle
|PSDU Masking||NDBPS/8 data bytes / OFDM symbol (3-90)|
Peak rate: 1 byte / 10 clock cycles
Table 9 RX Bit Processing Transfer Timing
The output timing of the submodules is given in Table 9. The number of values depends on the format, bandwidth, and MCS. The referred variables can be found in Table 18-4, 18-5 of  and Table 22-30, 22-38 of . In brackets, the minimum and maximum values are given indicating the valid range. The minimum value is based on L-SIG, which uses non-HT mode with MCS 0. The maximum value is based on VHT 40 MHz transmissions using MCS 9.
The RX I/Q Processing module provides 256 subcarriers in one burst. The first module that changes this pattern is the Configuration Alignment. Only subcarriers belonging to coded fields remain on the output. Since there are multiple pilot tones, this stream contains gaps. The serialized stream on the output of the Softbit Serializer module can have much more valid items per OFDM symbol. Nevertheless the pilot gaps remain if you are using BPSK modulation, where each subcarrier is translated to one softbit by the LLR Demapper. The gaps are gone after the Deinterleaver module because the softbit stream is read burstwise from the internal memory. The Depuncturer adds gaps to this data stream when there are two valid bits of stream A and B available. Adding punctured bits does not produce gaps. The Xilinx Viterbi core generates data on the output as soon as new bits are provided to the input. As a result, the output pattern is not changed. The masking of the PSDU reduces the data rate by factor 8. At a coding rate of 5/6, the peak rate is reached.
Figure 20 RX Bit Processing Latency
The latency of the RX Bit Processing chain depends on the format, bandwidth, and MCS. Similar to Table 9, Figure 20 refers to the two corner cases L-SIG and highest MCS at highest bandwidth. The latency is given for the last subcarrier of the packet generated by the RX I/Q Processing module. Most of the modules have a fixed latency.
The delay of the Softbit Serializer depends on the modulation. For BPSK, each subcarrier is mapped to one softbit so the serialization does not add any delay. The internal FIFO is empty when the last value arrives. The FIFO delay is unknown. The latency is 2 because of internal registers. For 256-QAM, each softbit array has to be split into eight softbits on the output. When the last value arrives, 108 (NSD) of 864 (NCBPS) softbits are processed on the output. The delay for the last softbit added with the two register stages results in 758 clocks latency.
The Deinterleaver has to store one complete OFDM symbol of softbits. The read operation starts as soon as the last value arrives. NCBPS softbits have to be read before the last sample is available on the output of the Deinterleaver. An additional latency of 11 is incurred because of the pipeline stages.
The latency of the Viterbi decoder is determined by the Xilinx Viterbi decoder core. An additional latency of one is incurred due to one pipeline stage.
The latency for other configurations can be calculated using Equation 7 with values from Table 18-4, 18-5 of  or Table 22-30, 22-38 of .
Equation 7 RX Bit Processing Latency
22.214.171.124 RX PHY State Machine
The RX PHY state machine, which is based on Figure 22-37 of , provides the configuration for RX IQ and RX Bit Processing modules and generates indications for the MAC. Notice that the synchronization is controlled indirectly by the state machine. The RX end indication is used to rearm the synchronization. Notice also that the PHY is not capable of decoding VHT MU PPDUs, so the reception of VHT-SIG-B is skipped as described in section 22.3.21 of . The state diagram is given in Figure 21. The word timing in this diagram refers to the timestamp when the last sample of the packet was received, which is included in the PHY RX end indication.
Figure 21 RX PHY State Machine States
This is the startup state. In this state, the internal configuration is reset such that it can receive the first coded field in the packet (L-SIG in the primary subband). This setting consists of the 802.11a format, 20 MHz bandwidth, disabled scrambler, and MCS 0. The unknown length of the packet means that the last OFDM symbol index is set to the maximum unsigned 16-bit integer value of 65,535.
As soon as the synchronization detects a packet, the processing chain uses the configuration from the Initialization state to provide the 24 bits of the SIGNAL field to the RX PHY state machine. The received bits are verified to be a valid L-SIG field based on Section 18.3.4 of . The L SIG check includes verifying the following conditions:
- R4 of RATE field is one
- Bit 4 is zero
- SIGNAL TAIL field is all zeros
- Parity bit is matching
- LENGTH > 0
The result of the check is used as condition L-SIG valid in the state machine. As soon as this condition is evaluated the state machine leaves this state.
If L-SIG is invalid the reception of the current packet is aborted. The last OFDM symbol index is set to 0. This forces the Sample Timing Generation module of the RX I/Q processing chain and the Packet Termination module of the RX bit processing chain to finish the current OFDM symbol and stop. Because there is no packet length information available at this point in time, the timing information is marked as invalid. Furthermore the internal format violation flag is set.
If a valid L-SIG was received, the next state depends on the packet format selected from the host. In the 802.11a format, all necessary information is available from L-SIG interpretation to start reception of the data. The index of the last OFDM symbol is calculated based on equation 18-11 of . This index as well as MCS and PSDU length are provided to the processing chain. In addition, the PHY RX start indication is generated, and the packet frame timing is set to valid.
If the L-SIG was valid and the format is set to 802.11ac, further information from VHT SIG A is needed to decode the packet. Only the index of the last OFDM symbol can be calculated, which also results in a known packet timing. The format still remains 802.11a because the VHT SIG A is coded like an L-SIG with MCS 0.
Similar to the RX L-SIG state, the processing chain is configured to provide the bits of the VHT-SIG-A to the RX PHY state machine. The code word of VHT-SIG-A is provided in two OFDM symbols. The Viterbi decoder flush required flag in the bit processing configuration cluster is set for the second OFDM symbol. This bit-processing configuration cluster is aligned with the data stream in the RX Bit Processing block by the Align Configuration module (see Section 126.96.36.199). Hence, accurate indication of the current OFDM symbol index is available from the RX Bit Processing module and can be used to set the Viterbi decoder flush required flag.
The 48 bits of the VHT-SIG-A are captured, and its validity is verified based on Section 188.8.131.52.3 of . The condition VHT-SIG-A valid is based on the following checks:
- Bandwidth is supported by PHY
- Group ID indicates VHT SU PPDU (0 or 63)
- Short GI is set to zero (disabled)
- B2 is set to zero (BCC encoding)
- CRC checksum is matching
If VHT-SIG-A is invalid, the reception is aborted, similar to what happens in the L-SIG state when the timing information is known from a successful L-SIG reception.
If VHT-SIG-A is valid, you can configure the bandwidth, format, MCS and PSDU length in the processing chain. Since there is no specific length information given in VHT-SIG-A, the PSDU length is calculated using Equation 22-112 of . This PSDU length is greater or equal to the exact payload size. Padding bits are included in the PSDU and given to the MAC.
Wait for last sample
This state can abort a running reception when an invalid signal field occurs. The state machine waits until the Sample Timing Generation module of the RX I/Q Processing module indicates that the last sample of the current OFDM symbol has been processed. The global timestamp is captured at this point in time to provide the end of the packet as the new frame timing. The PHY RX end indication is generated using the internal information about format violation and timing validity in addition to this new frame timing.
End of PSDU RX
This field is entered when the signaling information was correctly received and the data field is to be decoded. Similar to the RX VHT-SIG-A state, flushing the Viterbi decoder is enabled only for the last OFDM symbol of the packet, which is identified by the last OFDM symbol index computed in RX L-SIG state. Furthermore, in 802.11a format, the number of valid data bits in the last OFDM symbol before tail and padding is known and configured to the Viterbi module so that the TAIL bits are the last to be decoded. For 802.11ac format, the padding is inserted before tail bits. The valid bits limitation is not used in this case. All bits of the last OFDM symbol are processed by the Viterbi decoder.
The state is left as soon as the PSDU Masking module in RX bit processing indicates that the last byte of PSDU has been decoded. A PHY RX end indication with the frame timing information is generated. Similar to the wait for last sample state, the frame timing is based on the global timestamp captured when the last sample of the packet was processed in Sample Timing Generation module in RX I/Q Processing.
Wait for packet end
The state is left after 1000 clock cycles, which is the duration of one OFDM symbol. This waiting period is required because either the RX IQ processing or RX bit processing chain or both could still be working on samples that must be terminated before setting the configuration for a new L-SIG reception. Since the processing is based on OFDM symbol boundaries, after the duration of one symbol, all modules are in idle state.
The overall timing of the RX chain including Synchronization, I/Q and bit processing, and the RX PHY state machine is shown in Figure 21 for 802.11a packets. Time is represented on the horizontal axis. On the vertical axis, several selected modules with important outputs or that change the transfer timing are displayed. The colored rectangles correspond to the data values of one OFDM symbol. The size and the placement among the time axis are related to the latencies and transfer timings of the modules. The black arrows show important control signals between processing chain and state machine and between PHY and MAC. The arrows are based on the timing information. Neither the start nor the end position has to be related to the module that generates or consumes this control information.
Figure 22 RX PHY Timing for 802.11a Packets
Figure 22 shows the timing of the receiver for an 802.11a packet with MCS 7 and NSYM=3. RF and Synchronization add the latency between over-the-air transmission and the synchronization output. The first OFDM symbol after the packet start is L LTF¬ 2. The RX PHY state machine has configured the RX IQ and Bit Processing modules to receive L SIG.
As soon as the last sample of the L-SIG field is available in the FFT, the execution starts. The burstwise unloading of data is done in parallel to reception of the next OFDM symbol on the FFT module input. L-LTF-2 is terminated in the Align Configuration module of the RX Bit Processing block.
As the first dynamic field, the L-SIG is the first field handled by the RX Bit Processing. L-SIG uses MCS 0, so it has only 24 data bits, and the latency is much smaller than one OFDM symbol duration. The decoding and flushing of the Viterbi decoder takes most of the time. The RX PHY state machine can update the configuration cluster for the reception of the coded data symbols based on the L-SIG field contents long before the next OFDM symbol is unloaded by the FFT.
Starting with L-DATA-1, the RX Bit Processing chain uses MCS 7. This results in larger amount of bits on the Softbit Serializer module output. The Viterbi decoder keeps about 200 bits stored due to the internal latency. Because of this storage, the output of the RX Bit Processing chain is not given OFDM symbol wise. All other DATA symbols are similar in timing. The Viterbi decoder output starts off with the remaining bits from the previous OFDM symbol.
The code word ends in the last OFDM symbol, and the Viterbi decoder is flushed. Due to padding bits, the decoding can end before the last bit has been received. The PSDU Masking module notifies the RX PHY state machine to send out RX PHY end indication with the timestamp of the last sample of packet and goes to wait for packet end state. The RX PHY state machine remains in this state to terminate the remaining bits of the last OFDM symbol out of the RX Bit Processing chain. After the Initialization state, the RX chain is ready to process a new packet.
Figure 23 RX PHY Timing for Invalid Packets
Figure 23 illustrates the termination of the reception in case L-SIG was not valid. An invalid VHT-SIG-A is handled similarly. Like in Figure 22 L-SIG is provided to the RX PHY state machine. Once it is determined that L-SIG field contents are invalid, the last OFDM symbol index is set to zero, and the state machine goes to wait for last sample state (shortened to wait in Figure 23).
At this point in time the FFT is filled with data from the next OFDM symbol and cannot be aborted immediately. The Sample Timing generation module in RX IQ Processing block completes the current OFDM symbol and notifies RX PHY state machine after the last sample. RX PHY state machine switches to wait for packet end state and waits for the duration of one OFDM symbol. During this time the FFT unloads the remaining data. This data is terminated in the Packet Termination module of the RX Bit Processing chain.
Figure 24 RX PHY Timing for 802.11ac Packets
The reception of an 802.11ac packet with a bandwidth of 40 MHz at MCS 9 is shown in Figure 24. Since the process before L-SIG is equal to Figure 22, it is left out. After the L-SIG field, the RX PHY state machine switches to RX VHT-SIG-A state and updates the format to 802.11ac. The Demapper module in the RX I/Q Processing chain enumerates further subcarriers for 802.11ac. The timing of VHT-SIG-A reception is similar to L-SIG in RX IQ and Bit Processing chain. Because the VHT-SIG field only has a small number of bits and the Viterbi code is flushed at the end of VHT-SIG-A2, the 48 bits arrive in one bust at the RX PHY state machine.
If VHT-SIG-A is determined to be invalid, the reception would be aborted similar to the L-SIG invalid case illustrated in Figure 23. In this case, VHT-STF would be the last OFDM symbol getting out of the FFT.
If VHT-SIG-A is valid, the parameters bandwidth and MCS are obtained from the field and used to set the configuration clusters for the processing chain. PHY RX start indication is sent to MAC and the RX PHY state machine transitions to the End of PSDU RX state and waits for end of decoding.
The next OFDM symbols contain training sequences and VHT-SIG-B. This information is not handled in RX Bit processing chain.
In this example, the RX Bit Processing is for MCS 9, which consists of 256-QAM modulation. This scenario results in a large number of bits generated by the LLR Demapper, which are serialized by the Softbit Serializer. Reading and writing the Deinterleaver memory overlaps for this large number of bits is the reason for having a double page memory in this module. The Viterbi decoder core is flushed on the last OFDM symbol as in 802.11a format. RX PHY end indication is sent by the state machine if the last byte has been provided to MAC.
184.108.40.206 MAC RX
The module MAC RX implements low-level latency-critical MAC reception functionality, i.e. validation and recognition of received packets and triggering of ACK responses. Input to the module are SDUs delivered from the PHY together with associated control information. The MAC RX module performs frame validation, consisting of subframe detection for packets received in 802.11ac format and the FCS check for all received MPDUs. Subsequently MPDU type recognition is performed. For supported frame types, MAC header evaluation and address filtering is done. Finally MSDU extraction is performed by a configurable filter operation. In addition to these packet-handling related functionalities, the module MAC RX also handles CCA information from the PHY RX and forwards frame timing information from the PHY RX to the MAC TX.
As shown in Figure 25, the module MAC RX consists of five major submodules:
- A-MPDU Frame Validate
- MPDU Validate
- MPDU Recognize
- MPDU Filter
- Channel State
All five submodules are described in more detail in the following sections. Notice that the overall internal structure of MAC RX roughly follows the concept of the IEEE 802.11 SDL specifications. Refer to Section J.5 of  for more information about the SDL specifications.
Figure 25 MAC RX Block Diagram
The module A-MPDU Frame Validate is only applicable for packets received in 802.11ac format. It checks the MPDU delimiter and provides the contained information to subsequent modules. For received packets in 802.11a format, the module passes all data through without any change. For the 802.11 Application Framework version 1.1, this module can handle only A- MPDUs with one A-MPDU subframe, as in one MPDU.
The module MPDU Validate performs frame validation by means of the FCS field. The FCS check is done based on IEEE 32-bit CRC as specified in  Section 220.127.116.11. During the check, the 4 FCS bytes are removed. The module implements a small state machine to extract control information, such as the frame end timing validity and value, from the MPDU start indication primitive and the PHY RX end indication primitive. This information is collected in the RX info indication and forwarded to the Channel State module.
The module MPDU Recognize detects the frame type of received MAC PDUs. For supported frame types, MAC header evaluation is executed, including destination MAC address check (for all frames) or Source MAC address extraction (for frames with address field 2, such as data frames). Currently supported frame types are Data and ACK. For Data frames received with correct FCS and a matching address, an ACK transmission request is generated and forwarded to MAC TX.
The module MPDU Filter implements a configurable filter operation on received MPDUs. The filter can be configured to block MPDUs with FCS error, address mismatch or unsupported frame type. The filter also allows removal of MAC headers. The filtered received data and control information is converted into a serial data stream for transferring it using a target-to-host FIFO to the host.
The default filter configuration for 802.11 Application Framework version 1.1 is as follows:
- Remove header: TRUE
- Block unsupported frame types: TRUE
- Block FCS errors: TRUE
- Block address mismatch: TRUE
- Block header recognize error: TRUE
As a result of this configuration, only the frame body of received Data MPDUs with correct FCS and matching address is sent to the host. For the 802.11 Application Framework version 1.1, this is sufficient since no further MAC operation is implemented on the host. If you want to perform MAC operations on the host, the filter configuration has to be adapted. If for example the MAC header field Duration is evaluated on the host, remove header must be set to FALSE. Then the MAC header of ACK and Data frames are forwarded to the host. After evaluating the Duration field, the host completes MAC header removal for Data frames.
The module Channel State gathers CCA status information, including energy detection and signal detection, and information about received frames, including frame end timing validity/value, and DIFS/EIFS indicator, and then it provides it to MAC TX.
The TX baseband operates in the baseband clock domain of 250 MHz. Its block diagram is shown in Figure 26.
The data source block selects the source for the transmitter. Data is always taken from a host to target FIFO, or you can disable this feature when the TX MAC is bypassed. The TX MAC accepts TX requests to start after SIFS (ACK frames) or after backoff procedure (data frames). It multiplexes the requests in priority order and generates a TX start request for the PHY. The TX Bit Processing module serializes the bytes received from the TX MAC and scrambles, encodes, punctures, and interleaves these bits. The TX I/Q Processing module modulates the bits according to the settings of the TX vector, which is defined in the IEEE specifications and collects all TX parameters. The module furthermore applies channel duplication and rotation as needed for 802.11ac format. The modulated bits are then translated into the frequency domain using IFFT. The resulting I/Q samples are transferred to the TX to RF FIFO, which is an internal loopback FIFO for operation without RF and a target to host FIFO for debugging purposes (refer to Figure 26).
Every module is designed to keep up with the data rate from the upstream module, so there is no need for throttle control inside the modules. The timing of the transfers is described in the following sections.
Figure 26 TX Baseband Block Diagram
18.104.22.168 MAC MPDU Assembly
The current design is able to generate ACK frames and A-MPDU frames in the format described in Section 1.2. The frames consist of a header and an optional body. The header and body form the MPDU block, which is completed by the FCS in the MAC TX to a MPDU.
You can use the TX Data Source and TX MAC Bypass modules to bypass all MAC TX processing. When bypassed, all bytes from the T2H TX Data FIFO are streamed directly to the PHY TX. The host ensures that the FIFO contains valid MPDU data.
The header generation modules utilize the frame configuration cluster, which contains all supported header fields. This cluster is serialized in MAC Frame Header Generator into a continuous byte stream.
An ACK frame is generated with the help of the MPDU generation request register filled by the MAC RX. As the ACK carries no body, only the header must be serialized. When the ACK frame is ready to send, a TX after SIFS request (see Section 22.214.171.124) is triggered.
To generate a DATA frame, an ICP TX message (see Section 6.2.2) is decoded from the T2H TX Data FIFO. The header of the ICP message is transferred into a frame configuration, which is then used to create the A-MPDU block byte stream along with the payload data received from the ICP TX message. This step is coordinated by the Data Manager state machine in Prepare MPDU, which also triggers the TX after backoff request (refer to Section 126.96.36.199) to send the DATA frame.
188.8.131.52 MAC TX
The module MAC TX implements low-level latency critical MAC transmission functionality, i.e. the timing aligned provision of payload data and associated control information to TX PHY. Input to the module are MPDU blocks, which include the MAC header and frame body. These blocks are then extended by the FCS field to form a complete MPDU. For transmissions in 802.11ac format, the module generates A-MPDUs by adding delimiter fields and padding to the MPDUs. Version 1.1 of the 802.11 Application Framework only supports A-MPDUs with a single MPDU. The module supports backoff counting and handles Clear Channel Assessment (CCA) information provided from MAC RX. Furthermore, it ensures correct interframe spacing for the following scenarios:
- SIFS before ACK packet transmission
- DIFS/EIFS before Data packet transmission
The module implements two transmission types:
- Transmissions after backoff—Transmission in normal slots using CCA evaluation and backoff counting, with application of DIFS or EIFS as needed.
- Transmissions after SIFS—Transmission at the end of SIFS period without any CCA evaluation. It is, for instance, used for the ACK frame transmission.
As shown in Figure 27, the module consists of three major submodules:
- Timing Control
- Data Pump
All three modules are described in the following sections in more detail. Notice that the overall internal structure of MAC TX follows the concept of the IEEE 802.11 SDL specifications. Refer to Section J.5 of  for more information about the SDL specifications.
Figure 27 MAC TX Block Diagram
The module Timing Control implements the actual generation of timing information (timing signals) for the transmission part. At startup, a regular slot timing pattern is generated. The pattern consists of two signals, which are also shown in Figure 27:
- Signal slot M2 start—Provided to the module Backoff, refers to timing instant of CCA check. The name M2 start and other signal names described below are derived from the identifiers used in Figure 9-14 of .
- Signal slot M2 end—Provided to the module Data Pump, refers to timing instant of issuing a packet transmission to TX PHY taking into account the processing delay of the TX PHY and RF components.
The Timing Control module consumes timing information from the MAC RX. For frames received with valid length information independent of the actual FCS result, MAC RX provides a reference to the frame end timing to MAC TX. Based on this information, the Timing Control determines the correct interframe spacing and generates timing signals for transmission after SIFS (signal slot M1 end) and transmissions in regular slots after DIFS or EIFS depending on whether the frame reception was completed with FCS pass or fail.
Figure 28 Interframe Timing Relationships
The module Backoff performs the backoff procedure as defined in  Section 9.3.3. The backoff counter is initialized with the desired backoff value, provided through the TX after backoff request. At the appropriate timing instant within relevant slots, which is indicated to the module with the slot M2 start signal, the CCA information is checked and if the channel is idle, the backoff counter is decremented. If the counter reaches zero, this condition is indicated via the signal backoff done to the module Data Pump.
The module Data Pump coordinates the actual data transmission functionality. The module accepts requests for transmission with or without application of the 802.11 backoff procedure, referred to as transmission after backoff and transmission after SIFS respectively. Those requests are processed based on the timing and backoff information described earlier in the section. The module adds the FCS field to the MPDU blocks to generate complete MPDUs. FCS calculation is done based on IEEE 32-bit CRC as specified in  Section 184.108.40.206. In the 802.11ac mode, the module generates A-MPDUs by adding delimiter fields and padding. The MPDUs or A-MPDUs together with associated control information are sent to the PHY SAP TX. The module also provides information about active transmissions to the Backoff module to ensure that this is taken into account during backoff counting. At a given instant of time, only one pending or active transmission request is allowed for transmission after backoff and one for transmission after SIFS. To enable higher level MAC entities to control the data flow to MAC TX accordingly, status information is provided by the MAC TX module.
In addition the functions described previously, the module MAC TX also provides statistics information. Version 1.1 of the 802.11 Application Framework provides the following statistics:
- Number of TX after SIFS requests detected
- Number of TX after SIFS requests completed
- Number of TX after backoff requests detected
- Number of TX after backoff requests completed
220.127.116.11 TX Bit Processing
The purpose of the TX Bit Processing module is to generate the signal fields and enqueue the PSDU into the data stream. This stream is then serialized, scrambled, encoded, punctured, and interleaved before it is passed to TX I/Q Processing. Its block diagram is shown in Figure 29. The types of the data path and the elements of the control path are listed in Table 10.
Figure 29 TX Bit Processing Block Diagram
|Module||Output Data Type||Output Control Information|
|MAC TX||U8|| |
|TX PHY State Machine||U32||TX bit processing parameter length5|
|Bit Serializer||Boolean||enable scrambler|
|Convolutional Encoder||Boolean array (2 elements)|| |
Table 10 TX Bit Processing Data Types and Control Information
The first module in TX Bit Processing is the TX PHY State Machine, which encodes the signal fields according to 18.3.4 of  (L-SIG), 18.104.22.168.3 of  (VHT-SIG-A), and 22.214.171.124 of  (VHT-SIG-B). The Data Generator VI furthermore turns PSDU into a sequence of SERVICE field, PSDU data, TAIL, and PADDING for the 802.11a format according to 18.3.5 in  or SERVICE field, PSDU data, PADDING, and TAIL for 802.11ac format according to 126.96.36.199 in  respectively. The Generator outputs are combined using the EDSC pattern (see Section 6.1.1). The TX Bit Processing has a head start of two OFDM symbols to have the first bits available when needed by the TX I/Q Processing.
Each signal field is generated in one burst. The data bits are generated as one continuous burst per OFDM symbol in dependence on NDBPS. A small FIFO with a four-wire handshake ensures that bytes for at least one OFDM symbol are available. Furthermore TAIL and PADDING bits are also generated as part of the corresponding burst. So in worst case, the bit processing chain generates bits for up to two OFDM symbols in one burst, which is compensated in a FIFO of the TX IQ Processing Data Assembler.
The next downstream module is the Bit Serializer module, which converts data fields and PSDU data into one bit per cycle. A FIFO at module start ensures the module can process the incoming data rate. The maximal number of data bit per symbol NDBPS is 720 (802.11ac, 40 MHz, MCS 9). Because PSDU data is given in bytes, the FIFO must store at least 90 samples.
After bit serialization scrambling, convolutional encoding, puncturing, and interleaving are applied as described in 188.8.131.52–184.108.40.206 of  and 220.127.116.11–18.104.22.168 of . The scrambler and the encoder must be reset before the first bit of the data field is processed. The Scrambler module is bypassed for signal fields. The Puncturer module serializes the two stream of the convolutional encoder using the puncturing patterns of Figure 18-9 and 20-11 of . A FIFO is used on the input of the module since the data rate is higher on the input. The Interleaver applies the BCC interleaver operations defined in section 22.214.171.124 of  and section 126.96.36.199 of . The write operation into the memory is based on Equation 22-77 of , which applies the first permutation. The read operation is based on Equation 22-82 of , which applies the second permutation. Reading is started as soon as all bits of the current OFDM symbol are saved to memory. A double page memory is used, which enables reading and writing at the same time. The output of the Interleaver and the modulation scheme that is used is provided on the output of the TX Bit processing module.
|L-SIG Generator||U32 on start of L-SIG processing|
|VHT-SIG-A Generator||2 U24 on start of VHT-SIG-A processing burstwise|
|VHT-SIG-B Generator||U32 on start of VHT-SIG-B processing|
|Data Generator||NDBPS/8 U32 / OFDM symbol burstwise|
|Bit Serializer||NDBPS bits or array of bits / OFDM symbol burstwise|
|Puncturer||NCBPS bits / OFDM symbol|
Table 11 TX Bit Processing Transfer Timing
The output timing of the submodules is given in Table 11. All submodules of the TX PHY state machine generate data on the asserted enable signal from the OFDM symbol trigger type module. The modules need up to two U32 words. Starting with the OFDM symbol for the data field, the Data Generator provides the required number of bytes. After the Bit Serializer, NDBPS clock cycles are needed to complete the transfer. The convolutional encoder doubles the number of bits but due to the transfer of an array, the number of transfers is not changed. After puncturing, NCBPS bits remain.
Figure 30 TX Bit Processing Latency
The latency of the bit processing chain depends on the format, bandwidth, and MCS. Similar to Table 11, Figure 30 refers to the two corner cases of Non-HT mode with MCS 0 and highest MCS at highest bandwidth. The latency is given for the first subcarrier of the packet. So this is the time the TX bit processing chain needs from start trigger until the first valid bit is provided. Most of the modules have a fixed latency.
The Interleaver must store bits from one complete OFDM symbol. The read operation starts as soon as the last value arrives. NCBPS bits must be read before the last sample is available on the output of the Interleaver. An additional latency of 11 comes from the pipeline stages.
The latency of the FIFOs in Bit Serializer and Puncturer are unknown.
188.8.131.52 TX IQ Processing
The purpose of the TX IQ Processing module is to add the training fields and to convert the bits from TX Bit Processing into baseband I/Q samples. The OFDM symbol trigger of the RF loop is used to clock the generation of the OFDM symbols (see section 2.3.1). The block diagram is illustrated in Figure 31. The data types and control information are listed in Table 12.
Figure 31 TX IQ Processing Block Diagram
|Module||Output Data Type||Output Control Information|
|TX Bit Processing||Boolean|| |
|Create Packet Structure|| ||Field Map|
TX IQ Processing Parameter
|L-STF Assembler||CFX 3.136|| |
|Assembler Modules||CFX 2.14|| |
|Channel Duplication||CFX 2.14|| |
|Channel Rotation||CFX 2.14|| |
|IFFT Prescale||CFX 0.16|| |
|Xilinx IFFT||CFX 3.13|| |
Table 12 TX IQ Processing Data Types and Control Information
The Create Packet Structure modules create a timing structure, a field map, and the processing parameters that stay constant along the OFDM symbol. These parameters can include bandwidth, CP length, channel duplication, channel rotation, and tone scaling factor.
The field map controls which field is generated for the current OFDM symbol. Using the enable driven stream combiner (EDSC) pattern (see section 6.1.1) for field generation reduces the latency caused by parallel execution. There is one assembler for each training field, for the pilots, and one for the bit taken from TX Bit Processing module.
TX IQ Processing has to start delivery of IQ data as soon as possible after the TX start request is triggered. Because the IFFT takes about half an OFDM symbol (as described in Section 184.108.40.206), the L-STF is pregenerated in time domain, and its I/Q data is stored in a block RAM. For each combination of bandwidth and primary subband, a bank is reserved in the memory. Because L-STF is a Non-HT field, there is no need to distinguish between 802.11a and 802.11ac. For 802.11a, an additional bank for DC centered signal exists. Because L-STF is a repeating sequence in time domain with a period of 0.8 µs, you need to store only 64 samples.
The remaining training fields are generated according to the sections 220.127.116.11 (L-LTF), 18.104.22.168 (VHT-STF), and 22.214.171.124 (VHT-LTF) of .
The L-DATA and VHT-DATA are built from bits generated by TX Bit Processing modules. The bit stream is buffered in a FIFO that is laid out to buffer up to three OFDM symbols, which are the bit processing head start, the current OFDM symbol, and the last OFDM symbol, if filled with padding. Besides applying the correct modulation, the module also rotates VHT-SIG-A2 according to 126.96.36.199.3 in .
The pilot tones are inserted according to 188.8.131.52 of  using the information from the field map.
After the assembler modules, the channels are duplicated and rotated according to sections 22.3.4.x and 184.108.40.206 of . Here the Create Packet Structure modules ensure correct settings for handling channel duplication and rotation.
Next downstream module is the IFFT prescale. This modules applies tone field scaling according to 220.127.116.11 of  to ensure the time domain power of VHT modulated fields does not exceed the time domain power of pre-VHT modulated fields (each summed over all transmit channel). The scaling factor is determined in the Create Packet Structure module and depends on bandwidth and field type (refer to table 22-8 of ).
The last module is the IFFT, which is a wrapper around the Xilinx FFT core. It contains a 256-point IFFT operation using “Radix 4, Burst I/O” architecture similar to the RX IQ processing. In addition, the configuration input is used to enable cyclic prefix insertion by the core. The core configuration settings, such as this input, are dynamic and are provided parallel to the first sample of each OFDM symbol. A small FIFO is placed before the FFT input to compensate the longer execution time due to guard interval GI2 of L-LTF-1. The FFT output is shifted in frequency using a toggling negation. The fixed point format on the output is CFX 3.13 based on the requirements of section 18.104.22.168.
Figure 32 TX IQ Processing Latency
There are two paths for latency in TX I/Q processing. One goes for the precalculated samples of the L-STF in time domain. This path has only a latency of five cycles, which allows the PHY to ensure a packet starts at the interface to the RF when it is triggered inside the PHY. The second path goes for all the other symbols that are created using the path shown in Figure 32. The FFT latency is smaller than reported by the Xilinx core because the reported value includes loading of all 256 samples. During the packet, the FFT executes and unloads samples in 616 clock cycles after the last sample arrived. The remaining clock cycles per OFDM symbol are used to transfer data from the input FIFO to the FFT core. By the time the last sample is available on the input, the FIFO is empty and it is passed to the core as fast as possible. The delay of the FIFO is unknown. All other modules have a fixed latency.
The overall timing of the TX chain including TX Bit processing and TX IQ Processing is shown in Figure 33 for packets in 802.11ac format. The timing works similarly for 802.11a packets. Time is represented on the horizontal axis. On the vertical axis only a couple of module outputs are chosen that are important or change the transfer timing. The colored rectangles correspond to the data values of one OFDM symbol. The size and the placement among the time axis are related to the latencies and transfer timings of the modules. The black arrows show important control signals inside the processing chain and between PHY and MAC. The arrows are based on the timing information.
Figure 33 TX PHY Timing for 802.11ac Packets
Figure 33 shows the timing of the transmitter for an 802.11ac packet. The RF transmission starts with the start of packet trigger that is given with the start of the first OFDM symbol. This is possible because the L-STF is unloaded from memory in time domain. The latency is only a few cycles. In parallel, the bit processing starts with the encoding of bits for the L-SIG and the I/Q processing starts assembling the L-LTF. The head start of four symbols for the bit processing and two symbols for the IQ processing is kept during the packet generation. This design ensures that all samples arrive in time for I/Q processing and RF.
Assembling the bits takes only a few cycles. Bit insertion when the code rate is applied in convolutional encoding and puncturing causes the chain length of valid samples to increase at the end of TX Bit processing. The samples leave the bit processing as a burst because the interleaver handles data in groups on N_CBPS.
The assemblers in RX IQ Processing module append training fields and add pilots to the fields generated by TX Bit Processing module. Each OFDM symbol will contain 256 I/Q samples. The IFFT transforms this I/Q data into time domain and adds guard interval. This transformation takes 360 cycles plus FIFO delay.
In case of an invalid TX Request, there is no packet generation at all and the PHY TX Request handler generates the TX end indication immediately.
The digital downconversion (DDC) and digital upconversion (DUC) modules are based on the PXIe Streaming project templates of USRP and NI 579X. Their block diagrams are shown in Figure 34 and Figure 35. In the downconversion path, a DC Offset Correction module is present to estimate and compensate the residual DC offset from the RX LO. This module mitigates the impact to the autocorrelation computation within the Synchronization block. The DC Offset Correction module estimation uses an average over 512 samples. After each averaging windows the LSB of the correction value is increased or decreased. Over time, the correction value is approaching the DC offset iteratively.
Figure 34 DDC Block Diagram
Figure 35 DUC Block Diagram
The latencies of DDC and DUC are given in Figure 36 and Figure 37. The Fractional Decimator and Interpolator latencies depend on the ratio of clock rate versus sample rate. Since the clock rate is different between the RF devices USRP and FlexRIO, the DDC has a target-specific latency. The latency for the DUC remains the same for both target types.
Figure 36 DDC Latency
Figure 37 DUC Latency
The analog parts of the device and FPGA logic that are not presented on the block diagram add latency to the RF path. Those can be measured using RF loopback and the Streaming project templates for the specific target. The results are listed in Table 13.
| ||USRP RIO 40 MHz BW (Data clock = 120 MHz)||USRP RIO 120 MHz BW (Data clock = 200 MHz)||FlexRIO / FAM (Data clock = 130 MHz)|
|DDC||57 clock cycles ≈ 0.48 µs||72 clock cycles ≈ 0.36 µs||59 clock cycles ≈ 0.45 µs|
|DUC||40 clock cycles ≈ 0.33 µs||50 clock cycles ≈ 0.25 µs||40 clock cycles ≈ 0.31 µs|
|Others (ADC, DAC, … )||100 clock cycles ≈ 0.83 µs||50 clock cycles ≈ 0.25 µs||115 clock cycles ≈ 0.88 µs|
|RF round trip time||197 clock cycles ≈ 1.64 µs||172 clock cycles ≈ 0.86 µs||214 clock cycles ≈ 1.65 µs|
Table 13 RF Latency
The host is a sample application that covers all important features of the 802.11 Application Framework. This covers configuration of the FPGA target, exchanging payload data, and monitoring the system status.
3.2.1 Host Architecture
The host is split into six loops covering the jobs of configuration, data exchange, and status display (refer to Figure 38). The initialization of the system and the cleanup are done on the upper left and upper right of the block diagram respectively. The system status is passed around with the help of a session cluster that stores all handles to devices and queues used in the host application.
There are a few queues used to buffer and pass payload and status information around the system (refer to Table 14). All queues are part of the session cluster.
|stop||synchronize shutdown of all loops in case an error occurred or the stop button was pressed|
|send||buffer data that should be transmitted to the target|
|receive||buffer payload received from the target|
|receive throughput||store information about received payload size and timestamps (used for throughput graph display)|
Table 14 Message Queues Used in Host Application
The loops for data transmission between target and host as well as between host and UDP ports run without throttling to achieve the maximal possible throughput. The loops for configuration and status display run every 100 ms to enable a responsive system. The loop to display events, constellation, channel estimation, and spectral plots runs at a slower rate (every 250 ms), as it consumes a lot of processing power.
Figure 38 Host Schematic Block Diagram
3.2.2 System Configuration
The parameters for system configuration splits up into three groups, and they are ordered on the front panel from top to bottom. Parameters that can only be changed at system start can be found on the upper right. Parameters that can only be changed while the station is off can be found above the gray line in each tab. Parameters that can be changed at any time can be found below these lines.
Refer to the HTML documentation included inside the project files tab  for a detailed description of each parameter.
The host offers an automatic gain control (AGC) mechanism to ensure that the operating point of the system keeps in an optimum range. The main building blocks are shown in Figure 39. Power measurement is done in the baseband as described in Section 22.214.171.124. Based on this power measurement, the AGC adjusts the RF gain to meet targeted ADC headroom of around -25 dBFS. This target headroom has been derived in Section 2.3.3. Figure 39 shows a schematic of the power measurement.
Figure 39 Schematic Power Measurement
The host mechanism does not offer a packet-by-packet adaption. Therefore the AGC is working based on the signal power at the packet start from the previous detected packet. If this value does not meet the target head room within ±1 dB range, the RX gain is adjusted accordingly in steps of 0.5 dB.