Project GFS: The GFSK Receiver

This project is a compulsory part of the examination for the Implementation of Digital Signal Processing course at the University of Twente. The goals of this project are:

Preparation

Read the document Some Background on GFSK Modulation carefully. It contains the theory and some practical remarks that you will need in the rest of this project. Consult also the lecture slides on GFSK modulation.

Files and Directories

Go to your home directory and fetch the files for this project with:
get-module gfs gfs
Note: The first argument of get-module is the project name, the second the name of subdirectory in your file system. So, by issuing the command several times with a different second argument, you will be able to make multiple copies of the distribution.

Three subdirectories of sch will be created:

A multitude of files are involved in this project. They will be presented gradually, at the moment that they will be needed.

Even when Arx is not used in the first exercises of this project, change to subdirectory arx and run make. This ensures that the database is consistent by generating the C++ and VHDL versions of the Arx source files present in the subdirectory. At this moment, you don't need to understand contents of the source files.

Now start CCSS. Close the workspace that you used for the previous project and create a new workspace (call it e.g. gfsk) for this project. In this new workspace, use the Add Existing Library menu command to add library sys by selecting sys.ssl in subdirectory sys.

Exercise GFS-1: Familiarization with GFSK

Open model tb_gfsk_ccss. It is a functional GFSK testbench entirely written in CCSS. At the left, there are two bit sources, both providing inputs to two GFSK modulators. The lower one is the wanted one that uses a carrier of 1 MHz. The upper one is an interferer. Interferers are quite common in radio communication. Many standards have multiple channels in some frequency band. The systems should be designed in such a way that communication in adjacent channels should not interfere too much. In this case, the interferer uses a carrier of 3 MHz. The low-pass filter in the receiver should suppress the interferer. This is one of the things that you should prove in this exercise. The strengths of the wanted and interferer signals can be controlled by the parameters wanted_amplitude and interf_amplitude respectively.

Going further to the right in the model, you will see that the wanted and interferer signals are added after modulation and sent to an AWGN channel. The entire testbench uses the floating-point datatype double, but the model quantizer discretizes the signal by going to a fixed-point datatype and back. In this way, an analog-to-digital converter (ADC) is modeled.

Next comes the receiver or demodulator with the four stages as explained in the PDF document mentioned above. The output after each stage is accessible. Some of them are are connected to PowerSpectralDensity blocks to visualize the frequency content of a signal, others to WriteSignal blocks to visualize the signals in the time domain. The same happens earlier in the signal path. Feel free to add or remove such blocks if you think it is appropriate.

Descend both into the modulator and demodulator models and explain shortly in your own words the function of each block making reference to the theory as given in the PDF document.

Generate code for tb_gfsk_ccss and simulate it with control file sys/tb_gfsk_ccss.scf. Make sure that interf_amplitude is zero.

Use Davis to visualize the spectra before and after the channel, after mixing and after low-pass filtering. Set the Y-axis to a dB scale for better visualization.

Inspect the time signals dam_out, slicer_out and demod_out as well as the input bit stream. Do you recognize the input bit stream in the output? How much latency does the system have (how much "garbage" comes out of the demodulator before the first input bit is output? How large is the amplitude of dam_out and how does this relate to the theory presented in the PDF document?

Repeat the simulation with an interferer amplitude of 10.0. Comment using the spectra as well as the time-domain plots.

Optional: feel free to vary more parameters such as the signal-to-noise ratio snr and slicer_offset.

Exercise GFS-2: Bit-Error Rate Simulations

Now open model tb_gfsk_ccss_ber. It is a modification of tb_gfsk_ccss. All PowerSpectralDensity and WriteSignal blocks have been removed. Instead the input and output bit streams are led to a block BER that calculates bit-error rates. A Demultiplex2 block takes care of removing the garbage caused by the system latency. Generate code for this model. The interferer is active. Both the wanted and interferer bit streams are random, but initialized with a fixed seed such that repeated runs of the simulation produce identical results.

Use tb_gfsk_ccss_ber.scf as the control script for simulation. Make a copy of this file as you will change parameter values at various occasions while you need to restore the original values later on. If you inspect it, you will see that two things are happening in the last part of the TCL code. First the model is executed repeatedly in a loop (by calling run_iteration). In this loop the SNR is gradually increased until the BER drops below 1.0e-4 (one error in 10.000 bits). The block BER stops each of the iterations, either when a maximum number of errors (1000) is reached or a maximum number of bits have been simulated (10000).

The quality criterion throughout the entire project is the so-called sensitivity level. This is the SNR for which the BER becomes 1e-3. The last part of the TCL code estimates this level by taking the two SNRs for which BER is just more and just less than 1e-3 and applying linear interpolation. Run the simulation. On your screen (both in CCSS and in the terminal from which your called ccss) you will see the output of the script. It can take some time (say, a minute) to finish.

If everything went well, the script will report a sensitivity level of about 10.8 dB. Throughout the entire project this value will be a reference for the quality of the design. Any design you make should keep the sensitivity within 0.5 dB of this value, so below 11.3 dB.

The correct synchronization is essential for the BER performance. In the SCF the two parameters that matter are system_latency with a value of 5 and slicer_offset with a value of 2. Try a few neighboring values and write down the sensitivity obtained.

The script reports sensitivity levels in four digits. How accurate is this? Increase parameter max_bits to 20000 and 50000 (and use the optimal values for system_latency and slicer_offset). Which sensitivity levels are reported? You can use a value of 10000 bits for the rest of the project as a compromise between accuracy and simulation time.

Exercise GFS-3: Fixed-Point Optimizations, Part 1

Inspect model quantizer that is used to model the ADC in the two testbenches used until now. It uses the SystemC datatype sc_fix of which the word length (wl_adc) and integer word length (iwl_adc) can be set at run time. So, no recompilation is needed each time a new word length is chosen. The model also has rounding for quantization mode and saturation for overflow mode.

Consider again testbench tb_gfsk_ccss_ber with the original SCF settings. Increase the interferer amplitude to 20.0 and run a BER simulation. What happens?

Try to more or less recover the original performance by trying new values for the parameters wl_adc and iwl_adc. The goal is, of course, not to make them larger than necessary.

Exercise GFS-4: Fixed-Point Optimizations, Part 2

Open model tb_gfsk_ccss_ber_quant. It is almost identical to tb_gfsk_ccss_ber, the only difference being that the demodulator has been replaced by model gfsk_demod_ccss_quant. If you inspect this version of the demodulator, you will see that quantizers have been introduced between the mixer and the low-pass filter. They are there to investigate the optimal fixed-point format in that stage of the design. The quantizer parameters wl_mix_out and iwl_mix_out are available at the top level for the purpose of finding the optimal value.

For simulation of tb_gfsk_ccss_ber_quant, you can use the SCF tb_gfsk_ccss_ber.scf (make sure to have the version with the original parameter settings; restore especially the interferer amplitude to 10.0 and the synchronization parameters). The fixed-point parameters for the mixer output are commented out. You can uncomment them.

Propose optimal fixed-point data types for each stage of the demodulator: for the mixer output, filter output and DAM output. Use the strategy of working your way from input to output. So, once you have the right values for the mixer output, you modify gfsk_demod_ccss_quant to introduce quantizers at the filter outputs and propagate its parameters to the testbench level. You optimize these and go to the DAM output to do the same. You may use the very first testbench tb_gfsk_ccss to perform simulations and inspect signal amplitudes in the time domain to get an impression of the fixed-point formats needed.

Exercise GFS-5: Demodulator in Arx: C++ simulation

Open model tb_gfsk_arx. This is the counterpart of the first testbench tb_gfsk_ccss where the demodulator has been realized entirely with Arx blocks. Inspect this demodulator gfsk_demod_arx. You should notice that the model is a concatenation of Arx blocks wrapped in PRIM models. Data conversion only takes place at the boundaries, not between the Arx blocks. Model gfsk_demod_arx has parameters for all fixed-point datatypes used. Note that they do not influence the Arx code (fixed-point parameters for Arx need to be supplied in the Arx code itself and require recompilation both of Arx and CCSS testbench); they are only used for the purpose of data conversion at the interfaces.

Study as well the source code of all blocks in Arx. Try to understand especially the mixer that is based on a CORDIC and the low-pass filter that uses a multiplierless direct-form style of implementation.

Generate code and simulate this testbench using SCF tb_gfsk_ccss.scf also used for the CCSS testbench. Plot the dam_out signal of the Arx model in the same Davis window as the dam_out of the CCSS model. You should see two main differences. Mention and explain them.

Exercise GFS-6: Demodulator in Arx: VHDL Synthesis

The distribution of this project comes with support for VHDL. An entity gfsk.vhd is provided that instantiates all blocks coming from Arx as well as a clock generator for the different clock rates necessary. Word lengths are centrally administered in a file pk_gfsk.vhd.

A testbench is provided for a standalone VHDL simulation. You can run a Questasim simulation if you are curious. The compulsory part of the VHDL work concerns synthesis. Synthesize the design using command srun generate-design and inspect the log file when synthesis is ready. Comment on the sizes of the 4 different blocks in the demodulator. Use the number of flipflops as a measure of complexity. In the Arx code, the keyword register specifies the flipflops. Does the flipflop count computed from the Arx code match the count reported in the log file? In the log file, standard cells the name of which start with "DFF" indicate flipflops.

Exercise GFS-7: Fixed-Point Optimization for Arx Models

Consider this as an optional exercise and skip it if you are running out of time.

Perform a BER simulation with testbench tb_gfsk_arx_ber and script tb_gfsk_arx_ber.scf. It should result in a sensitivity comparable to the CCSS-only testbench. Now increase the interferer amplitude from 0 to 10 and repeat the simulation.

Obviously, robustness against an interferer was not taken into account when choosing the fixed-point formats in the Arx code. Redesign the word lengths in the Arx models such that they achieve the sensitivity performance for an interferer with an amplitude of 10. Be aware that you need to run make on the Arx code and then regenerate code in CCSS each time that you have modified the Arx source code.

Hints:

Exercise GFS-8: Polyphase Implementation of the FIR Filters

The demodulator contains downsampling (by a factor 4) directly after low-pass filtering. This means that a polyphase implementation, exchanging the order of filtering and downsampling is possible (see also the lecture slides).

Consider the advantages and disadvantages of a polyphase implementation in this concrete case. You can involve the following elements in your reasoning: one-to-one implementation vs. scheduled solution, filter symmetry, multiplierless design, power-area-time trade off. Do not write any code at this time. What is your final recommendation regarding the polyphase implementation? Should it be used or not?

Exercise GFS-9: Processor Solution

The reference design is a one-to-one implementation where the sample frequency of 8 MHz is equal to the system clock frequency. Suppose that the technology available to you allows a system clock of 80 MHz and you can do arithmetic operations (add, multiply, etc.) in one clock cycle.

Design on paper an architecture, in VLIW style, that can perform the entire GFSK reference design for the receiver. First estimate the complexity of this design: how many additions, shifts, multiplications, etc. do you need? Based on this, determine how many adders, multipliers, etc. you would need. Involve the fixed-point formats to determine the word lengths that the arithmetic blocks should have. How much memory (e.g. register files) would be needed. Which interconnection structures would the architecture have? How would the control structure look like? Assuming a VLIW architecture, how many bits would an instruction have?

Exercise GFS-10: Free Design Assignment

Make, depending on the time left, minor or major modifications to the Arx code for the GFSK receiver. Follow any of the suggestions below, a subset or a combination or just do something completely different.

All Arx code can be imported into CCSS by wrapping it in PRIM models, as is the case in the reference design. In CCSS the wrapped Arx models should be directly interconnected. The only "native" CCSS blocks allowed between Arx blocks is SampleDown and its variants.

As mentioned earlier, the BER performance of your design is not allowed to degrade more than 0.5 dB with respect to the original CCSS model. Be aware that changing the number of registers in the signal path may affect the system latency. If this is the case in your design, you will need to find the optimal synchronization parameters.

Once your design is ready, run synthesis in order to have an impression of the area and timing. You may synthesize the entire receiver, but you may also consider synthesizing each component separately.

Deliverables

Write a short report always motivating your choices and explaining the way you have reached your answers. Particular points of attention:

Grading


Go (back) to  Sabih's Home Page.
Last update on: Wed Mar 2 01:20:11 CET 2016 by Sabih Gerez.