get-module sch schThree subdirectories of sch will be created:
Every computation is translated into a separate functional unit, every edge in the data-flow graph is a wire bundle and every delay element is a register. The hardware consumes and produces a signal sample every clock cycle. Study the code in sec_par.arx and check for yourself that the code describes indeed a 1-to-1 mapping of the data-flow graph given in the figure into hardware.
Now run make (type make in the shell). You will see that for both Arx source files, C++ and VHDL will be generated. The warnings are related to the fact that the filter coefficients do not fit the given fixed-point data formats without loss of precision.
Double-click on model arx_sec_par. This is a Prim model that acts as a wrapper for the C++ code generated by Arx. Its content is rather trivial. The wrapper casts the data type i32 that is used by Arx but is not directly understood by CCSS, to and from data type int with which the wrapper communicates with the outside world. The wrapper also calls the methods run and reset at the appropriate place.
Now open model tb_sec_soc_par which is the testbench for arx_sec_par. As in exercise TRA, the testbench feeds the filter with a superposition of a low frequent and high-frequent sine wave. As the filter is a high-pass filter, the low-frequent signal should be strongly attenuated in the result. Run the testbench and visually verify the expected behavior in Davis. Do not use an scf file; just use the default values for all parameters.
Different data types are used in the model. List all conversions in the chain from sources to sinks and explain why these conversions are made. Note: the delay element in the chain is not necessary but is there to make the model consistent with the testbench that will perform co-simulation with VHDL.
What are the values of the first 5 samples arriving at the sink? You can use the Browse Data ... (under the right mouse button) in Davis to see the exact values.
The event-driven simulation model for a language as VHDL is quite different. CCSS provides an interfacing mechanism for an external VHDL simulator such as Questasim. The idea is that the hardware modeled in VHDL is synchronous. At each clock cycle a new token is sent from the data-flow environment to each input of the hardware and a new token is collected from the hardware and sent into the data-flow environment. CCSS also supports schedules where communication only takes place on a specified subset of clock cycles, but this feature will not be used here.
The main advantage of co-simulation is that one can reuse the testbench developed for system-level design for the verification of the RT-level design, possibly after some minor adjustments. In the case of CCSS, all interfacing with the external simulator is automatically generated and the external VHDL code is automatically compiled.
The goal of this exercise is to embed the VHDL code in the CCSS testbench tb_sec_soc_vhdl. Open this model. Do not check this design yet, as a submodel sec_par_std_if is missing. This model is actually tb_sec_soc_par with an additional branch to simulate VHDL alongside with C++.
Perform the following steps to create the missing submodel:
CCSS will deposit the files that Questasim needs for interfacing in subdirectory tmp/ccss/hdl/ of your home directory.
After you finish the procedure above, you will have the model sec_par_std_if in your library. Instantiate this model in the empty space reserved for it in the testbench. Now, you can check your design. Some VHDL will directly be compiled by Questasim. You can find the diagnostics in the Code Generation tab of the main CCSS window.
The CCSS model bit2bit takes care of the transitions from the fixed-point data types used in the testbench and the SystemC bit vectors of the hardware interface. They simply copy the signals bit by bit.
There is an extra delay at the input side. This model has been inserted to allow the hardware model to execute a reset cycle. Otherwise, the hardware model would loose the first sample of the data stream.
Double-clicking the sec_par_std_if instantiation in the testbench will pop-up a window with parameter settings. Change the value of debug to 1. This setting will bring about that the graphical user interface (GUI) of Questasim will be launched prior to simulation. You can then trace waveforms, etc. in the way that you are used to. Be aware that CCSS generates a wrapper around the VHDL code generated by Arx. The Arx model is the submodel with instance name eut (entity under test); the level above is an automatically generated wrapper taking care of the right interfacing.
For an interactive co-simulation act as follows:
Trace relevant waveforms from Questasim and include them in your report. What are the values of the first 5 output samples that you see in Questasim? How do they compare to the samples of the C++ simulation? Explain possible differences.
If everything went well, the conclusion of this exercise should be that the C++ and VHDL generated from Arx behave exactly the same. This also means that it is not necessary to simulate the VHDL for each design made in Arx. The VHDL will serve primarily as input for synthesis. In practice, it is also wise to perform a post-synthesis simulation. This could be done in the same way as above with a CCSS testbench, but is outside the scope of this course.
If, for some reason, the simulation gets stuck (is in deadlock), terminate the Questasim process from the Linux prompt. First find out the process number by
ps -ef | grep vsimand then kill the process by typing kill <process number>.
m1: xn * b0 m2: xn * b1 m3: yn * a1 m4: xn * b2 m5: yn * a2And the additions as:
p1: m1 + z2 p2: m2 + z1 p3: m3 + p2 p4: m4 + m5Then, an overlapped schedule using 5 clock cycles can be as follows (an @-sign indicates an operation to the previous iteration, a #-sign an operation of the next iteration):
time: 0 1 2 3 4 5 (=0) *: m1 m2 m3 m4 m5 m1# ... +: p4@ p1 p2 p3 p4Completing the design, requires that the entire data path is specified, including registers, multiplexers, etc. Below, such a data path is shown:
m1: C, x -> R1 p4: R1, R4 -> R4 m2: C, x -> R1 p1: R1, R3 -> R2 m3: C, R2 -> R1 p2: R1, R4 -> R3 m4: C, x -> R1 p3: R1, R3 -> R3 m5: C, R2 -> R4
The C++ and VHDL for this model were already generated when make was called earlier. The model arx_sec_ser contains the Prim model that interfaces with the C++ coming from Arx. Open it. You will see that the run method of the C++ object is called 5 times in each invocation. This corresponds to the iteration period having value 5. As the design has been made in such a way that the output register only changes once in 5 clock cycles, it is not necessary to bother about in which of the 5 cycles the value should be read and written to the node's output.
Run the simulation and compare the output to the results of the parallel version of the filter. For easy comparison you can either read signals from two simulations in one Davis session or modify the CCSS testbench to run both versions of the filter in parallel.
If everything went well, you should see identical behavior except for a time shift. Explain this time shift.
Directory vhdl contains the generate-design script that you know from the System-on-Chip Design course. Use it to synthesize both the parallel and serial versions of the filter (do not forget to run it via srun). For each design, study the log file and pay special attention to the resource report. It mentions all adders and multipliers to be implemented by Synopsys including word lengths (in the reference report on the other hand, not all adders and multipliers are mentioned as some of them are directly expanded into gates).
For each design explain the information given in the resource report. For each resource, point out from which part of the Arx code it originates.
Now check the areas reported for both designs. Which of the designs is larger? Explain. Synthesize both VHDL descriptions of the filter for a clock period of 10 ns and analyze the results. Which of the two designs is larger? Is that according to expectation? Try to explain the results.
When the Arx code compiles without errors, you can simulate it in CCSS. Use Library -> Add Existing Files ... to make the .h and .cpp file visible in CCSS. Then, create a Prim model that interfaces with C++ and finally create a testbench for your version of the filter. Simulate and try to make the design to have exactly the same output stream as the other two provided versions.
When ready with the design, synthesize the VHDL and discuss the performance figures (area, resources, critical path).
If you have time left, consider one or more other design alternatives. Try especially to reduce the area. Is it a good idea to use one or more multiply-accumulate blocks in the data path? Those of you who work alone should better concentrate on a single design version version with well-motivated design choices rather than spending time on multiple alternatives for which the design choices are poorly motivated.
vcd2wlf debug.vcd debug.wlfdebug.wlf can now be viewed in Questasim. Call vsim and then open debug.wlf (choose the "log" file type).
This only works when there is a single Arx component in your simulation! Otherwise debug.vcd will get corrupted.
|Go (back) to||Sabih's Home Page.|