Introduction: Vivado HLS Video IP Block Synthesis
Have you ever wanted to real-time processing on video without adding much latency or in an embedded system? FPGAs (Field Programmable Gate Arrays) are sometimes used to do this; however, writing video processing algorithms in hardware specification languages like VHDL or Verilog is frustrating at best. Enter Vivado HLS, the Xilinx tool that allows you to program in a C++ environment and generate hardware specification language code from it.
Required softwares:
- Vivado HLS
- Vivado
- (If you use the AXI registers) Vivado SDK
(Optional) Download the Xilinx made examples here:
Step 1: What Is Vivado HLS?
Vivado HLS is a tool used to turn c++ like code into hardware structures that can implemented on an FPGA.
It includes an IDE for doing this development.
Once you have completed your development of the code for HLS you can export your generated IP in a format for use with Vivado.
Download the attached files and put them near where you will be creating your project. (rename them back to "top.cpp" and "top.h" if they have a randomized name)
Step 2: HLS Video Library
The HLS Video Library has documentation with reference designs in this paper:
XAPP1167
Another good resource is the Xilinx Wiki page about it.
Start Vivado HLS.
Create a new project.
Take the files that you downloaded in the previous step and add them as source files. (Note: the files are not copied into the project, but instead remain where they are)
Then use the Browse button to select the top function.
On the next page, select the Xilinx part you are using.
Step 3: Synthesizing
Solution => Run C Synthesis => Active Solution
After ~227.218 seconds, it should be done. (Note: your actual synthesis time will vary based off of many factors)
Step 4: Versioning and Other Info for Export
Version numbers interact with Vivado to cause you to be able to update the IP in a design. If it is a minor version change it can be done in place while major version changes require your to manually add in the new block and remove the old one.
If your interfaces have not changed and the version update is a minor one the update can be done completely automatically by pressing the update IP button.
You can run "report_ip_status" in the Vivado tcl console to see the status of your IP.
Set the version numbers and other info in Solution => Solution Settings...
Alternatively, these settings can be set during the export.
Step 5: Exporting to a Vivado IP Library
Solution => Export RTL
If you didn't set the IP library details in the previous step, you can do that now.
Step 6: Synthesis and Export Analysis
On this screen we can see the stats about our exported module, showing that it does meet our clock period of 10ns (100MHz) and how much of each resource it uses.
With a combination of this, our Synthesis Report, and our Dataflow analysis, we can see that it takes 317338 clock cycles * 10ns clock period * 14 pipeline stages = 0.04442732 seconds. Meaning that the total latency added by our image processing is less than one twentieth of a second (when clocked at the targeted 100MHz).
Step 7: Adding the IP Library in Vivado
To use your synthesized IP block you are going to need to add it to Vivado.
In Vivado add an IP repository to your project by going to the IP catalog and right-click selecting "Add Repository..."
Navigate to your Vivado HLS project directory and select your solution directory.
It should report the IP that it found.
Step 8: Doing an Upgrade
Sometimes you need to make changes to your HLS block after including it in a Vivado design.
To do this, you can make the changes and resynthesize and export the IP with a higher version number (see details in earlier step about major/minor version number changes).
After changing exporting the new version, refresh your IP repositories in Vivado. This can either be done when Vivado notices the IP has changed in the repository, or activated manually. (Note, if you refresh your IP repositories after starting, but before the export completes in HLS, the IP will temporarily not be there, wait for it to finish and refresh again.)
At this point a window should appear with the information that an IP has been changed on the disk and gives you the option to update it with an "Upgrade Selected" button.
If the change was a minor version change and none of the interfaces changed, then pressing that button will automatically replace the old IP with the new one, otherwise more work may be required.
Step 9: Additional Details and Info
The following steps provide more information on how HLS synthesis works and what you can do with it.
For an example of a project using an HLS synthesized IP block, see this instructable.
Step 10: Output and Input
Outputs and Inputs to the final IP block are determined from an analysis the synthesizer does of the flow of data in and out of the top function.
Similar to in VHDL or verilog, HLS allows you to specify details about the connections between IP. These lines are examples of this:
void image_filter(AXI_STREAM& video_in, AXI_STREAM& video_out, int& x, int& y) { #pragma HLS INTERFACE axis port=video_in bundle=INPUT_STREAM #pragma HLS INTERFACE axis port=video_out bundle=OUTPUT_STREAM #pragma HLS INTERFACE s_axilite port=x bundle=CONTROL_BUS offset=0x14<br>#pragma HLS INTERFACE s_axilite port=y bundle=CONTROL_BUS offset=0x1C
You can see how the ports exhibited on the IP block are influenced by these directives.
Step 11: AXI Register Interfacing
In order to get input/output to/from your IP block to the PS a good way to do this is through an AXI interface.
You can specify this in your HLS code, including the offsets to be used to access the value later like this:
void image_filter(AXI_STREAM& video_in, AXI_STREAM& video_out, int& x, int& y) {
#pragma HLS INTERFACE s_axilite port=x bundle=CONTROL_BUS offset=0x14 #pragma HLS INTERFACE s_axilite port=y bundle=CONTROL_BUS offset=0x1C #pragma HLS dataflow
x = 42; y = 0xDEADBEEF; }
Once connected properly in Vivado, you can access the values using this code in Vivado SDK:
#include "parameters.h" #define xregoff 0x14 #define yregoff 0x1c x = Xil_In32(XPAR_IMAGE_FILTER_0_S_AXI_CONTROL_BUS_BASEADDR+xregoff); y = Xil_In32(XPAR_IMAGE_FILTER_0_S_AXI_CONTROL_BUS_BASEADDR+yregoff);
This will have you end up with 42 in x and 0xdeadbeef in y
Step 12: Dataflow Pragma
Inside the #pragma DATAFLOW the way that the code is implemented changes from normal C++. The code is pipelined so that all of the instructions are running at all times in different parts of the data (Think of it like an assembly line in a factory, each station is working continuously doing one function and passing it to the next station)
from the image you can see that each of the directives
Despite appearing to be normal variables, img objects are actually implemented as small buffers between the commands. Using an image as an input to a function "consumes" it and makes it no longer usable. (Hence the need for the duplicate commands)
Comments