Introduction: Vivado HLS Video IP Block Synthesis

Have you ever wanted to real-time processing on video without adding much latency or in an embedded system? FPGAs (Field Programmable Gate Arrays) are sometimes used to do this; however, writing video processing algorithms in hardware specification languages like VHDL or Verilog is frustrating at best. Enter Vivado HLS, the Xilinx tool that allows you to program in a C++ environment and generate hardware specification language code from it.

Required softwares:

  • Vivado HLS
  • Vivado
  • (If you use the AXI registers) Vivado SDK

(Optional) Download the Xilinx made examples here:

Xilinx HLS video examples

Step 1: What Is Vivado HLS?

Vivado HLS is a tool used to turn c++ like code into hardware structures that can implemented on an FPGA.
It includes an IDE for doing this development.
Once you have completed your development of the code for HLS you can export your generated IP in a format for use with Vivado.

Download the attached files and put them near where you will be creating your project. (rename them back to "top.cpp" and "top.h" if they have a randomized name)

Step 2: HLS Video Library

The HLS Video Library has documentation with reference designs in this paper:
Another good resource is the Xilinx Wiki page about it.

Start Vivado HLS.

Create a new project.

Take the files that you downloaded in the previous step and add them as source files. (Note: the files are not copied into the project, but instead remain where they are)

Then use the Browse button to select the top function.

On the next page, select the Xilinx part you are using.

Step 3: Synthesizing

Solution => Run C Synthesis => Active Solution

After ~227.218 seconds, it should be done. (Note: your actual synthesis time will vary based off of many factors)

Step 4: Versioning and Other Info for Export

Version numbers interact with Vivado to cause you to be able to update the IP in a design. If it is a minor version change it can be done in place while major version changes require your to manually add in the new block and remove the old one.
If your interfaces have not changed and the version update is a minor one the update can be done completely automatically by pressing the update IP button.
You can run "report_ip_status" in the Vivado tcl console to see the status of your IP.

Set the version numbers and other info in Solution => Solution Settings...

Alternatively, these settings can be set during the export.

Step 5: Exporting to a Vivado IP Library

Solution => Export RTL

If you didn't set the IP library details in the previous step, you can do that now.

Step 6: Synthesis and Export Analysis

On this screen we can see the stats about our exported module, showing that it does meet our clock period of 10ns (100MHz) and how much of each resource it uses.

With a combination of this, our Synthesis Report, and our Dataflow analysis, we can see that it takes 317338 clock cycles * 10ns clock period * 14 pipeline stages = 0.04442732 seconds. Meaning that the total latency added by our image processing is less than one twentieth of a second (when clocked at the targeted 100MHz).

Step 7: Adding the IP Library in Vivado

To use your synthesized IP block you are going to need to add it to Vivado.

In Vivado add an IP repository to your project by going to the IP catalog and right-click selecting "Add Repository..."

Navigate to your Vivado HLS project directory and select your solution directory.

It should report the IP that it found.

Step 8: Doing an Upgrade

Sometimes you need to make changes to your HLS block after including it in a Vivado design.

To do this, you can make the changes and resynthesize and export the IP with a higher version number (see details in earlier step about major/minor version number changes).

After changing exporting the new version, refresh your IP repositories in Vivado. This can either be done when Vivado notices the IP has changed in the repository, or activated manually. (Note, if you refresh your IP repositories after starting, but before the export completes in HLS, the IP will temporarily not be there, wait for it to finish and refresh again.)

At this point a window should appear with the information that an IP has been changed on the disk and gives you the option to update it with an "Upgrade Selected" button.
If the change was a minor version change and none of the interfaces changed, then pressing that button will automatically replace the old IP with the new one, otherwise more work may be required.

Step 9: Additional Details and Info

The following steps provide more information on how HLS synthesis works and what you can do with it.

For an example of a project using an HLS synthesized IP block, see this instructable.

Step 10: Output and Input

Outputs and Inputs to the final IP block are determined from an analysis the synthesizer does of the flow of data in and out of the top function.

Similar to in VHDL or verilog, HLS allows you to specify details about the connections between IP. These lines are examples of this:

void image_filter(AXI_STREAM& video_in, AXI_STREAM& video_out, int& x, int& y) {
#pragma HLS INTERFACE axis port=video_in bundle=INPUT_STREAM
#pragma HLS INTERFACE axis port=video_out bundle=OUTPUT_STREAM
#pragma HLS INTERFACE s_axilite port=x bundle=CONTROL_BUS offset=0x14<br>#pragma HLS INTERFACE s_axilite port=y bundle=CONTROL_BUS offset=0x1C

You can see how the ports exhibited on the IP block are influenced by these directives.

Step 11: AXI Register Interfacing

In order to get input/output to/from your IP block to the PS a good way to do this is through an AXI interface.

You can specify this in your HLS code, including the offsets to be used to access the value later like this:

void image_filter(AXI_STREAM& video_in, AXI_STREAM& video_out, int& x, int& y) {
#pragma HLS INTERFACE s_axilite port=x bundle=CONTROL_BUS offset=0x14
#pragma HLS INTERFACE s_axilite port=y bundle=CONTROL_BUS offset=0x1C

#pragma HLS dataflow
x = 42;

Once connected properly in Vivado, you can access the values using this code in Vivado SDK:

#include "parameters.h"
#define xregoff 0x14
#define yregoff 0x1c

This will have you end up with 42 in x and 0xdeadbeef in y

Step 12: Dataflow Pragma

Inside the #pragma DATAFLOW the way that the code is implemented changes from normal C++. The code is pipelined so that all of the instructions are running at all times in different parts of the data (Think of it like an assembly line in a factory, each station is working continuously doing one function and passing it to the next station)

from the image you can see that each of the directives

Despite appearing to be normal variables, img objects are actually implemented as small buffers between the commands. Using an image as an input to a function "consumes" it and makes it no longer usable. (Hence the need for the duplicate commands)