Introduction: Synchronisers, Clock Domain Crossing, Clock Generators, Edge Detectors, Much More - Essential Tweak Circuits

About: Mitu Raj -- Just a Hobbyst and Learner -- Chip Designer -- Software Developer -- Physics and Mathematics Enthusiast

This is to inform that this blog is now archived and I have started a new website/blog of my own: Chipmunk Logic. I hope you guys follow/subscribe me for free content and knowledge and continue supporting me. Hereafter, I will publish all my future technical blogs there :)

In this blog, We will be seeing some circuits which I would like to call 'Tweak Circuits'. These micro-circuits will come handy for every RTL designer on numerous scenarios like clock domain crossing, reset signalling, internal clock generation etc. Let's see some of these circuits and scenarios where you may have to (or should have to !) use it in your design. We are gonna play around a bit with clock in these designs. I am presenting these circuits from my experience as an RTL designer so far. Kindly leave feedback if you have queries/suggestions :-)

All codes are in VHDL, I will try to add Verilog as well later on. But the ideas are same in any HDL ;-)

I will be adding more contents soon. Do follow me for updates.

Step 1: Multi-flop Synchroniser

Why?:

When you send a single-bit signal from one clock domain to another clock domain (asynchronous), you SHOULD synchronise it to the destination clock domain to avoid metastability. For this purpose, We have to use flip-flop synchronisers. Traditional way is to use 2-flop synchronisers. I am presenting here with the VHDL code for a configurable multi-flop synchroniser in which you can configure more than 2 flops in synchroniser chain and increase MTBF or reduce metastability probability. I use this synchroniser to pass signal between synchronous clock domains of different clock periods as well, when I don't want to put any multi-path clock constraining on such paths.

Application Notes:

After metastability, data may settle to either the correct value or the past value. So, even if you are using synchronisers, you have to make sure that the source signal is at least two destination-clock cycles long so that the signal is latched correctly at the destination clock domain by at least one clock edge. Generally hand-written RTL code is not recommended in designs for CDC crossing (especially for data synchronisation). Specially hardened CDC cells are used to achieve this. For eg: one popular RTL method is writing behavorial design of a mux-based data synchroniser. But it is noted that while this works in RTL behavioral simulation, synthesisers in ASIC/FPGA may give you a glitch-prone netlist which may fail CDC in gate-level simulation, and hence fail in the actual hardware as well ! Read here . Note that putting individual 2-flop synchroniser on each bit of data bus IS NOT going to be safe CDC as well !

VHDL Code, Synthesis Notes:

Please note that you have to add a special placement constraint called ASYNC_REG if you are implementing this using Vivado. Altera should have a similar attribute (Which unfortunately I am not sure of). This attribute makes sure that all flops in the synchroniser chain will be placed close to each other in the FPGA for better MTBF. It is recommended to max-path constraint these paths instead of false-path constraints.

--------------------------------------------------------------------------------------------------------------------
-- Design Name    : Single-bit Synchroniser for Clock Domain Crossing   
-- Description    : - To synchronise control signals of one bit between clock domains
--                  - Configurable no. of flip-flops in the synchroniser chain         
-- Date           : 05-07-2019
-- Designed By    : Mitu Raj, iammituraj@gmail.com
-- Comments       : Attributes are important for proper FPGA implementation, cross check synthesised design
--------------------------------------------------------------------------------------------------------------------

--------------------------------------------------------------------------------------------------------------------
-- LIBRARIES
--------------------------------------------------------------------------------------------------------------------
Library IEEE;
use IEEE.STD_LOGIC_1164.all;

--------------------------------------------------------------------------------------------------------------------
-- ENTITY DECLARATION
--------------------------------------------------------------------------------------------------------------------
Entity synchronizer is
    Generic (STAGES : natural := 2)     ;     -- Recommended 2 flip-flops for low speed designs; >2 for high speed
    Port ( 
          clk           : in std_logic  ;     -- Clock
          rstn          : in std_logic  ;     -- Synchronous Reset
          async_sig_i   : in std_logic  ;     -- Asynchronous signal in
          sync_sig_o    : out std_logic       -- Synchronized signal out
          );
end synchronizer;

--------------------------------------------------------------------------------------------------------------------
-- ARCHITECTURE DEFINITION
--------------------------------------------------------------------------------------------------------------------
Architecture Behavioral of synchronizer is

--------------------------------------------------------------------------------------------------------------------
-- Synchronisation Chain of Flip-Flops
--------------------------------------------------------------------------------------------------------------------
signal flipflops : std_logic_vector(STAGES-1 downto 0);
--------------------------------------------------------------------------------------------------------------------

--------------------------------------------------------------------------------------------------------------------
-- These attributes are native to XST and Vivado Synthesisers.
-- They make sure that the synchronisers are not optimised to shift register primitives.
-- They are correctly implemented in the FPGA, by placing them together in the same slice.
-- Maximise MTBF while place and route.
-- Altera has different attributes.
--------------------------------------------------------------------------------------------------------------------
attribute ASYNC_REG : string;
attribute ASYNC_REG of flipflops: signal is "true";
--------------------------------------------------------------------------------------------------------------------

begin   

   -- Synchroniser process
   clk_proc: process(clk)
             begin
                if rising_edge(clk) then
                   if (rstn = '0') then
                      flipflops <= (others => '0') ;
                   else                                                        
                      flipflops <= flipflops(flipflops'high-1 downto 0) & async_sig_i;
                   end if;                      
                end if;
             end process;

   -- Synchronised signal out
   sync_sig_o <= flipflops(flipflops'high);

end Behavioral;

--------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------

Step 2: Asynchronous Reset Synchroniser

Scenario:

This is another circuit which most of you need but yet choose to overlook. Most of the times when we prototype a design on FPGA, we may be using synchronous reset in your design. Yet, the reset comes from a push button or switch on the board, which is an asynchronous input which you can press at any moment as you like! So, at least for the prototyping purpose, you may want to put a synchroniser on reset path to be on the safer side avoiding possibilities of metastability. This will make sure both assertion and de-assertion of the reset is properly synchronized to the whole design.

VHDL Code:

Below is the code for Asynchronous Reset Synchronizer:

--------------------------------------------------------------------------------------------------------------------
-- Design Name    : Asynchronous Reset Synchronizer   
-- Description    : Configurable no. of flip-flops in the synchroniser chain         
-- Date           : 13-02-2021
-- Designed By    : Mitu Raj, iammituraj@gmail.com
-- Comments       : Attributes are important for proper FPGA implementation, cross check synthesised design
--------------------------------------------------------------------------------------------------------------------

--------------------------------------------------------------------------------------------------------------------
-- LIBRARIES
--------------------------------------------------------------------------------------------------------------------
Library IEEE;
use IEEE.STD_LOGIC_1164.all;

--------------------------------------------------------------------------------------------------------------------
-- ENTITY DECLARATION
--------------------------------------------------------------------------------------------------------------------
Entity areset_sync is
    Generic (STAGES : natural := 2)     ;     -- Recommended 2 flip-flops for low speed designs; >2 for high speed
    Port ( 
          clk           : in std_logic  ;     -- Clock          
          async_rst_i   : in std_logic  ;     -- Asynchronous Reset in
          sync_rst_o    : out std_logic       -- Synchronized Reset out
          );
end areset_sync;

--------------------------------------------------------------------------------------------------------------------
-- ARCHITECTURE DEFINITION
--------------------------------------------------------------------------------------------------------------------
Architecture Behavioral of areset_sync is

--------------------------------------------------------------------------------------------------------------------
-- Synchronisation Chain of Flip-Flops
--------------------------------------------------------------------------------------------------------------------
signal flipflops : std_logic_vector(STAGES-1 downto 0);
--------------------------------------------------------------------------------------------------------------------

--------------------------------------------------------------------------------------------------------------------
-- These attributes are native to XST and Vivado Synthesisers.
-- They make sure that the synchronisers are not optimised to shift register primitives.
-- They are correctly implemented in the FPGA, by placing them together in the same slice.
-- Maximise MTBF while place and route.
-- Altera has different attributes.
--------------------------------------------------------------------------------------------------------------------
attribute ASYNC_REG : string;
attribute ASYNC_REG of flipflops: signal is "true";
--------------------------------------------------------------------------------------------------------------------

begin

   sync_rst_o <= flipflops(flipflops'high);  -- Synchronised Reset out

   -- Synchroniser process
   clk_proc: process(clk)
             begin
                if rising_edge(clk) then                                                                         
                   flipflops <= flipflops(flipflops'high-1 downto 0) & async_rst_i;
                end if;           
             end process;

end Behavioral;

--------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------

Step 3: Asynchronous Reset Synchronised De-assertion

Scenario:

This is another overlooked scenario by many in digital design. Using synchronous or asynchronous reset in the design is one of the hottest and most debated topics in VLSI design realm. If you are using asynchronous resets, it is probably because you want a faster reset, especially when you have a slow and fast clock in your design, where you don't want to stretch synchronous resets for long time. Nevertheless, it comes with a catch. Asynchronous Resets have to satisfy recovery and removal timing checks. While assertion is fine, de-assertion can violate these timing checks and timing analyser can never analyse it because of the asynchronous nature of the signal. So the take-away is: De-assertion has to synchronised to the clock domain to which the asynchronous reset is applied.

VHDL Code:

Below is the code for Synchroniser for de-assertion of Asynchronous Reset :

--------------------------------------------------------------------------------------------------------------------
-- Design Name    : Synchroniser for de-assertion of Asynchronous Reset   
-- Description    : Configurable no. of flip-flops in the synchroniser chain         
-- Date           : 13-02-2021
-- Designed By    : Mitu Raj, iammituraj@gmail.com
-- Comments       : Attributes are important for proper FPGA implementation, cross check synthesised design
--------------------------------------------------------------------------------------------------------------------

--------------------------------------------------------------------------------------------------------------------
-- LIBRARIES
--------------------------------------------------------------------------------------------------------------------
Library IEEE                ;
use IEEE.std_logic_1164.all ;

--------------------------------------------------------------------------------------------------------------------
-- ENTITY DECLARATION
--------------------------------------------------------------------------------------------------------------------
Entity areset_deassert_sync is
   
   generic (
              CHAINS      : natural   := 2   ;        -- No. of flip-flops in the synchronization chain; at least
              RST_POL     : std_logic := '1'          -- Polarity of Asynchronous Reset
           ) ;                                        -- same is the pulse width of reset assertion            

   port    (
              clk         : in  std_logic    ;        -- Clock
              async_rst_i : in  std_logic    ;        -- Asynchronous Reset
              sync_rst_o  : out std_logic             -- Asynchronous Reset with de-assertion synchronized
           ) ;

end entity ;

--------------------------------------------------------------------------------------------------------------------
-- ARCHITECTURE DEFINITION
--------------------------------------------------------------------------------------------------------------------
Architecture behav of areset_deassert_sync is

--------------------------------------------------------------------------------------------------------------------
-- Synchronisation Chain of Flip-Flops
--------------------------------------------------------------------------------------------------------------------
signal flipflops : std_logic_vector(CHAINS-1 downto 0) ;
--------------------------------------------------------------------------------------------------------------------

--------------------------------------------------------------------------------------------------------------------
-- These attributes are native to XST and Vivado Synthesisers.
-- They make sure that the synchronisers are not optimised to shift register primitives.
-- They are correctly implemented in the FPGA, by placing them together in the same slice.
-- Maximise MTBF while place and route.
-- Altera has different attributes.
--------------------------------------------------------------------------------------------------------------------
attribute ASYNC_REG : string;
attribute ASYNC_REG of flipflops: signal is "true";
--------------------------------------------------------------------------------------------------------------------

begin

-- Synchronizer process
process (clk, async_rst_i)
begin
   
   if (async_rst_i = RST_POL) then
      flipflops <= (others => RST_POL)                                  ;
   elsif (rising_edge (clk)) then
      flipflops <= flipflops(flipflops'high-1 downto 0) & (not RST_POL) ;
   end if ;

end process ;

-- Reset out with synchronized de-assertion
sync_rst_o <= flipflops(flipflops'high) ;

end architecture ;

--------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------

Step 4: Pulse/Toggle Synchroniser

Challenge:

Next one is a bit tricky one; how to synchronise a pulse from one clock domain to the other? Two-flop synchoniser fails miserably when passing a pulse from fast clock to slow clock. Native Pulse synchroniser works well but has chances of missing pulses, if generated at consecutive cycles. So handshake-based pulse synchroniser is preferred. The only rule is: Pulse should be generated only when the 'busy' signal is low.

VHDL Code:

Below is the code for Handshake-based Pulse/Toggle Synchroniser for Clock Domain Crossing:

<p>--------------------------------------------------------------------------------------------------------------------
-- Design Name    : Handshake-based Pulse/Toggle Synchroniser for Clock Domain Crossing   
-- Description    : - Synchronises single-cycle pulse from Clock domain A to Clock domain B
--                  - Handshake-based synchroniser for safe and reliable transfers
--                  - Configurable no. of flip-flops in the synchroniser chains         
-- Date           : 13-02-2021
-- Designed By    : Mitu Raj, iammituraj@gmail.com
-- Comments       : Attributes are important for proper FPGA implementation, cross check synthesised design
--------------------------------------------------------------------------------------------------------------------

--------------------------------------------------------------------------------------------------------------------
-- LIBRARIES
--------------------------------------------------------------------------------------------------------------------
Library IEEE                ;
use IEEE.STD_LOGIC_1164.all ;

--------------------------------------------------------------------------------------------------------------------
-- ENTITY DECLARATION
--------------------------------------------------------------------------------------------------------------------
Entity pulse_sync is

    Generic (   
               STAGES        : natural := 2         -- Recommended = 2 flops; >2 for high speed 
            ) ;     
             
    Port    ( 
               clk_a         : in  std_logic  ;     -- Clock of Domain-A
               rstn_a        : in  std_logic  ;     -- Synchronous Reset of Domain-A
               clk_b         : in  std_logic  ;     -- Clock of Domain-B
               rstn_b        : in  std_logic  ;     -- Synchronous Reset of Domain-B
               pulseA_i      : in  std_logic  ;     -- Pulse originated at Domain-A
               pulseB_o      : out std_logic  ;     -- Synchronized pulse generated at Domain-B
               busy_o        : out std_logic        -- Busy processing the pulse from Domain-A
            );

end Entity;

--------------------------------------------------------------------------------------------------------------------
-- ARCHITECTURE DEFINITION
--------------------------------------------------------------------------------------------------------------------
Architecture Behavioral of pulse_sync is

--------------------------------------------------------------------------------------------------------------------
-- Synchronisation Chain of Flip-Flops
--------------------------------------------------------------------------------------------------------------------
signal flipflops_a   : std_logic_vector (STAGES-1 downto 0) ;        -- At Domain-A
signal flipflops_b   : std_logic_vector (STAGES-1 downto 0) ;        -- At Domain-B
--------------------------------------------------------------------------------------------------------------------

-- Other signals/registers
signal pulseA_regA   : std_logic  ;        -- Pulse sampled at Domain-A
signal busyB         : std_logic  ;        -- Busy signal from Domain-B
signal busyB_delayed : std_logic  ;        -- Busy signal from Domain-B cycle-delayed
signal busyB_syncA   : std_logic  ;        -- Busy signal from Domain-B synchronised to Domain-A

--------------------------------------------------------------------------------------------------------------------
-- These attributes are native to XST and Vivado Synthesisers.
-- They make sure that the synchronisers are not optimised to shift register primitives.
-- They are correctly implemented in the FPGA, by placing them together in the same slice.
-- Maximise MTBF while place and route.
-- Altera has different attributes.
--------------------------------------------------------------------------------------------------------------------
attribute ASYNC_REG : string                          ;
attribute ASYNC_REG of flipflops_a : signal is "true" ;
attribute ASYNC_REG of flipflops_b : signal is "true" ;
--------------------------------------------------------------------------------------------------------------------

begin

-- Synchroniser at Domain-A that synchronises the busy signal generated at Domain-B
process (clk_a)
begin
   
   if rising_edge (clk_a) then 

      if (rstn_a = '0') then
         flipflops_a <= (others => '0')                                   ;
      else
         flipflops_a <= flipflops_a (flipflops_a'high-1 downto 0) & busyB ;              	         
      end if ;

   end if ;

end process ;

-- Pulse sampler at Domain-A, converts the pulse to level based on the busy status from Domain-B
process (clk_a)
begin
   
   if rising_edge (clk_a) then 

      if (rstn_a = '0') then
         pulseA_regA <= '0'                                             ;
      else
         pulseA_regA <= pulseA_i or (pulseA_regA and (not busyB_syncA)) ;              	         
      end if ;

   end if ;

end process ;

-- Synchroniser at Domain-B that synchronises the sampled pulse from Domain-A
process (clk_b)
begin
   
   if rising_edge (clk_b) then 

      if (rstn_b = '0') then
         flipflops_b   <= (others => '0')                                         ;
         busyB_delayed <= '0'                                                     ;
      else
         flipflops_b   <= flipflops_b (flipflops_b'high-1 downto 0) & pulseA_regA ;
         -- Generate the delayed busyB signal
         busyB_delayed <= flipflops_b (flipflops_b'high)                          ;              	         
      end if ;     

   end if ;

end process ;

-- Concurrent assignments
busyB       <= flipflops_b (flipflops_b'high) ;
busyB_syncA <= flipflops_a (flipflops_a'high) ;
busy_o      <= busyB_syncA or pulseA_regA     ;
pulseB_o    <= busyB and (not busyB_delayed)  ;

end Architecture ;

--------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------</p>

Step 5: Data Synchroniser

Challenge:

Synchronising a control signal is quite easy but a multi-bit data is not. Mainly because, different bits of the data may settle to different values after metastability. So when you read the data register at the destination clock domain, you are never sure if you are sampling the right data or not.

Techniques:

There are mainly 3 techniques to synchronise data across clock domains. Putting an Asynchronous/Dual-clock FIFO between the clock domains is the most reliable technique and has the highest throughput. FIFOs consume lot of hardware resources though, hence the next two techniques. Another low-bandwidth technique is using two-way handshaking. It employs two signals: data-ready and data-acknowledge. This is a very reliable technique as the source domain can always make sure that the destination domain has captured the data. But it has the lowest throughput due to the latency introduced by handshaking. The third technique is mux-based synchroniser which is quite popular and has better throughput than the former. But it has no handshaking and hence can be unreliable if not used properly. It makes use of a data valid signal which is pulsed along with the valid data. Only this control signal is synchronised to the destination clock domain. The destination clock domain registers the incoming data only if data valid is high, otherwise the past data remain latched. As I said, use this technique only if data comes at low throughput from source clock domain (data doesn't change every clock cycle). Otherwise, some data will be lost by the destination domain.

VHDL Code:

Use this module only for FPGA synthesis. In ASICs, this works at RTL level, but may synthesise to glitch-prone netlist as I discussed in Step 2 of this blog. Below is the VHDL Code for Mux-based Data Synchroniser:

--------------------------------------------------------------------------------------------------------------------
-- Design Name    : Mux-based Data Synchroniser for Clock Domain Crossing   
-- Description    : - To synchronise data between clock domains using data ready synchroniser + mux
--                  - Configurable no. of flip-flops in the synchroniser chain         
-- Date           : 17-02-2021
-- Designed By    : Mitu Raj, iammituraj@gmail.com
-- Comments       : Attributes are important for proper FPGA implementation, cross check synthesised design
--------------------------------------------------------------------------------------------------------------------

--------------------------------------------------------------------------------------------------------------------
-- LIBRARIES
--------------------------------------------------------------------------------------------------------------------
Library IEEE                ;
use IEEE.STD_LOGIC_1164.all ;

--------------------------------------------------------------------------------------------------------------------
-- ENTITY DECLARATION
--------------------------------------------------------------------------------------------------------------------
Entity data_sync is

   Generic (
              STAGES : natural := 2 ;          -- Recommended 2 flip-flops for low speed designs; >2 for high speed
              DWIDTH : natural := 8            -- Data width
           ) ;     

   Port    ( 
               clk      : in  std_logic                            ;     -- Clock
               rstn     : in  std_logic                            ;     -- Synchronous Reset               
               din      : in  std_logic_vector (DWIDTH-1 downto 0) ;     -- Asynchronous Data in
               dready_i : in  std_logic                            ;     -- Asynchronous Data ready in
               dout     : out std_logic_vector (DWIDTH-1 downto 0) ;     -- Synchronous Data out
               dready_o : out std_logic                                  -- Synchronous Data ready out  
            ) ;

end Entity ;

--------------------------------------------------------------------------------------------------------------------
-- ARCHITECTURE DEFINITION
--------------------------------------------------------------------------------------------------------------------
Architecture Behavioral of data_sync is

--------------------------------------------------------------------------------------------------------------------
-- Synchronisation Chain of Flip-Flops for Data ready
--------------------------------------------------------------------------------------------------------------------
signal flipflops : std_logic_vector (STAGES-1 downto 0) ;
--------------------------------------------------------------------------------------------------------------------

-- Data ready signal synchronised
signal dready_sync : std_logic ;

--------------------------------------------------------------------------------------------------------------------
-- These attributes are native to XST and Vivado Synthesisers.
-- They make sure that the synchronisers are not optimised to shift register primitives.
-- They are correctly implemented in the FPGA, by placing them together in the same slice.
-- Maximise MTBF while place and route.
-- Altera has different attributes.
--------------------------------------------------------------------------------------------------------------------
attribute ASYNC_REG              : string           ;
attribute ASYNC_REG of flipflops : signal is "true" ;
--------------------------------------------------------------------------------------------------------------------

begin  

   -- Synchroniser process for Data ready
   clk_proc: process (clk)
             begin

                if rising_edge (clk) then

                   if rstn = '0' then
                      flipflops <= (others => '0') ;
                   else                                                        
                      flipflops <= flipflops(flipflops'high-1 downto 0) & dready_i ;
                   end if ;  

                end if ;

             end process ;

   -- Register process for Data in
   reg_proc: process (clk)
             begin

                if rising_edge (clk) then

                   if rstn = '0' then

                      dout     <= (others => '0') ;
                      dready_o <= '0'             ;

                   else

                      if dready_sync = '1' then
                         dout  <= din             ;    -- Mux + register logic
                      end if ; 

                      dready_o <= dready_sync     ;

                   end if ;

                end if ;

             end process ;

   -- Synchronised signal out
   dready_sync <= flipflops(flipflops'high) ;

end Behavioral ;

--------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------<br>

Step 6: Edge Detectors

Scenario and VHDL Code:

Edge detectors are another useful tweak circuits you may need in your design. It can be used to detect a change in the state of the design and simply sent a pulse. One useful application is to convert level-sensitive interrupts to edge-triggered interrupts. Example code for edge detector in VHDL is given below.

The circuit takes in a signal and outputs:

  • zero-delay rising detection, falling edge and either edge detection.
  • cycle-delayed rising detection, falling edge and either edge detection.

Interestingly, zero-delayed rising edge detector outputs '1' on reset, if signal was '1' during the reset. While this should not be a problem in the design, if you still don't desire it, you can tweak it as well, as given in the code.

--------------------------------------------------------------------------------------------------------------------
-- Design Name    : Edge Detector   
-- Description    : Rising and Falling edge detecting circuit        
-- Date           : 05-07-2019
-- Designed By    : Mitu Raj, iammituraj@gmail.com
-- Comments       : NIL
--------------------------------------------------------------------------------------------------------------------

--------------------------------------------------------------------------------------------------------------------
-- LIBRARIES
--------------------------------------------------------------------------------------------------------------------
library IEEE;
use IEEE.std_logic_1164.all;

--------------------------------------------------------------------------------------------------------------------
-- ENTITY DECLARATION
--------------------------------------------------------------------------------------------------------------------
entity edge_detector is

port ( 
        -- Global signals
        clk               : in  std_logic;      -- Clock
        rst               : in  std_logic;      -- Sync active-low Reset       
              
        sig_in            : in  std_logic;      -- Signal in 
        
        -- Cycle-delayed edge detectors
        sig_out_r         : out std_logic;      -- Rising edge detector
        sig_out_f         : out std_logic;      -- Falling edge detector
        sig_out_rf        : out std_logic;      -- Rising edge detector

        -- Zero-cycle-delay edge detectors
        sig_out_r_imm     : out std_logic;      -- Rising edge detector
        sig_out_f_imm     : out std_logic;      -- Falling edge detector
        sig_out_rf_imm    : out std_logic;      -- Either edge detector

        -- Zero-cycle-delay edge detectors which are cycle-glitch-free on reset
        sig_out_r_imm_gl  : out std_logic;      -- Rising edge detector      
        sig_out_rf_imm_gl : out std_logic       -- Either edge detector
     );

end entity;

--------------------------------------------------------------------------------------------------------------------
-- ARCHITECTURE DEFINITION
--------------------------------------------------------------------------------------------------------------------
architecture archi of edge_detector is

-- Internal signals/registers
signal sig_in_delayed : std_logic;

begin
process (clk) 
begin
   if rising_edge(clk) then
   
      if rst = '0' then
         sig_out_r      <= '0';
         sig_out_f      <= '0';
         sig_out_rf     <= '0';
         sig_in_delayed <= '0';
      else
         
         -- Pulse for only one cycle
         sig_out_r      <= '0';
         sig_out_f      <= '0';
         sig_out_rf     <= '0';  

         -- Generate one cycle delayed version of sig_in       
         sig_in_delayed <= sig_in;
         
         -- Detect rising edge of sig_in      
         if sig_in_delayed = '0' and sig_in = '1' then
            sig_out_r  <= '1';
            sig_out_rf <= '1';
         end if;

         -- Detect falling edge of sig_in
         if sig_in_delayed = '1' and sig_in = '0' then
            sig_out_f  <= '1';
            sig_out_rf <= '1';
         end if;
         
      end if;

   end if;   

end process;

-- Zero-cycle-delay edge detectors
sig_out_r_imm  <= sig_in and (not sig_in_delayed) ;  
sig_out_f_imm  <= (not sig_in) and sig_in_delayed ;
sig_out_rf_imm <= sig_in xor sig_in_delayed       ;

-- Zero-cycle-delay edge detectors cycle-glitch-free on reset
sig_out_r_imm_gl  <= sig_in and (not sig_in_delayed) and rst  ;    
sig_out_rf_imm_gl <=  (sig_in xor sig_in_delayed) and rst     ;

end archi;

--------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------

Step 7: Reset Stretcher

Need and VHDL Code:

If you have multiple synchronous clock domains in the design running at different clock frequencies, then you need to synchronise the external reset and generate a global reset which has min. pulse width = time period of the slowest clock, which will then reset all clock domains. You can use pulse synchroniser to generate this synchronous reset pulse that will reset your whole design. However, if you want to stretch the reset to multiple clock cycles for some architectural-specific reasons, you can achieve this by cascading a Reset Stretcher circuit. Below is the VHDL code for Reset Stretcher:

--------------------------------------------------------------------------------------------------------------------
-- Design Name    : Synchronous Reset Stretcher   
-- Description    : Configurable no. of flip-flops in the stretcher chain.        
-- Date           : 16-02-2021
-- Designed By    : Mitu Raj, iammituraj@gmail.com
-- Comments       : Attributes make sure the flops are placed close to each other on FPGA.
--------------------------------------------------------------------------------------------------------------------

--------------------------------------------------------------------------------------------------------------------
-- LIBRARIES
--------------------------------------------------------------------------------------------------------------------
Library IEEE                ;
use IEEE.STD_LOGIC_1164.all ;

--------------------------------------------------------------------------------------------------------------------
-- ENTITY DECLARATION
--------------------------------------------------------------------------------------------------------------------
Entity reset_stretcher is

    Generic (
               PERIOD      : natural   := 4   ;     -- How many clock cycles to be stretched by
               RST_POL     : std_logic := '1'       -- Polarity of Synchronous Reset   
            ) ;     

    Port    ( 
               clk         : in  std_logic    ;     -- Clock
               rst_i       : in  std_logic    ;     -- Synchronous Reset in
               rst_o       : out std_logic          -- Stretched Synchronous Reset out
            ) ;

end Entity ;

--------------------------------------------------------------------------------------------------------------------
-- ARCHITECTURE DEFINITION
--------------------------------------------------------------------------------------------------------------------
Architecture Behavioral of reset_stretcher is

--------------------------------------------------------------------------------------------------------------------
-- Stretcher Chain : Synchronous Chain of Flip-Flops
--------------------------------------------------------------------------------------------------------------------
signal flipflops : std_logic_vector (PERIOD-1 downto 0) ;
--------------------------------------------------------------------------------------------------------------------

--------------------------------------------------------------------------------------------------------------------
-- These attributes are native to XST and Vivado Synthesisers.
-- They make sure that the synchronisers are not optimised to shift register primitives.
-- They are correctly implemented in the FPGA, by placing them together in the same slice.
-- Maximise MTBF while place and route.
-- Altera has different attributes.
--------------------------------------------------------------------------------------------------------------------
attribute ASYNC_REG              : string           ;
attribute ASYNC_REG of flipflops : signal is "true" ;
--------------------------------------------------------------------------------------------------------------------

begin

process (clk) 
begin
   
   if rising_edge (clk) then
      
      if rst_i = RST_POL then
         flipflops <= (others => RST_POL)                                  ;
      else
         flipflops <= flipflops(flipflops'high-1 downto 0) & (not RST_POL) ;                 
      end if ;

   end if ;

end process ;

-- Stretched Synchronous Reset out
rst_o <= flipflops (flipflops'high) ;

end Architecture ;

--------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------<br>

Step 8: Minimum Width Resetter

Need and VHDL Code:

Sometimes a minimum pulse width is required at reset input. You want to propagate the reset in your design only if the reset is asserted/de-asserted for at least a minimum number of clock cycles. Any lesser pulse width reset are considered as 'glitches' that should not reset the design. In such cases, you can implement a min. width resetter.

Below is the VHDL code for Minimum Width Resetter:

--------------------------------------------------------------------------------------------------------------------
-- Design Name    : Minimum Pulse Width Reset Logic   
-- Description    : Ensures minimum pulse width for proper reset assertion and de-assertion at input and output.
--                  Used for glitch filtering.
--                  Configurable reset polarity, Configurable min. pulse width to be recognized at input.        
-- Date           : 16-02-2021
-- Designed By    : Mitu Raj, iammituraj@gmail.com
-- Comments       : Attributes make sure that the sync flops are placed close to each other on FPGA.
--------------------------------------------------------------------------------------------------------------------

--------------------------------------------------------------------------------------------------------------------
-- LIBRARIES
--------------------------------------------------------------------------------------------------------------------
Library IEEE                ;
use IEEE.STD_LOGIC_1164.all ;

--------------------------------------------------------------------------------------------------------------------
-- ENTITY DECLARATION
--------------------------------------------------------------------------------------------------------------------
Entity min_width_reset is

    Generic (
               MIN_WIDTH   : natural   := 4   ;     -- Minimum pulse width of reset to be recognized, [2-16] 
               RST_POL     : std_logic := '0'       -- Polarity of Synchronous Reset   
            ) ;     

    Port    ( 
               clk         : in  std_logic    ;     -- Clock
               rst_i       : in  std_logic    ;     -- Synchronous Reset in
               rst_o       : out std_logic          -- Synchronous Reset out with min. pulse width assured  
            ) ;

end Entity ;

--------------------------------------------------------------------------------------------------------------------
-- ARCHITECTURE DEFINITION
--------------------------------------------------------------------------------------------------------------------
Architecture behavioral of min_width_reset is

--------------------------------------------------------------------------------------------------------------------
-- Pulse Width Validator Chain : Synchronous Chain of Flip-Flops
--------------------------------------------------------------------------------------------------------------------
signal sync_chain_rg     : std_logic_vector (MIN_WIDTH-1 downto 0) := (others => not RST_POL) ;
--------------------------------------------------------------------------------------------------------------------

-- Temporary signals
signal temp_level0       : std_logic_vector (MIN_WIDTH-1 downto 0) ;
signal temp_level1       : std_logic_vector (MIN_WIDTH-1 downto 0) ;

-- Muxed resets
signal muxed_sync_rst    : std_logic                               ;
signal muxed_sync_rst_rg : std_logic := RST_POL                    ;

--------------------------------------------------------------------------------------------------------------------
-- These attributes are native to XST and Vivado Synthesisers.
-- They make sure that the synchronisers are not optimised to shift register primitives.
-- They are correctly implemented in the FPGA, by placing them together in the same slice.
-- Maximise MTBF while place and route.
-- Altera has different attributes.
--------------------------------------------------------------------------------------------------------------------
attribute ASYNC_REG                  : string           ;
attribute ASYNC_REG of sync_chain_rg : signal is "true" ;
--------------------------------------------------------------------------------------------------------------------

begin

-- Clocked process
process (clk) 
begin
   
   if rising_edge (clk) then
      sync_chain_rg     <= sync_chain_rg(sync_chain_rg'high-1 downto 0) & rst_i ;
      muxed_sync_rst_rg <= muxed_sync_rst                                       ;                       
   end if ;

end process ;

-- Generate statement to self-OR and self-AND all bits of Synchronizer chain */
temp_level0 (0) <= sync_chain_rg (0) ;
temp_level1 (0) <= sync_chain_rg (0) ;

gen: for i in 1 to MIN_WIDTH-1 generate
      
     temp_level1 (i) <= temp_level1 (i-1) and sync_chain_rg (i) ;
     temp_level0 (i) <= temp_level0 (i-1) or sync_chain_rg (i)  ;  

end generate ; 

-- Muxed reset
muxed_sync_rst <= temp_level0 (MIN_WIDTH-1) when muxed_sync_rst_rg else temp_level1 (MIN_WIDTH-1) ;

-- Reset out
rst_o <= muxed_sync_rst_rg ;

end Architecture ;

--------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------

Step 9: Clock Divider

Why not?:

So, dividing a system clock into a smaller frequency and use it to clock a module or a big system. One of the most popularly used techniques and yet not so good thing to do on an FPGA design. During my early days I used to do the same, but with experience I figured out that this is never a good idea as it routes the divided clock through LUT and switch-matrix paths on FPGAs resulting in poor skew balancing and slew rate. Also due to the introduction of a new clock domain, the crossing timing paths are now headache to put on timing constraints as multi-cycle paths to make sure that your design meets timing. But somehow for frequencies below 25 MHz, this has worked fine for me. But still taking this clock outside FPGA to clock an external module is a VERY BAD idea. Time to befriend PLLs/MMCMs.

VHDL Code:

Anyway, I am presenting here with the code for Clock Divider:

--------------------------------------------------------------------------------------------------------------------
-- Design Name    : Clock Divider   
-- Description    : Configurable Clock Divider         
-- Date           : 14-02-2021
-- Designed By    : Mitu Raj, iammituraj@gmail.com
-- Comments       : -
--------------------------------------------------------------------------------------------------------------------

--------------------------------------------------------------------------------------------------------------------
-- LIBRARIES
--------------------------------------------------------------------------------------------------------------------
Library IEEE                ;
use IEEE.STD_LOGIC_1164.all ;

--------------------------------------------------------------------------------------------------------------------
-- ENTITY DECLARATION
--------------------------------------------------------------------------------------------------------------------
Entity clock_divider is

    Generic (
               DV            : natural := 4         -- Clock division factor > 1, multiples of 2
            ) ;     

    Port    ( 
               clk           : in  std_logic  ;     -- Clock
               rstn          : in  std_logic  ;     -- Synchronous Reset
               clk_o         : out std_logic        -- Divided Clock out
            ) ;

end Entity ;

--------------------------------------------------------------------------------------------------------------------
-- ARCHITECTURE DEFINITION
--------------------------------------------------------------------------------------------------------------------
Architecture Behavioral of clock_divider is

-- Internal signals/registers
signal clk_rg : std_logic                   ;        -- Clock out register
signal count  : integer range 0 to DV/2 - 1 ;        -- Counter

begin

-- Clock divider process
process (clk)
begin
   
   if rising_edge (clk) then
      
      if rstn = '0' then
         
         clk_rg <= '0';
         count  <= 0  ;

      else    

         if (count = DV/2 - 1) then
            count  <= 0          ;
            clk_rg <= not clk_rg ;
         else
            count  <= count + 1  ;                              	                     
         end if ;

      end if ;
      
   end if ;

end process ;

-- Clock out
clk_o <= clk_rg ;

end Architecture ;

--------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------

Step 10: Pulse Generator

Why?:

I have already talked about why internal clock generation should be avoided for FPGA designs. However if you don't want to use PLL/MMCM and use bare RTL code itself to achieve the same functionality, what you might want to implement is a pulse generator or a clock-enable generator. Let's look at a scenario:

Say, You want a 25 MHz clock from system clock 100 MHz and use this slower clock to clock another module X. What you can do is, generate a pulse every four cycle of 100 MHz clock. Use this pulse as global clock-enable for flip-flops/registers in your Module X. With this clock-enable, Module X is still clocked by the system clock 100 MHz, but works as if it's clocked by 25 MHz. Hence, your whole design is still synchronous and in the same clock domain. This makes thing much easier in terms of timing without any of the drawbacks of crude internal clock generators!

VHDL Code:

Below is the code for Pulse/Clock-enable Generator:

--------------------------------------------------------------------------------------------------------------------
-- Design Name    : Pulse Generator   
-- Description    : Generates a pulse of one cycle at user defined intervals.     
-- Date           : 14-02-2021
-- Designed By    : Mitu Raj, iammituraj@gmail.com
-- Comments       : -
--------------------------------------------------------------------------------------------------------------------

--------------------------------------------------------------------------------------------------------------------
-- LIBRARIES
--------------------------------------------------------------------------------------------------------------------
Library IEEE                ;
use IEEE.STD_LOGIC_1164.all ;

--------------------------------------------------------------------------------------------------------------------
-- ENTITY DECLARATION
--------------------------------------------------------------------------------------------------------------------
Entity pulse_generator is

    Generic (
               PERIOD        : natural := 4         -- Interval of how many clock cycles, > 1
            ) ;     

    Port    ( 
               clk           : in  std_logic  ;     -- Clock
               rstn          : in  std_logic  ;     -- Synchronous Reset
               pulse_o       : out std_logic        -- Pulse out
            ) ;

end Entity ;

--------------------------------------------------------------------------------------------------------------------
-- ARCHITECTURE DEFINITION
--------------------------------------------------------------------------------------------------------------------
Architecture Behavioral of pulse_generator is

-- Internal signals/registers
signal pulse_rg : std_logic                   ;        -- Pulse out register
signal count    : integer range 0 to PERIOD-1 ;        -- Counter

begin

-- Pulse generator process
process (clk)
begin
   
   if rising_edge (clk) then
      
      if rstn = '0' then
         
         pulse_rg <= '0';
         count    <= 0  ;

      else    
         
         if (count = PERIOD-1) then
            count    <= 0         ;
            pulse_rg <= '1'       ;
         else
            pulse_rg <= '0'       ;         
            count    <= count + 1 ;                              	                     
         end if ;

      end if ;
      
   end if ;

end process ;

-- Pulse out
pulse_o <= pulse_rg ;

end Architecture ;

--------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------<br>

Step 11: Clock Gate

Why?:

Clock Gating circuits are used for power saving applications by shutting down clock to modules when they are not in use. The modules are not clocked when not in use. Clock gating thus significantly reduces dynamic power consumption. However RTL based Clock Gates are not recommended to be used on FPGA based designs as it infers latches, as well as synthesiser complains of any combi-logic on clock routes. If this warning is ignored, the synthesiser routes the gated clock thru LUTs and switch matrixes instead of dedicated global clock route. This will result in a poor design with unpredictable timing. So instead of RTL, either dedicated Clock Gating cells from library or Clock-enable on flip-flops have to be exploited for FPGA synthesis. However in ASIC based designs, clock tree synthesis is much flexible and at designer's hand and hence RTL based Clock Gates are fine to be used.

VHDL Code:

Below is the code for latch based Clock Gate:

--------------------------------------------------------------------------------------------------------------------
-- Design Name    : Clock Gate  
-- Description    : Latch-based circuit to gate input clock and generate a gated clock.       
-- Date           : 16-02-2021
-- Designed By    : Mitu Raj, iammituraj@gmail.com
-- Comments       : Not recommended to synthesise on FPGAs.
--------------------------------------------------------------------------------------------------------------------

--------------------------------------------------------------------------------------------------------------------
-- LIBRARIES
--------------------------------------------------------------------------------------------------------------------
Library IEEE                ;
use IEEE.STD_LOGIC_1164.all ;

--------------------------------------------------------------------------------------------------------------------
-- ENTITY DECLARATION
--------------------------------------------------------------------------------------------------------------------
Entity clock_gate is

    Port    ( 
               clk_i    : in  std_logic  ;     -- Clock in
               en_i     : in  std_logic  ;     -- Gate enable 
               clk_o    : out std_logic        -- Gated Clock out
            ) ;

end Entity ;

--------------------------------------------------------------------------------------------------------------------
-- ARCHITECTURE DEFINITION
--------------------------------------------------------------------------------------------------------------------
Architecture Behavioral of clock_gate is

-- Latched Gate Enable signal
signal en_latched : std_logic ;

begin

-- Combinational logic to generate gated clock
en_latched <= en_i when clk_i = '0' ;
clk_o      <= clk_i and en_latched  ;

end Architecture ;

--------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------<br>

Step 12: All Source Codes - Github Repository

Please find and download all source codes for free in my github:

my_tweak_circuits

for queries:iammituraj@gmail.com

Regards,

Mitu

Step 13: Bonus

Design of a simple Reset Controller in RTL

Some of the concepts discussed in this blog have been culminated to design a simple Reset Controller in RTL. Do check out how it was designed and implemented.