A sorting algorithm is an algorithm that puts elements of a list in a certain order.The most-used orders are numerical order and lexicographical order. Efficient sorting is important for optimizing the use of other algorithms (such as search and merge algorithms) that require sorted lists to work correctly; it is also often useful for canonicalizing data and for producing human-readable output. More formally, the output must satisfy two conditions:

1. The output is in non decreasing order (each element is no smaller than the previous element according to the desired total order);

2. The output is a permutation (reordering) of the input.

-Sorting is one of the key functions

required for many applications such as decoders for digital communication, digital signal processing, VLSI CAD etc. As a consequence,there is tremendous interest in speeding up sorting in software as well as hardware.

-The time taken in the sorting depends on number of words in case of software algorithms The improvement we’re trying to make in this project is to make the time dependent upon the number of bits per word k and not the number of words

**Problem Statement :**

Traditional sorting algorithms such as bubble sort, insertion sort, merge sort and many more have a base case which requires the swapping of two data elements at a time. Even while using a combination of two sorting algorithms such as using quick sort and merge sort together to sort an array of numbers, the base case still involves swapping only two data elements at a given time. Hence the motivation to explore sorting algorithms implemented via hardware stems from the fact that through hardware one is able to better exploit parallelism, thereby sorting many numbers simultaneously.

Parallelism not only ensures concurrency but also aims at reducing the time complexity of algorithms. Hence the problem at hand is to devise a new sorting algorithm and implement it in hardware such that it extracts parallelism.

## Step 1: Algorithm Analysis

Sorting based on Parallel Decoding and Subsequent Encoding :

•This algorithm takes advantage of the fact that once a k bit binary number is decoded it will occupy one of the 2k positions. Hence if there are N distinct numbers then each of these N k-bit numbers will occupy a different position out of the 2k available output levels. The outputs of each decoder are bitwise OR-ed with the respective outputs of all the other decoders to obtain a 2k bit OR signal. This OR-ed signal is used to generate all the N numbers in decreasing order. At any given level “i”, there will be two numbers generated in parallel- one which occupies (N- i)th rank and one that occupies the ith rank. A next priority generator circuit is present at each level to make the numbers invalid as they get ranked.

## Step 2: Hardware Description

•The hardware implementation of the aforementioned sorting algorithm comprises of the following parts: decoders, OR-ing circuit, priority encoders (both high –to- low and low- to- high), and next priority generator circuits. These components add up to form a combinational circuit as shown below in the figure.

•The decoder circuits employed are used to decode N (k bit) numbers. For every number only one of the decoder outputs will get activated. For N distinct numbers, N distinct decoder lines will get activated

•The OR-ing circuit comprises of 2k (N-input) OR gates (since k bit numbers are used as inputs to the sorter). The ith bit of all the N decoder is fed to an N input OR gate to get the ith bit of the OR-ing circuit. All 2k bits can be generated in a similar manner. N output lines of this OR-ing circuit will be high (logic 1) corresponding to N distinct numbers.

•Once after obtaining an OR-ed signal, comprising of 2k output lines, out of which N output lines are high, priority generators are employed to generate numbers in a decreasing(increasing) order. The priority encoders employed here are either high-to-low priority or low-to-high priority encoders. The first high-to-low and low-to-high priority encoders are fed directly with the OR-ing circuits output, thereby generating the highest and the lowest numbers simultaneously. Subsequent priority encoders are fed from next priority generator’s output.

•The next priority generator circuit is a circuit that eliminates a number once it has been generated (synonymous with getting “ranked”). This ensures that the remaining numbers get re-prioritized and hence attain a higher priority than before. This is achieved by first decoding the generated number, then bit wise complementing the decoded output. This complemented result is AND-ed with the originally generated OR-ed output. In this manner the number which initially activated the ith bit of the decoder’s output will no longer do so.

•The hardware generates all the numbers in a decreasing order (from highest to lowest) which is similar to providing ranks.

The verilog code for the realization of the circuit design is attached herewith.

## Step 3: Verilog Code

The complete circuit is constructed using the modules discussed in previous steps.The final sorting machine circuit diagram is provided.

The verilog code is also attached herewith.

## Discussions