Our goal was to implement a hardware-software vision system designated for home care of elderly people or convalescents.
The project is based on visual analysis of person’s behaviour. The system allows to detect health or life-threatening situations like falls or fainting.
We have prepared a short YouTube clip, to demonstrate the so far obtained results.
Step 1: Embedded Vision System Concept
The aim of project was the implementation of a hardware-software vision system for supporting home care of living alone elderly people or convalescents. It allows to perform real-time analysis of human behaviour staying inside a certain room and to detect situations that are health or life threatening such as falls or fainting.
The vision system is based on foreground object segmentation and moving object detection. An algorithm able to provide correct results in typical indoor conditions i.e. sudden and gradual illumination changes, moved background objects (e.g. a chair) and stopped foreground objects (people) is used. In the next step the position of the human is determined. This is based on centre of gravity or bounding box analysis.
In the future, more types of health or life-threatening situations will be detected. We are planning to use a microphone as a second source of information. These should allow to detect screams or falls. Moreover, the detected objects will be classified (on the basis of simple shape features or using shape matching). The aim of this procedure will be to distinguish human silhouettes from other objects especially from the equipment inside the room. Also the problem of shadows will be addressed.
In the first stage, the related work analysis was performed in order to assess different possible algorithmic solutions and their hardware implementations. In the end, a discussion within the project team was held and system details were determined.
In the second part a software model was created. It was implemented in the MATLAB environment and in C++ language with use of the OpenCV library. The model was used for testing solutions and as a reference for the designed system, especially for hardware modules. In this stage, also test sequences for system evaluation were recorded.
In the third stage, the vision system was split into the hardware resources and the processor part. Image acquisition, preprocessing, filtration, foreground and moving object segmentation and connected component labelling is carried out in hardware. Software part operates only on meta-data (e.g. parameters of detected objects).
This part of project required implementing communication between PL (programmable logic) and external RAM, data exchange between PL and PS (processor system), image acquisition in HDMI standard and auxiliary visualization on VGA output.
As for now, we are using bare-metal application on PS. In the future, we are going to use PetaLinux OS for:
- the second part of the vision system (using data received from PL),
- audio signal processing,
- data logging (a simple database),
- simple web-service (allowing for authorized persons the access to statistics and current image)
In the last stage, the solution was tested in simulated conditions. Also a report and a video showing the performance and capabilities of the system were prepared.
Enclosed is a general scheme of the system.
Step 2: Software Model
In first step we developed a
software model of the algorithm in C++ programming language with the use of OpenCV image processing library (www.opencv.org). Therefore, to compile and run the application you will require a C++ compiler (e.g. gcc) and installed OpenCV. As this issue is highly operating system dependent in will not be described here. Then to run the application you need to insert a path to the input movie in line 19 of file "HomeCareVS.cpp". Please provide a video recorded with a still camera. Furthermore, for the current version of the system a properly illuminated scene is required.
How the application works:
First, the input video stream in RGB is converted to YCbCr colour space, in order to improve object segmentation – reduce the negative impact of shadows.
Second, the foreground object mask is determined. Two binary masks are used. The first is generated by thresholding the differential image between the current input image and a background model (simple running average method is used). The second is the result of thresholding of consecutive frame difference. The obtained masks are joined using logical OR operator.
The background model is generated and updated using the following formula. New_background_model = alfa)*Current_frame + (1-alfa)*Previous_background_model
The parameter alfa depends on whether a foreground object was detected in a given location. The binary image is filtered with a median filter in order to remove small noise.
Then, connected component labelling is performed – objects consisting of connected group of pixels are detected and their areas, centroid and bounding boxes are computed. The seven largest objects are tracked (by analyzing bounding box overlapping) and therefore their behavior can be analyzed is a temporal context.
In the current version of the system three activities are recognized - lying, sitting and standing. This is done by analyzing the bounding box height and width. A fall is detected when a person is lying, the centroid of the object moved down and the object has a size in a certain range (e.g. when a ball falls, there no need to worry).
In future works the algorithm could be improved in many ways: a better background modelling algorithm could be used, human silhouette detection could be added, more sophisticated human action recognition could be used. However, this version fulfils the basic task of detecting falls in an embedded home care video system.
Step 3: The Required Equipment
To run our project you require:
- Zybo development board from Digilent,
- a camera with HDMI output (currently the used video I/O design supports 720p resolution i.e. 1280 x 720 ). In our experiments we have used a Sony HDR-CX280 camera,
- LCD screen with 1280 x 720 pixel resolution support and VGA input,
- wires: x1 HDMI (camera-Zybo), x1 VGA (Zybo-LCD), x1 USB mini/standard (Zybo programming),
- power supply for the Zybo board,
- Vivado and SDK software (we used the 2015.4 version).
Step 4: Connecting Hardware
Connect your video camera to the Zybo board with a proper HDMI cable (in our case mini HDMI to HDMI). The camera needs to have HDMI output enabled and set 1280x720 resolution (720p). Plug in the VGA cable to Zybo board and some screen with VGA input.
The Zynq configuration will be downloaded by a USB cable, so connect the board with your PC via mini USB to USB cable and make sure the required cable drivers are available.
Step 5: The Hardware Implementation
We attached our project to the instructable, so you can test it by yourself.
To run Programmable Logic part of the design you have to:
- Run Vivado (recommend version 2015.4)
- Download and unzip the project. Open it in Vivado (Open Project). The main project file is in: \Source\Project\ddc\hdmi_zybo_vga_axi_stream_vdma_8\hdmi_vga_zybo.xpr
- You can analyse the block design (Open Block Design - left toolbar).
- Run Generate Bitstream (it lasts few minutes).
- Open Hardware Manager. Open target->Auto connect. Program the device.
You're halfway there!
Step 6: Running the Software Part
To run the software part, you need to do the following:
- In Vivado, click File->Export->Export Hardware->OK.
- Run Xilinx SDK by clicking File->Launch SDK.
- In SDK software, choose VDMA application and Run.
- Switch the switches on development board to down-down-up-down (binary 2).
If everything worked correctly, ZYBO should send by video output a binary mask of detected objects. The algorithm needs a minute or two to generate proper background model.
Step 7: Connecting to the Design Via UART
As for now, the information about possible human fall is sent by UART interface. To receive it, you will need some serial terminal, e.g. CuteCom.
- Open your serial terminal.
- Choose the interface which is connected to ZYBO
- Insert following settings:
- Baud rate - 115200
- Data bits - 8
- Stop bits - 1
- Parity - none
- Flow control - none
4. Start the connection
If you succeeded, you will receive "Fall!" string, when our design detects you falling.