DIY 3D Scanner Based on Structured Light and Stereo Vision in Python Language





Introduction: DIY 3D Scanner Based on Structured Light and Stereo Vision in Python Language

About: Teacher that enjoys working with students

This 3D scanner was made using low cost conventional items like video projector and webcams. A structured-light 3D scanner is a 3D scanning device for measuring the three-dimensional shape of an object using projected light patterns and a camera system. Software was developed based on structured light and stereo vision with python language.

Projecting a narrow band of light onto a three-dimensional shaped surface produces a line of illumination that appears distorted from other perspectives than that of the projector, and can be used for an exact geometric reconstruction of the surface shape. Horizontal and vertical light bands are projected on object surface and then captured by two webcams.

Step 1: Introduction

Automatic 3D acquisition devices (often called 3D scanners) allow to build highly accurate models of real 3D objects in a cost- and time-effective manner. We have experimented this technology in scanning a toy to prove performance. Specific needs are: medium-high accuracy, easy of use, affordable cost of the scanning device, self-registered acquisition of shape and color data, and finally operational safety for both the operator and the scanned objects. According to these requirements, we designed a low-cost 3D scanner based on structured light which adopts a versatile colored stripe pattern approach. We present the scanner architecture, the software technologies adopted, and the first results of its use in a project regarding the 3D acquisition of a toy.

In the design of our low-cost scanner, we chose to implement the emitter unit by using a video projector. The reason was the flexibility of this device(which allows to experiment any type of light pattern) and its wide availability. The sensor can be either a custom device, a standard digital still camera or a webcam. it must support high quality color capture (i.e. acquisition of high dynamic range) and possibly with high resolution.

Step 2: Software

Python language was used for programming for three reasons, one it is easy to learn and implement, two we can use OPENCV for image related routines and three it is portable among different operating system so you can use this program in windows, MAC and Linux. You can also configure the software to use with any kind of camera (webcams, SLRs or industrial cameras) or projector with native 1024X768 resolution. It is better to use cameras with more than two times resolution. I personally tested the performance in three different configuration, first one was with two parallel Microsoft webcam cinema and a small portable projector, second one was with two lifecam cinema webcameras that rotated 15 degrees toward each other and Infocus projector, last configuration was with logitech webcameras and Infocus projector. To capture point cloud of object surface we should go trough five steps:

1. Projecting gray patterns and capturing images from two cameras ""

2. Processing the 42 images of each camera and capture points codes ""

2. Adjusting threshold to select masking for areas to be processed ""

4. Find and save similar points in each camera ""

5 Calculate X,Y and Z coordinates of point cloud ""

The output is a PLY file with coordinate and color information of points on object surface. You can open PLY files with CAD software like Autodesk products or an open source sofware like Meshlab.

Python 2.7, OPENCV module and NUMPY should be installed to run these Python programs. I have also developed a GUI for this software in TKINTER that you can find in step six with two sample data sets . You can find additional information on this subject on the following websites:

Step 3: Hardware Setup

Hardware consists of :

1. Two webcameras (Logitech C920C)

2. Infocus LP330 projector

3. Camera and projector stand (made from 3 mm Acrylic plates and 6 mm HDF wood cut with a laser cutter)

Two cameras and projector should be connected to a computer with two video output like a notebook computer and the projector screen should be configured as an extension to main windows desktop. Here you can see images of cameras, projector and stand. The drawing file ready for cut are attached in SVG format.

The projector is an Infocus LP330 (Native resolution 1024X768) with following specs.
Brightness:650 Lumens Color Light Output:**Contrast (Full On/Off):400:1 Auto Iris:No Native Resolution:1024x768 Aspect Ratio:4:3 (XGA) Video Modes:**Data Modes:MAX 1024x768 Max Power:200 Watts Voltage:100V - 240V Size(cm) (HxWxD):6 x 22 x 25 Weight:2.2 kg Lamp Life(Full Power):1,000 hours Lamp Type:UHPLamp Wattage:120 Watts Lamp Quantity:1 Display Type:2 cm DLP (1) Standard Zoom Lens:1.25:1 Focus:Manual Throw Dist (m): 1.5 - 30.5 Image Size(cm):76 - 1971

This video projector is used to project structured light patterns on the object to be scanned. The structured pattern consists of vertical and horizontal white light strips that are saved on a data file and webcams capture those distorted strips.

Preferably use those cameras that are software controllable because you need to adjust focus, brightness, resolution and image quality. It is possible to use DSLR cameras with SDKs that are provided by each brand.

Assembly and tests were conducted in Copenhagen Fablab with its support.

Step 4: Experimenting With Scanner

For testing the system a fish toy was used and you can see the captured image. All captured file and also the output point cloud is included in attached file,

you can open the PLY point cloud file with Meshlab:

Step 5: Some Other Scan Results

Here you can see some human face scans and 3d scan of a wall. There are always some outlier points due to reflections or inaccurate image results.

Step 6: 3D Scanner GUI

For testing the 3d scan software in this step I add two data sets one is scan of a fish and another is just a plane wall to see the accuracy of it. Open ZIP files and run For installation check step 2. Send message to my inbox here for all source codes.

For using 3d scan part you need to install two cameras and projector but for other parts just click on the button. For testing the sample data first click on process then threshold, stereo match and finally point cloud. Install Meshlab to see the point cloud.



  • Clocks Contest

    Clocks Contest
  • Creative Misuse Contest

    Creative Misuse Contest
  • Water Contest

    Water Contest

35 Discussions

have you considered/tried using a turntable for the object to cover it in its entirety?

Hi, hesamh

Thanks for sharing.

In the step 2, you say that it is better to use cameras with more than two times resolution.

How did you get this result?

What are you comparing?

Best regards.



1 year ago

I am currently
comparing photographic, laser and structured light methods of producing point
clouds. Please can you explain the advantage of using your method over
automatic image correlation as two cameras are being used to give a stereo
view? Is it more difficult to match the imaged texture of the object than the

2 replies

By projecting patterns we produce feature point to make correlation accurate

Thank you for your reply.

Even if there is good texture on the surface of the object do you think that the projected patterns still provide a more accurate surface?


1 year ago

what about scanning something smaller and little more complex like a ps4 controller shell?

does the chinese projector unic uc 46 with native resolution 800x480 will work ?

1 reply

It works but you should adjust projector resolution inside python programs and change the binary images that will be shown on projector

What would you estimate the resolution and accuracy to be with the logitech c920s?

3 replies

Hi Paul, my projector can project more than 100000 different points, so if you catch them with both cameras you can have same number of points in your point cloud

Before I respond, I just want to say that this is an awesome project and I love the fact that you've used python for it as it'll make it super easy for me (and everyone else) to adapt it to our needs and develop it further.

The reason I ask about resolution and accuracy is those are the standard metrics used to determine 3d scanner performance and I wanted to be able to compare your setup with alternatives like a line laser based scanner to see if it's worth spending the extra money to do sls.

You say that you're projecting 100k+ points. Where does that figure come from? Presumably you use more than one pixel for each point? I would think these points are over a rather large area, so what sort of point per inch are we talking if you happen to know? How small were the smallest features you were able to capture reliably? Please understand that I'm not trying to bash your project or even imply that you ought to have answers to all these questions, I'm just trying to get at some sort of data to see if the sls approach is worth it for my project.

Resolution of video projector is 1024X768, because it is hard to capture each pixel I used points of 2X2 pixels, projection area is 60X45 cm so it means that distance between points are around 2 mm. If you use a projector with min focus of 50 cm resolution will be 120 DPI, but you should install cameras in closer distance.

Hi hesamh,
Thank you for sharing your great project.
Could you tell me what the method you used to solve stereo correspondance problem in stereo match step and where can i read about that in the references you provided?
I wish you explain that for me.
Thank you in advance.

3 replies

Thanks a lot for your response.
In camlcoloc file, in decoding process :
how do you convert gray code to decimal?
why do you multiply by 2^xx?
Another question plz:
In calcxyz file, could you explain what's going in the section after the triangulation function definition, especially the following code :
a = np.array(rightcod)


aaa, right_idx=np.unique(aa[:,0],return_index=True)


print 'Total points from right camera= ',m


for ii in range(0,m-1):

Thanks a lot.


how do you convert gray code to decimal?

Each pixel in projector screen has a horizontal and a vertical gray code, by grayimg=grayimg+(2**xx)*ff in a loop I convert it to decimal. Horizontal and vertical decimal code is then combined to get a unique number for each pixel among 1024*768 pixel of the projector.

colocright.append(np.uint32([rightcamcode[jj][ii][0]+rightcamcode[jj][ii][1]*1024 ,ii, jj])) is the line that combines horizontal and vertical so that global code is equal to horizontal code plus vertical code multiplied with 1024

file "graykod" is my gray to decimal conversion

rightcod and leftcod files have three columns, first column is the decimal code related to projector pixel, second is horizontal pixel coordinate of the camera and third is vertical pixel coordinate of the camera

aa=a[a[:,0].argsort(),] is for sorting based on the decimal code of projector pixels so that makes it easy to find similar projector pixels in left and right camera

if you need more explanation please ask, also please read the references precisely before asking more questions.

Later you can send me your code for checking and debuging. Try to run programs with the fish and wall data


2 years ago

Hi hesamh,

I used your code to make a home made small scanner. It is a very good code. I did some modifications to run it faster. If you want I can share the code with you or if you have it on GitHub I can send it you there.

Best regards,


3 replies

Hi bdvd,

I noticed this in the code

x1=-(pxpy[0]-960)*fcr #px1
y1=(pxpy[1]-540)*fcr #py1

Do you know where 960 and 540 come from? And what's the physical meaning of x1, pxpy here?

Thanks a lot.

Best regards,


960 is 1980/2 and 540 is 1080/2 (resolutions of the camera), these are index of the array for center pixel of the camera

Hi,could u please recommand some cameras for me?I do not know what to choose.It should be able to build colorful 3D model.THK