Introduction: DIY 3D Scanner Based on Structured Light and Stereo Vision in Python Language

This 3D scanner was made using low cost conventional items like video projector and webcams. A structured-light 3D scanner is a 3D scanning device for measuring the three-dimensional shape of an object using projected light patterns and a camera system. Software was developed based on structured light and stereo vision with python language.

Projecting a narrow band of light onto a three-dimensional shaped surface produces a line of illumination that appears distorted from other perspectives than that of the projector, and can be used for an exact geometric reconstruction of the surface shape. Horizontal and vertical light bands are projected on object surface and then captured by two webcams.

Step 1: Introduction

Automatic 3D acquisition devices (often called 3D scanners) allow to build highly accurate models of real 3D objects in a cost- and time-effective manner. We have experimented this technology in scanning a toy to prove performance. Specific needs are: medium-high accuracy, easy of use, affordable cost of the scanning device, self-registered acquisition of shape and color data, and finally operational safety for both the operator and the scanned objects. According to these requirements, we designed a low-cost 3D scanner based on structured light which adopts a versatile colored stripe pattern approach. We present the scanner architecture, the software technologies adopted, and the first results of its use in a project regarding the 3D acquisition of a toy.

In the design of our low-cost scanner, we chose to implement the emitter unit by using a video projector. The reason was the flexibility of this device(which allows to experiment any type of light pattern) and its wide availability. The sensor can be either a custom device, a standard digital still camera or a webcam. it must support high quality color capture (i.e. acquisition of high dynamic range) and possibly with high resolution.

Step 2: Software

Python language was used for programming for three reasons, one it is easy to learn and implement, two we can use OPENCV for image related routines and three it is portable among different operating system so you can use this program in windows, MAC and Linux. You can also configure the software to use with any kind of camera (webcams, SLRs or industrial cameras) or projector with native 1024X768 resolution. It is better to use cameras with more than two times resolution. I personally tested the performance in three different configuration, first one was with two parallel Microsoft webcam cinema and a small portable projector, second one was with two lifecam cinema webcameras that rotated 15 degrees toward each other and Infocus projector, last configuration was with logitech webcameras and Infocus projector. To capture point cloud of object surface we should go trough five steps:

1. Projecting gray patterns and capturing images from two cameras ""

2. Processing the 42 images of each camera and capture points codes ""

2. Adjusting threshold to select masking for areas to be processed ""

4. Find and save similar points in each camera ""

5 Calculate X,Y and Z coordinates of point cloud ""

The output is a PLY file with coordinate and color information of points on object surface. You can open PLY files with CAD software like Autodesk products or an open source sofware like Meshlab.

Python 2.7, OPENCV module and NUMPY should be installed to run these Python programs. I have also developed a GUI for this software in TKINTER that you can find in step six with two sample data sets . You can find additional information on this subject on the following websites:

Step 3: Hardware Setup

Hardware consists of :

1. Two webcameras (Logitech C920C)

2. Infocus LP330 projector

3. Camera and projector stand (made from 3 mm Acrylic plates and 6 mm HDF wood cut with a laser cutter)

Two cameras and projector should be connected to a computer with two video output like a notebook computer and the projector screen should be configured as an extension to main windows desktop. Here you can see images of cameras, projector and stand. The drawing file ready for cut are attached in SVG format.

The projector is an Infocus LP330 (Native resolution 1024X768) with following specs.
Brightness:650 Lumens Color Light Output:**Contrast (Full On/Off):400:1 Auto Iris:No Native Resolution:1024x768 Aspect Ratio:4:3 (XGA) Video Modes:**Data Modes:MAX 1024x768 Max Power:200 Watts Voltage:100V - 240V Size(cm) (HxWxD):6 x 22 x 25 Weight:2.2 kg Lamp Life(Full Power):1,000 hours Lamp Type:UHPLamp Wattage:120 Watts Lamp Quantity:1 Display Type:2 cm DLP (1) Standard Zoom Lens:1.25:1 Focus:Manual Throw Dist (m): 1.5 - 30.5 Image Size(cm):76 - 1971

This video projector is used to project structured light patterns on the object to be scanned. The structured pattern consists of vertical and horizontal white light strips that are saved on a data file and webcams capture those distorted strips.

Preferably use those cameras that are software controllable because you need to adjust focus, brightness, resolution and image quality. It is possible to use DSLR cameras with SDKs that are provided by each brand.

Assembly and tests were conducted in Copenhagen Fablab with its support.

Step 4: Experimenting With Scanner

For testing the system a fish toy was used and you can see the captured image. All captured file and also the output point cloud is included in attached file,

you can open the PLY point cloud file with Meshlab:

Step 5: Some Other Scan Results

Here you can see some human face scans and 3d scan of a wall. There are always some outlier points due to reflections or inaccurate image results.

Step 6: 3D Scanner GUI

For testing the 3d scan software in this step I add two data sets one is scan of a fish and another is just a plane wall to see the accuracy of it. Open ZIP files and run For installation check step 2.

For using 3d scan part you need to install two cameras and projector but for other parts just click on the button. For testing the sample data first click on process then threshold, stereo match and finally point cloud. Install Meshlab to see the point cloud.


Kai-SiangG (author)2017-05-31

Hi, hesamh

Thanks for sharing.

In the step 2, you say that it is better to use cameras with more than two times resolution.

How did you get this result?

What are you comparing?

Best regards.


CITD (author)2017-03-03

I am currently
comparing photographic, laser and structured light methods of producing point
clouds. Please can you explain the advantage of using your method over
automatic image correlation as two cameras are being used to give a stereo
view? Is it more difficult to match the imaged texture of the object than the

hesamh (author)CITD2017-03-07

By projecting patterns we produce feature point to make correlation accurate

CITD (author)hesamh2017-03-07

Thank you for your reply.

Even if there is good texture on the surface of the object do you think that the projected patterns still provide a more accurate surface?

PatA44 (author)2017-01-07

what about scanning something smaller and little more complex like a ps4 controller shell?

RohitG78 (author)2016-12-06

does the chinese projector unic uc 46 with native resolution 800x480 will work ?

hesamh (author)RohitG782016-12-31

It works but you should adjust projector resolution inside python programs and change the binary images that will be shown on projector

paulie_g (author)2016-08-15

What would you estimate the resolution and accuracy to be with the logitech c920s?

hesamh (author)paulie_g2016-08-23

Hi Paul, my projector can project more than 100000 different points, so if you catch them with both cameras you can have same number of points in your point cloud

paulie_g (author)hesamh2016-08-23

Before I respond, I just want to say that this is an awesome project and I love the fact that you've used python for it as it'll make it super easy for me (and everyone else) to adapt it to our needs and develop it further.

The reason I ask about resolution and accuracy is those are the standard metrics used to determine 3d scanner performance and I wanted to be able to compare your setup with alternatives like a line laser based scanner to see if it's worth spending the extra money to do sls.

You say that you're projecting 100k+ points. Where does that figure come from? Presumably you use more than one pixel for each point? I would think these points are over a rather large area, so what sort of point per inch are we talking if you happen to know? How small were the smallest features you were able to capture reliably? Please understand that I'm not trying to bash your project or even imply that you ought to have answers to all these questions, I'm just trying to get at some sort of data to see if the sls approach is worth it for my project.

hesamh (author)paulie_g2016-08-25

Resolution of video projector is 1024X768, because it is hard to capture each pixel I used points of 2X2 pixels, projection area is 60X45 cm so it means that distance between points are around 2 mm. If you use a projector with min focus of 50 cm resolution will be 120 DPI, but you should install cameras in closer distance.

MahmoudK17 (author)2016-04-25

Hi hesamh,
Thank you for sharing your great project.
Could you tell me what the method you used to solve stereo correspondance problem in stereo match step and where can i read about that in the references you provided?
I wish you explain that for me.
Thank you in advance.

hesamh (author)MahmoudK172016-05-02

check page 12 of this document:

tell me if you need more explanation on specific parts

MahmoudK17 (author)hesamh2016-05-09

Thanks a lot for your response.
In camlcoloc file, in decoding process :
how do you convert gray code to decimal?
why do you multiply by 2^xx?
Another question plz:
In calcxyz file, could you explain what's going in the section after the triangulation function definition, especially the following code :
a = np.array(rightcod)


aaa, right_idx=np.unique(aa[:,0],return_index=True)


print 'Total points from right camera= ',m


for ii in range(0,m-1):

Thanks a lot.

hesamh (author)MahmoudK172016-05-10


how do you convert gray code to decimal?

Each pixel in projector screen has a horizontal and a vertical gray code, by grayimg=grayimg+(2**xx)*ff in a loop I convert it to decimal. Horizontal and vertical decimal code is then combined to get a unique number for each pixel among 1024*768 pixel of the projector.

colocright.append(np.uint32([rightcamcode[jj][ii][0]+rightcamcode[jj][ii][1]*1024 ,ii, jj])) is the line that combines horizontal and vertical so that global code is equal to horizontal code plus vertical code multiplied with 1024

file "graykod" is my gray to decimal conversion

rightcod and leftcod files have three columns, first column is the decimal code related to projector pixel, second is horizontal pixel coordinate of the camera and third is vertical pixel coordinate of the camera

aa=a[a[:,0].argsort(),] is for sorting based on the decimal code of projector pixels so that makes it easy to find similar projector pixels in left and right camera

if you need more explanation please ask, also please read the references precisely before asking more questions.

Later you can send me your code for checking and debuging. Try to run programs with the fish and wall data

bdvd (author)2015-10-02

Hi hesamh,

I used your code to make a home made small scanner. It is a very good code. I did some modifications to run it faster. If you want I can share the code with you or if you have it on GitHub I can send it you there.

Best regards,


Salagion (author)bdvd2015-12-16

Hi bdvd,

I noticed this in the code

x1=-(pxpy[0]-960)*fcr #px1
y1=(pxpy[1]-540)*fcr #py1

Do you know where 960 and 540 come from? And what's the physical meaning of x1, pxpy here?

Thanks a lot.

Best regards,


hesamh (author)Salagion2016-05-02

960 is 1980/2 and 540 is 1080/2 (resolutions of the camera), these are index of the array for center pixel of the camera

Byte Chen (author)bdvd2015-10-28

Hi,could u please recommand some cameras for me?I do not know what to choose.It should be able to build colorful 3D model.THK

hesamh (author)Byte Chen2016-05-02

You need a camera with software control, some webcams can be used, like some models of microsoft or logitech webcams. It is better to use a camera with no compression and also with global shutter

hesamh (author)bdvd2016-05-02

Hi, can you send the code link to me?

shreyaskamathkm (author)bdvd2015-12-11

Hello bdvd,

I was planning to make something similar using hesamh's code. Could you please help me out by providing your code too? It would be very grateful.

Thank you,

Best Regards,


Byte Chen (author)2015-10-28

Hi hesamh,

I am going to build a 3D system based on your code.I wonder if I can update the hardware to high quality ones,maybe I can use 2 better cameras? If so,do I need to modify the code? Please forgive my lackness of knowledge, I am still a student...THANK U!!!!!!

Best Regurds,

Byte Chen

hesamh (author)Byte Chen2016-05-02

Yes Chen, you can update the code for a better camera, check inside python codes to see where you need to change

Salagion (author)2015-07-28

Hi hesamh,

About the camera, did you ever tune the parameters like focal length, exposure time, etc. Or do you think the parameter values matter the quality of the photos. If so, then what's your advice for tuning the parameters?

Best regards,


hesamh (author)Salagion2015-07-29

Hi, yes I tuned the camera parameters, focal length, exposure, ...... All parameter should be fixed so that you can get a good contrast for stripes after thresholding and making black and white image. For lifecam cameras there is "Cameraprefs" and for logitech you should use adjusting software.

Salagion (author)hesamh2015-08-02

Hi hesamh,
Thanks for you reply. As for how much the parameters exactly should be, how you managed to measure that? Or to say, you set up the physical layout prior to measuring the parameters, or you calculated the parameters according to the algorithm, and then you set up the layout in keeping with the parameter determined? Also, I didn't see a PY program named after 'calibration', I guess you hard-coded the internal and extrinsic parameters somewhere inside the programs, is that right?

Best regards,


hesamh (author)Salagion2015-08-05

You can set camera parameter in OPENCV and If you check the "" at the beginning you see these lines:






video_capture0 = cv2.VideoCapture(0)






Salagion (author)hesamh2015-08-11

Hi hesamh,

Thanks for your reply. I know I can set shooting parameters by means of you mentioned. Actually I am curious about how you managed to calculate the following settings


fcl=.003 #pixel size for left camera Focal Length
fcr=.003 #pixel size for right camera
tetl=0.26179938779914943653855361527329 #left camera rotation angle around Y axis (15 deg)
tetr=-0.26179938779914943653855361527329 #right camera rotation angle
phil=0 #left camera rotation angle around X axis
x0l=250 # left camera translation in X direction
y0l=0 # right camera translation in Y direction
ccddr=2.977 #focal lenght right camera mm
ccddl=2.977 #focal lenght left camera


These are the internal and extrinsic parameters, right? They are very important to the reconstruction. Normally we get this by calibration steps, however I didn't see such process in your code. Could you please share how you acquired these figures?

Best regards,


hesamh (author)Salagion2015-08-19

Yes, they are very important and small error in them leads to unacceptable results.

pixel size and focal length are parameters that you should get from camera manufacturer. x0l,x0r,y0l,y0r,z0l,z0r phi and tet are values that should be set based on your geometrical design(installation of cameras)

Salagion made it! (author)hesamh2015-09-23

Hi hesamh,

I am not so sure if I understand the coordinate system in your geometical design, could you please somehow demonstrate the parameters(x0l,x0r,y0l,y0r,z0l,z0r,phi,tet) in the uploaded photo? I suppose that's the most visual way to understand the correspondence between these settings and the actual geometrical.

Thank you so much.


hesamh (author)Salagion2015-09-26

I added an image in step 2 so you can see X and Z and also tet. Phi is the rotation around camera axis and center of the coordinate is between two cameras.

hesamh (author)Salagion2015-08-19

For calibration you need accurate calibration board and good algorithm. I used a calibration board from a commercial 3d scanner and found out that my settings are accurate enough for me. But if you need accuracy of 100th of millimeter you need calibration. A good calibration board costs more than 1000 Euros!!
I made camera base and stands with a laser cutter so I could have accurate installation of cameras without calibration.

Salagion (author)hesamh2015-09-23

Hi hesamh, I am very curious about the DIY design. May I have your mail box for further discussion?

TuR1 (author)2015-08-06

Hi, hesamh

I am the student from the National Central Unversity.

My professor want me to create the 3D scanner platform.

Fortunately, i find your website, your solution helps me a lot, i am very appreciate :-)

Now I am studying your code, but I cannot understand your algorithm, I don't know why you have to doing this...for example


imgbin3[jj][ii][1]= grayimg[jj][ii]%256
imgbin3[jj][ii][2]= 40*grayimg[jj][ii]//256
imgbin3[jj][ii][0]= 4

k = cv2.waitKey(0)
if k == 113:
elif k == 2555904:
elif k == 2424832:

Because I should write paper in the future, and I will add your website say the

authentic author is you, in my paper I'll explain your algorithm, so I need something help.

Could you please give me some reference list that could help me understand your algorithm? Thanks.

hesamh (author)TuR12015-08-19

Good luck in your project. Here is some explanation:


Above part is for thresholding the images and make a binary black and white image read page six of this document:


imgbin3[jj][ii][1]= grayimg[jj][ii]%256
imgbin3[jj][ii][2]= 40*grayimg[jj][ii]//256
imgbin3[jj][ii][0]= 4

this part is just for demonstration of stripes with different color, so you can omit this part.


k = cv2.waitKey(0)
if k == 113:
eslif k == 2555904:
eslif k == 2424832:

2555904 is keyboard code of right arrow key and 2424832 is for left arrow key

ask if you have further questions :)

TuR1 (author)hesamh2015-09-14

Thanks a lot, hesamh

The pdf file is a good reference document, I'll read it.

Thank for your help, I am very appreciate. :-)

Salagion (author)2015-05-04

Hi hesamh,

I got the following error when I tested your code(SLS2012v3.2) in VS2010

1>CanonCamera.obj : error LNK2001: unresolved external symbol __imp__EdsCreateEvfImageRef@8

1>CanonCamera.obj : error LNK2001: unresolved external symbol __imp__EdsDownloadEvfImage@8

I suppose this is due to incorrect .dll or .lib configuration, however I checked additional dependencies, include directories and library directories where I have maintain corresponding paths. And I tested all of this in 32bit environment. Could you please give me some hint on getting this resolved?

hesamh (author)Salagion2015-05-05

Hi, SLS2012 belongs to Cyprus University of Technology and it is not my code, I just put it here for giving more information about the 3D scanning. My codes are in Python language and you can find them in steps three and six. But as I know you should install Canon SDK to run SLS2012.

iamhuskar (author)2015-03-24

Very interesting~~~

instructables is a good website. i want to make a 3d scanner.then google .then ~start from here。

sry for my poor english.ha ha

krummrey (author)2015-03-20

Looks great!
Have you tried it using a Raspberry Pi and it's cam?
And have you stitched multiple scans together to get a watertight model?

hesamh (author)krummrey2015-03-22

With Meshlab you can stitch pointclouds and I did that successfully

hesamh (author)krummrey2015-03-22

Thanks, I have already thought about it to use raspberry pi because we can make it much smaller. You need two raspberry pi with two cameras or a raspberry pie with one compute module and two cameras. For many reason it will be a good system:

* it will be much smaller if you use a portable projector

* Raspberry Pi camera is completely software controllable and easy to manipulate

* You can easily use raspbian , OPENCV and Python

Zach Sousa (author)2015-03-19

Very interesting!

About This Instructable




Bio: Teacher that enjoys working with students
More by hesamh:Showing Charts and Gauges of IOT device data using Arduino web server with JavaScriptLasercut interlock puzzleDIY 3D scanner based on structured light and stereo vision in Python language
Add instructable to: