Introduction: 3D Models From Cheap Digital Images - Proposal

The purpose of this Instructable is to propose an outline for a system that can generate three dimensional models of real world objects given digital images of said objects.  In order to simplify the system we focus on generating models of faces.  However, given enough time this system could evolve to be used to generate models of arbitrary objects.  The models we aim to produce will be printable on a 3D printer, ala MakerBot.

We made this for the Things That Think class Spring 2011 at CU Boulder. We're Paul Heider , Buck Heroux and Ian Smith .

Step 1: Things You'll Need

A digital camera or a camera phone
A person whose face you want to turn into a 3D model
Our software, or possibly access to our web interface
(optional) A 3D printer

Step 2: Obtaining the Images

Our system obtains three dimensional images via means of stereo rectification, but more on that later.  The process for obtaining images for use in our system is very simple:

1. Find the person whose face you want to turn into a 3D model
2. Have them stand in front of a relatively featureless background.  Examples of such a background include a solid color wall or door.  Be careful that no other objects show up in the images, i.e. hangings on the wall.  It is important to not here that you should not fake the background be editing the photograph later.  For instance do not Photoshop objects out of images, this will mess up the algorithm.
3.  Tell the subject not to move while the photographs are being taken.
4.  Prepare to take a series of images.  Start by positioning the digital camera horizontally (landscape mode).  The subject's head should take of the majority of the vertical space in the photograph.  For the first picture, the subject's head should occupy the right-center of the photo.
5. Take a series of pictures of the subject moving the camera each time horizontally.  For each successive picture you should move the camera left to right while facing the subject.  The subject's face should be getting closer to the right side of the photo every time.

Step 3: 3D Rectification

The next step for our system is to perform a 3D rectification from the images taken in the previous step.  In order to perform this step we use the aid of OpenCV , a fully featured and open source computer vision library.  

Specifically, we use the 3D Reconstruction sublibrary  to aid us in obtaining 3D information.  This library allows us to take a pair of images and obtain 3D point in space for each pixel.  Very simply the algorithm works as follows:
1. Find regions of each picture that match each other.  E.g. the subject's left eye in each picture.
2. Calculate the amount of movement, in pixel distance, that each region moved between pictures. 
3. The more a region moved, the closer that it is to the camera.  Therefore, by knowing much each pixel moved between pictures we can come up with a 3D location for each point relative to all others in the picture.

There are some things to note about this process.  This algorithm typically depends on knowing a lot of things about the world in which the pictures were taken in order to come up with good three dimensional information.  Our approach to creating these 3D models is both novel and difficult because we are attempting to do so without much of this knowledge about the outside world.  Namely, the biggest things we are missing are the intrinsic and extrinsic camera parameters .  Intrinsic camera parameters have adjust for things like distortion in the lens where extrinsic parameters adjust for things like the exact amount of distance the camera moved between.  Our system makes (probably bad) assumptions about both of these and thus cannot produce very accurate point clouds at the moment.  An area of further improvement definitely includes trying to estimate these parameters better. 

The attached images show a 2D representation of our point cloud where the whiter a pixel is the closer to the camera it is. 

Step 4: Noise Filtering & Other Issues

As mentioned in the last paragraph of the previous step, we cannot produce particularly reliable or accurate point clouds given images.  Therefore it is necessary to filter out a lot of the noise that gets created in the previous step.  There are several types of noise that can occur:

1.  Non-subject areas such as textures on the background are sometimes assigned 3D positions much closer to the camera than they are in reality.  This makes them look like a part of the subject when in fact they are not.
2. Some points have error conditions that cause them to be placed essentially on the camera lens, or infinite far away.
3.  Many areas on the subject receive Z (depth) values not at all close to the points around them.
4.  Some areas of the subject don't get matched between pictures and thus do not receive a depth value at all, resulting in holes.  

Noise of type 2 can be eliminated trivially.  

Noise of types 1 and 3 is eliminated in much the same way.  For each point we get in 3D space we compare it to its neighbors.  If too many of its neighbors weren't assigned a depth, or were assigned a depth significantly different from this point, this point is determined to be noise and eliminated.

Type 4 errors are more difficult to deal with and still pose problems.  

Step 5: 3D Model Creation

At this step we now have a somewhat de-noised 3D point cloud that needs to be turned into a 3D model which can be printed.  This area, unfortunately, is proving to be rather difficult and in need of further research and improvement.  We have attempted two different ways of going about creating models.  

1.  Create a 3D model directly from the point cloud.  There are some academic software packages that exist for turning a point cloud into triangular mesh.  We have investigated and tried using these to turn our point cloud into a model.  These tools have produced valid 3D meshes, however they are in no way suitable for 3D printing.  The first problem with the meshes produced is that they are very jagged and do not produce smooth faces.  This is an artifact of the noise remaining in our point cloud.  The second problem is that meshes produced are in no way watertight and thus not suitable for 3D printing.  By this i mean that due to the holes in our point cloud (see previous step, error type 4) we have gaps in the mesh.

2.  Use an existing 3D model of a face and deform it so that it looks like the subject being photographed.  This method for creating a model is promising but needs to be explored further.  Its primary advantage is that we are starting with a model that already looks kind of like what we want, and also is watertight and thus suitable for 3D printing.  In order to use the point cloud to deform an existing mesh we first have to align the point cloud to the mesh which thankfully is not proving to be too difficult.  The next step, which needs to be explored further, is using the point cloud to deform the mesh.