Copyright (c) 2003, Rama "cc" Hoetzlein
1. Basic Digitization Theory
2. Construction of the 3D LEGO Digitizer
3. The Scanning Process and Input Data
4. Reconstruction Software
6. Future Directions
7. Current Status
I. Basic Digitization Theory
The 3D LEGO Digitizer uses a technique called laser-strip triangulation.
The fundamental goal of any digitizer is to determine
the three-dimensional coordinates (x,y,z) for each point on the surface of a
real object. The problem is that a still or video camera does not capture depth,
but only color and intensity at each point. Given a photograph of an object
(like the ram above), it is impossible to determine the distance from the camera
to each point in the picture. Human beings sense depth from a photograph using
contextual clues (like familiar objects), depth and shading cues (like perspective),
and past experience (we are familiar with the shape of the human head). However,
if we are shown a picture of the night sky for example (constellation Aries,
above right), there is no way for us to determine just by looking at the picture
the distance of each star (astronomers have other methods of determining the
distance of stars, but laser strip triangulation is not one of them).
Laser strip triangulation solves this problem by using
another constraint in addition to the camera. A laser is used to sweep out a
plane in space so that a "strip" of light falls on the object being
scanned at an angle. While this "strip" of light illuminates the object,
we take a picture of it from the camera. Because the position of the laser-plane
is known (we set up the laser so that we always know where the laser-plane will
be), and because the angle of the camera is know relative to the laser, the
laser-plane thus provides a constraint which can be used to determine the depth
of the object in the picture.
In a single picture, we see the contour of the object
as the laser traces a single "strip" over the object. Because of the
arrangement of camera and laser, each of these illuminated points in the image
must be in the plane swept out by the laser. Thus the depth of
the strip is know. This information allows us to determine the exact distance
of each point from the camera. By moving the laser plane, and taking a sequence
of pictures we can not only capture a single strip of the object, but the geometry
of the entire object.
With the lights out, the camera captures just the 'strip' which can be analyzed by a computer.
This is the basic theory behind laser strip triangulation.
However, several specific mathematical details have been left out here. The
problem of laser strip triangulation can be stated more precisely as this..
"Knowing the angle of the camera to the laser plane, the distance from
the camera of the laser, and given a laser-strip picture like the one above,
how would one actually compute the three-dimensional coordinates (x,y,z) of
each of the points in this image?" Feel free to get out some graph paper,
the answer will be provided in section four on Reconstruction Software below..
Or continue to the next section for Construction of the LEGO Digitizer.
II.Construction of the 3D LEGO Digitizer
The theory of Laser-Strip Triangulation in part I describes how a rotating laser beam, set up to sweep out a a laser-plane moving over an object, and using a camera to capture "strips" of the object geometry as the plane moves, can be used (with the proper design and calculations) to reconstruct the geometry of a real object. The construction of the LEGO Digitizer is as follows..
The parts used in this project are:
- The LEGO Mindstorms Kit ($200)
- Helium-Neon Laser (inexpensive ~$150, low-power 0.5 mW, harmless-to-touch, red-light laser from Meredith Instruments laser surplus)
- Webcam (Intel Create & Share camera $100)
- Five small 3/4" x 3/4" square Mirrors ($5)
- Additional LEGOs
- Computer with capture capabilities and a programming environment
Visual Studio C++ was used for the programming environment, and Adobe Premiere was used to capture webcam images. However, neither of these are necessary. Most webcams ship with basic capture software and any programming language that can load a sequence of images will do.
The main problem in construction, as defined by the problem in part I, is to provide a plane of laser light that not only sweeps out a "strip" across the object, but also moves through space to capture many "strips" that will be used to reconstruct the complete object. This gives us two problems: 1) Given a beam of light (from a laser point source), how do we spread the beam into a plane and.. 2) Given a plane of laser-light how to we move this plane of light through space and over the object
The overall design of the LEGO Digitizer consists of
a carriage, a track, and mounting points for the laser and camera. The carriage
sits in the track and moves along it horizontally. Parallel to the track, and
behind it, the laser tube is mounted. Two mirrors at 45 degs. are used to point
the laser beam directly along the track and into the carriage. When the beam
enters the carriage, a third 45 deg mirror shoots the beam vertically. Immediately
above this mirror, a fourth mirror is mounted on a rotating shaft and aligned
so that when the beam enters from below it directs the beam radially in a vertical
plane perpendicular to the track. This whole carriage assembly (with rotating
mirror) moves along the track, thus providing a way to move the laser plane
through space (ie. along the track). Because the beam is directed so that it
goes right down the center of the track before entering the carriage, we can
guarantee that as the carriage moves the laser beam will always enter it from
the same direction. This provides a consistent, moving plane of light such that
each light-plane is parallel yet slightly offset from the previous one.
These pictures shows the carriage and the rotating mirror
which cast the laser beam (entering from below) into a vertical plane. A motor
turns the mirror at the shaft using a rubber band attached to Lego discs which
go from large-to-small. I tried gears at first, but found that the motion was
not smooth enough. The discs go from large-to-small so that the rotating mirror
sweeps out a full plane very quickly. This is necessary because each strip must
be swept out in less than 1/30 sec. so that the camera captures only a single
strip per frame.
The next picture shows the two 45 deg. mirrors used
to transmit the beam down the track. The mirrors are mounted on what I call
LEGO "alligator" pieces (large 4x4 hinges). This provides accurate
adjustment of the beam (well, as accurate as you can achieve with LEGOs anyway!).
One problem with this design is that the hinges provide adjustment horizontally
only, while the beam could still be misaligned up-and-down. I found that just
by "pressing" on the LEGOs (ie. making them tighter) in the right
places I could adjust vertical alignment of the optics very easily.
A gear box with a coil drive and gear-down drive is
used to move the carriage smoothly along the track. The motor is geared way
down (ie. made very slow) using both a coil drive and small-to-large gears so
that the carriage moves very slowly. Unlike the rotating mirror, which must
sweep out a strip in <1/30 sec, we want the carriage to move slowly enough
that "strips" on the object are spaced fairly closely together. The
gives a more accurate scan. Thread spools go to the carraige from either side
and are wound in opposite directions much like how a printer head works. As
one side is pulls in, the other side is let out. A rubber band on the 'let out'
side provides tension so that the carriage motion is smooth. Smooth motion of
the carriage turned out to be one of the most important factors in getting a
good scan... I found that lubricating the track with a bit of water before each
scan was the best solution. Although it dries quickly, its a lot better than
getting grease or some other oil all over my Legos.
Finally, the keen observer will notice that LEGO does
not produce Lego optics (why not, I ask?! :). To build the four mirrors needed
for this project, I just got five 3/4" x 3/4" mirrors (by cutting
up a larger mirror) and using a bit of glue, attached them to standard lego
pieces. Three of these involed simply attaching the mirrors to a 2x2 flat Lego
piece. The rotating mirror was create by taking a LEGO drive shaft, sanding
down the sides of it on 3/4" of one end, and gluing two mirrors back-to-back
with a bit of cardboard to fill the shaft space.
The entire thing: track, carriage, and RCX unit are attached to a solid base with space-age stancheons so that the digitizer stands up off the table a bit. The only other pieces are the laser itself and the camera...
The laser is provided from an old Helium-Neon laser
I bought while I was in high school. It is an inexpensive (~$150), low-power
(0.5 mW), harmless-to-touch laser that provide a bright confined beam. They
are relatively easy to aquire on-line (Meredith Instruments, http://mi-lasers.com).
Look at complete HeNe systems (~$250), or put together a laser from a HeNe tube
and power supply like I did (~$150). The only safety concern is that you should
not look into the beam (!). But there is no danger in touching it. For the purposes
of the digitizer, a pen laser-pointer may even work if its bright enough.
A video camera is needed so that a sequence of images can be obtained (a still digital camera won't work). For this project, an Intel Create & Share Webcam was used ($100). I could have used the Lego camera that comes with the LEGO Mindstorms kit, but I found that interfacing my computer to the Intel Webcam was much easier (using Adobe Premiere). Today, inexpensive webcams often support resolutions of 640x480 which is enough to provide a fairly good scan. I've found that the limiting factor in scan quality is not the resolution of the camera, but the accuracy and smoothness of the motion of the Legos.
III. The Scanning Process and Input Data
Here is an image of the 3D LEGO Digitizer in the process
of scanning a clay model head (with the lights out).
The steps in digitizing an object are these:
1) Lubricate the track with water so the carriage moves smoothly.
2) Place the carriage at the 'starting' position.
3) Point the camera at the object and zoom in until the field of view covers the object.
3) Measure the distance from the carriage to the camera (using a tape measure). (This is needed for the mathematics as described in next section)
4) Measure the distance from the camera to the object, and measure the distance across the object that the camera sees. (Also needed for the math)
5) Turn on the laser. Making sure the beam is properly aligned with the carriage.
6) Turn on the Lego RCX.
7) Start the camera capture using Adobe Premiere.
8) Press the 'Run' button on the RCX... This starts both the rotating mirror and also the carriage-drive motor turning.
9) Wait 40 secs for scan to complete! - Then turn everything off.
Thats it.. LEGO digitizer scans in about a minute. I found that the last step is important.. if I don't turn the RCX off in time, the carriage hits the end of the track and the motor starts to pull the pieces apart (its way geared down to make it move slowly, so it has enough power to do this).
A close up of the model shows how the digitizer traces
a vertcal "strip" on the object. I first started making a 3D LEGO digitizer
early in 2002 with a horizontal-scan digitizer that had two vertical 'columns'
and a 'bridge' that was raised using a set of motors. I did a few scans of my
head with this device with varying degrees of success (I hope to dig out some
older photos of this project and post them here soon). However, this design was
very inaccurate because it was too
difficult to make the motion against gravity even on both ends of the bridge.
The solution was to move the carriage horizontally, and having the strips be
traced out vertically.
The digitizer scans an object in roughly 40 seconds (this is the time it takes the carriage to move the fully length of the track). Each 1/30th of a second, the camera captures an image of the laser 'strip' on the object.. Based on the distance the carriage travels (8.2 inches in 40 secs), we can calculate that the carriage is moving at roughly 0.205" per sec. Because the camera is capturing images at 1/30th of a sec, this means that in theory each 'strip' will be about .00683" (.205" / 30 sec) apart - this is a horizontal accuracy of 0.1 mm... However....
The accuracy is also limited by the camera resolution. The camera is set up so that as much of the object to be scanned is in view as possible. This increases the resolution by providing more pixel coverage of the object. Thus, with a camera resolution of 640x480, and a field of view that covers 8" of the object (from one side of the frame to the other), we can calculate the accuracy as .0125" (8" / 640) - this is an accuracy of 0.3 mm (worse than 0.1 mm).... However...
Unfortunately, there is another limitation. Looking at the object while scanning, I noticed that the laser-plane does not scan smoothly across the object but shifts back-and-forth with a very slight vibration. This was traced back to the rotating mirror. In gluing the mirrors back-to-back on the Lego axle, it was impossible to have them perfectly centered. Since the mirror is set rotating very quickly (faster than 1 rotation every 1/30 sec), this slight misalignment sets up a small vibration in the carriage. Since the beam is reflected out to the object, any inaccuracies are amplified as the beam goes out to the object. The result is that these vibrations cause a very minor vibrational shifting of the light-plane as the carriage moves. It should be noted that these vibrations are undetectable except by looking at the laser strip shift back-and-forth every so slightly. I minimized this by using a rubber-band to smooth out the motion, and by keeping the mirrors as close to the rubber-band as possible... The actual accuracy appears to be about 1 mm... This is enough to capture fine facial details. Not too bad considering it scans in 40 secs and is made out of LEGOs!!
The follow sequence shows what the camera actually captures:
Each image contains a single 'strip' contour of the
object at a specific time. Another problem I discovered is that some frames
are completely blank. This is because the camera capture rate and rotating mirror
are slightly out of sync. The camera captures at 1/30th sec, and the beam traces
a strip a bit faster than this. However, because these two devices have different
frequencies, and because they are not synchronized by anything, there are moments
when the beam has not yet "stripped" the object, but the camera captures
an image anyway. I found I could almost completely eliminate this problem by
making the laser beam rotate as-quickly-as-possible by gearing up (down?) the
rotating mirror axle. A few frames still come in blank when the timing is just
right - these frames are just ignored.
Here is a single, full frame from the camera: (original image except for being reduced in size)
It is important to have the lights off during digitization
(so the only thing visible to the camera is the laser).
I've discovered that I can also scan live people, so long as they hold very still, and so long as they keep their eyes closed! Eyelids will generally provide enough eye protection from a 0.5 mW HeNe laser, but please don't try this at home as it depends on the power of the laser and the thickness of your eyelids. I've also used black or white tape over the eyelids to provide additional protection.
DISCLAIMER: This should go without saying, but please don't point lasers at your eyes. Low power (<2 mW) HeNe lasers are almost always harmless to touch, but may cause eye damage depending on the type and power of the laser. I am not responsible for how you use your laser. If you purchase a laser, you must take the necessary precautions and be responsible with it. Do not look into the beam.
IV. Reconstruction Software
Given the sequence of frames captured in the last section, the final step is to take these images into a computer program that calculates the three-dimensional coordinates (x,y,z) of all the points of the object. The is the problem as described above:..."Knowing the angle of the camera to the laser plane, the distance from the camera of the laser, and given a laser-strip picture like the one above, how would one actually compute the three-dimensional coordinates (x,y,z) of each of the points in this image?"...
Here is how we set out to solve the problem (this first stuff is just setting up what we need):
* To know the angle of the camera to the laser plane, we just have to measure this angle (in the real world).
* To know the distance from the camera to the laser,
we have to keep in mind that there are several frames coming in and each frame
has a different distance from the camera to the laser. Thus, for this piece,
we should really be asking: "Given a frame that was captured at time t,
at what position was the laser at this time?". To determine this, we simple
note that the laser carriage is moving at a certain rate (.2051" per sec
as computed in previous section), from a known starting location. We call the
starting position Lx(0) - laser position on the x-axis at time 0. We call the
speed of the laser Lx' - laser speed along x-axis.
We can compute the laser position at some time t using
the starting position, the speed of the carriage, and the amount of time that
Lx (t) = Lx (0) + Lx' * t
The position of the laser at time t is calculated from
its starting position, and its motion over a given time. Since we want to know
what the laser position is for each frame of the camera, the times t that we're
interested in are based on the frame number f, and the frame rate (FPS) of the
t (time in seconds) = f (frame number) / FPS (frames per second)
Plugging this in, a more useful formula gives us the position of the laser (in inches from the start location) for any frame number f.
Lx (f) = Lx(0) + Lx' * f / FPS
* One other thing we will need is the field of view (FOV) of the camera. This will become necessary for the reconstruction calculation itself. The FOV of a camera can easily be measured by pointing the camera at an object, measuring the distance from the camera to the object, adjusting the zoom of the camera as necessary (this changes the FOV), and then measuring across the object the amount of tape measure that is visible in the camera view. This sets up a right triangle from which we can compute the field-of-view (angle) of the lens.
Now that everything is set up, we have preliminary calculations,
and we have an image - how do we actually compute the points?
First, we scan through the image to find the laser 'strip'. As the strip is vertical, it makes sense to march through the image vertically, then scan across until we find the boundary of the strip. We can say that we have "found" a point as soon as the intensity of the pixel goes over some threshold. Once we have a point on the image, we can say the following: "I know that this point lies on the object (because the laser hit the object there), I know that this point lies in the plane of the laser (because the laser swept out this strip), and I know this point lies along a ray with the camera that also passes through this x,y coordinate on the image (because this is where I found the point in the image."
We set up a virtual camera, a virtual laser and a virtual object. We set the field-of-view of the virtual camera, the angle of the camera, and the position and orientation of the laser to match the conditions in the real world.
Then, once we find an x,y point that has been illuminated by the laser, we create a line which starts at the virtual camera position and passes through the x,y point on the camera plane (imagine the camera plane as a tiny image sitting a small distance away from the center of the camera). This process is called 'inverse projection'. Finally, we intersect this line with the plane of the laser for the frame we are working with. This gives us a virtual point (x,y,z) which must lie on the digitized object. Repeat this process for each frame and we can reconstruct the entire object.
Pseudo-code (program steps) for the entire process is relatively simple. Here it is:
One step that I don't elaborate on here is the inverse-projection
calculation. The problem is: Given a camera with a specific orientation in space,
and a point in an image created by that camera, find the ray in space that goes
through the center of the camera and also through that point on the camera plane.
(Note: OpenGL provides a function that will do this calculation for you - its
called gluUnProject.). There are several good computer graphics books that cover
camera projection and inverse projection (Computer Graphics: Principles and
Practice, 3D Computer Graphics, etc.).
The code to reconstructe the object was written in C++ using OpenGL for output. It consists of several classes which work together to achieve the tasks described above. Here are the important pieces of the program:
- Images: An Image and Movie class were written to load
a series of TIF images from files on disk. Once a movie is capture in Adobe
Premiere, I export it as TIF sequence, and then use the Image and Movie classes
to load each image into my program.
- Camera: A Camera class was written that does both projection and inverse-projection (I wrote this myself just for the heck of it). The camera class sets up the virtual camera, and allows you to specify its position, orientation and field-of-view.
- Laser: A Laser class computes the position of the laser plane given a frame f and the frame rate (FPS) of the camera.
- Digitizer: The Digitizer class takes an Image, Camera and Laser class and performs the pseudo-code process described above. The result is a set of Points which are the digitally reconstructed object.
- Points: The points are stored as a vector (ie. list) of 3D points. (for those of you who know STL.. this is just std::vector<Vector3DF>, where Vector3DF is a struct that holds three x,y,z doubles)
The result is what is called a 'point cloud' - a set of 3D points in space. It is important to note that this 'point cloud' does not contain any surface geometry. The object surface is present in the location of the points, but there are no "polygons" which connect the points.
The 3D LEGO Digitizer was tested on a smaller-than-lifesize clay model (about 10" high including hair)
In the following images, you can see the digitally reconstructed model (point cloud) from several different angles:
In these images you can clearly see the vertical strips
of the laser.. There are occasional gaps in these strips as some captured frames
were blank (due to a sync problem between the laser and the camera capture rate).
The scanner did a pretty good job of capturing the hair. One important thing
to note is that there is no data wherever the camera cannot "see"
the laser. For example, on the left side of the nose, the laser beam hits the
nose but the tip of the nose gets in the way so the camera cannot see the laser
from this angle.
Here is an image of the software actually performing the digitization calculation:
To compare the original model to the reconstruction,
I took the reconstructed points, viewed them from the same angle as the original
camera, and superimposed these with a picture of the clay model from the same
angle. The results show that the real object and the digitized object match
up very well!
I have recently scanned a live human head with similar
results (and eyes closed). More results to be posted shortly..
VI. Future Directions
The 3D LEGO Digitizer gives accurate (1 mm), fast (40 second) digitizations of real world objects at very low cost.
Several improvements have come up, either from myself trying to improve the quality of the scans, or through friends offering suggestions to various problem. Some of these improvements have been implemented, others are in the works. Here is the list so far:
- Data Averaging - When scanning horizontally across an image, there may be several points in the strip that are above the threshold. This is because the laser as it appears on the object frequently covers more than one pixel. This improvement involves detecting all the points in a single line that are above the threshold, and taking the average of their positions in the x direction. The results above include this improvement, which gives a much more accurate reconstruction.
- Two Cameras - Two cameras could be used to capture the object without any missing "shadow" gaps in the data. This involves placing a camera at +14" and -14" both aimed at the object. The problem is how to capture images from both cameras simultaneously.
- Rotating the Object - Another suggestion for capturing the >entire< object, without any "shadow" gaps is to rotate the object to specific angles are rescan. The resulting data sets would then be superimposed to give a reconstruction that was a complete 360 deg. model.
- Real-time Capture - At present, I first do a scan with Adobe Premiere (40 seconds), then convert the data from an AVI to a TIF sequence (12 minutes), and finally construct the digital model using the digitizer software (1 minutes). With Microsoft's Multimedia API interface, it may be possible to retrieve the data and construct the digital model in real-time.
- Polygon-Relaxation - Unfortunately, the laser-strip triangulation technique does not output a 3D polygonal mesh, but a point-cloud. The draw back of a point-cloud is that there is no real "surface". Polygon-relaxation would use a mesh cloth model to essentially form a polygonal mesh over the point cloud. This would allow shading and lighting, and would make the digital model much more realistic.
- Surface Texturing & Bump Mapping - Once a polygonal mesh is constructed (by polygon-relaxation for example), it will be possible to take a front-on photograph of the object and apply the photograph as a texture to the reconstructed model. By placing a light source at 90 degs. from the surface of the real object, another photograph could be taken that would be a good bump map. These two techniques should produce the most realistic reconstructed object yet.
- 3D Max Converter - Once a polygonal mesh is constructed, the possibility of saving this data in ascii format and loading the data in 3D Studio MAX using a MaxScript becomes available. This would allow the mesh to be imported into 3D Studio Max and then used for any number of modeling and artistic projects. (this is my final goal in this project)
- Track-End Sensors - Track end sensors could be placed at the end of the track so that, when the carriage arrives at the end of the track, it trips a LEGO switch, which causes the RCX program to stop or reverse. This would allow the digitizer to continuously scan over the object repeatedly without pulling apart the Legos when the end of the track is reached.
- Fully-Automated Scanning - In connection with Track-End Sensors, fully automated scanning is the idea of writing a C++ program to interface with the infrared port to control the RCX while scanning. Several web sites on-line demonstrate how to control the RCX from C++ via the COM port and the infrared interface. By combining this with Real-time capture and Track-End sensors, the entire digitization process could be done automatically. Just place the object in front of the LEGO Digitizer, press the 'Digitizer' button, and the computer would automatically start the RCX, start capturing images from the camera, rotate the object to get all angles of the object, and convert the captured images into a reconstructed digital model - all in real-time.
VII. Current Status
Thanks to my geek friends in Ithaca, NY who made this project possible. Take a look at our new Ithaca hackerspace: http://www.ithacagenerator.org
I am currently in San Francisco working in computer graphics.
If you have any questions or comments about this project please feel free to contact me at firstname.lastname@example.org