Insight VR

Laser Tracking with NumPy and PySight (and V4L too!)

by john on Apr.06, 2008, under Lasers, Python, pySight

In my PyCon presentation (video should be up shortly video is up!) I mentioned the difficultly of processing the pixels fast enough. In a 640×480 bitmap there are 307,200 pixels and the iSight is trying to pump them out at 30 frames per second. I did a few things to speed things up. For instance I limited the area of the bitmap that I scanned based on calibration data, I only scanned every fourth pixel (every other pixel of every other row) and I only processed every fifth frame. While this worked well enough, clearly it was not ideal.

The day before my presentation I was speaking with Travis Vaught at the Enthought booth asking how they did their multi-touch stuff in Python. Well it turns out that they don’t do all of it in Python, but Travis suggested that I try NumPy for the image processing. I had some concept of what NumPy could could because of some brief use of it with work, but I hadn’t really delved into.

So during the lightning talks on Friday I was feverishly writing NumPy code with the hope of getting it running for my demo at 9am on Saturday. At about 1 am Matt and I were finally convinced that it worked and went back to the hotel room, where I built another pair of IR glasses until 2 am.

This did not leave time for altering my presentation, so I left it out. I did mention it briefly after demoing the 3D game, but I would guess that many listeners missed it.

In any case, I basically reimplemented the method I was using to filter for red pixels in NumPy, but now the system scans every pixel of every frame. Also, rather than returning a single point where the laser is “seen” it returns a list of points if there are multiple lasers.

The basic process is that I first look at just the R value of the RGB tuple. You can filter out pixels that don’t have a high enough R value very quickly.

Once I have a list of candidate pixels I look at them in more detail, calculating their redness.

Then I define redness as:

redness = R*2 – G – B

Thus a white pixel would have a redness of zero. A pure red pixel would have a redness of 510 (255*2-0-0). Again, NumPy makes this process fast, but it isn’t nearly as fast as the first filter due to the fact that it includes a multiply and two adds. In fact, I tried to run it without the first filter (which doesn’t affect results, only speed) and it slowed things down a lot.

Finally I sort the list of pixels by redness and then begin putting those pixels in a results list. But if a pixel is within about 15 pixels of a pixel already in the results list it is assumed to be part of the same laser and excluded. Currently I am also capping the return list to a length of 5, which also helps with performance in some instances.

Here’s the code as it currently stands, with what I hope are sufficiently copious comments.

#! /usr/bin/env python -t
'''
isightlaser

scans for laser and manages list of laser hits
This is a poorly named class as it works with a vartiety
of webcam inputs, not just the iSight

'''

import numpy

# The cameras we've tested so far all have 640x480 resolution
CAMERA_MAX_X = 640
CAMERA_MAX_Y = 480
NUM_PIXELS = CAMERA_MAX_X * CAMERA_MAX_Y

# for RGBA data this is the matrix we multiply by to get redness
MUL_ARRAY = numpy.array([2,-1,-1,0])

# some libraries return bitmaps as BGRA ordered data, so
# this is the matrix we use to get redness
BGRA_MUL_ARRAY = numpy.array([-1,-1,2,0])

class IsightLaser(object):

def __init__(self, top=0, bottom=480, left=640, right=0, game_xy=(1024,768)):
self.queue = []
self.top = top
self.bottom = bottom
self.left = left
self.right = right
self.game_xy = game_xy
self.calibrate = True

# called when game surface is sized or resized
def set_game_xy(self, xy_pair):
self.game_xy= (xy_pair)

def pop(self):
if self.queue:
return self.queue.pop()
else:
return None

def has_elements(self):
return self.queue

def push(self, xy_pair, flip_y=False):
# any pixels seen during calibration are used to define
# the limits of what the camera can see
if self.calibrate:
print "calibrating"
x,y = xy_pair
if y > self.top:
self.top = y
if y < self.bottom:
self.bottom = y
if x > self.right:
self.right = x
if x < self.left:
self.left = x

else:
trans_pair = self.translate(xy_pair, flip_y)

self.queue.append(trans_pair)
print self.top, self.bottom, self.left, self.right, self.game_xy

def translate(self, xy_pair, flip_y=False):
# translate a point from camera coordinates to game world coordinates
x = self.game_xy[0] * (xy_pair[0] - self.left)/(self.right-self.left)

# some cameras have their origin in the upper left, others in the lower left
if flip_y:
y = self.game_xy[1] * (xy_pair[1] - self.top)/(self.bottom - self.top)
else:
y = self.game_xy[1]- self.game_xy[1] * (xy_pair[1] - self.top)/(self.bottom - self.top)
#print xy_pair , x,y
return (x,y)

def start_calibrate(self):
self.calibrate =True
self.top = 0
self.bottom = 480
self.left = 640
self.right = 0

def stop_calibrate(self):
print "STOP CAL"
self.calibrate = False

def process_buffer(self, image_buffer, rgba=4, flip_y=False, bgra=False):
global MUL_ARRAY
# convert from buffer type to numpy array
flat_array = numpy.frombuffer(image_buffer,numpy.uint8)

# convert from a (640*480*4) x 1 matrix to a (640*480) x 4 matrix
image_array =numpy.reshape(flat_array,(NUM_PIXELS,rgba))

# filter out one colum of the matrix in order to just get red values
if bgra:
reds = image_array[:,2]
else:
reds = image_array[:,0]
#print  "image arrary:", image_array
#print "reds", reds
#image_array =numpy.multiply(image_array,MUL_ARRAY)

# Filter out any red value that is less than 180 (should be adaptive later)
mask = numpy.greater(reds,180)
#print mask

# get indices of pixels with a red value over 180
original_indices = numpy.array(mask.nonzero())[0]
#print "original_indices", original_indices

# pixel_list is an n x 4 matrix of candidate pixels
pixel_list = numpy.array(image_array[original_indices])
#print "pixel list: ", pixel_list
if bgra:
MUL_ARRAY = BGRA_MUL_ARRAY

# next two lines multiply and then sum in order to get
# redness = 2*R - G - B + 0*A
pixel_list = numpy.multiply(pixel_list,MUL_ARRAY)
#print "after mul", pixel_list
pixel_list = pixel_list.sum(axis=1)
#print "after sum", pixel_list

# filter for redness over 300
pixel_indices = numpy.greater(pixel_list,300).nonzero()
#print "pixel ind", pixel_indices
red_pixels = numpy.array(pixel_list[pixel_indices])
#print "red pixels", red_pixels

# Work back to get the index of pixels that pass both
# filters
original_indices = original_indices[pixel_indices]
#print "original_indices", original_indices

# create an n x 2 matrix with redness values and indicies
combined_array = numpy.column_stack((red_pixels,original_indices))
#print "ca", combined_array

# sort by redness
combined_array = numpy.sort(combined_array,axis=0,kind='quicksort')
#print "sorted ca", combined_array
clean_list = []

# take reddest pixel first, put it in the list
# then take next reddest pixel and put it
# in the list if it isn't too close (sqrt(200) pixels)
# to a pixel already in the list
for pixel in combined_array[::-1]:  # this reverses the order from the sort
#print "pixel:", pixel
if clean_list.__len__() > 5: #limits total number of dots tracked
break

x = pixel[1]%640
y = pixel[1]/640

add_elem = True
for clean_elem in clean_list:
if ((x-clean_elem[0])**2 + (y-clean_elem[1])**2) < 200:
add_elem = False
break
if add_elem:
clean_list.append((x,y))

#print clean_list
for elem in clean_list:
self.push((elem[0],elem[1]), flip_y)

laser_singleton = IsightLaser()
15 comments for this entry:
  1. Christian Muise

    Do you have the luxury of the assumption that the laser is likely to be close to where you saw it last time?

    If you do, then why not set a threshold of redness (which you will assume means the laser), and start your search of pixels radially from where you last saw the point?

    It could save loads of time…

    The problem we faced with MolViz wasn’t any of the algorithms, but the actual iSight framerate. How did you pump that up to 30? We only get about 17 under normal lighting conditions and it falls to 16 or 15 with the face tracking algorithm working on each frame. (ya, we could put them in separate threads, but its more effort than its worth if the iSight is only spitting out images at 17 fps)

  2. john

    I just did some research and found this:

    http://discussions.apple.com/thread.jspa?messageID=4986308

    It seems that the iSight varies its frame rate depending on the brightness of the image it sees. A dimmer image will lead to a lower frame rate. I just verified this experimentally by uncommenting the framerate tracking and pointing my iSight at brighter and dimmer areas of the room.

    It seems that if you want 30 FPS you need to have a well-lit subject. I need to do some more testing to find out what frame rate my system gets in practice. Since my game screens are generally pretty dark it might be that the frame rate is much lower than 30 fps. Too much brightness in the red part of the spectrum would of course defeat the purpose of my first filter lower my frame rate by increasing my processing burden.

    As for assuming where the point will be next, I haven’t worried about that. I generally try not to worry about where a laser might show up or even how many might show up. By keeping things general I enable games such as Laser Missile Command.

  3. Christian Muise

    Ya, I’ve known the brightness frame rate for a while now – that’s why I asked when you said you could get 30fps :p. The lab I work in with the Mac gives about 17fps at normal conditions. The camera hits a slower frame rate because it leaves the aperture open longer to get a longer exposure (its just a regular camera with a fast shutter anyways).

    If you’re tracking something, and you can assume they won’t teleport to the other side of the screen, you can make things loads easier by remembering where they were (iterative approach).

    But I guess for the laser thing, you could just have a bunch of randoms firing at will and don’t want to start making those assumptions. One speed up you may want to try is image rescaling. Typically packages that are meant to do it, can do it really fast…and I think it would give you better results than the “every 4th pixel” technique.

    Cheers

  4. john

    You are right that if I was more interested in tracking and movement then there would be additional tricks I could do. Right now I don’t worry about keeping track of a laser from one frame to the next. There are a lot of cool things you could do with that.

    My old iterative method required me to do every 4th pixel and other tricks. The NumPy method processes every pixel of every frame without problems. I think that the Laser Missile Command video demonstrates that it registers pretty quickly and registers multiple points at once.

  5. Brian Hammond

    You might consider one of the GPU-based CV libraries to speed up things…. GPUCV, OpenVidia, etc. I wrote a very CV intensive app for a masters thesis last year (getting published! w00t! mixedrealitybilliards.com) and am in the midst of using GPUCV to speed it up. You might want to perhaps use ctypes or swig to wrap GPUCV for use in Python.

  6. Brian Hammond

    Oh, you’re not even using OpenCV. You might want to try that first… It’s easy to use and very efficient.

  7. john

    Brian,

    I do need to look at OpenCV. I’ve had good results with NumPy so far, but perhaps OpenCV is the better tool for the job.

  8. Brad Montgomery

    Very Cool Stuff (eagerly awaiting the video!) OpenCV will do a lot for you, and it is quite efficient. You really need a good background in Computer Vision to understand how to use it correctly, though. I’ve been using it for about a year on various projects, and I still feel like an OpenCV noob… There’s an Oreilly book coming out in June (hopefully), that may make it more accessible.

    I’ve also had a lot of luck with PIL for pixel processing. It uses numpy underneath, so it’s fast, too.

    I’ve also got to put a plug in for pygame… It’s not the best when it comes to actual image processing, but it’s a great framework for user input, sound, and some additional 2D graphics. I’ve successfully built projects using OpenCV+PIL+Pygame.

    Great Work and Good Luck!

  9. john

    Brad,

    The video is actually up as of today. I’ve edited the post to contain the link.

    Perhaps I’ll wait for the OpenCV book before getting into it if it takes over a year to master it. I had my NumPy implementation working within hours of the NumPy people telling me to check it out.

    Two of the projects that I’ve done (the original 2D Marshie game and Laser Missile Command) use pygame and I like it a lot. The 3D game uses pyglet as I am stealing some of my own c++ OpenGL code from another project and the pyglet port was very simple.

  10. Tor Arne Pedersen

    Hi!

    i’ve been checking out different source code around the net, and I believe this is one of the fastest I have found. Lots of the other code examples out there does a lot of useless stuff, like converting to HSV, or filtering the image several times. I think the easiest way to go is:
    Compare image pixel array with the laser colour and a tight distance, and get the new pixel array returned, and sort it. That should be it, and it would work for IR-laser as well. Some webcams has this IR-filter inside, but that can be removed. I’ll se what I can make of it some day..
    – Tor Arne Pedersen

  11. john

    Tor,

    I have no idea how fast this method is relative to those used by others. I do think that if you can apply some very simple preliminary filters that you’ll be faster overall because it means that there are many fewer pixels to do the more complex calculations on. That is why I run an initial filter on just the red value with basically no calculations at all. In a dark environment it eliminates nearly all the pixels very quickly.

    The problem with my method is that it doesn’t do as well when there are lots of white or red pixels, since that means there is more data to process. So I get a slowdown if I run in a well-lit room. I think that I could adjust by making the algorithm more adaptive and raising the value for the first filter dynamically. It used to do that before I made my NumPy version, but I dropped that when I did NumPy and haven’t added it back in.

  12. bdot

    this is great!!!

  13. Dave

    Any chance of getting the code from the article, but formatted correctly? Possibly as a download?

  14. john

    Dave,

    I’ve got that code up in several places included in complete games. I believe it is isightlaser.py in all of them. It is in the missile command game at:
    http://insightvr.com/download/LaserMissile0_02.zip

    and it is present (and easily activated with some flags) in all my OSCON demos:
    http://blog.insightvr.com/?p=123

Leave a Reply

Looking for something?

Use the form below to search the site:

Still not finding what you're looking for? Drop a comment on a post or contact us so we can take care of it!