The Nervous System
The choice of components for the system was mostly driven by the need to implement a computer vision algorithm to track people, but keep our system running autonomously (it's supposed to sit in the middle of a field!). For this reason, we used a Raspberry Pi Model A to do all of our number crunching. We were motivated in this decision by the low power consumption and the surprisingly powerful GPU, useful for doing computer visioin. It turned out in the end that we weren't able to harness the GPU, but the computing power of the Pi was still enough to process 3-4 frames a second without the aid of the GPU.
To get the Pi up and running off the grid, we powered it from a 6V 12 Ah sealed lead acid battery and bought a USB wifi module. The Pi draws anywhere from 100 to 500mA at 5V by itself, depending on who you ask. This means that we can definitely run the Pi with its wifi module for several hours at least, and for possibly even a full day. We also bought a Raspberry Pi camera board, which is easily set up with OpenCV on the Raspberry Pi.
Finally, we needed to be able to control the sprinkler's range. Because we wanted to be able to control both the direction and size of the sprinkler's range, we needed two servos. Additionally, they each needed to be able to sweep over at least one full revolution and be at least reasonably waterproof. We bought two 8-turn high torque servos to do the actuation, and waterproofed them ourselves. The servos were also powered at 6V from the sealed lead acid battery, pulling a peak current of 2 amps apiece.
To get the Pi up and running off the grid, we powered it from a 6V 12 Ah sealed lead acid battery and bought a USB wifi module. The Pi draws anywhere from 100 to 500mA at 5V by itself, depending on who you ask. This means that we can definitely run the Pi with its wifi module for several hours at least, and for possibly even a full day. We also bought a Raspberry Pi camera board, which is easily set up with OpenCV on the Raspberry Pi.
Finally, we needed to be able to control the sprinkler's range. Because we wanted to be able to control both the direction and size of the sprinkler's range, we needed two servos. Additionally, they each needed to be able to sweep over at least one full revolution and be at least reasonably waterproof. We bought two 8-turn high torque servos to do the actuation, and waterproofed them ourselves. The servos were also powered at 6V from the sealed lead acid battery, pulling a peak current of 2 amps apiece.
The Bleakly Inhuman Visual Cortex
While writing the image processing code I read 134 stackoverflow questions. I guess that means two things: 1) OpenCv's python bindings are not exactly straightforward, and 2) I had no idea what I was doing. I now have at least a medium understanding of what I did, and we only had to go through the (10 hour) OpenCv raspberry pi install once, so I did something right!
The logic behind our person detection is fairly simple. You take a video stream as an input. You have an average of all the frames ever, which is your background. You compare this to the current frame. Anywhere they are different is motion. If the motion is person shaped, BAM, target acquired.
Sounds good, right? Here comes the nitty gritty. First, you need that video as an input. So OpenCv has to initiate talking to a camera. cv2.VideoCapture will initiate contact with the first camera it finds. We named ours cap. Pro tip: You only do this once. If you put this inside your loop, your code will be slow. However, each time you go through your loop, you need to read the most recent frame from the camera using the read method on cap. You now have a frame to work with. It is a numpy ndarray. The depth is three, corresponding to the three color channels. RGB, right? Wrong! BGR . Thanks OpenCv. The height and width are the resolution of the image, probably 480x640. What if you want a different image size? Great question! Do you want spend another 10 hours of your life reinstalling opencv with different dependencies? If so, install uv4l, read up on how to make this witchcraft work and let me know what happens. We just opted to use cv2.resize() to shrink the image in code instead.
Now we have a properly sized frame. My next step was to convert it to the HSV colorspace. Remember how we are targeting people with water? When the water enters the cameras field of vision, we don't want the water to be detected as motion. This might cause the sprinkler to track and fire at itself, like so http://one-gif-each-day.blogspot.com/2012/02/funny-cat-chasing-its-own-tail.html . But, since water is clear, it changes the saturation of a pixel and value of a pixel, but not so much it's hue, so we figured it would be easier not to filter out water in the HSV space. I then used the cv2.accumulateWeighted function build the background. I guess most people who use the function want super accurate averages, so the function outputs 32bit floats. I then used cv2.convertScaleAbs() to bring it back to 8 bit.
The logic behind our person detection is fairly simple. You take a video stream as an input. You have an average of all the frames ever, which is your background. You compare this to the current frame. Anywhere they are different is motion. If the motion is person shaped, BAM, target acquired.
Sounds good, right? Here comes the nitty gritty. First, you need that video as an input. So OpenCv has to initiate talking to a camera. cv2.VideoCapture will initiate contact with the first camera it finds. We named ours cap. Pro tip: You only do this once. If you put this inside your loop, your code will be slow. However, each time you go through your loop, you need to read the most recent frame from the camera using the read method on cap. You now have a frame to work with. It is a numpy ndarray. The depth is three, corresponding to the three color channels. RGB, right? Wrong! BGR . Thanks OpenCv. The height and width are the resolution of the image, probably 480x640. What if you want a different image size? Great question! Do you want spend another 10 hours of your life reinstalling opencv with different dependencies? If so, install uv4l, read up on how to make this witchcraft work and let me know what happens. We just opted to use cv2.resize() to shrink the image in code instead.
Now we have a properly sized frame. My next step was to convert it to the HSV colorspace. Remember how we are targeting people with water? When the water enters the cameras field of vision, we don't want the water to be detected as motion. This might cause the sprinkler to track and fire at itself, like so http://one-gif-each-day.blogspot.com/2012/02/funny-cat-chasing-its-own-tail.html . But, since water is clear, it changes the saturation of a pixel and value of a pixel, but not so much it's hue, so we figured it would be easier not to filter out water in the HSV space. I then used the cv2.accumulateWeighted function build the background. I guess most people who use the function want super accurate averages, so the function outputs 32bit floats. I then used cv2.convertScaleAbs() to bring it back to 8 bit.
Now that we have a background we need to split both the background and the current frame into their hue, saturation, and value parts. Cv2.split(). For each channel, we now use cv2.absdiff() to figure out, well the absolute difference between the background and the current frame. It looks like this.
So you can mostly see me. However there is still an amount of noisy tree stuff going on in the background. I blurred the image to get rid of some of that. I also blurred the image in the vertical direction more, because (as you can see with my pants), some types of clothing a harder to recognize. Blurring vertically will smudge to recognized parts (my jacket and boots) together more.
Now comes the most finicky part. We are calling the threshold function. This takes a black and white image and accepts a threshold. Then it returns a binary (black or white) image, where everything above the threshold in the original image is white, and everything below it is black. I found these thresholds purly by trial and error. The mask (as the cool nerdy kids call it) looks like this for the same value frame. We then add all three of the mask channels together as each channel is best at recognizing certain colors of clothing. This is actually requires two adds because add can't do three things at once.
Now, we are using functions findContours() with the external option, and boundingRect() to find the outer most edges of white blobs and give us the coordinates of the rectangle around said blob. At this point, we are pretty much done with OpenCv, and are moving on to finding blobs that are people. First thing to do is take each contour OpenCv found, find its rectangles area and coordinates, and then make a list of these coordinates sorted by area. We don't want to have to iterate through all the contours. Ain’t no one got time for that shit. Instead, we will just take the biggest 10. If any of these rectangles are big and have a width/height ratio that is smaller than one, add it to the people list. However, sometimes a person ends up being two blobs that are close together, which show up as two separate rectangles. I then iterate enough each possible pair of rectangles, and if they are close enough, I make one big rectangle. If this rectangle is person shaped, it gets added to the list as well. For debugging this process, I used cv2.rectangle to draw boxes around people like so.
The Fine Motor Control
Once a target has been acquired, we must translate this position into a millisecond pulse width. This is simply a linear function which we found by trial and error, so that when a victim was found at a place in an image, the servos would turn the sprinkler to face the hapless soul. We started with both servos aligned, and then added 2ms to one while subtracting 2ms from the other so they were slightly offset. This gap is where the mechanical trigger sits. Normally, the sprinkler turns toward the target nearest to the last victim it hit, however, in the event that there have been no targets found for 20 cycles, the servos revert to an open position 180deg apart to allow the sprinkler to sprinkle normally. Once we have the desired ms position for both servos, we put the position into the queue for each servo. The servo sub-processes then implement moving the servos using the servoblaster program.