"After revealing last week that a pilot installation of controversial, buggy border-security
scanner towers had finally been accepted into service, the US government has now
admitted that the project is a technical failure..."
- Boeing’s Border Watchtowers Can’t See Straight, Wired Magazine
I worked on the project two years ago, very briefly. I knew it was a basket case the moment I saw it. Not only do I know exactly how it happened, and why it ended up like it did, I can explain the whole thing in three words... North American Union.
This instructable will show you how to fix the failed $1B border security system in three easy steps. The goal is to create a system using only what is already there, that is fully autonomous, and which operates as good as, or better than, a system relying on having one or more human operators present at all times.
Step 1: Fix the Filter
Two years ago I got hired as a consultant by Boeing with the job description of making sure everything worked. On my first day on the job, I read some documentation that said the system used "Kalman filtering." It also said it used "particle physics?" Well, I don't know much about particle physics, but I do know my Kalman filtering. Later that day I asked my manager if he could tell me how the Kalman filter was implemented. He told me that I should only concern myself with "intelligent" questions. I knew right then and there that the system would be a complete and utter failure. I was let go after 30 days for "incompetence."
One of the biggest complaints about the Boeing system is that it's constantly giving false alarms. Well, one thing a Kalman filter is good for is eliminating false alarms. A Kalman filter is a software algorithm that is used for all kinds of things. Most often it's used for guidance and navigation, but it can be used for everything from economic projections to population growth studies.
The Kalman filter is my favorite filter in the world. If you walked up to me in a bar and asked what my favorite filter was, I'd say the Kalman, by far. One reason it's my favorite is because, if you read the description on Wikipedia, it looks like a big ugly hairball of mathematical equations! Go here and scroll down (actually, look at the screen shot above.) Beautiful, ain't it! BTW, that's one problem with Wikipedia. It sometimes ends up being a giant sandbox for grad students working on PhD theses, which, in my humble opinion, is where that description belongs, not on Wikipedia. The average reader isn't going to be able to understand a description like that.
Fortunately, like most complicated things, if you boil it down, it's just a tiny little common sense equation. It is basically just a weighted filter. For example, say you have a pile of watermelons with a sign saying "10 lb. watermelons." Now, to estimate how much each watermelon actually weighs, you start by saying your initial estimate of the weight is 10 lbs, because that's the only information you have. This is called the prediction. Then you pick up a watermelon from the pile and put it on the scale and it reads 9 lbs. But the scale says it's accurate to plus or minus 5%, meaning 95% accurate. Therefore you can only say for sure that the watermelon weighs at least 9 * .95 = 8.55 lbs. To account for the other 5%, you take the previous estimate, your initial estimate, times 5%, and add that to the intermediate estimate, which yields 10 lbs * .05 = 0.5. Then, your overall new estimate, called the update, is 8.55 + 0.5 = 9.05 pounds. And you just keep doing that for each watermelon, resulting in a recursively filtered estimate. Regarding the covariance matrix, that is for things that have a small side effect on the system, such as if very high humidity causes the watermelons to be a tiny bit heavier, in which case you would add humidity to the covariance matrix.
With the Boeing system, a specific complaint was that, on windy days, there would be even more false alarms due to things like tree branches swaying in the wind, not to mention tumbleweeds rolling by. To correct for this, one item for the covariance matrix would be current wind conditions. Maybe even use "tumbleweed recognition," which, by the way is entire feasible considering that the way I do pattern recognition is by closing my eyes and visualizing the video data (the ones and zeros) and "seeing" what tumbleweeds look like over time as ones and zeros. For one thing they are almost perfectly round. And they move fast, about as fast as the wind, which I already know because it's in my Kalman filter. Are you starting to see how the ability to inject simple common sense into the Kalman filter is what makes the Kalman filter so eloquent, and useful? More detail on that in the next section.
Step 2: Sensor Mashup
The Boeing-designed system relies entirely on the Doppler radar for detection of moving targets. Frankly, I think the infrared sensor would have been a better choice. The most obvious choice, however, is to use all three sensors, and combine them using sensor fusion. This gives us a three fold improvement in the signal to noise ratio, effectively extending the detection range of the system. Furthermore, the way to implement sensor data fusion is, once again, through the use of the Kalman filter.
In the case of the video camera, one complaint with the Boeing system was that the video quality was often very poor. That's because not one person on the project knew the basics of standard image processing techniques. Therefore, this needs to be added, and will significantly improve the video quality.
A particular area for improvement is the filtering out of noise, as well as reducing artifacts due to atmospheric conditions such as scintillations and shimmering caused by heat rising off the desert floor. This involves techniques such as scene integration where each frame is blended with the previous frame giving significant noise reduction.
Another problem, at least at the Huntsville location, was that there was no scene stabilization which means when it was windy, the video was very jittery. That is easily correctable using a technique known as scene registration, otherwise known as "anti-shake" in the commercial world.
Yet another problem I observed at the Huntsville location, was that the infrared imagery suffered due to a lack of histogram equalization . Adding that feature will ensure that the infrared imagery has the optimal dynamic range.
Finally, video quality can be further improved by sharpening the imagery in the same way you sharpen an image in Photoshop. It's just another filter.
Regarding the pictures shown here, amazingly, I stumbled across a thumb drive I had when I worked at Boeing, and was doing some analysis using Pant Shop Pro.
Step 3: Add Artificial Intelligence
What Homeland Security really wanted was a system that was fully autonomous, which means not requiring an operator sitting there babysitting it. And that is what Artificial Intelligence is all about.
Artificial Intelligence (AI) is something that encompasses a whole range of complicated-sounding terminology such as neural networks, computer vision, situational awareness, behavioral profiling, automatic target recognition, cognitive behavior, pattern recognition, and so on and so forth. The experts would probably tell you that it's all very, very complicated. In fact, it is, and would be too complicated for me, if I had to do it by standard textbook methods, as opposed to creating my own methodology. The only way I can ever do anything at all is by using my own algorithms, developed over many years. I don't how other people do it, all I know is that my way seems to work very well.
Again, as demonstrated with the Kalman filter, when it's boiled down, it's all very simple, almost to the point of being trivial. Many times I have to implement something but it's so complicated that I just give up. But I often find that if I spend the time to really boil it down, sometimes it's something I invented on my own maybe 25 or 30 years ago. This happened a while back when me and a co-worker decided to make a Fortran to C++ translator. To fully optimize the program, we decided to use something my co-worker saw in C++ Magazine called Simulated Annealing . It seemed incredibly complicated, and I chose to let someone else do it. Eventually though, I spent the time to figure it out, and sure enough, it was something I had developed on my own about 27 years earlier. BTW, this is the patent for the language translator. Regarding AI, I would say the translator was an "Expert System," in that it performed the exact same tasks as would be employed by an expert in computer programming.
I should mention that I used the simulated annealing algorithm to solve a huge puzzle in the L.A. Times, which was a contest called "Tangle Towns." This was in the early 80's. You had weekly puzzles that you had to solve and submit for several months, which got increasingly more difficult, culminating with the huge puzzle for the final contest. The prizes were nice, like a new Cadillac, or $50 grand, and a whole bunch of other prizes. For the final puzzle, you had a list of every town in California, about 650, and a pool of letters, so many of each letter, and each letter was worth a different number of points, and you got points for how many towns you spelled, and maybe other points for something else. Anyway, I didn't win the big big prize, but I did win a fancy looking stereo system, that was really a piece of crap, and which I sold just to pay the taxes on it. Again, simulated annealing is an AI concept, and could theoretically be of some use in the border system.
Anyway, let's demystify artificial intelligence by breaking it down to what it really means (to me.)
artificial = software
intelligence = common sense
The way it applies to the border system is, for example, making use of situational awareness. That means where am I, how high am I, what is my vantage point, what should I be looking for, etc. Clearly, this is just common sense. The situational awareness for us is that we are sitting on top of a tower looking out over the desert. We want to detect moving objects. Although, when we detect something, before we report an alarm, we will use another bit of common sense to determine whether or not to report the alarm. That logic would look like this:
IF alarm = true THEN
IF target_direction = towards_the_border THEN
cancel_alarm = true
So, we simply avoid reporting alarms for things moving towards the border because it's probably a border patrol vehicle, or whatever. This is called "exploitation of situational awareness," at least that's what I call it when I want to sound like I know what I'm talking about, otherwise I just call it common sense. And if you thought that was slick, then stand back ladies and gentlemen while I show you a total AI mash-up using a 3-way combo of pattern recognition, situational awareness, and cognitive behavior to further reduce false alarms. Ready? Here's the equation:
IF (numb_targets = 1) AND (persistence < 50%) AND (target_distance > 1000 yards) THEN
CALL forget_about_it (target_ID);
Bam!!! Probably got rid of 25 false alarms/minute, going off all day long, with just that one little trick. For some God unknown reason, the Boeing logic was, if I get one alarm signal, put up a yellow symbol, and if I get three in a row, put up a red symbol. That means they were only spending about 2 seconds looking at a target before they made a decision to indicate an alarm. And what I just did with that equation was to say, if I only see one object, and it's only there less then half of the time when I look at it, and it's really far away, like half a mile, then I don't need to worry about it. I'll keep it on file, but if it's somebody coming across the border, and they are on foot, then they are going to be coming my direction eventually, and I can spend the next 30 minutes tracking them (or it) to see if it's gotten closer in that time. I don't need anybody to drive out there to check it out, because it has no place to go where I can't see it. And how many times do people come across the border all by themselves? Never! People always come across the border in droves, or however many a pickup truck holds. So it's probably just a coyote. By the way, I got the distance to the target from the radar, although I can calculate it directly from the video just as well, and I can also calculate the size from the video. And, I feel like I rambling on here, while I'm beating a dead horse at the same time, about something that is just simple common sense. Then again ladies and gentlemen, we're talking about $1 billion dollars of taxpayer money here. (Somebody here is definitely crazy, and it sure ain't me. Although, it is a whole lot of people who agreed on something, and I'm the only one disagreeing...it does kinda freak me out just little bit.)
Well, continuing on, regarding computer vision, I just model everything to work exactly the way the human eye works, along with how the information is processed in the brain. A trivial example is how the iris controls the pupil to adjust the amount of light coming into the eye. This is an automatic function, meaning involuntary; you don't have to think about it, but you need it. This feature is something you normally get for free because it's built into the camera. The pupil is represented by the camera's aperture. Another function built into most cameras is automatic level and gain.
For some interesting reading on computer vision, check out this army proposal I wrote several years ago. (Scroll down to 18.104.22.168 Generalized Balanced Ternary.) In this proposal I discussed how visual stimuli are interpreted in the brain, including how objects are recognized using a clustering algorithm, and how relative size is estimated with only a few simple calculations. Also discussed is the use of a base-7 mathematical system providing a convenient method for summation of statistical data over a hexagonal grid using pyramid algorithms.
Probably the most important thing when implementing computer vision is to emulate the way the brain filters out things that are of no interest. For example, if you are looking out over the desert looking for moving objects, one thing the brain does is ignore things moving in a natural way due to wind. That means that the swaying branches and rustling of leaves on a tree are ignored without even thinking. Many years ago I was working on automatic target recognition for a targeting system that my group developed on a DARPA project, and is currently used in an Army helicopter called the OH-58D Kiowa Warrior . One algorithm I developed was for emulating the way the brain cancels out motion at a particular area in the field of view, and I called that "adaptive localized filtering." That means if I look at a scene, and there is motion which is ambient, and a natural occurrence, then I can filter that out by changing the gains and thresholds for the pixels in that particular area. This means I can now adapt to whatever the current conditions are. This is a form of artificial intelligence in that I do not require an operator to adjust any parameters for me to adapt my filters for the current conditions. This can be augmented by feeding me the instantaneous wind velocity (wind speed plus direction.) By putting this wind velocity into the Kalman filter, I know that if the wind speed is high, I can make certain assumptions when initializing my filters. Note, I use lots and lots of tiny filters to emulate the individual cells of the retina, or rather, groups of cells, because I use "super pixels" to cut down on processor bandwidth.
BTW, I just realized that I have now substituted "myself" in the place of "the software." I didn't do it intentionally, but it is in fact exactly how I develop all of my algorithms related to artificial intelligence, so I'll leave it in. To come up with algorithms that I know I need in order to make a system work perfectly, (i.e., work like I was there controlling it, but I'm going to be there controlling it,) I put myself in the place of the computer, and I close my eyes, because a computer does not have eyes, and I do thought experiments, which I learned from reading about Albert Einstein. From there, I only have my brain to work with, which allows me to do the thought experiments, such that my brain is actually the computer, and I close my eyes to simulate computer vision, where I only have video data, an array of pixels whose values may be only shades of grey from 0 to 255. And then I think about how do I process the data in my head, with data coming in from multiple sensors. In the end I develop an algorithm, to solve a problem, using a biological model, along with my own thought processes. The final result is an "ad hoc" algorithm, based on a combination of common sense, and intuition, along with lots of laboratory trial and error.
The way to implement sensor fusion, given that the Doppler radar only does motion detection, is by also doing motion detection with the two cameras. The basic technique for video motion detection is by scene subtraction, in which, one frame of data is subtracted from a frame of data taken maybe 1/10 of a second later, from which any motion becomes clearly evident. For this system, there would be a high speed motion detection function performed while at any of the staring positions, as well as a low speed function performed whenever the system returns to a particular staring location. In other words, you have motion detection for things moving quickly, as well as motion detection for things that may require 30 seconds between frames in order to be clearly detectable, as in the case of objects at long distance. This provides two levels of resolution. Note that the radar only does instantaneous motion detection, meaning it does not compare data from the previous time at a particular staring location.
Regarding the Kalman filter, for each individual target I would maintain a state vector consisting of [position, velocity, acceleration, size, range, persistence].
For fully autonomous operation, and the ability to run continuously, it is merely necessary to run the software in an infinite loop.
At this point, the most important, and most interesting, feature to incorporate, and what truly gives the system human-like characteristics, is the capacity for cognitive behavior. Cognitive behavior allows the system to do things like having the ability to learn from experience, and the ability to become more proficient in performance over time, exactly as to be expected from a human operator. Similarly, since we have a neural network, and using pattern recognition, the system can detect trends, such as most intrusions occur within a particular corridor, and at a certain time at night, and that the intruders tend to follow a similar path each time. With this information the system can modify it's search pattern to spend more time searching that particular corridor, and less time at areas that historically exhibit little or no activity. Again, this is the same exact behavior one would expect from a human operator.
For a final note, the example just given is a very good description of a neural network because, to me, a neural network is just a database along with some rule-based logic. In the above case, using pattern recognition to detect a trend, results in that tidbit of information to be stored in the database as a new node, much in the same way the brain generates neurons to permanently save information in long term memory. Again, I do everything using a biological approach. In fact I can't even think of any other way I could possibly do it. Furthermore, the biological approach uses a 1 to 1 analogy. For example, with the human brain, you have short term memory (RAM,) long term memory (disk or FLASH,) and neurons (brain cell clusters, or something like that) and synapses (nodes in a database,) and it all parallels the way in which a computer works. Honestly, I don't see even the slightest difference between a human brain and a computer, including having various sensors, and vision being like computer vision, and so on. You might also be interested to know, that the way a high end auto-focus camera works, or used to anyway, (I've actually programmed this function before,) is by dithering the lens back and forth until you get the best focus (technically, finding the point which maximizes the pixel standard deviation.) And this is done, not using a motor and gears, but by using a piezoelectric membrane, which sounds to me just like how the human eye uses muscles and membranes as actuators. Biologically speaking, Everything just corresponds so well using a biological model. Except, of course, for human feelings, and thank God for that. (Oh crap, my computer isn't talking to me right now...must have been something I said!!!)
Step 4: Q.E.D. (quod Erat Demonstrandum) "what Was to Be Demonstrated"
I have now demonstrated the steps required to create a border system with the following features:
- fully autonomous, needs no operator intervention
- continuous 24/7 operation
- 100% detection rate
- zero false alarm rate
- extended detection range
- self adapting to current conditions
- filters out ambient noise
- ability to ignore spurious motion caused by wind
- improved video quality
- learns from experience
- recognizes trends and patterns and adapts accordingly
- improves performance over time
Clearly we now have a system with all of the attributes of a human operator, but which is immune to degradation of performance due to the ill effects of being tired, bored, stressed, cranky, crampy, underpaid, having a bad hair day, drunk, or hung over from the night before.
If I was in charge of the border system, I wouldn't be building towers. They are too expensive, not to mention a maintenance nightmare. Furthermore, the towers can be seen from far away, and people sneaking across the border will just avoid going near the towers. Towers are only good as a deterrent, and therefore it would make sense to have a tower placed near a point of entry where the only place to cross the border, because of the terrain, is a narrow corridor. Of course that would make it a good target, so I would use dummy cameras, as real ones would be getting shot out every other week.
The best way to patrol the border is with robotic ground vehicles. Israel has one called Guardian , which was at least tested. I would use 4x4 ATV's as the frame, with hybrid engines so that when running on battery only, they would be very quiet. There would be solar powered recharging stations that the vehicles could dock to in order to recharge the battery. The vehicles could carry water, food bars, and first aid. The best thing would be to build a series of very straight dirt roads to simplify autonomous navigation. Navigation could be done using arduino based system (ArdoPilot.)
South Korea also built a robotic sentry system, although it was also a failure. I bet if I had a look at it I could make it work. Gonna look into that.
In summery, although there are probably a thousand analysts looking for a clue as to why the border system failed, even though a billion dollars was spent on equipment that worked right out of the box, it is only necessary to understand how things work at the government level. To understand how the $1 billion dollar border failure really happened, one must boil down a massive hairball of information to the least common denominator (LCD.) When I need to boil down a large amount of data, I process it until I get an LCD of 1, then I know I'm done. The answer as to how the billion dollar failure happened is simple: "interagency rivalry." There's simply no communication between government agencies. It's not in their interest. And check out the compression ratio I ended up with here:
Compression Ratio = US $1 Billion / 7 syllables = $142.8 Million : 1
That's pretty good compression. I wish I had a patent on that algorithm.