May 13, 2009
Artificial Intelligence & Robotics
Artificial Intelligence and Socks in a Galaxy Far, Far Away Dook Skywacker looked up from the puddle he'd just fallen headlong into. "R3D4—I need you to bring me some dry socks from the spaceship," the Heroic Cosmo Knight said. And with that, the doughty robot set out to perform its appointed task.
Would that it were that easy! But in the real world of artificial intelligence, even a simple task like bringing a pair of dry socks from the hold of a spaceship is fraught with unintended peril. Because beyond the confines of space opera, real robots stumble over the kind of rules that a two year old would find laughably easy to follow. The nature of artificial intelligence is that it's not really intelligence as we know it.
The human brain is not only smart—it's smart in a very specific way. Although a computer can perform millions of calculations per second, the human brain can make inferences and fill in gaps no computer can. This is because real intelligence is designed to make leaps of connection and intuition that are very difficult to write into the kinds of algorithms a computer can understand. And one of the best ways to understand this is to look at a simple problem, like R3 getting Dook Skywacker a pair of dry socks.
The Intelligent Agent
One of the most important elements of an artificial intelligence is the ability to be an intelligent agent. An intelligent agent is an artificial intelligence that is capable of perceiving its environment and determining what activities it must do within that environment to achieve its goal. This is much harder than it seems at first. For one thing, the very act of perceiving the environment is far harder for a computer than it is for a living organism. Even if the only sense of perception used is sight.
Let's say R3D4 needs to find the socks in Skywacker's spaceship. Its first problem is that its cameras don't operate in the same manner as Skywacker's eyes. A computer's "eyes"-- also known as a machine vision system, require:
1) Cameras (digital or analog), with the ability to store images. One camera is usually enough—adding two for binocular vision only complicates matters.
2) Specialized light sources. This is because computer networks are very picky about perceiving light in certain ways. If, for example, dim light makes a red sock look black, a machine vision system may suffer a breakdown on the spot.
3) Programs to process relevant features of the perceived object. This is a lot harder than it sounds; a computer has to store a matching image of the object being observed or it won't recognize it, since computer neural networks cannot infer an object's class from a general view of it.
So far, so good. But let's stop a second and talk about requirement number three. Where Dook's eyes see an entire image and automatically integrate it as an object, SOCK, R3's camera sees the sock as a series of pixels. It must then assemble those pixels into a single image, and then translate that arrangement of pixels into something it recognizes as a sock. Sound simple? Not really.
R3 may have a picture of a sock in its memory banks. But that's just one picture. What if the memory bank sock is red instead of blue? R3 will miss the sock because it doesn't match the picture. What if the sock in the memory bank is unfolded and Dook's sock has been neatly rolled up by loyal Princess Cinabuns? Again, R3 will miss the sock because it doesn't match the picture.
How does Dook manage to recognize the sock no matter what it looks like? It's because Dook is able to make inferences about the sock. He has a mental map that defines "sockness" to his mind. All he has to do it look for certain cues: L-shapes, tubular, made of cloth, one end open; and he can assign even his rolled up red socks to the "sock" category. This is called "pattern recognition." Pattern recognition is one of the biggest obstacles to building truly intelligent neural networks. Pattern recognition is the ability to assemble that mental map of an object or environment from an extensive series of cues that the object provides.
Pattern recognition is very hard to program because every single pattern has a cascade of possible elements that will change the pattern's cues. To the literal mind of a computer, the fact that one part of the sock is in shadow can totally change the nature of the sock's pattern. But to a human mind, the difference in shading simply gets reassigned to the category of "shade", with the inherent concept that objects in shade are darker than objects in light. (As an amusing side note, this kind of thinking is known as "fuzzy logic.")
To make R3 capable of finding Dook's sock is going to take a special kind of programming; a kind of neural net called a cognitive architecture. A subset of the idea of intelligent agents, cognitive architecture is created by nesting several types of "thinking" programs into one larger control program. For example, one part of R3's cognitive architecture is going to have to involve the idea of visual states; like the idea that light can affect the visual pattern of an object.
This program would look at the overall light emitted in an area and be able to determine whether a shadow was present. Another part of the CA program would need to be able to determine what abstract qualities determine what is "sockness." This part of the CA would look for qualities of texture, tubes, open ends, and general L shapes. If the object being observed has enough of those qualities, R3's neural net would be able to say, "Hmmm. That matches most of what determines a sock," and then be able to categorize that object as a sock.
Lastly, R3's cognitive architecture will need to know how to test that hypothetical "sock" against the information that it has stored on socks. For example, the same basic qualities of sockness (rough texture, tube with L shape, open mouth" could also be used to define a snake. Imagine Dook's surprise when R3 brings a ten-foot python back to the puddle. As a human, Dook knows ways to test his hypothesis; if the "sock" slithers and hisses, it's probably not a sock.
R3 will also need a part of his intelligent agent program that knows how to test objects, gathering additional information that can clarify whether it's holding a boa constrictor or an argyle. Much like the way a baby puts things in its mouth, R3 will need ways of experimenting with objects in its environment, learning from experience that any "sock" that has fangs and hisses is probably not a good sartorial choice.
Will Dook Pull On A Comfy Pair of Pythons?
Probably not, although he may want his handy laser sword around when R3 brings back the dry pair he requested. But as we've demonstrated here, even a simple task like grabbing a pair of socks can be fraught with peril when artificial intelligence and robotics combine.
For the foreseeable future, it looks like the role of our trusty robot friend will be limited to the factory floor and the laboratory, where the elements of machine vision, cognitive architecture, and pattern recognition are more under control. One day, perhaps, R3 will be smart enough to tell a pair of socks from a snake, but for right now, Dook Skywacker is going to have to slog back to the ship and get his own dry socks.
Written by: John
Trackback URL: http://roboticstechnologycenter.com/artificial-intelligence/trackback/