Behavioral, autonomic, mechanical: a model for building badass robots

[Update: since I started writing this, a Twitter friend helpfully pointed me at Marr’s levels of analysis, which upon quick study appears to be pretty much identical to this idea, so I’ll be framing this in his terminology at some point.]

This rides on the tail of the previous post. I’m just trying to get this sorted, so bear with me.

A cybernetic system consists of inputs, outputs, and logic to connect them via feedback — if this, then that. This is true for Web servers and 747 autopilots alike.

It occurs to me that you can broadly organize a cybernetic system into three levels of interaction: behavioral, autonomic and mechanical. So let’s look at these in reverse order, from the bottom up.

  1. Mechanical: this is the lowest level, below which it’s impossible to alter component behavior without external intervention. Think of, for example, a motor. A motor turns, in one direction or another. You can’t make it do anything else without actually going in and fucking with its physical properties. Same with a photovoltaic sensor, or a human muscle, which can only contract when sent an electric/chemical signal.
  2. Autonomic: This is the next level up, in which you can connect up inputs and outputs to perform basic logic without need for complex modeling. Imagine a robot with a touch sensor and a motor. You can program the robot to reverse the direction of the motor when the touch sensor is triggered. Or in a biological model, think of your heartbeat. It requires no thought, no interaction: it just beats. You can also think of the BIOS of a computer: it handles the simple, low-level switching of signals between a CPU, RAM, a hard drive, etc.
  3. Behavioral: This is when you hook a bunch of inputs and outputs together and create complex behavior based on their interaction. In computers, this would be the software level of things.

To give a concrete example of this, think of a Belkin WeMo switch. This is a networked-enabled power switch. It has a simple WiFi receiver in it and a relay that can turn power on and off to an electrical socket.

The mechanical level of the WeMo is the power socket switch itself. It does one thing: flip a relay. It doesn’t “know” anything else at all, doesn’t do anything else.

But the WiFi adds the autonomic level: there’s basic logic in the WeMo that when it receives a specific signal over WiFi, it flips that relay. That’s all it does (aside from the ability to connect to a WiFi network in the first place). Slightly more complex than the switch itself, but still not complex at all.

But then there’s the behavioral level of the system. Belkin makes a mobile app for your phone that lets you turn on the switch from wherever you are. In this case, the behavioral level is provided by your own brain: you can turn the light on or off based on a complex system of feedback inside your skull, which takes a various set of inputs, conditions and variables to decide “Do I want this light on or off?” It might be overcast outside, or it might be nighttime, and you might want to turn it on; it might be daylight, and you want to turn it off, or it might be dark but you’re not home and don’t want to waste electricity. Whatever.

But here’s where it gets interesting: you can use IFTTT to create a “channel” for your WeMo, which can be connected up to any other IFTTT channel, allowing for complex interaction without human intervention. For example, I have the WeMo in my living room set to turn on and off based upon Yahoo Weather’s API; it turns off when the API says the sun has risen, and turns on when it says the sun has set.

This is different than a light controlled by a photovoltaic switch, which is an example of autonomic behavior. The PV switch doesn’t “know” if the sun has gone down, or if someone is standing in front of it, casting a shadow; all it knows is that its sensor has been blocked, which turns off the light. While this is somewhat useful, it’s not nearly as useful as a system with a behavioral level.

Make sense?

Okay, so let’s get back to robots, which was what I was going on about in the last post. A robot is a cybernetic system, and so it has these three potential levels: behavioral, autonomic and mechanical. In the case of a robot, it looks like this:

1) Mechanical: motors, sensors. A Roomba, to use the example from the last post, has three motors (left wheel, right wheel, vacuum) and a set of touch sensors. All these can do is either receive or send electrical current: when a touch sensor is touched, it sends an electrical signal (or stops sending it, whatever, doesn’t matter). A motor receives current in one direction, it turns one way; send it the other direction, it turns the other way.

2) Autonomic: In our Roomba, this is the hardware logic (probably in a microprocessor) that figures out what to do with the change in current from the touch sensor, and how much current to send to each motor. For example, if the motor is a 100 amp motor and you send 1000 amps through it, you can literally melt it, so make sure it only gets 100 amps no matter what. Very straightforward.

3) Behavioral: in our Roomba, this is deceptively simple: roll around a room randomly until you’ve covered all of it, and then stop. In actuality, this requires a pretty serious amount of computation, based upon interaction with the autonomous level: a sensor has been tripped, a motor has been turned on. I don’t know the precise behavioral modeling in a Roomba, but I suspect it’s conceptually similar to something like Craig Reynold’s boids algorithm: move around until you hit a barrier, figure out where that barrier is (based upon something like number of revolutions of the motor), move away from it until you hit another one, etc.

In a Roomba — and indeed, in most robots — the autonomic and behavioral levels are hard-coded and stored within the robot itself. A Roomba can’t follow any instructions, save the ones that are hardcoded into the firmware in its processor.

Fine. But what if we thought about this in another way?

Let’s remove the Roomba’s behavioral subsystem entirely. Let’s replace it with a black box that takes wireless signals from a WiFi or cellular network; doesn’t matter which. This black box receives these signals and converts them to signals the autonomic subsystem can understand: turn this motor this fast for this long, turn that motor off. And let’s even add some simple autonomic functions: if no signals have been received for X milliseconds, switch to standby mode.

Our Roomba is suddenly much more interesting. Let’s imagine a Roomba “channel” on IFTTT. If I send a Tweet to an account I’ve set up for my Roomba, I can turn it on and off remotely. Cool, but not that cool, right?

But what if we add the following behavior: let’s make our Roomba play Marco Polo. Let’s give it a basic GPS unit, so it can tell us where it is. Then let’s give it the following instructions:

1) Here’s a set of GPS coordinates, defined by two values. Compare them to your own GPS coordinates.

2) Roll around for a minute in different directions, until you can figure out which direction decreases the difference between these two coordinate points.

3) Roll in that direction.

4) When you encounter an obstacle, try rolling away from it, generally in the direction you know will decrease the difference between your coordinate and your target coordinate. If you have to roll in another direction, fine, but keep bumping into things until you’ve found a route that decreases the difference rather than increasing it.

This is a very simple and relatively easy set of instructions to implement. And when we do so, we’ve got a Roomba that will come and find us, bouncing around by trial and error until it does so. It might take thirty seconds, it might take hours, but the Roomba will eventually find us.

Now, if we equip the Roomba with more complex sensors like a range finder or a Leap Motion, this all becomes much more efficient: the Roomba can “scan” the room and determine the quickest, least obstacle-filled path. In fact, the Roomba itself, the hardware, doesn’t have to do this at all: it can send the data from its sensors over its wireless connection to a much more complicated computer which can calculate all of this stuff for it, much faster, and issue commands to it.

But what happens if that network connection breaks down? In this case, we can give the Roomba a very simple autonomic routine to follow: if there’s no instructions coming, either stop and wait until a connection is reestablished, or resort to the initial behavior: bump around trying to reduce the difference between your own GPS coordinate and the one you’ve got stored in your memory. Once a connection is reestablished, start listening to it instead.

If this sounds dumb, well, imagine this: you’re in an unfamiliar city. You’re relying on your car’s GPS to navigate from your hotel to your meeting. When you’re halfway there, your GPS stops working (for whatever reason). You know your meeting is at 270 34th Street, and you know that you’re at 1800 57th Street. (The numbered streets in this imaginary city run east-west.) So you know you need to go east for fifteen blocks or so, and north for twenty-three blocks. So you turn left and go north on Oak Street, but Oak Street deadends at 45th Street. So you turn right onto 45th until you find Elm Street, the next north-south street, and you turn left and continue to 34th Street, where you turn right and keep going until you reach the 200 block.

Do you see where I’m going with this? You’re doing exactly what our imaginary Roomba is doing: you’re “bumping” into obstacles while reducing the difference between your Cartesian coordinate and the coordinate of your destination. The difference is that you’re not literally bumping into things (at least hopefully), but if our Roomba has sophisticated enough range finders and such, neither is it.

But this is even more interesting, because we can break your behavior down into the same three levels.

  1. Behavioral: I want to go to 270 W. 34st Street. My brain is converting this idea into a set of complex behaviors that mainly involve turning a wheel with my arms and pushing pedals with my feet. And hopefully also paying attention to the environment around me.
  2. Autonomic: I think “I need to turn left”, and my brain automatically converts this to a series of actions: rotate my arms at such an angle, move my knee up and down at a certain speed and pressure. As Julian Jaynes points out in The Origin Of Consciousness In The Breakdown Of The Bicameral Mind, these are not conscious actions. If you actually sat down and thought about every physical action you needed to do to drive a car, you couldn’t get more than a block.
  3. Mechanical: Your limbic system sends electricity to your muscles, which do things.

Your muscles don’t need to “know” where you’re going, why you’re going there, or even how to drive a car. Your higher mental functions (I need to turn left, ooh, there’s a Starbucks, I could use some coffee, shit, I’m already late though) don’t deal with applying signals to your muscles. The autonomic systems are the go-between.

But then, something happens: a dumbass in an SUV whips out in front of you. At that point, your behavioral system suspends and your autonomic system kicks in: hit the brakes! You don’t have to consciously think about it, and if you did, you’d be dead. It just sort of happens. (There are actually lots of these direct-action triggers wired into human mental systems. Flinching is another example. It is almost impossible not to flinch if something comes into your vision from the periphery unexpectedly, moving very fast.)

So let’s turn this into a brilliant architecture for robotics. (He said, modestly and not confusingly at all.)

Our architecture consists of our three levels: behavioral, autonomic and mechanical. However, because we’re building modular robots and not monolithic people, what each of these actually means can be swapped out and changed. Again, let’s look at this from the bottom up.

1) Mechanical. This can be pretty much any set of sensors and actuators: a potentiometer, a button, a touch sensor, a photovoltaic sensor. Doesn’t matter. To an electron, a motor and a speaker look exactly the same. You can actually simply imagine this as a whole bunch of Molex connectors on a circuitboard with a basic BIOS built-in. What we hook into them is kind of irrelevant, as our autonomic system will handle this.

2) Autonomic. this is a combination of hardware and updatable firmware. Think of a reprogrammable microprocessor, perhaps with a small bit of RAM or SSD storage attached to it. The hardware simply interprets signals from the behavioral level and sends them to our mechanical level; the firmware handles the specific details. So let’s imagine we’ve plugged two motors into our circuitboard and a heat sensor. We then tell the firmware how much voltage to send to the motors, and what range of voltage we expect from the heat sensor. It then normalizes these values by mapping them to a floating point number between 0 and 1. (This is just an example of how you could do this.)

So let’s say our heat sensor sends temperature in degrees Celsius, with a maximum of 200 and a minimum of -50. Our autonomic system converts that to a 0-1 range, where 0 is -50 and 200 is 1. Therefore, if the temperature is 125 degrees Celsius, it sends a value of 0.5. Make sense?

Same with the motors. If the motor’s maximum RPM is 2500 (and minimum is obviously zero), if we send a message like “rotateMotor(0.5)” to our autonomic level, it “knows” to send the amount of current that will turn the motor at 1250 RPM. (This can get a bit more complicated, but for our purposes, this is a basic example.)

The point is, the actual physical operating ranges of our components don’t matter at all; that’s easily mappable to standardized value ranges by our autonomic system.

We can program the firmware based upon the mechanical stuff we have connected, so we can swap our components out at any time. We can also create simple programmable autonomic “behaviors”, which are preprogrammed instruction sets. One might be: if the heat sensor (which we’ve mounted at the front of our robot) gets above 0.5, turn both motors counterclockwise at amount 1 until the sensor’s value goes down to .25. This means that when our robot senses temperatures at 125º C or higher, it will run away until the temperature goes down to 62.5º C. This allows us to not worry about basic things like self-preservation. We can even make this behavior slightly more complex: for example, we can use motors that can send back the amount of torque to the autonomic level. If the torque is too high, the motor stops doing what it’s doing.

We can also create simple shortcuts, like “turn left” or “go forward by 500 feet”. These shortcuts can be translated by the autonomic level into hardware specific commands. For example, if we know our motor turns at 2500 RPM and we know that 5 revolutions will move it one foot, when our autonomic system receives the command “go forward by 500 feet”, it translates that into the command “turn on for 60000 milliseconds, or one minute”, which is sent to the motor.

In other words, the autonomic level acts like our limbic system, freeing our robot’s “higher brain” from having to worry about any of the tedious hardware interfacing shit.

And again, this doesn’t have to be oriented towards robotics. We can make an autonomic level that sends electricity through a speaker at a certain frequency when a certain button is pushed, which becomes a very simple musical synthesizer. It’s all just input and output. Just current.

If we’ve done our job correctly, we can now move on to the behavioral level of our device.

3) Behavioral. The behavioral level, in hardware terms, is a black box: it can either be an onboard CPU (like a Raspberry PI, for example) or a network connection, like in our imaginary Roomba. Doesn’t matter, as far as the rest of the system is concerned, as long as it sends commands that our autonomic system can understand. These can either be higher-level (“turn left”) or lower-level (“turn on motor #3 for eight milliseconds, pause for fifteen milliseconds, then turn on motor #5 for one hundred milliseconds, or until sensor #5 trips, in which case start the whole thing over again”). The logic for our behavioral system can be anything we like, provided we have a complex enough processor onboard or in the cloud. In fact, it doesn’t have to be either/or: we can build a behavioral center with half of its behaviors onboard and half in the cloud, or any variation thereof — like our Roomba, which stumbles around blindly until it’s given commands by the cloud. It depends upon the requirements of the tasks our device is made to carry out.

With a structure like this, we can easily build a simple “brain” for a robot that can essentially be connected to damn near any set of sensors and actuators and perform an infinite number of tasks, so long as the right sensors and actuators are connected to it. Such a robot could be anything from a simple Romotive-style consumer toy to a drone tank in a war zone to a telemedical surgical robot, performing neurosurgery while controlled by a doctor miles away. It doesn’t even have to be a robot, actually: it can be a synthesizer or a video game controller or an interface for driving the drone tank or performing the neurosurgery. I cannot stress this enough: it’s all just electricity, going to and from mechanical bits.

And herein lies the difficult part, which is not technical but organizational: this relies upon software and hardware standards, two things which the engineering industry seems simply incapable of deciding upon until forced at gunpoint. There is no standard way of connecting motors to sensors, no universal format for describing an actuator’s mechanical behavior (voltage, amperage, torque, maximum speed, operational temperature range, etc.). Nor is there any standard API or language protocol that can be implemented between the behavioral and autonomic layers. There are existing analogies in hardware/software interfacing: the first two that pop into mind are the USB Human Interface Device standard and MIDI, the Musical Instrument Digital Interface protocol which allows interoperability between synthesizers. (In point of fact, a number of non-musical devices like 3D motion capture systems incorporate MIDI as their input/output system, which is a square peg banged into a round hole, but which suggests that such a standard is probably about thirty years overdue.)

Think of your computer’s mouse, or your trackpad perhaps if you’re using a laptop. There are a few different methods of building a mouse: mechanical, optical, or in the case of touchpads, capacitive. A mouse can move around, or it can be stationary (as with a trackball). And when I was a kid, a mouse required a software driver that came on a floppy disk when you bought it.

But at some point, somebody realized that the actual mechanics of any given mouse were just completely goddamn irrelevant from a software perspective, because every mouse — no matter how it works — just sends back two bits of information: X and Y position. So the people who make mice figured out a standard, in which a mouse sends that positional coordinate over USB in a standard way, which is called “class compliance”. How it converted motion into that coordinate — whether it used two rollers or a laser — was handled at the autonomic level, by the tiny chip inside the mouse.

So now, when you buy a mouse, you plug it in and it works. Any mouse manufacturer who attempted to build a mouse that wasn’t USB class compliant would very quickly go out of business. It would be pointless. There are lots of wonderful improvements in mouse design, I guess, and probably entire conventions full of engineering nerds who get together and get drunk in hotels and talk animatedly about lasers versus capacitance. But nobody else gives a shit. We’ve sorted the irritating part out.

And yet, the people who make robots are still reinventing the wheel, every single time, despite the fact that no robot is anything more than a collection of sensors and actuators, held together in ways that are really fascinating if you’re a structural engineer and completely irrelevant if you’re just trying to write software that controls robots. It’s all just motors, even if there are lots and lots of them and they’re connected in extremely intricate ways. You’re just sending and receiving current.

Imagine a standardization scheme where you, the aspiring roboticist, could purchase a set of motors and sensors and bring them home. Each one might have a QR code printed on it or an RFID attached to it; you could scan the code, and your computer would retrieve all of the pertinent information about the mechanism. You could then plug it into your autonomic interface, tell your computer which mechanism was at which port, and your computer would then prepare the firmware and dump it into the system. You could then attach a CPU or network interface to the autonomic board, and within minutes your robot would be active and alive, behaving in any fashion you liked.

Commercially-sold robots — with perhaps complex and delicate assemblies that would be difficult for you to make at home — would have pre-existing complex autonomic systems, with software that allowed you to “train” them, or purchase downloadable “personalities”, which would simply be pre-existing behavioral methods. Tinkerers could modify and customize the behavior of their robots using standard APIs, which could even have safety limits set in place so that you couldn’t accidentally short your robot or blow out its motor, unless you were sophisticated enough to bypass the API (and the autonomic system) and control the mechanical bits directly.

If robot manufacturers adopted this model, we would begin to see a true Golden Age of robotics, I think. We would begin to see emergent complexity at a far faster speed than is currently displayed, because anybody could build and train robots, and link them together, and let them not only act but interact, learn from each other, and contribute to and benefit from collective knowledge and action.

Now, if only we could convince engineers to get their shit together.

Leave a comment