A real-world example of BAP robot architecture

You’ll have to excuse me — I’ve got the flu today, and I might not be entirely coherent. But here’s a concrete example of my BAP robotic framework. This is a simple robot with two motors (one on each side) and a GPS unit.

robot layout

The first thing we notice is that each of the motors is controlled by an H-bridge, which consists of four switches that allow current to flow in either direction through the motor itself — or through a relay switch which turns on the motor, if the motor is high-voltage. (For our purposes, we’ll assume it is, as a 5V robot is small and boring).

The H-bridge controls the relay, which controls current coming from a power supply to the motor. So with our system, we can turn each motor forward or backward. We can’t control speed — the motor is either on or off — but that might be enough for our purposes.

We’re using OSC, so let’s call our robot robot1. The robot has two top-level endpoints, sensors and motors. These are arbitrarily named, and in point of fact we don’t even need them — we can assign top-level endpoints to every component in the robot. But let’s leave it like this, as it’s nice and tidy.

We see that the motors endpoint has two sub-endpoints, leftMotor and rightMotor. Our behavior system sends commands to these motors by issuing methods to the endpoint, like so:

robot1/motors/leftMotor/goForward
robot1/motors/leftMotor/goBackward
robot1/motors/leftMotor/stop

These three commands are all we can really do with these motors…but as we’ll see, they can be used to instantiate a pretty sophisticated set of behaviors.

Our autonomic layer “knows” what these commands mean and how to actually make them happen. For example, pseudocode for our goForward function might look like this:

function goForward(){
    setSwitch(0).HIGH;
    setSwitch(1).HIGH;
    setSwitch(2).LOW;
    setSwitch(3).LOW;
}

The setSwitch commands simply turn the switches on the H-bridge connected to the relay connected to leftMotor to send current in a direction that rotates the motor counterclockwise (which, since it’s on the left side, turns it “forward”). The goBackward function might turn switchs 2 and 3 to high and switches 0 and 1 to low; and the stop command might simply set all of them to low, or disconnect them. (We can get more funky by having the stop command reverse polarity for 10 milliseconds before disconnecting, which would work like an engine brake.)

Now, we’ve also got a sensor: namely the GPS unit on top of the robot. This provides two endpoints in our current scheme, lat and lng, though it could also provide a single endpoint which when queried returns a formatted string from the GPS like “-37.3457532,30.12930495” that our behavioral layer can decipher.

We’d query it by issuing a command like this (again, this is pseudocode):

var robotLat = osc.retrieve(/robot1/sensors/gps/lat); var robotLng = osc.retrieve(/robot1/sensors/gps/lng);

Pretty straightforward, right? This way, our software can figure out where the robot is at any given time.

Now, you’re maybe asking yourself: how does our software connect to the robot? How does the GPS data get translated and piped out to our endpoints? For that matter, how are our motors connected to the power supply? Does our control circuitry hook into the same power supply?

The answer is simple: who gives a shit? Really? The whole point of this is abstraction. The hardware engineer sorts out the physical layer, the firmware geek sorts out the autonomic layer, the programmer works out the behavioral level. None of them give a shit about any of the other layers, except insofar as their bit needs to hook into the other guy’s bit.

The autonomic engineer spends months designing a really efficient cellular modem that receives commands from the Internet and turns them into hardware-specific commands. The design is brilliant. Other engineers buy him endless rounds at conventions for being a badass.

And the programmer neither knows nor cares about any of this, because all that matters is that the system responds to commands. Period.

Look, do you know anything about Javascript? Javascript lets you make web pages interactive. (Yes, it can do much more, blah blah Node.js blah blah blah. <i>Shut up.</i> Pay attention.) One of the things you can interact with using Javascript is the mouse — you can get the mouse’s X and Y position relative to the screen or even the DOM object you’re attaching the code to.

Doing this requires the browser to be able to retrieve relatively low-level data about the mouse’s current status. In return, it does this by accessing the operating system’s mouse driver, which uses a standard set of protocols to interface with the mouse’s hardware, which might be optical or mechanical or a touchpad or motion sensors in the air.

But as far as the Web developer is concerned, who gives a shit how all of that works? So long as window.mouseX and window.mouseY are accurate, the rest of the process of turning physical movements into Cartesian coordinates is pretty much irrelevant.

Same thing here. There might be a more efficient way of controlling the motor than with an H-bridge and a relay switch, but as long as the motor can take in our three commands and turn them into motion, who gives a shit?

We have everything we need now to begin building complex logic into our robot. Let’s use an example of a behavior I mentioned in a previous post: let’s make our robot come and find us.

We have a GPS, so the robot knows where it is; and we have motors. What we don’t have is a digital compass to help our robot orient itself…but we can still pull this off, with some code like so (again, this is pseudocode):

var myLat = 36.1515;
var myLng = -115.2048;

var robotLat = osc.retrieve(robot1/sensors/gps/lat);
var robotLng = osc.retrieve(robot1/sensors/gps/lng);

if(robotLat != myLat && robotLng != myLng){

oldLat = robotLat;
oldLng = robotLng;

osc.transmit(/robot1/motors/leftMotor/goForward);
osc.transmit(/robot1/motors/rightMotor/goForward);
wait(10000); // this is milliseconds

robotLat = osc.retrieve(robot1/sensors/gps/lat);
robotLng = osc.retrieve(robot1/sensors/gps/lng);

if(difference(robotLat, myLng) &gt; difference(oldLng)){
    osc.transmit(/robot1/motors/leftMotor/stop);
    osc.transmit(/robot1/motors/rightMotor/stop);
    osc.transmit(/robot1/motors/leftMotor/goBackwards);
    osc.transmit(/robot1/motors/rightMotor/goBackwards);
    }


// and so on

}

What this code does is make the robot “try” different directions by moving in a direction for ten seconds and seeing if that decreases the distance between its location and my location. Once it finds a direction that does so, it goes in that direction until its coordinates match my coordinates.

What if we want to make it more complex, “smarter”? We can add in a digital compass, which returns either a value in degrees (0-360) or as a mapped range (0-1, where 1 can be mapped to 360 degrees). Then our robot knows where it is and where it’s pointed. It doesn’t need to orient itself before it starts to move.

Our behavioral software can connect to Google Maps and get a series of directions, following roads, between the robot and us. We can then direct the robot using these directions, simply by issuing commands to our two motors. It doesn’t matter that our robot doesn’t know anything about Maps, or that Maps doesn’t have robot navigation built in. Our behavioral layer handles that for us.

Is this complex? Sure it is. So it’s a good thing we’re not trying to do this with instructions hardwired into our robot. All the complexity of translating road directions into commands to turn motors for certain time periods, in certain directions, is offloaded to our behavioral layer, which can be on-board the robot or — more likely — connected over a network.

Of course, our robot still has to avoid cars as it trundles along public roads, and maybe it has to worry about recharging itself. But all we have to do is add more sensors — touch sensors, range sensors, a battery level sensor. We can download the low-level stuff directly to the robot so there’s no network latency between a command being issued and being carried out. For example, we can tell the robot “if something triggers your range sensor, move away from it until it’s not there anymore, then recalculate your travel direction”. Or “if your battery gets below 25%, here’s the GPS coordinates of a filling station where a nice person can charge you again. Go there instead of your initial location.” Once the battery level is 100%, we simply resume our initial set of commands to reduce distance between the robot and us.

Simple, right? Well, not simple, not trivial, but not so complex we can’t make it happen.

And here’s the great part: we can have our robot perform this behavior until we get bored, and then we can totally reconfigure it. We can build a ground-based version of John Robb’s DroneNet.

Imagine Buffy was at Angel’s house last night, and left her keys there. So we give our robot two GPS coordinates, and we add a lockbox with a combination to the top of it. We tell it to go from its current location to the first GPS coordinate, which is Angel’s moody pad. Angel puts Buffy’s keys in the lockbox, locks it, and hits a “Deliver” button on his iPhone. Our software receives that button press, and tells the robot to go to the second GPS coordinate — Buffy’s house — using all of these sophisticated behaviors we’ve made as reusable software objects. It makes the delivery, Buffy uses the combination her iPhone told her, unlocks the box, gets her keys, hits “Received” on her iPhone, and the robot returns to its home, waiting for its next customer.

Same robot, massively different functionality. As long as our robot’s motors and sensors and interface — its physical and autonomic layers — are in working order, we can keep coming up with cool ways to use it. We can add sensors to it as necessary. We can also reconfigure its hardware so that it moves in more optimized ways — we can make it a flying drone instead of a rolling drone, for example. But we can still use our GPS “find me” behavior, almost independent of how the robot moves from point A to point B. It will require some reconfiguration, but if we’ve made it modular enough in the first place, that might not be such a big deal to accomplish.

We can even allow third-party developers and behavioral networks to access our robot, maybe using something like oAuth. Instead of writing our own behaviors, we entrust our robot to Google’s behaviors, because their programmers are smarter than us and they also have a vast network of supercomputers to process our robot’s sensorial data and connect it to all the other robots in Google’s network. Our robot isn’t very smart on its own, but when connected to a behavioral network, it’s much smarter…and when its behavior is synced with a lot of other robots, all gathering information and doing things, it’s scary smart.

Leave a comment