"But does it have Heart, Watson; Does it have Heart?"
-- probably inaccurate Alexander Graham Bell quote
I'm not quite sure why I am writing on the topic of artificial intelligence (AI) at all, especially in a blog ostensibly devoted to computer languages. I think a lot of programmers give thought from time to time about machine intelligence, and as I develop the Flexible System Architecture, and consider the sorts of problems it may be able to tackle, I too find myself pondering this faintly absurd subject.
First though, here's a really elementary paragraph on neurons just to define some terms so you won't have to constantly bounce back and forth to Neuron in Wikipedia. A brain cell consists of a cell body (soma) with one long, maybe very long, axon and a shaggy bunch of dendrites hanging off it. The axon eventually branches into its own shagginess. Neurons send signals out their axons and receive them via their dendrites, across gaps known as synapses. The signals cross the gaps by way of chemical messengers called neurotransmitters. This is WAY simplified, but good enough to get us started.
So let's dig in. Assuming the standard for intelligence is the human brain (and avoiding any snarkiness about the general state of human society), we're looking at approximately 100 trillion connections among 100 billion neurons. As of this date we're getting close to putting 10 billion transistors on a fingernail sized piece of silicon, but their number of interconnections is only a single digit multiple of that, and one transistor definitely does not equal the processing capability of one neuron.
On the other hand, the ace in the hole for transistorized information processing is time: computer chips run WAY faster than neurons. I admit that's a big unoriginal "Well, Duh!" I'm sure AI pioneer Marvin Minsky had a similar thought at MIT way back in the 1960's. Still, this ratio is worth exploring in some detail.
The electrochemical signals that travel along an axon well-sheathed in myelin (a white, fatty insulation that increases nerve propagation speed) max out at about 100 meters per second. The electrical signals on a chip travel at an appreciable percentage of light's 300,000,000 meters per, roughly a million times faster. Advantage silicon. Also, the neuronal signals are trains of pulses, up to 100 per second. This is a seemingly inefficient form of serial communication compared to a modern commercial computer's 64-wide parallel paths. Again, advantage silicon.
Not so fast! I wrote 'seemingly inefficient' because each pulse in the train is releasing neurotransmitters at the axon's synapses - ALL its synapses, of which there may be hundreds or even thousands, each connecting to another neuron. Serial becomes parallel, in a big way. Advantage grey and white matter after all? Perhaps so.
Nevertheless, the gap in signal speed would seem to give silicon some real leverage in being able to catch up to, and maybe surpass, protoplasmic intelligence. What other advantages might it have? How about economy of design?
Some of what passes for processing in human and animal brains seems a bit ad hoc, thrown together by random evolution. As one example, in the human retina, neurons sit in front of the photoreceptor cells, partially blocking the very light the receptors are trying to sense. An imperfect design that nevertheless gets the job done. The eyes just happened to evolve that way.
All the various nodes for processing information in the brain are products of a similar evolution. No doubt we'll find some of them are a bit quirky in their levels of efficiency. Perhaps we already have - I'm not killing myself with scholarly research here; I've been in over my head in some of my previous essays, so I'm not too worried about finding the deep end here. This is all speculation - hopefully I'll learn something, even if you don't! The point is, an artificial brain built from computer components can have the advantage of a more directed, targeted design. Intelligent design versus evolution. Oh no, did I just write what I think I wrote?
Enough clowning around. Programmers and systems engineers tend to design either from the top-down or from the bottom-up; that is, general to particular or vice-versa. The next few paragraphs are going to abstract Carl Zimmer's article, 100 Trillion Connections, from the January 2011 Scientific American, which has a top-down emphasis.
When for whatever reason a single neuron fires, it tends to cause an 'avalanche' of other neurons firing as well. Most of these avalanches are confined to nearby neurons but sometimes they can spread far and wide. Taken altogether, these avalanches follow a power law of many local ones, some intermediate ones, and a few widespread. What is interesting is that these effects can be duplicated in a computer by setting up a "small-world" network. The closest simulation comes from 60 clusters of neurons (nicely near the binary 'round number' of 64!), each connected to an average of 10 other clusters. The average is 10, yet some hubs are well-connected, while others are on the periphery - just like in the human brain. To quote Mr. Zimmer, "The architecture of the network itself shapes its pattern of activity."
Of course, these findings don't even begin to answer all the questions about how the brain is connected to consciousness, but they do show that evolution has come up with the most practical organization to allow neural signals to travel a far distance when necessary, yet keep the pathways for carrying those signals to a minimum, thereby saving valuable 3D real estate.
But this research is merely a start. The human brain is not a static network, but rather one that is constantly dynamically reorganizing itself. And there seems to be just the right amount of background noise - chaotic nerve firings - to keep the network maximally sensitive to incoming sense data.
With this overview, we can get back to the possible advantages that electronic logic might have to be able to, well, outsmart the protoplasmic thinking machine.
For almost as long as I've been working on my computer design, I've had in mind this generalized architecture for the basic FSA "Envelope":
The Envelope is a field of 4096 control bits, which in this case is carved into equal quarters. In the top right corner, most of the control bit real estate is shoved aside to make room for a "four-pack" of fast, agile sequencers and their support circuitry. In each of the other three quarters, imagine a thousand control bits being orchestrated from programs running in the top right.
The top half can be considered analgous to the logical "right brain" while the bottom, especially the lower right, can act as the more creative "left brain." In truth, the whole left brain / right brain paradigm has been de-emphasized by recent research: the two halves of the brain really cooperate more than had been previously thought.
Nevertheless, I like the idea of having two very different kinds of processing available. I've always thought neural network fabrics should be included in more standard ICs than is actually the case. This is kind of ironic since so far I've spent about 97% of my creative energies developing the top right, with another 3% going to the top left, leaving about .0002% having been directed to the neural net area. Oh well.
People have been simulating neural networks with computer programs for decades now, but more and more effort is going into developing analog neural circuits that don't have to depend on the ones and zeros of binary computing. Whether these designs run any faster than the digital simulations is maybe not as important than the fact that they should ultimately take up much less space.
Remember, we're competing with about 3 pounds of 'jello' taking up not much more than 70 cubic inches, and using give or take 20 watts. If we do finally come up with something just as smart, yet it has to 'live' in an air conditioned room full of rack mounted processors burning up a megawatt or so, well, it seems to me the brain still wins that matchup. Anything that can shrink the silicon wiring is a good thing. Yes, modern microelectronics are tiny, but so are nerves: there are 100 miles of axons packed into that 70 cubic inches. 100 miles, inside our heads! I wonder how many miles of wire there are in IBM's Jeopardy winning Watson computer (including all the submicroscopic buses on all its chips)? It would be an interesting comparison.
It is not merely size we are looking for, we want efficiency as well. Evolution has built up layered systems in the brain in order to process sense input. Take vision for instance. Starting at the eye, which is now pretty much considered to be an extension of the brain, visual processing flows through a hierarchy of processing, mostly in the visual cortex at the back of the head. There are areas in the eye/brain system that process very specific "primitives," relating to such items as contrast, shape, color and motion.
If a given primitive can be emulated in the green neural network section above, well and good. Yet computer technology has made great strides in digital signal processing. If some IP ("intellectual property," a general patenting term that has been hijacked to also apply to algorithms that can be loaded as programs into computers, (or proprietary circuits that can be burned onto silicon)) can perfectly well perform the same function as one of these primitives, why then, download the IP into the Damn Fast Math section and let the processing take place there rather than some quirky, incomprehensible bundle of spaghetti in the green area. It might be faster, as well.
Maybe quite a bit of processing can be handled by software running algorithms that are nothing like their counterparts in the brain - yet produce the same results.
Even though this essay isn't about computer languages per se (maybe BECAUSE this essay isn't about computer languages), I will throw in a little bit of programming here.
I have some ideas about how to most efficiently simulate neurons in the FSA. One of the unusual features in the architecture is the inclusion of dedicated Test Buses. These are 16 bit wide buses having the data format:
where the top byte is an 8 bit address, followed by 4 test bits, and ending with 4 "handshake bits" that help the receiving sequencer decide how to handle the incoming test data (by the way, just as 8 bits is called a "byte", 4 bits are often given the name "nibble", sometimes spelled "nibl").
The address is always that of the sourcing sequencer. This is important because any given test bus is connected to the inputs of a large number of sequencers, and it is the programs running in those seqs that decide whether to either accept or ignore the data streaming by on the bus. For control purposes, a sequencer usually only needs to be set up to be receiving test data from one, or at most a very few, other processes.
However, to act as a neuron it needs to be connected to many other neurons. So the trick is to set up an emulating sequencer to accept ALL the test data going by and immediately filter out whatever neurons it is not supposed to be connected to.
It can use the address in the test word to do the filtering, and the easiest way to do that is by setting up a size 256 jump table array. To do so it has to shift the address 8 bits to the right. For efficiency [low level detail alert], the sequencer's input buffer will do this with hardwired logic rather than having to send the word to an ALU to perform the shift.
The simplest kind of jump table will contain only two branches. If this neuron isn't supposed to be receiving info from the neuron that sent the word just read from the bus, the jump will go right back to reading another word. If the word came from a 'friendly' neuron then the jump will go to a part of the program that can further process it. Time to look at those test bits.
Again, in a conventional control situation the test bits would typically be examined one at a time. But for this simulated nervous system it might be better to simply treat them as raw numeric data. Admittedly, four test bits yield a tiny numeric range (0 thru 15), but let's go back up the 'axon' to the sending process to see why I think this could be adequate.
A sending sequencer's job is to let the network know how 'excited' a neuron it is. When it puts a word on the test bus, it can load a 15 into the test nibble ("I'm really excited"), a 1 ("I'm barely awake"), or anything in between (it probably would not bother to send out a 0 ("I'm bored")). That's a pretty wide range, actually, between "Yahoo" and "blah." If it turns out not to be enough, the nibble value could always stand for an exponent. Then the range would go from 21 thru 215, or 1 thru 32768.
Meanwhile, back at the receiving neuron, its program would keep a running total of all the nibbles received from all the neurons on its approved input list. After a certain time interval it could also subtract a constant from from this total ("I haven't heard much from you guys lately"). Also at certain times it would look at the total and translate that into a nibble value that it would then send out, adding to the flow on the test bus. The larger the total, the bigger nibble it sends to those neurons that have it on their respective input lists. Thus each simulated neuron is both a receiver and sender and therefore a full member of the network.
In a real nervous system, some neurons inhibit the ones downstream from them rather than excite them. That's fine, make the standard test nibble a signed value so its range is now from -8 thru +7. This is still the same resolution of data, just moved on the number line so subtractions can now easily affect the running totals.
It would be easier [changing the design as I write about it alert] to update the internal totals if the test bus format were aaaaaaaahhhhbbbb. Maybe I'll do that. No, I won't - I just rechecked the FSA instruction set and this would not fit [changing the design back as I write about it]. Actually, the leftmost hhhh bits, even if not masked away, merely have a slight rounding up bias which probably (hopefully) can be ignored. Neuron communication is sloppy anyway...
You might guess that it would have been even better to move the address all the way to the right and lose the shifting needed to address the jump tables. Sorry, it needs to stay left for when or if the architecture expands beyond 16 bits. The address cannot easily grow if it is bumping into h's & b's on its left [end of designing as I write for now].
There's no end of fiddle factors built in to this quite simple system. All the timing and weighting parameters can be adjusted to tune it in different ways. Let's add some random test bus activity just to get the communication started. When feedback is thrown into the mix - one neuron's output looping through the network to arrive back at some of its upstream neurons, a delightful complexity ensues.
This jump table design also provides flexibility for the network to reconstruct itself on the fly. A neuron can grow new connections simply by getting its address added to more jump tables. Connections can be dropped in a similar manner, by removing addresses.
Personally, I'm less interested in emulating animal nervous systems than merely seeing what sort of unexpected emergent behavior might pop out of this cross-connected, fedback network as various parameters are tweaked. Still, here's a platform for many different research directions. And all this is constructed without even yet involving the green, 'official' neural net section in the graphic above.
At this point, the technically minded might be thinking that only a small neural system can be squeezed into however many sequencers can fit into the 4096 bit envelope. Valid point, but remember, those bits are control bits. Take a reasonably sized group of those (probably on the order of 16 - 64 bits) and use them to communicate with another envelope hiding underneath! That would allow a minimum of 16 envelopes to reside below just the I/O section. And those envelopes could beget other envelopes, and so on and so on. Remember: the small-world network paradigm consists of interconnected clusters.
Admittedly, bus contention and other limitations of highly parallel systems would catch up with this nested dolls approach for applying the FSA to neural simulation. Then again, any other attempted implementation of a silicon brain faces similar problems.
In my opening paragraph, I called artificial intelligence a 'faintly absurd subject.' Actually, I find it faintly distasteful. In reading about the Blue Brain Project on Wikipedia, I came across this quote by Henry Markram, the project's head honcho: "It is not impossible to build a human brain and we can do it in 10 years." Such arrogance!
Even granting that their neuronic simulations go well beyond anything I've described so far - their neurons are "biologically realistic" down to the "molecular level" - who says neurons are only what the brain is about?
For every neuron in the human brain, there is also roughly one glial cell. These cells used to be considered just structural scaffolding for neurons, the real workhorses. But now researchers have begun finding that glial cells play a role, not yet well understood, in synaptic transmission. They are beginning to be considered partners of neurons, rather than merely glue. Glial cells make up about 90% of the brain's volume (probably the basis for the incorrect meme of "we only use 10% of our brain"), and apparently some definite percentage of its processing.
Speaking of processing, some scientists have speculated that there may be quantum effects taking place in the brain. This is unproven, and difficult to test, but if turns out to be true, then simulating neurons alone definitely won't tell the whole story.
I finished my summary of Zimmer's article by writing: "And there seems to be just the right amount of background noise - chaotic nerve firings - to keep the network maximally sensitive to incoming sense data." I wonder how hard will it be to get that underlying crackle right? Will the Blue Brainers have a hard time 'priming the pump' to get their magnum opus to wake up?
Speaking of sense data, I'd guess they have some plans for sense input. Assuming they want to wire up tranducers that connect to the physical world, hearing should be easiest, and an approximation of smell should be possible if they want to take the trouble. Taste is not so easy, and touch will be really hard. That leaves sight: hook up some cameras and they're good to go, right?
To look for an answer to that question, let's turn to the prolific Carl Zimmer for an answer, this time abstracting from his "The Brain" column in the September 2011 Discover Magazine. He writes about the heroic efforts to restore sight to the blind, with an emphasis on bionic implants, basically "by plugging cameras into peoples eyes." This is starting to happen in a very crude way, but the technology should now both improve and drop in price fairly quickly.
There are some sticking points, naturally. One is that the photodiodes of current camera technology are laid out in a regular grid pattern. On the other hand, "The network of neurons in the retina ... looks less like a grid than a set of psychedelic snowflakes, with branches upon branches filling the retina in swirling patterns." This is not a good match at all. When they can finally make implants that can approximate the density of retinal neurons (the current ratio: The human retina has 127 million photoreceptors within 1,100 square millimeters, the best camera has about 12 million light sensors within the same space), the mismatch of grid to 'snowflakes' will mean a large percentage of the camera elements will go to waste.
They're working on it of course, but if they don't have that problem solved in time for Blue Brain, BB's camera input won't match what the brain is used to seeing.
Another problem: the retina's photo-cells are particularly concentrated in one place, the fovea, which effectively gives us humans a form of tunnel vision. We compensate by moving the eyes, always jumping the eyes' aiming point from place to place: "The frequency of a jump goes up as distance gets shorter. In other words, we make big jumps from time to time, but we make more smaller jumps, and far more even smaller jumps." I find it interesting that this "fractal" patterning is quite similar to the power law underlying the avalanches of neuron firing we looked at above.
Be that as it may, the point here is that if they don't give BB cameras that can move in a similar way, the camera input once again, won't match what the brain is used to seeing.
Here's another gotcha: our brains exist in a physical body. With muscles, and movement. A major part of why we even have a brain is related to this fact. Le Anne Schreiber writes about "mirror neurons" in, of all places, the sports and pop culture site Grantland (This Is Your Brain on Sports).
Suppose you're watching a football game, and the camera zooms in on the quarterback dropping back to throw a pass. Obviously, a whole slew of neurons in his brain are orchestrating the muscle movements needed to perform this task. But guess what? The same set of neurons are firing in your premotor cortex at the exact same time! Not all of them, only about 20%. And your muscles themselves remain in couch potato mode, which is good - your fragile furnishings are safe.
It runs on a sliding scale: if a professional quarterback is watching the game, his premotor neurons will perfectly mimic the action, and his muscles will almost start to contract. A sports announcer's mirroring will also be nearly perfect, yet without the "motor evoked potentials" of the professional athlete. A general spectator's brain reaction will be more generalized, but still present. Even if you've never thrown a football in your life, as long as you somewhere, sometime threw something AT something, mirroring will take place.
The article ends by quoting one of the researchers interviewed, the UCLA neuroscientist Marco Iacoboni: "If this system is important for the things we believe it is, which is social cognition, understanding the mental states of others, empathizing with others, and how to learn things just by watching others, evolution must have devised something that makes us feel good when we activate these cells, which makes us do it more and more, because that is an adaptive mechanism."
Which brings me to my final point. I would suppose that the Blue Brain researchers intend to simulate an adult human brain. But a real adult brain isn't a static snapshot taken at a specific time. It's the result of a long and winding road of socialization that begins at birth and continues for a lifetime.
The BB Wikipedia article says, "It is hoped that it will eventually shed light on the nature of consciousness" (although a tag does say ). I sincerely hope that when they flip the switch, poor Little Brain Blue does not actually awaken to some level of consciousness. This crude, first-pass conglomeration of computer parts will have no connection to the human class brain "living" within it. Imagine the confusion, the angst ... the lonely pain.