GENERAL

APPLICATION

PROCESSING



Welcome to the home page of General Application Processing™. Read about the creation of a Flexible System Architecture™ for the next generation of computer, System-On-Chip, and multicore processor design.



Screen shot of the FSA™ Simulator thirteen cycles into a CORDIC computation of sine & cosine. Input: 45 degrees. Output: cosine is at top of P1 at level C3, sine is just below it. Both values are .70703125 in a fixed point format with thirteen bits to the right of the binary point.


For fun: click the pic or HERE for an explanation of the CORDIC computing system
Updated — 2008
More serious: scroll DOWN for the NEW Home Page




Guest Pages

Brian's CNC Site Brian's Tesla Lab

GenAPro continues to welcome two additions to the site. These areas demonstrate the hard work of Brian Foley and his active mind and hands.





Prior
Homepage
Vision &
Mission
Executive
Summary
Technical
Stuff
Sitemap


Distinguished Visitor —

I am changing the format of this homepage to a simpler, hopefully more readable question and answer format, similar to a column you might see in a technical magazine. Of course in this case, the 'editor' and the 'interviewee' are both the same person. But why not?

I've also reduced the links to the bar above. The original home page is there, with its greater detail, and any of the pages on the bar contain a fuller set of links to the site.

So, let's get right into the "interview" :


In 25 words or less ...

The FSA provides extremely low-level control of things that run extremely fast.

Bit-twiddling, in other words.

Yes. On things that run the fastest: fast transistors, fast gates, fast logic.

And what makes the FSA able to keep up?

And hopefully, stay ahead! A couple of things:

Its "Standard Sequencer" is extremely small and simple. It has no microcode; indeed, it has little control logic at all in order to keep instruction cycle times short. The instructions themselves are minimal, and need no pipelining.

About half the instructions (and the most commonly used) operate directly out of memory. It takes one gate delay to recognize these instructions, and then BOOM, they're on their way; or, boom, they've checked a test bit and are already looking up a response. The rest fall into a decoder that's simple enough to keep up with the clock rate.

That covers speed, what about power?

If you're talking about power dissipation, I would expect the FSA to primarily be used within applications that have the circuitry running pretty hot anyway. Still, a 16 bit core should certainly use less power than a 32 bit core, if that's a requirement.

On the other hand, if you're asking about processing power, the FSA has a multi-path ALU available to crunch control data. One path is based on the venerable TTL 181, yielding common add, subtract, compare, and bitwise logic functionality. But there's more: other paths perform proprietary operations that a standard ALU can't do, at least not without multiple clock cycles. So both time and code size can be reduced.

But it doesn't stop there. Even a multi-path ALU can still be limited by bus contention issues, something that in a multicore system can bog things down very easily. So the FSA has an overlaid communication system that bypasses the ALU altogether. I can't really go into that any further here, except to say that two hands are usually better than one.

Imagine the dealer in a game of blackjack. The dealer has to pass out cards to the players around the table, and also has to be responsive to input from those players ("hit me"), which can affect the order of the card distribution.

The FSA is obviously the dealer in this analogy. What makes it different than the zillion other cores that are out there? First, the FSA is optimized to be the fastest dealer in the business.

One way to take advantage of its speed is to have a game with only smart players. That is to say in a friendly game, the dealer might offer some advice from time to time ("Should I stay on 16?"), but this just slows down play while the dealer has to respond. When the FSA is the 'dealer', the 'players' should definitely have the primary responsibility for their own play.

To begin to stretch the blackjack analogy, it can also be said that the FSA is also the dealer with the most cards. Within its instruction set's addressing range, the architecture can pass to IP subsystems a plethora of information over multiple paths. Again, it all comes down to what a computer design is optimized for. Not only is the FSA a Control Architecture, but the control is fast and ... dense.
You wrote in your original homepage that "the FSA machine language is its own microcode," yet you said above that there was no microcoding of the sequencer.

If you want to fit a general descriptive category, you could say it is closest to vertical microcode. The difference is, where is that microcode aimed? It's aimed at application logic rather than at its own hardware.

When I was first developing the instruction set, I had a type of indirect instruction that I assumed I would have to "microcode," that is, have two standard sequencers side by side and use one of them to step the other through the completion of the instruction. That turned out to be unnecessary. A closer look at the problem revealed that the target sequencer had enough control built in to handle the instruction by itself.

Yet the facility is still there to put one sequencer beside another to handle more complex tasks. In fact, I like to bundle a "four-pack" of them together to allow for easy context switching and various kinds of threaded code. But so far, no need for one to take over and run another.

So, the microinstructions are primarily meant to be sent directly to IP subsystems, the sequencer acting like the dealer in a game of cards. [see sidebar]

Of course, sometimes the IP wants to talk back, so a full one quarter of the instruction set is dedicated to responding to what comes over an internal test bus.

And if that test bus itself gets overloaded?

Why, then just drop in more sequencers. They're small, and some IP may not need the full ALU capability, so localize the control resources, and buses, where they're most needed.

A lot of people may have trouble with the 16 bit size ...

I understand. "The 80's have called and they want their architecture back." Look, this FSA core doesn't care if your data paths are 64 bits or wider, or how much memory your application is addressing, its job is to act as a traffic cop directing the bigger, wider bustle going on around it. I mean, your head is smaller than your body, right?

I have held for a long time that sixteen bits is the very best word size for this kind of low level control; I'm absolutely convinced of this. Yes, if you're doing long computations or need floating point, you'll want more bits, but to control something? If you need more than 16 bits, you aren't thinking the problem through!

So, what applications do you see for the FSA?

One that I added to the Executive Summary awhile back was network processing . Expensive custom logic ruled the day at the time, and the FSA seemed a perfect fit. By now there are several commercial NPUs out there, and typically, creating software for them is the biggest bottleneck - a game in which the FSA would now be playing catch up.

But you get the idea. Potential niches are always coming along. The core features of fast processing, inherent parallelism, and a small silicon (or GaAs?) footprint will always apply to high performance and multicore design.

How about standalone or embedded applications?

I remember the first Palm Pilot came out when I was working on an earlier version of this architecture, I thought, "Damn, that should be MY chip in there!" But, it wasn't yet ready for prime time.

Nowadays, while you could certainly make one heck of a single board computer out of the FSA, how many such systems would it be competing against? That niche seems pretty full to me.

I suppose you could say using an architecture as an on-chip core is a form of "deep embedding," so maybe after getting its instruction set entrenched in several such designs, the FSA could begin to gain popularity in standalone development. That will be several years down the road if it happens.

Any final words?

Well, this may be merely an inventor's conceit, but the FSA is a beautiful architecture. Elegance may not always win in the marketplace, but it is certainly more enjoyable to develop within.


— Bob Loy, Founder