Why Forth ?
I've reached the point in this blog where I can start getting away from reviewing elementary concepts and start waxing philosophical.
I do have to talk about about my Flexible System Architecture a little bit, because it, its simulator, and the Forth language will be the platform on which I perform my language "experiments."
Yet my original choice of Forth was not driven by such noble aims, but rather from feeling the need for a higher level, more structured approach than a 'standard' assembly language in order to tame the FSA.
A general discussion of the FSA is at the genapro.com home page, but here's a quick overview of features relevant to this essay:
Roominess - One reason I took to programming in higher level languages, particularly C, is that I always felt cramped when programming in typical machine languages. They just seemed to somehow get in their own way. It was futzy dealing with them. I started asking myself, "Do things always have to be so sequential? Moore's Law is doubling the size of computers every 18 months anyway, so can't things be spread out more in space?" Personally, I visualize better in 2D than 1D.
A massive ALU - The Central Processing Unit of the average computer is a powerful general purpose collection of logic, but its development has remained pretty static for decades. I thought I could do better, so I created "Bus Path Processing" which consists of eight separate logic paths available on the data bus. The venerable TTL '181' was one path, providing the usual well known arithmetic and logic functions.
With the other paths, I tried to expand the low level 'bit twiddling' that could be performed. For one extra path I named two instructions DUNK and SLAM, that could in one cycle change bit patterns in a way that would take at least two passes through a 181 or similar ALU. Another path is a complete shift matrix (plus rotate) that can shift any number of bits any direction in one cycle. This isn't the place to detail all the paths, but the idea is to ultimately end up with more condensed, powerful machine code. After all, the FSA is first and foremost a "Control Architecture."
The FSA has a very rich instruction set anyway, but adding these multiple processing paths makes it even richer. A standard mnemonic based assembly language approach seems like it might grow too large to keep track of, let alone work with easily.
Parallelism - This harkens back to the roominess concept. At a company long ago and far away I was once given a task to find a bug in some machine code using an ICE (In Circuit Emulator) for the Motorola 68020 chip. I found the bug and then took some time to poke around, since it was the first time I had a chance to look at compiled C. We had calls to many of the floating point library routines, and I was pretty amazed to see how little space these very involved mathematical functions took. Once again thinking of Moore's Law, I thought, "Someday every function could have its own processor to run out of." That's ambitious, and maybe a bit naive, but its not a bad concept to try to run with. So it leads to ...
Small, agile sequencers - Maybe because my first real professional engineering task was to translate a handful of little ECL (Emitter Coupled Logic) transistor circuits into a logic diagram of a push-the-envelope microsequencer, I developed a lifelong interest in microcode, nanocode and firmware. The heavily pipelined CPUs of modern computer chips seem overwrought. The FSA "standard sequencer" is a very tiny system consisting of two counters, a memory, and a decoder. Fast, compact, and simple. And you can put a lot of them on a chip.
Some quotes from the genapro.com home page: "... I like to bundle a "four-pack" of [sequencers] together to allow for easy context switching and various kinds of threaded code. ... Speaking of threaded code, I recently wrote a Forth interpreter / compiler for the FSA. It almost seemed like cheating to have four interoperating sequencers available to spread the dictionary, stacks, and everything else into. I have to say I gained a lot of respect for Chuck Moore and other Forth pioneers who did their development on single micros with only one memory space and limited scratch registers."
Here's a screen shot of a Forth session where I compiled
: ott 1 2 3 + + . ;
and then ran
It shows parts of four memories associated with four sequencers:
From left to right are the Forth dictionary, kernel, stacks and input buffer, associated with standard sequencers 3, 2, 1 and 0.
In the dictionary is the compiled form of "ott" with three embedded unnamed constants at the bottom of the word. Just above them is the "thread list" of the addresses of those constants along with pointers to '+' '+' and '.' . And at the top is the link to the word above, the LRS, character count, and ascii string.
Typing in "ott" transferred the thread list to the call stack in the middle of sequencer 1, and when it ran, it made use of the parameter stack at the top of seq 1. At first, the parameter stack was loaded with 1, 2, 3, and as processing progressed, it left behind 1, 5, 3 and finally 6, 5, 3. You may notice the stacks are upside down. Even though the sequencer counters can count either up or down, as a programmer I've been used to incrementing rather than decrementing, so I designed the stacks to work under 'anti-gravity'. Sorry.
Now we can get into the language development side of things.
My personal relationship with Forth has had its downs and ups. Forth and I were both young, professionally, at about the same time. Unfortunately, there were a lot of Forth Fanatics around then, some of them writing books, all of them claiming what a far better language Forth was than anything anyone else could possibly conceive.
Since I am a fanatical anti-fanatic, their claims (and books) turned me off.
Some years later, I obtained ("obtained" - meaning I can't now remember whether I begged, borrowed, bought or was gifted) a copy of Alan Winfield's "The Complete Forth", a very thin tome filled with a minimum of fanaticism and a maximum of interesting things to think about.
Even though it would be years before I actually did anything with it, It became one of those books which always moved with me whenever I did. It is the source material I dug out when it came time to start adding Forth to the FSA.
The book is so old that it covers the FORTH-79 standard. That's right, 1979. Don't worry, I've already started not implementing that particular version of Forth.
And that's apparently just fine. Here's a quote from Doug Hoyte (http://www.hcsw.org/reading/forth.txt): "However, forth is the only programming language that has a strong, vocal user community that is actively against the ANSI standard of its language, and chooses not to support it." And as far as I can tell from my reading, Charles Moore, the original inventor of Forth, barely chooses to even notice it.
Everybody, it seems, 'rolls their own' Forths. When I get done with this thing, whatever it turns out to be, I probably won't even be be able to name it "Forth," partly because it will be too tightly coupled to the quirks of the FSA, and partly because some things about Forth, as marvelous as it is, annoy me.
Jargon is part of it. From this point on in these essays, "word" and "dictionary" are gone. Kaput. I've previously had to use "slot" and "cell", to describe a "memory word" to avoid confusion with the Forth WORD. But my hardware background says the FSA is a 16 bit architecture which means it handles 16 bit 'words'. It addresses 16 bit 'words' in memory. To everybody who isn't a Forth programmer, that's what a 'word' is.
As for "dictionary": to me, it's just a clunky description. If I were teaching computer science to elementary school kids ... well, enough flaming, you get the idea. The dictionary is considered a "symbol table", or to be really modern, let's call it a "symbol collection". Yeah, that's it. And Forth words are now "tokens".
About those wor... tokens: A lot of the standard tokens in the FORTH-79 standard were pretty well named (IMHO). Some of them, maybe not so much. In fact, the various Forth 'standards' have renamed some of them as time passed. Forth, like C is very terse, some would say cryptic. But programmers hate typing. We do! Give us cryptic over verbose any day of the week. And yet, the longer Forth names were always ALL CAPITALS. Are you kidding me!? (fume, froth). This C programmer put up with DUP for a few weeks when I started Forth-ing the FSA, but that dict... symbol collection entry got changed to "dup", permanently. Seems like a lot of the modern Forths have changed to lower-case as well.
Well, I don't want to come off like one of those lab coated "Far Side" scientists fighting over some unbelievably trivial thing (too late), so I'll move on.
But not too far. I still need to discuss the parlance of some of the tokens that will be standard to FSA Forth (should I call it "North", as in "Not Forth"?).
Remember, I'm replacing / augmenting assembly language for my own architecture. I've decided, for better or worse, to jump directly from the base four FSA native code, eg,
# add two integers and store result in reg # 4 03101013 # direct set choosing alu181 path 00103203 # IMMED into register # 4 00000003 # data - binary 3 03233203 # IMMED to BPP_ALU_REG 00000011 # data - binary 5 03133203 # IMMED to ALU_CTRL_REG 00002100 # instruction - load PLUS-no-carry-in encoding 00103303 # CYCLE from-to register # 4 01000030 # reset alu181 path (load 'straight' path)
that the simulator now reads and processes (and of course, always will) to a Forth-like higher level system.
My very first Forth symbol collection ("symcollection") contained the token
which stands for "shift left with 0 fill". This is a very standard ALU operation, shifting a bit pattern to the left, pushing the MSB off the end and pushing in a Zero to be the new LSB. However, very few ALUs let you push a One into the LSB. Most of the time you don't want that anyway, but if you do, you have to follow the shift by ORing with 000...001 to get that One in there. The FSA Shift Path allows either kind of fill however, so that implies the need for a sl1f token. And naturally, you can also shift right, which would demand tokens for sr0f and sr1f.
What we'll end up with is lots more mnemonics than the typical assembly language, and they need to be tamed. One subject I'll get to later is the idea of weeding the operations that aren't used for a given program out of the symcollection for that program.
But mnemonics and abbreviations in general tend to make assembly languages hard to use. They are certainly better than the alternative of dealing with the hexadecimal (or base 4) representations of machine words, but for a big chip with a lot of instructions, the assembly language is a lot to hold in your head. And the FSA, for all its simplicity, has a big native language.
I prefer using symbols over mnemonics. I think I'll use the following substitutions, for example:
sl0f == <<0 sl1f == <<1 sr0f == >>0 sr1f == >>1
making them sort of C-like (and for the same Bus Processing Path I'll use <<< and >>> for Rotate). Of course, some programmers might prefer the mnemonic versions. The nice thing about a Forth style symcollection is that it can contain both versions. To save space, maybe I'll come up with a "synonym" symcol entry type that consists only of a pointer to the active code stored under a sibling name. The nice thing about Forth is that you can do things like that.
You know what? "symcol" isn't working for me either. One more nerdy word for people to remember. Let's just replace "dictionary" with "definitions". Everything in the symbol collection is a definition. Even a true primitive like '+' shares space in the collection with the definitions compiled by a programmer. There's no:
: + means ... ;
it simply is . And yet it is still basically a definition. And typed in as a token.
One final thing before moving on: the standard Forth source comments go inside parentheses. Man, I am so used to seeing function parameters and control expressions go into parentheses, not comments. But then, Forth parameters are prefix tokens surrounded by whitespace anyhow, so why not let comments use them?
No biggie, except I'd ultimately like to add a Lisp-like lambda functionality to my "perfect language" and everybody's used to looking at parentheses in that context.
Now to tackle a subject I know almost nothing about: Lisp, which stands for List Processing, and which (big surprise) is yet another computer language. In my introduction, I named Paul Graham and Charles Moore as two sources of inspiration. So far, this blog has devoted a lot of space to Chuck, but where has Paul been?
Unlike Mr. Moore, Paul Graham isn't the inventor of Lisp, but he is one of its biggest proponents, and just as important to my practical mind, he's used it with success commercially.
Forth and Lisp are very different in many ways, but what they have in common is they are languages that can write their own code in their own language and then run that code. They can self-referentially extend themselves.
Machine languages can do this, because after all, that is the very language they speak. But for higher level English-like languages to be able to perform the same feat - that is rare.
I suppose any full featured HLL could do it if you 'torture' it enough, but Forth & Lisp can do it naturally. I don't know if Forth kind of backed into this ability by being so powerful and low level, but Lisp was designed from the start to have this degree of abstraction.
As computer languages, I've grown to respect these two more even than C. And it is painful for me to write that. I've LOVED C. I've written more code in C than all other languages put together. Way more. I've got to be in six figures when counting up lines of C I've produced. Maybe seven figures when I count rewrites.
I always wanted to create my own computer language, but when I discovered C, I realized that it was the language I would have invented if I had known as much about computers as Dennis Ritchie, its author. At the time, I didn't know one tenth as much as D.R., so I gave up and just accepted C as the language I would live in from then on. And I have, for more years than I will admit here.
In my introduction, I asked how one comes up with a computer language that is truly powerful? And Moore and Graham seemed to tell me it comes down to two things:
One - is it powerful as an excellent consumer of machine resources?
Two - is it powerful as an abstracter of a programmer's ideas?
Forth definitely has an edge on the first. Indeed, I'm not sure Lisp programmers even care about how they are using resources. But maybe I'm misunderstanding their approach.
For me, number one is extremely important. Even in my C days, I would sometimes wonder something like, "If I used a while-loop instead of a for-loop, would that compile into more compact code?" I never got around to examining this because it wasn't part of my job description, and besides, I was having too much fun writing C to worry overmuch about it. Yet the nagging question was there in the back of my mind.
Now that I have a simulator that can produce images like the screen shots above and on the genapro home page, and provide many other monitoring functions, I have the tool to answer this type of question for every modification I add to my language. You're darn right efficiency is still my main interest! I can run experiments in my digital lab, and know exactly what resources are being used, and why.
On the second question, Lisp probably has a slight edge over Forth as an abstracter of a programmer's ideas. Here's a quote from Graham (http://paulgraham.com/progbot.html): "Experienced Lisp programmers divide up their programs differently. As well as top-down design, they follow a principle which could be called bottom-up design-- changing the language to suit the problem. In Lisp, you don't just write your program down toward the language, you also build the language up toward your program."
"Changing the language to suit the problem" - what a concept!
And it's not just an intellectual concept. In various essays, including (http://paulgraham.com/avg.html), he has a recurring theme about startups getting ahead by hiring really smart programmers using a really powerful language to develop whatever they're developing. His first startup, Viaweb (later the Yahoo Store) was created by a very small group of hackers writing in Lisp, and having Lisp extend its reach by way of "macros", which is what Lisp uses to write more Lisp. They were able to stay ahead of competitors because they could provide new features faster than anyone else. And why not, if you can have the machine do the typing for you, so to speak.
Sound too good to be true? Then don't even bother to go online and find some of Chuck Moore's articles and lectures. Now, there's a radical. He developed Forth over many years on a series of computers that are antiquated by today's standards, and when he finally got tired of being limited, he moved from Forth to cmForth, a "Machine Forth" that ran on a chip he helped design. That was the first of many iterations - over the next several years - of tightly coupling the language to the logic.
As Graham points out (also in avg.html), "... since the 1980s, instruction sets have been designed for compilers rather than human programmers", but Moore has taken things to a hyper-level.
Struggling with other people's expensive and complicated software for modeling and laying out chips, he developed OKAD, an experiment in 'sourceless programming', and a much faster and self-contained system for creating ICs. One chip was named the S40 because it contained 40 individual Forth computers.
I'd accuse him of stealing my ideas but he sort of got there first. Of course, I designed the machine first and am now tossing a language at it, while he went the other direction.
Lest you think Moore is just a hardware jockey who writes a program now and then, here's something he had to say in a talk he gave in 1999 (http://www.ultratechnology.com/1xforth.htm): "The whole point of Forth was that you didn't write programs in Forth you wrote vocabularies in Forth. When you devised an application you wrote a hundred words or so that discussed the application and you used those hundred words to write a one line definition to solve the application. It is not easy to find those hundred words, but they exist, they always exist."
In other words, "Changing the language to suit the problem."
Is the converse true? Can we say if the language isn't part of the solution, it's part of the problem? That's pretty drastic. And yet ... Viaweb's hackers (a term Graham treats with respect) won in the marketplace because they were competing against bureaucratic corporations with too many programmers using less powerful languages to maintain bloatware running on buggy operating systems, while the Viaweb developers were doing everything in Lisp. The web server was in Lisp, the applications were in Lisp - and Lisp was in Lisp.
And Forth can be in Forth. There's a traditional story that in the old days of floppy disks, you could start to copy the language from the floppy, and Forth would metacompile itself into system memory before the floppy had even spun up to full speed. Hey, I've never seen it happen, but then I've never not seen it happen, either.