lavandeira.net

Relearning MSX #15: MSX-C commands: CG.COM

Posted by Javi Lavandeira in How-to, MSX, Retro, Technology | February 18, 2015

In the previous two posts we’ve seen a detailed description of CF (MSX-C’s parser) and FPC (MSX-C’s function parameter checker). Today we’ll see the last one: CG.COM, MSX-C’s code generator.

What I said about the previous two articles applies to this one as well: this explanation is for experienced users who want to have more control over the compile process. If you’re a beginner then you don’t need to bother paying too much attention to this (yet). However, it won’t hurt to at least skim it and find out what are the available options.

MSX-C’s code generator (CG.COM)

CG.COM : Code Generator

CG’s purpose is to take the intermediate code generated by CF (the TCO file we saw in the previous posts), and create the assembler source code for our program. This code will be a file with the same base name as the original .C source code, but with the .MAC extension. At this point MSX-C’s job is done, and the two remaining steps (assembling and linking the executable program) will be handled by M80 and L80, both part of the MSX-DOS2 TOOLS package. We won’t see these in detail for now.

Let’s see a simple example:

Example program 1 for CG.COM

The program above is a very simple one: it adds two integers and prints the result on the screen.

Generating the Z80 assembler code is very simple: first, run CF as we saw two posts ago, and second, run CG with exactly the same file name:

CG generates assembler code

After the copyright message CG will print the name of every function it finds in the program, followed by a progress indicator. If there are no errors, CG will output “complete” and save the resulting .MAC file to disk:

1.MAC file generated by CG

And now we can open the 1.MAC file in AKID in case we want to see the assembler source code:

Assembler source code generated by CG

At this point we can feed this assembler code to any M80-compatible assembler and linker to create the program executable. The C.BAT batch file that we created in chapter #12 already does this, so we don’t need to touch anything unless we want to do the assembly process manually.

How to read the status indicator

In order to understand the meaning of the symbols in the output, let’s see a longer program (NUMS.C):

NUMS.C : Simple C program that adds and substracts

This is a very simple program that takes two integer parameters from the command line and prints their sum and substraction. It checks that there are enough parameters in the command line and prints an usage string if the parameters aren’t correct.

After generating the TCO file with CF we run CG on the result:

CG processing the NUMS program

The first thing we see is that CG outputs the name of each function it finds in the program: sum(), subs(), usage() and main().

The meanings of the symbols after each function name are as follows:

Each dot (.) represents an instruction inside the function.
A colon (:) indicates that CG is performing optimization on this function.
A semicolon (;) indicates the end of the function.

Command line parameters

CG has five command line parameters: -k, -l, -o, -r and -u:

-k

This option tells CG to delete the TCO file after successfully creating the assembler source code.

-l

CG’s default behaviour is to truncate symbol names (functions, constants, etc) t0 six characters. The reason is that this is a limitation in M80/L80: the L80 linker doesn’t support symbols longer than six characters. This option tells CG to output long symbol names of up to sixteen characters.

Consider the following example program:

3.c: Symbols longer than 6 characters

This program contains two functions whatever() and whatevah() whose names are eight characters long. CG will generate assembler code for this program just fine, but because the function names are truncated to six characters, both will end up having the same name and the assembler source code won’t work:

Two functions with the same name after truncating them to six characters

There are two solutions for this situation:

Change the function names so they both have up to six characters. This will work fine with Microsoft’s M80/L80, but will make our code a bit more difficult to read.
Add the -l command line option, and then use an assembler/linker package that’s compatible with the M80/L80 format, but supports longer symbol names.

For now we will make sure that our symbols are all up to six characters, but in another post we’ll start using the ASLD assembler package by Egor Voznessenski, which doesn’t have this limitation.

Very sad note: I learn while writing this post that Egor passed away last year. He was a very good developer and did lots of useful stuff, both software and hardware, for MSX and Sinclair Spectrum computers. He was only 43 years old.

-o [drive/directory/filename]

Tells CG the location where it has to save the resulting assembler code file. This option accepts a drive name, a directory, or a filename (either absolute or relative).

-rN

Tells CG to reserve N bytes of memory for its internal symbol table. Use this option only if CG runs out of memory. Most often we won’t have to bother with this.

-u

This option disables the progress status during code generation. Use it when you don’t want to see your screen fill with dots and colons.

In the next post…

The next post will bring us back to learning C programming from the beginning. If you’re new to C and these three posts made you fall asleep, then the next chapters will be for you.

This series of articles is supported by your donations. If you’re willing and able to donate, please visit the link below to register a small pledge. Every little amount helps.

Javi Lavandeira’s Patreon page

2 comments on “Relearning MSX #15: MSX-C commands: CG.COM”

Pingback: Relearning MSX #15: MSX-C commands: CG.COM | Vintage is the New Old
Roger Sen on February 19, 2015 at 8:31 pm said:

Ohhh! The past is truly a foreign country (that we visited a lot).

I was never lucky to be a MSX user (loved the Z80 but from a less gifted platform: Spectrum), but the linker limitations were pretty common in older UNIX. Not a huge problem if you knew this in advance, but a mess if you wanted to migrate code between different vendors.

Anyway, keep publishing these posts, I’m having a lot of fun reading them!

Reply ↓

Blog

Search

Categories

Archives