Commodore BASIC

Commodore BASIC
First appeared 1977 (1977)
Platform PET to the C128

Commodore BASIC, also known as PET BASIC, is the dialect of the BASIC programming language used in Commodore International's 8-bit home computer line, stretching from the PET of 1977 to the C128 of 1985. The core was based on 6502 Microsoft BASIC, and as such it shares many characteristics with other 6502 BASICs of the time, such as Applesoft BASIC. Commodore licensed BASIC from Microsoft on a "pay once, no royalties" basis after Jack Tramiel turned down Bill Gates' offer of a $3 per unit fee, stating, "I'm already married," and would pay no more than $25,000 for a perpetual license.[1]

History

Commodore took the source code of the flat-fee BASIC and further developed it internally for all their other 8-bit home computers. It was not until the Commodore 128 (with V7.0) that a Microsoft copyright notice was displayed. However, Microsoft had built an easter egg into the version 2 or "upgrade" Commodore Basic that proved its provenance: typing the (obscure) command WAIT 6502, 1 would result in Microsoft! appearing on the screen. (The easter egg was well concealed—the message did not show up in any disassembly of the interpreter.)[2]

The popular Commodore 64 came with BASIC v2.0 in ROM despite the computer being released after the PET/CBM series that had version 4.0 because the 64 was intended as a home computer, while the PET/CBM series were targeted at business and educational use where their built-in programming language was presumed to be more heavily used. This saved manufacturing costs, as the V2 fit into smaller ROMs.

Technical details

A convenient feature of Commodore's ROM-resident BASIC interpreter and KERNAL was the full-screen editor.[3][4] Although Commodore keyboards only featured two cursor keys which alternated direction when the shift key was held, the screen editor allowed users to enter direct commands or to input and edit program lines from anywhere on the screen. If a line was prefixed with a line number, it was tokenized and stored in program memory. Lines not beginning with a number were executed by pressing the RETURN key whenever the cursor happened to be on the line. This marked a significant upgrade in program entry interfaces compared to other common home computer BASICs at the time, which typically used line editors, invoked by a separate EDIT command, or a "copy cursor" that truncated the line at the cursor's position.

It also had the capability of saving named files to any device, including the cassette – a popular storage device in the days of the PET, and one that remained in use throughout the lifespan of the 8-bit Commodores as an inexpensive form of mass storage. Due to their use as an analog audio medium, cassettes were as ubiquitous during the era as CDs are today. Most systems only supported filenames on diskette, which made saving multiple files on other devices more difficult. The user of one of these other systems had to note the recorder's counter display at the location of the file, but this was inaccurate and prone to error. With the PET (and BASIC 2.0), files from cassettes could be requested by name. The device would search for the filename by reading data sequentially, ignoring any non-matching filenames. The file system was also supported by a powerful record structure that could be loaded or saved to files. Commodore cassette data was recorded digitally, rather than less expensive (and less reliable) analog methods used by other manufacturers. Therefore, the specialized Datasette was required rather than a standard tape recorder. Adapters were available that used an analog to digital converter to allow use of a standard recorder, but these cost only a little less than the Datassette.

Due to Commodore's use of the same BASIC on multiple hardware architectures, the LOAD command was provided with an additional parameter to indicate the memory address where the program should start. A command such as LOAD "*",8 would place the program data at the start of BASIC's area, while LOAD "*",8,1 would place the file at the address from which it was saved. The former was usually used for BASIC programs, since BASIC's location varied between different models. Some Commodore BASIC variants supplied BLOAD and BSAVE commands that worked like their counterparts in Applesoft BASIC, loading or saving bitmaps from specified memory locations.

Like the original Microsoft BASIC interpreter, on which it is based, Commodore BASIC is slower than native machine code. Test results have shown that copying 16 kilobytes from ROM to RAM takes less than a second in machine code, compared to over a minute in BASIC. To execute faster than the interpreter, programmers started using various techniques to speed up execution. One was to store often-used floating point values in variables rather than using literal values, as interpreting a variable name was faster than interpreting a literal number. Since floating point is default type for all commands, it's faster to use floating point numbers, rather than integers. When speed was important, some programmers converted sections of BASIC programs to 6502 or 6510 assembly language which were loaded separately from a file, or POKEed into memory from DATA statements at the end of the BASIC program, and executed from BASIC using the SYS command either from direct mode or from the program itself. When the execution speed of machine language was too great, such as for a game or when waiting for user input, programmers could poll by PEEKing selected memory locations (such as $A6 for the C-64, or $D0 for the C-128, denoting size of the keyboard queue) to delay or halt execution.

Commodore BASIC keywords could be abbreviated by entering first an unshifted keypress, and then a shifted keypress of the next letter. This set the high bit, causing the interpreter to stop reading and parse the statement according to a lookup table. This meant the statement up to where the high bit was set was accepted as a substitute for typing the entire command out. However, since all BASIC keywords were stored in memory as single byte tokens, this was a convenience for statement entry rather than an optimization.

In the default uppercase-only character set, shifted characters appear as a graphics symbol; e.g. the command, GOTO, could be abbreviated G{Shift-O} (which resembled GΓ onscreen). Most such commands were two letters long, but in some cases they were longer. In cases like this, there was an ambiguity, so more unshifted letters of the command were needed, such as GO{Shift-S} (GO♥) being required for GOSUB. Some commands had no abbreviated form, either due to brevity or ambiguity with other commands. For example, the command, INPUT had no abbreviation because its spelling collided with the separate INPUT# keyword, which was located nearer to the beginning of the keyword lookup table. The heavily used PRINT command had a single ? shortcut, as was common in most BASIC dialects. Abbreviating commands with shifted letters is unique to Commodore BASIC.

By abbreviating keywords, it was possible to fit more code on a single line (line lengths were usually limited to 2 or 4 screen lines, depending on the specific machine). This allowed for a slight saving on the overhead to store otherwise necessary extra program lines, but nothing more. All BASIC commands were tokenized and took up 1 byte (or two, in the case of several commands of BASIC 7 or BASIC 10) in memory no matter which way they were entered. And, such long lines could be difficult to edit. The LIST command displayed the entire command keyword - extending the program line beyond the 2 or 4 screen lines which could be entered into program memory.

Program lines in Commodore BASIC do not require spaces anywhere (but the LIST command will always display one between the line number and the statement), e.g., 100 IFA=5THENPRINT"YES":GOTO160, and it was common to write programs with no spacing. This feature was added to conserve memory since the tokenizer never removes any space inserted between keywords: the presence of spaces results in extra 0x20 bytes in the tokenized program which are merely skipped during execution.

The order of execution of Commodore BASIC lines was not determined by line numbering; instead, it followed the order in which the lines were linked in memory.[5] Program lines were stored in memory as a singly linked list with a line number, a pointer (containing the address of the beginning of the next program line), and then the tokenized code for the line. While a program was being entered, BASIC would constantly reorder program lines in memory so that the line numbers and pointers were all in ascending order. However, after a program was entered, manually altering the line numbers and pointers with the POKE commands could allow for out-of-order execution or even give each line the same line number. In the early days, when BASIC was used commercially, this was a software protection technique to discourage casual modification of the program.

Variable names were only significant to 2 characters; thus the variable names VARIABLE1, VARIABLE2 and VA all referred to the same variable.

The native number format of Commodore BASIC, like that of its parent MS BASIC, was floating point. Most contemporary BASIC implementations used one byte for the characteristic (exponent) and three bytes for the mantissa. The accuracy of a floating point number using a three-byte mantissa is only about 6.5 decimal digits, and round-off error is common. Commodore, however, used MS BASIC's four-byte mantissa, which made their BASIC much more useful for business.

Although Commodore BASIC supports signed integer variables (denoted with a percent sign) in the range -32768 to 32767, in practice they only work on array variables and serve the function of conserving memory by limiting array elements to two bytes each. Denoting a normal variable as integer simply causes BASIC to convert it back to floating point, slowing down program execution and wasting memory as each percent sign takes one additional byte to store. Also, it's not possible to POKE or PEEK memory locations above 32767 with address defined as a signed integer.

String variables were represented by postfixing the variable name with a dollar sign. Thus, the variables AA$, AA, and AA% would each be understood as distinct.

The Simons' BASIC start-up screen

Many BASIC extensions were released for the Commodore 64, due to the relatively limited capabilities of its native BASIC 2.0. One of the most popular extensions was the DOS Wedge, which was included on the Commodore 1541 Test/Demo Disk. This 1 KB extension to BASIC added a number of disk-related commands, including the ability to read a disk directory without destroying the program in memory. Its features were subsequently incorporated in various third-party extensions, such as the popular Epyx FastLoad cartridge. Other BASIC extensions added additional keywords to make it easier to code sprites, sound, and high-resolution graphics like Simons' BASIC.

Although BASIC 2.0's lack of sound or graphics features was frustrating to many users, some critics argued that it was ultimately beneficial since it forced the user to learn machine language.

The limitations of BASIC 2.0 on the C64 led to use of built-in ROM machine language from BASIC. To load a file to a designated memory location, the filename, drive, and device number would be read by a call: SYS57812"filename",8

The location would be specified in two locations: POKE780,0:POKE781,0:POKE782,192

And the load routine would be called: SYS65493

The disk magazine for the C64, Loadstar was a venue for hobbyist programmers, who shared collections of proto-commands for BASIC, called with the SYS address + offset command.

From a modern programming point of view, the earlier versions of Commodore BASIC presented a host of bad programming traps for the programmer. As most of these issues derived from Microsoft BASIC, virtually every home computer BASIC of the era suffered from similar deficiencies.[6] Every line of a Microsoft BASIC program was assigned a line number by the programmer. It was common practice to increment numbers by some value (5, 10 or 100) to make inserting lines during program editing or debugging easier, but bad planning meant that inserting large sections into a program often required restructuring the entire code. A common technique was to start a program at some low line number with an ON...GOSUB jump table, with the body of the program structured into sections starting at a designated line number like 1000, 2000, and so on. If a large section needed to be added, it could just be assigned the next available major line number and inserted to the jump table.

Later BASIC versions on Commodore and other platforms included a DELETE and RENUMBER command, as well as an AUTO line numbering command that would automatically select and insert line numbers according to a selected increment. In addition, all variables are treated as global variables. Clearly defined loops are hard to create, often causing the programmer to rely on the GOTO command (this was later rectified in BASIC 3.5 with the addition of the DO, LOOP, WHILE, UNTIL, and EXIT commands). Flag variables often needed to be created to perform certain tasks. Earlier BASICs from Commodore also lack debugging commands, meaning that bugs and unused variables are hard to trap.

Use as user interface

In common with other home computers, Commodore's models booted directly into the BASIC interpreter. BASIC's file and programming commands could be entered in direct mode to load and execute software. If program execution was halted using the RUN/STOP key, variable values would be preserved in RAM and could be PRINTed for debugging. This, along with the advanced screen editor included with Commodore BASIC gave the programming environment a REPL-like feel; programmers could insert and edit program lines at any screen location, interactively building the program.[7] This is in contrast to business-oriented operating systems of the time like CP/M or MS-DOS, which typically booted into a command line interface. If a programming language was required on these platforms, it had to be loaded separately.

While some versions of Commodore BASIC included disk-specific DLOAD and DSAVE commands, the version built into the popular Commodore 64 lacked these, requiring the user to specify the disk drive's device number (typically 8 or 9) to the standard LOAD command, which otherwise defaulted to tape. Another omission from the Commodore 64s BASIC 2.0 was a DIRECTORY command to load a disk's contents into screen memory without clearing main memory. On the 64, viewing a disk's contents was implemented as loading a "program" which when listed showed the directory. This had the effect of overwriting the currently loaded program. Addons like the DOS Wedge overcame this by rendering the directory listing direct to the screen.

Versions and features

A list of CBM BASIC versions in chronological order, with successively added features:

Released versions

Unreleased versions

Notable extension packages

References

Notes
BASIC 2.0
  • Angerhausen et al. (1983). The Anatomy of the Commodore 64 (for the full reference, see the C64 article).
BASIC 3.5
  • Gerrard, Peter; Bergin, Kevin (1985). The Complete COMMODORE 16 ROM Disassembly. Gerald Duckworth & Co. Ltd. ISBN 0-7156-2004-5.
BASIC 7.0
  • Jarvis, Dennis; Springer, Jim D. (1987). BASIC 7.0 Internals. Grand Rapids, Michigan: Abacus Software, Inc. ISBN 0-916439-71-2.
BASIC 10.0
  • Commodore 65 preliminary documentation (March 1991), with addendum for ROM version 910501. c65manual.txt
This article is issued from Wikipedia - version of the 11/27/2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.