Skip to content

Instantly share code, notes, and snippets.

@Kroc
Last active September 11, 2018 20:50
Show Gist options
  • Save Kroc/32fff4fdc1f4e90fdf5df36480128aa3 to your computer and use it in GitHub Desktop.
Save Kroc/32fff4fdc1f4e90fdf5df36480128aa3 to your computer and use it in GitHub Desktop.
A proposed image format specification for 8-bit Commodore computers.

Introduction

Version 1.0; 10-SEP-2018.

With the introduction of new tools for the development of Commodore 64 art on modern systems, there has arisen a need for a universal standard image format for the storing and transferring of Commodore image formats between programs, including original hardware.

This specification proposes just such a format that is simple to understand, simple to parse -- even on original hardware -- and simple to implement.

What Exactly is Commodore "art"?

Unlike modern image formats where the image is composed of pixels, a Commodore image can consist of multiple separate layers of different format, including text-characters (PETSCII), separate colour data and global colour such as the foreground / background colour.

Therefore a format is needed that can store the individual parts of an image, but also leave irrelevant parts out.

The Specification

The file extension is ".pet"

Sectors

The PET file format consists of blocks of data split into 254-byte "sectors". This is done to make reading on original hardware very simple, where a disk sector is exactly 254 bytes. A parser can read and process the file a sector at a time and not have to seek backward or forward.

The last sector in a PET file can be truncated, that is have no padding bytes to fill a full sector. All sectors other than last must, of course, be a full 254 bytes. Unused space within a sector should be zero-filled.

The Meta-Data Block

The first block of data is the meta-data block and consists of 1 sector (254-bytes). It stores various image meta-data and properties.

The first four bytes of the meta-data block (and therefore, the file itself) form the "magic number" used to identify a PET file. The four bytes are the characters "PET" and a version number character all in PETSCII; that is, the first four bytes of a file (in hexadecimal) will be:

$50 ("P"), $45 ("E"), $54 ("T"), $31 ("1")

This version number of "1" will only be changed should the file format change to an encoding that would be incompatible with the "version 1" specification.

If your program encounters a version number that is not "1" (in PETSCII), it should not parse the file any further!

Immediately following the "PET1" magic-number is the meta-data table. This table consists of a set of names and values, each entry in the meta-data table consists of 8 bytes.

The table ends when you come across a name consisting of four nulls (0, 0, 0, 0). A further four bytes exist (reserved for future use), like any other entry, but these have no defined value. Your parser should skip over these when reading, and write four zeroes when writing a file. I.e., you should not preserve these bytes if your parser does not understand them.

  • The first 4 bytes are a name; see the headings on each different name

  • The next four bytes depend upon the name, see below

If the meta-data table contains no entries other than the terminator (0, 0, 0, 0), the file is still considered "valid", but you should stop parsing and inform the user that the file "contains no image data".

The meta-data names allowed are as follows:

Author

Allows embedding an author's name.

The 4 name bytes used are "AUTH" ($41, $55, $54, $48). The next one byte gives an offset in bytes from the beginning of the sector to some PETSCII text stored anywhere within the sector.

If multiple such meta-data ID entries exist, consider this to mean more than one author

Title

"TITL" ($54, $49, $54, $45): Title; PETSCII text. A title for the image

Date

"DATE" ($44, $41, $54, $45): Date-time for the image; PETSCII numerals:

...

Description

"DESC" ($44, $45, $52, $43): Description; PETSCII text. A long form description of the image

Editor

"EDIT" ($45, $44, $49, $54): Editor; PETSCII text. The editor used to produce the image, e.g. "PETMATE"

The "SRAM" Chunk -- Screen RAM

...

The "CRAM" Chunk -- Colour RAM

...

@gnacu
Copy link

gnacu commented Sep 11, 2018

First, here, I also wrote a primitive viewer, for this format. Granted, it's calling C64 OS routines, but that's the point. The routines decrease the amount of code necessary to get something done. To write a similar viewer for the bare KERNAL rom you'd have to actually write out the 16-bit loops for reading in the two blocks of 1000 bytes.

https://gist.github.com/gnacu/c5ad52836290c925a93a707a77c7662e

Ignoring the meta data, because, that's a valid thing to do, this viewer program is just 89 bytes, including validation of the magic and version number.

Next, to answer your question: "the format you describe leaves absolutely no room for expansion at all"

PETSCII images have been structured exactly the same way for almost 40 years. The machine is small. The world is simple. That's half the fun. And, if you're looking for future expandability, that's the point of the version number. If at some future date a significant interest in a few additional fields (or the ability to specify different screen resolutions, etc) comes about, then release a version 2 of the spec.

Look at this page: http://codebase64.org/doku.php?id=base:c64_grafix_files_specs_list_v0.03

It lists ~41 (I may have miscounted) C64 bitmapped image formats. (PETSCII art is not among them.) They are all as simple, perhaps simpler, than the format I propose. A PETSCII image file format should look at home on that page, alongside those other formats.

I was able to write both a creator and a viewer, in a matter of an hour, for my proposed format. You do the same, and then if it's easy and simple to implement with a reasonably small code footprint, then at least you have an argument that it's a good and suitable format for the platform.

Oh, before I forget. Thinking about sectors, and how they're 254 byte chunks, is not useful in my opinion. The KERNAL has no special support for loading in 254 byte chunks, nor for skipping over unnecessary sectors. If you take a 16 byte string field, and align it but ultimately let it sit inside its own entire 254 byte sector, you'll waste a huge amount of space on disk, and you'll force the user to load in gobs of empty space from the disk, over a very slow bus. You can only profit from sector layout tricks (like GEOS does with its VLIR format) if you marry yourself to the 1541 and write your code to send special commands to its DOS. It's 2018, SD2IEC is very popular. So, that's a bad idea.

@Kroc
Copy link
Author

Kroc commented Sep 11, 2018

In the case of C64OS; how do you handle taking screenshots with custom characters, that will vary from one app / utility to another? For true portability to other systems, including the web, you'd also want a way to include the custom character definitions.

@gnacu
Copy link

gnacu commented Sep 11, 2018

By the way. I'm not trying to be a jerk. But if one proposes a format and hasn't tried to implement a parser/creator for it, in 6502/10 assembly, then they don't really know how tricky that format will be to deal with. Whenever I write a format (such as the human readable/editable menu file format for an application's menus in C64 OS, or the desktop application link files, or a fileref serialization, etc), I write the format and the code needed to deal with it at the same time. The writing of the code almost always exposes a weakness in the data format, and the two negotiate with each other until some happy medium is reached: Small format, easy to read, easy to write, easy to allocate memory for, doesn't require much code to deal with, meets the essential needs of the solution, and sometimes is human readable/editable.

@gnacu
Copy link

gnacu commented Sep 11, 2018

That's a very good point. And that's exactly the sort of point that I'm glad is made and the reason for having discussions with others at all.

C64 OS supports loadable character sets. But the character sets are separate files. PETSCII "art" is usually made with the default character rom. But even then, there should be at least one byte (we discussed this with nupax) for specifying upper/lower or upper/graphics character sets. For C64 OS screenshots, they would look broken if the default character set were used. I'm not sure of the best way to handle that. I'm open for discussion.

One way would be to pack 2K of bitmap data at the END of the file, as a custom character set. Plus add one byte in the header to specify if it should be upper/lower, upper/graphics, or custom.

An alternative would be to ship the character set separately from the data file, and put the character set file name in the header.

Another alternative, would be to publish characterset byte values for popular character sets. One of which could be reserved for C64 OS's charset. 2 for the default character rom sets, and then leave the other 253 values to be defined by the community for other popular character sets.

@Kroc
Copy link
Author

Kroc commented Sep 11, 2018

The data block at the end for custom characters could be 1 byte to specify which char is being defined, then the 8 bytes for the graphic. This way you could include only the characters that are actually redefined. This would also bind the definitions to the screen codes used in the screen data, so that the screenshot would be preserved accurately in the future and on other systems too.

@gnacu
Copy link

gnacu commented Sep 11, 2018

Okay, I have two more thoughts on my own last comment.

  1. I didn't take into account what happens when an app customizes some small available portion of the characterset, for example, to draw an icon, or a logo. When a different app is loaded, that app may change just those 9 or 12 characters for its own little graphical flourishes.

I don't know how to handle that. But, it's a good time to think about it.

  1. To clarify what I meant by the published list, I mean, in a common place, like codebase64.org, or c64-wiki.org, the community could allocate single byte values to specify whole character sets. i.e.

0 = default upper/lower
1 = default upper/graphics
2 = Contiki
3 = C64 OS
4 = LUnix
5 = GeckOS
... etc.

This would not support truly custom character sets, it would just allow the format to support a wide variety (up to 256) of common pre-existing character sets for different platforms. It does not however address the issue of point 1.

@gnacu
Copy link

gnacu commented Sep 11, 2018

The data block at the end for custom characters could be 1 byte to specify which char is being defined, then the 8 bytes for the graphic. This way you could include only the characters that are actually redefined. This would also bind the definitions to the screen codes used in the screen data, so that the screenshot would be preserved accurately in the future and on other systems too.

Actually, I like that a lot. But, perhaps one byte in the header to specify the rom character set that's being modified. A stand alone viewer could then copy the correct rom charset into ram, and modify it with the data at the end of the file.

C64 OS, would just need to encode the characters it knows are custom AND which are in use in the screen data for that particular capture.

@Kroc
Copy link
Author

Kroc commented Sep 11, 2018

... one byte in the header to specify the rom character set that's being modified

I forget to mention that, but yes, I really like the use of the PETSCII code for upper/lower case to mark that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment