Version 1.0; 10-SEP-2018.
With the introduction of new tools for the development of Commodore 64 art on modern systems, there has arisen a need for a universal standard image format for the storing and transferring of Commodore image formats between programs, including original hardware.
This specification proposes just such a format that is simple to understand, simple to parse -- even on original hardware -- and simple to implement.
Unlike modern image formats where the image is composed of pixels, a Commodore image can consist of multiple separate layers of different format, including text-characters (PETSCII), separate colour data and global colour such as the foreground / background colour.
Therefore a format is needed that can store the individual parts of an image, but also leave irrelevant parts out.
The file extension is ".pet"
The PET file format consists of blocks of data split into 254-byte "sectors". This is done to make reading on original hardware very simple, where a disk sector is exactly 254 bytes. A parser can read and process the file a sector at a time and not have to seek backward or forward.
The last sector in a PET file can be truncated, that is have no padding bytes to fill a full sector. All sectors other than last must, of course, be a full 254 bytes. Unused space within a sector should be zero-filled.
The first block of data is the meta-data block and consists of 1 sector (254-bytes). It stores various image meta-data and properties.
The first four bytes of the meta-data block (and therefore, the file itself) form the "magic number" used to identify a PET file. The four bytes are the characters "PET" and a version number character all in PETSCII; that is, the first four bytes of a file (in hexadecimal) will be:
$50 ("P"), $45 ("E"), $54 ("T"), $31 ("1")
This version number of "1" will only be changed should the file format change to an encoding that would be incompatible with the "version 1" specification.
If your program encounters a version number that is not "1" (in PETSCII), it should not parse the file any further!
Immediately following the "PET1" magic-number is the meta-data table. This table consists of a set of names and values, each entry in the meta-data table consists of 8 bytes.
The table ends when you come across a name consisting of four nulls (0, 0, 0, 0). A further four bytes exist (reserved for future use), like any other entry, but these have no defined value. Your parser should skip over these when reading, and write four zeroes when writing a file. I.e., you should not preserve these bytes if your parser does not understand them.
-
The first 4 bytes are a name; see the headings on each different name
-
The next four bytes depend upon the name, see below
If the meta-data table contains no entries other than the terminator (0, 0, 0, 0), the file is still considered "valid", but you should stop parsing and inform the user that the file "contains no image data".
The meta-data names allowed are as follows:
Allows embedding an author's name.
The 4 name bytes used are "AUTH" ($41, $55, $54, $48
). The next one byte gives an offset in bytes from the beginning of the sector to some PETSCII text stored anywhere within the sector.
If multiple such meta-data ID entries exist, consider this to mean more than one author
"TITL" ($54, $49, $54, $45
): Title; PETSCII text. A title for the image
"DATE" ($44, $41, $54, $45
): Date-time for the image; PETSCII numerals:
...
"DESC" ($44, $45, $52, $43
): Description; PETSCII text. A long form description of the image
"EDIT" ($45, $44, $49, $54
): Editor; PETSCII text. The editor used to produce the image, e.g. "PETMATE"
...
...
Okay, this is following up on what I said on Twitter.
It's way too complicated. I don't know how long it's been since you wrote a serious amount of 6502 code, but 64K of ram is frighteningly little. The C64 OS KERNAL is already 10K, I've missed my target twice, and I have barely even started the Toolkit, and have no networking code at all.
I want screen grabs to be built in, because, they're so handy. But I'm literally counting the bytes and hand tuning the loops trying to strip away everything that isn't absolutely essential. I have no room whatsoever for a parser that needs to deal with interpreting the header and read in variable header lengths, with optional fields, etc.
Since the goal is for this to be usable by the C64 itself, you have to think much simpler. It is so easy to be carried away by the excessive RAM and CPU power of a modern computer. I understand that it's hard to stay constrained to the limits of the C64.
In my opinion, the fields have to be fixed length. And there has to be a fixed number of them. I'm open to discussing what those fields are and how big they are. But they cannot be variable, or require searching for patterns, and not knowing ahead of time how much meta data you're going to encounter. On a PC/Mac, even in a script language like Javascript or PHP, this stuff is easy peasy. But on a C64, there simply isn't the space.
Let me rewrite my suggestions from the comments on PETMATE, given the ideas you have above. Pardon the formatting, Its just easier for me to think in terms of code.
.byte $50, $45, $54, $31 ;PET1 (Magic and version number)
.buf 17 ; 16 bytes for a title, null padded, with trailing null.
.buf 17 ; 16 bytes for an author, null padded, with trailing null.
.buf 17; 16 bytes for release info. First 4 should be a year. null padded with trailing null.
.buf 1000 ; 1000 bytes of screen codes
.buf 1000 ; 1000 bytes of color memory
.buf 1 ; Background color
.buf 1 ; Border color.
"Parsing" thus, consists of a preallocated block of memory 55 bytes big ((17*3)+4). Like a C struct. Loading that from disk into memory is a single loop that reads in 55 bytes. The title, author and release info strings end up in memory pre-null-terminated, like C-strings, ready to be drawn to screen with a pre-existing routine that knows to interpret null as the end of the string.
After that, it needs a loop to read 1000 bytes and write them directly to wherever you want screen memory to be. Then a loop to read 1000 bytes and write them either to color memory or a color memory buffer.
If the program doesn't care about the metadata, it can reserve just 4 bytes for the magic and version number. Read in the first 4 bytes to populate that. Confirm that the magic and version are correct. If they are, you could then read and throw away exactly 51 bytes, in a loop with 51 iterations. And then know exactly where the screen codes will begin.
A parser for the above can be written in just a tiny handful of bytes, total. I'd have to count them, but I can imagine. Maybe I'll go write a quick example of the code that can use this to show you how small it can be.