Skip to content

Instantly share code, notes, and snippets.

@akingdom
Last active May 31, 2026 14:28
Show Gist options
  • Select an option

  • Save akingdom/f068e919d360f39347e1c8e1ec2806ab to your computer and use it in GitHub Desktop.

Select an option

Save akingdom/f068e919d360f39347e1c8e1ec2806ab to your computer and use it in GitHub Desktop.
A very simple human-friendly parsable data format. I use this in a number of projects.

Line Record Format (LRF) Specification

1. Introduction

The Line Record Format (LRF) is a human-friendly, line-oriented data format designed for simplicity of both human editing and machine parsing. Every LRF document is also a valid Markdown document, making it suitable for embedding structured data in plain-text documentation.

2. Format Objectives

  1. Human editable – Accessible to users with minimal technical skills, using any text editor.
  2. Machine readable – Deterministic, unambiguous grammar that can be parsed with simple, line-by-line logic.
  3. Readability over size – Whitespace and clarity are preferred; minimalism in storage is not a goal.
  4. Markdown subset – Any LRF document is also a valid Markdown document. This allows structured data to live inside README files, notes, or any Markdown rendering environment without breaking the document flow.

3. Basic Format

Each non-empty, non-blank line contains a field name, one or more whitespace characters, and a field value:

field-name       field value
  • Record separator – A new record begins with the keyword RECORD or the hash character #, optionally followed by an identifier.
  • List items – Unordered lists use - or * as field names; numbered lists use a numeric field name (with or without a trailing period).

4. Complete Example

Input (LRF text):

RECORD            Customer Example
customer-name       Fred Smith
customer-email      fsmith@example.com
customer-phone      +1 555 123 4567

RECORD Fruit Example
  1 Grapes
  2 Oranges
  - Peaches    
 * Mandarines
* Strawberries
* Raspberries

Parsed output (shown as # for records, matching the algorithm’s normalisation):

# : Customer Example
customer-name : Fred Smith
customer-email : fsmith@example.com
customer-phone : +1 555 123 4567
# : Fruit Example
1 : Grapes
2 : Oranges
- : Peaches
* : Mandarines
* : Strawberries
* : Raspberries

5. Detailed Specification

5.1 Whitespace

Whitespace characters are defined as:

  • Horizontal tab (\t, U+0009)
  • Any character in Unicode General Category Zs (Space Separator), which includes the common space (U+0020), non-breaking space (U+00A0), and others.

All leading and trailing whitespace on a line is ignored (trimmed). The first internal whitespace character after the field name acts as the separator; any immediately following whitespace is consumed and does not become part of the value.

5.2 Line Terminators

Lines are terminated by a line feed (\n, U+000A). Carriage return characters (\r, U+000D) immediately preceding the line feed are discarded.

5.3 Record Structure

[record marker line]          ← begins a record
[field data line]*
[blank line (ignored)]*
[record marker line] ...

A record marker line is a line whose field name is RECORD (case‑sensitive, uppercase) or #. The field value on that line serves as the record identifier or type. # and RECORD are completely equivalent; a parser may normalise RECORD to # internally.

A field data line consists of:

  1. Field name (key) – any sequence of characters that does not contain whitespace. If a space must be represented in the name, use one of the encodings: %20, +, -, or _.
  2. Separator – one or more whitespace characters (the exact amount is arbitrary and need not be consistent across lines).
  3. Field value – the remainder of the line after the separator, with trailing whitespace removed.

Example field data lines:

customer-name       Fred Smith
field-example-2                  Many, many spaces in-between.

5.4 Field Types

  • Named fields – Any field name not matching a special keyword. Parsers should match these against a whitelist of known field names for the current record type when validation is desired.
  • Unordered lists – Field names - or *.
  • Ordered lists – Field names consisting entirely of decimal digits, optionally followed by a period (e.g., 1, 2., 42).
  • Title (optional) – Field name TITLE (treated as a named field by default; parsers may assign it special meaning).

5.5 Blank Lines

Lines that are empty after trimming are ignored. They may appear anywhere between records or fields without affecting the parsed structure.

6. Parsing Algorithm

The following algorithm processes a raw LRF string into an ordered list of key‑value pairs (represented as single‑key dictionaries or similar structures):

  1. Escape single quotes and discard any \r characters.
  2. Split the text by \n into individual lines.
  3. For each line:
    1. Trim leading and trailing whitespace.
    2. If the line is now empty, skip it.
    3. Locate the first whitespace character.
      • If none exists: field name = entire trimmed line; field value = empty string.
      • If found: field name = text before that whitespace (trimmed); field value = text after that whitespace (trimmed), consuming any immediately following whitespace characters.
    4. Categorise by field name:
      • If field name is in a provided list of recognised fields → output {name: value}.
      • If field name is RECORD or # → output {"#": value}.
      • If field name is TITLE, -, *, or a purely numeric string → output {name: value}.
      • Otherwise, skip the line (unrecognised).

Reference implementations in Python, JavaScript, Swift, and Objective‑C are available here (link to implementations).

7. Relation to Markdown

LRF is intentionally designed to be a subset of Markdown. An LRF document is always valid Markdown text. This is achieved as follows:

  • The field name–value separator (whitespace) causes Markdown renderers to treat each line as a plain paragraph or, when the field name is a list marker, as a list item.
  • Record markers (RECORD or #) become headings when the line starts with a # or plain paragraph text for RECORD.
  • Unordered lists (-, *) render as Markdown bullet lists.
  • Numbered lists render as ordered lists.

Because LRF uses only a small subset of Markdown, it can be embedded inside existing Markdown documents without conflict. This property also means that a reader seeing an LRF snippet in a web‑rendered README will perceive it as well‑formatted plain text, while a machine parser can extract the exact structured data.

Note: The text/line-record media type and file extensions .txt or .rl are suggested when LRF is used independently. When embedded in Markdown, the standard .md extension suffices.

8. Metadata

The format itself is described by the following fields, which can be included as an LRF record for self‑documentation:

Field Value
format-author Andrew Kingdom
format-copyright 2023, all rights reserved by the author
format-name Line Record Format
format-license CC‑BY (Use freely, retain copyright notices)
media-MIME-type text/line-record
file-extension .txt or .rl (whitespace‑separated record lines)

9. License

This specification is licensed under the Creative Commons Attribution 4.0 International (CC‑BY 4.0). You are free to share and adapt the material as long as appropriate credit is given. See the format-license field above.

RECORD Customer Example
customer-name Fred Smith
customer-email fsmith@example.com
customer-phone +1 555 123 4567
# Messy Fruit Example
- Grapes
- Oranges
- Peaches
* Mandarines
* Strawberries
* Raspberries
# Neat Fruit Example
- Grapes
- Oranges
- Peaches
- Mandarines
- Strawberries
- Raspberries
/**
* Parses a line‑record string.
* @param {string} source - The input string.
* @param {string[]} knownFields - Array of recognised field names to include directly.
* @returns {Object[]} Array of single‑key objects representing the parsed records.
*/
function parseLineRecordString(source, knownFields) {
if (!source || source.length === 0) return [];
const keyNewrecord = "RECORD";
const keyNewrec = "#";
const keyTitle = "TITLE";
const keyList1 = "-";
const keyList2 = "*";
// Escape single quotes and remove carriage returns (as in original)
const processed = source.replace(/'/g, "\\'").replace(/\r/g, "");
const lines = processed.split("\n");
const build = [];
for (const line of lines) {
const trimline = line.trim();
if (trimline.length === 0) continue;
// Find first whitespace character
const wsMatch = trimline.match(/\s/);
let field1, field2;
if (!wsMatch) {
field1 = trimline;
field2 = "";
} else {
const idx = wsMatch.index;
field1 = trimline.substring(0, idx).trim(); // no whitespace, trim is safe
// Skip the whitespace character (as in original using location + length)
field2 = trimline.substring(idx + 1).trim();
}
// Categorise and append
if (knownFields.includes(field1)) {
build.push({ [field1]: field2 });
} else if (field1 === keyNewrecord || field1 === keyNewrec) {
build.push({ [keyNewrec]: field2 });
} else if (field1 === keyTitle) {
build.push({ [field1]: field2 });
} else if (field1 === keyList1 || field1 === keyList2) {
build.push({ [field1]: field2 });
} else if (/^\d+$/.test(field1)) { // all digits
build.push({ [field1]: field2 });
}
// else skip the line
}
return build;
}
// --- Test (matching the original testLRS) ---
function testLRS() {
const test = "\n RECORD Customer Example\n customer-name Fred Smith\n customer-email fsmith@example.com\n customer-phone +1 555 123 4567\n \n RECORD Fruit Example\n 1 Grapes\n 2 Oranges\n - Peaches \n * Mandarines\n * Strawberries\n * Raspberries";
console.log("Input:");
test.split("\n").forEach(line => console.log(`"${line}"`));
console.log("Output:");
const knownFields = ["customer-name", "customer-email", "customer-phone"];
const arr = parseLineRecordString(test, knownFields);
arr.forEach(item => {
const key = Object.keys(item)[0];
const value = item[key];
console.log(`${key} : ${value}`);
});
}
// Run test
testLRS();
// Example code in Objective-C
/* Displays...
Input:
""
" RECORD Customer Example"
" customer-name Fred Smith"
" customer-email fsmith@example.com"
" customer-phone +1 555 123 4567"
" "
" RECORD Fruit Example"
" 1 Grapes"
" 2 Oranges"
" - Peaches "
" * Mandarines"
" * Strawberries"
" * Raspberries"
Output:
# : Customer Example
customer-name : Fred Smith
customer-email : fsmith@example.com
customer-phone : +1 555 123 4567
# : Fruit Example
1 : Grapes
2 : Oranges
- : Peaches
* : Mandarines
* : Strawberries
* : Raspberries
*/
#import <Foundation/Foundation.h>
@interface LineRecordString ()
@property (weak, nonatomic) id<ServerTemplateDelegate> delegate;
@end
@implementation LineRecordString
- (void)testLRS
{
auto test = @"\n RECORD Customer Example\n customer-name Fred Smith\n customer-email fsmith@example.com\n customer-phone +1 555 123 4567\n \n RECORD Fruit Example\n 1 Grapes\n 2 Oranges\n - Peaches \n * Mandarines\n * Strawberries\n * Raspberries";
NSLog(@"Input:");
for (NSString * line in [test componentsSeparatedByString:@"\n"]) {
NSLog(@"\"%@\"",line);
}
NSLog(@"Output:");
auto knownFields = @[@"customer-name",@"customer-email",@"customer-phone"]; // specific field names to include
NSArray<NSDictionary<NSString*,NSString*>*>* arr = [self parseLineRecordString:test recognisedFieldNames:knownFields];
for (NSDictionary<NSString*,NSString*>* item in arr) {
auto key = [item allKeys][0];
auto value = [item allValues][0];
NSLog(@"%@ : %@",key,value);
}
return;
}
// Objective-C example parser
- (NSArray<NSDictionary<NSString*,NSString*>*>*) parseLineRecordString:(NSString*)source recognisedFieldNames:(NSArray<NSString*>*)knownFields
{
auto keyNewrecord = @"RECORD";
auto keyNewrec = @"#";
auto keyTitle = @"TITLE";
auto keyList1 = @"-";
auto keyList2 = @"*";
if(source == nil || source.length == 0) {return nil;}
// Parse the file...
// This is an array of single-record-dictionary key-value pairs, to allow flexibility. This could also be a dictionary using the TITLE field as a key, adding other valid fields to the current record:
NSMutableArray<NSDictionary<NSString*,NSString*>*>* build = [[NSMutableArray<NSDictionary<NSString*,NSString*>*> alloc]init];
auto lines = [[[source
stringByReplacingOccurrencesOfString:@"'" withString:@"\'"]
stringByReplacingOccurrencesOfString:@"\r" withString:@""]
componentsSeparatedByString:@"\n"];
for (NSString * line in lines) {
// two fields separated by any amount of whitespace on the same line.
NSString * trimline = [line stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceCharacterSet]]; // WWWWWabcdefh ijklmnoWWWWWWW -- remove extreme whitespace (shown as W)
if(trimline.length != 0) {
// Separate the fields (1 or 2) -- field 1 contains no whitespace.
NSString* field1; // key
NSString* field2; // data
NSRange pos = [trimline rangeOfCharacterFromSet:[NSCharacterSet whitespaceCharacterSet]]; // abcdefghWWWWijklmno -- find next whitespace, shown as W
if(pos.location == NSNotFound) {
field1 = trimline;
field2 = @"";
} else {
field1 = [[trimline substringToIndex:pos.location]stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceCharacterSet]]; // extract field 1
field2 = [[trimline substringFromIndex:pos.location + pos.length] stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceCharacterSet]]; // extract field 2
}
// ...optionally do extra processing here...
// recognised field name...
if([knownFields containsObject: field1]) {
[build addObject:@{field1:field2}];
}
// record field type...
else if(
[field1 isEqualToString:keyNewrecord] ||
[field1 isEqualToString:keyNewrec]) {
[build addObject:@{keyNewrec:field2}]; // use the short version
}
// key field type
else if(
[field1 isEqualToString:keyTitle]) {
[build addObject:@{field1:field2}];
}
// other special field names
else if(
[field1 isEqualToString:keyList1] ||
[field1 isEqualToString:keyList2]
) {
[build addObject:@{field1:field2}];
}
// it's a numbered list
else if( [NSCharacterSet.decimalDigitCharacterSet isSupersetOfSet:[NSCharacterSet characterSetWithCharactersInString:field1]]) {
[build addObject:@{field1:field2}];
}
// else skip the line.
}
}
return build;
}
@end
def parse_line_record_string(source: str, known_fields: list) -> list:
"""
Parses a line‑record string.
:param source: The input string.
:param known_fields: List of recognised field names to include directly.
:return: List of single‑key dictionaries representing the parsed records.
"""
if not source:
return []
key_newrecord = "RECORD"
key_newrec = "#"
key_title = "TITLE"
key_list1 = "-"
key_list2 = "*"
# Escape single quotes and remove carriage returns (as in original)
processed = source.replace("'", "\\'").replace("\r", "")
lines = processed.split("\n")
build = []
for line in lines:
trimline = line.strip()
if not trimline:
continue
# Find index of first whitespace character
ws_index = None
for i, ch in enumerate(trimline):
if ch.isspace():
ws_index = i
break
if ws_index is None:
field1 = trimline
field2 = ""
else:
field1 = trimline[:ws_index].strip() # no whitespace, strip for safety
# Skip the whitespace character (the original skips one character)
field2 = trimline[ws_index + 1:].strip()
# Categorise and append
if field1 in known_fields:
build.append({field1: field2})
elif field1 in (key_newrecord, key_newrec):
build.append({key_newrec: field2})
elif field1 == key_title:
build.append({field1: field2})
elif field1 in (key_list1, key_list2):
build.append({field1: field2})
elif field1.isdigit():
build.append({field1: field2})
# else skip the line
return build
# --- Test (matching the original testLRS) ---
def test_lrs():
test = (
"\n RECORD Customer Example\n"
" customer-name Fred Smith\n"
" customer-email fsmith@example.com\n"
" customer-phone +1 555 123 4567\n"
" \n"
" RECORD Fruit Example\n"
" 1 Grapes\n"
" 2 Oranges\n"
" - Peaches \n"
" * Mandarines\n"
" * Strawberries\n"
" * Raspberries"
)
print("Input:")
for line in test.split("\n"):
print(f'"{line}"')
print("Output:")
known_fields = ["customer-name", "customer-email", "customer-phone"]
arr = parse_line_record_string(test, known_fields)
for item in arr:
key = next(iter(item)) # first (and only) key
value = item[key]
print(f"{key} : {value}")
# Run test
if __name__ == "__main__":
test_lrs()
import Foundation
/// Parses a line-record string.
/// - Parameters:
/// - source: The raw input string.
/// - knownFields: An array of recognised field names to include verbatim.
/// - Returns: An array of single-key dictionaries representing parsed records.
func parseLineRecordString(source: String?, knownFields: [String]) -> [[String: String]] {
guard let source = source, !source.isEmpty else { return [] }
let keyNewrecord = "RECORD"
let keyNewrec = "#"
let keyTitle = "TITLE"
let keyList1 = "-"
let keyList2 = "*"
// Escape single quotes and strip carriage returns (mirrors original behaviour)
let processed = source
.replacingOccurrences(of: "'", with: "\\'")
.replacingOccurrences(of: "\r", with: "")
let lines = processed.components(separatedBy: "\n")
var build = [[String: String]]()
for line in lines {
let trimline = line.trimmingCharacters(in: .whitespaces)
if trimline.isEmpty { continue }
// Find index of first whitespace character
var wsIndex: String.Index?
for (offset, char) in trimline.enumerated() {
if char.isWhitespace {
wsIndex = trimline.index(trimline.startIndex, offsetBy: offset)
break
}
}
let field1: String
let field2: String
if let idx = wsIndex {
// Field 1: up to (but not including) the whitespace, trimmed for safety
field1 = String(trimline[..<idx]).trimmingCharacters(in: .whitespaces)
// Field 2: after the whitespace character, trimmed
let afterIdx = trimline.index(after: idx)
field2 = String(trimline[afterIdx...]).trimmingCharacters(in: .whitespaces)
} else {
// No whitespace – entire line is field1, field2 is empty
field1 = trimline
field2 = ""
}
// Categorise and store the pair
if knownFields.contains(field1) {
build.append([field1: field2])
} else if field1 == keyNewrecord || field1 == keyNewrec {
build.append([keyNewrec: field2]) // always use the short "#" key
} else if field1 == keyTitle {
build.append([field1: field2])
} else if field1 == keyList1 || field1 == keyList2 {
build.append([field1: field2])
} else if CharacterSet.decimalDigits.isSuperset(of: CharacterSet(charactersIn: field1)) {
build.append([field1: field2]) // numbered list entries
}
// else skip the line entirely
}
return build
}
// MARK: - Test (matches original testLRS)
func testLRS() {
let test = """
\n RECORD Customer Example
customer-name Fred Smith
customer-email fsmith@example.com
customer-phone +1 555 123 4567
RECORD Fruit Example
1 Grapes
2 Oranges
- Peaches
* Mandarines
* Strawberries
* Raspberries
"""
print("Input:")
for line in test.components(separatedBy: "\n") {
print("\"\(line)\"")
}
print("Output:")
let knownFields = ["customer-name", "customer-email", "customer-phone"]
let arr = parseLineRecordString(source: test, knownFields: knownFields)
for item in arr {
if let key = item.keys.first, let value = item[key] {
print("\(key) : \(value)")
}
}
}
// Run the test
testLRS()
@akingdom
Copy link
Copy Markdown
Author

akingdom commented May 8, 2023

Sharing this as a rough reference for clients and developers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment