Skip to content

Instantly share code, notes, and snippets.

@wware
Last active February 3, 2025 01:42
Show Gist options
  • Save wware/a61cd9bfd5789f84ad521d1755b7674e to your computer and use it in GitHub Desktop.
Save wware/a61cd9bfd5789f84ad521d1755b7674e to your computer and use it in GitHub Desktop.

Machine readable scientific literature

First a bit of background. Around 2010 I had the idea that computers could do science, not simply run lab equipment but iterate the scientific method on their own. I was thinking about how to make that happen, and within a month I stumbled across a brilliant piece of work done by Ross King, then at Aberystwyth University. He built a robot that could make observations, reason about hypotheses and the predictions implied by those hypotheses, and design and perform experiments. His robot produced publishable scientific papers, and advanced the state of biology in the area of yeast genomics. This was the full expression of the ideas I'd been kicking around for the previous month or two, and I tried to think of what was left to do.

At that time the semantic web was still a thing, and I thought that a good idea would be a way to express scientific ideas in a way that could be rendered into a graph database. The notion was to embed the RDF/Turtle in the markdown document, and use the markdown to render the document and the RDF/Turtle to render the knowledge graph. I was aware that researchers would not be eager to learn yet another syntax, so I thought about how to ease the learning curve. I called this idea "machine readable scientific literature" or MRSL for short.

This is a hodgepodge of ideas for a possible MRSL syntax, and the beginnings of a Pandoc parser for it.

How to build this document

pandoc --toc --toc-depth=5 \
    --from markdown+fenced_code_blocks+backtick_code_blocks \
    --highlight-style=pygments \
    -V geometry:margin=1in \
    -V fontsize=11pt \
    --number-sections \
    --listings mrsl.md -o mrsl.pdf

Basic syntax

An obvious approach to a machine-readable scientific literature is to simply embed pieces of RDF/Turtle in the document, which then constitute a knowledge graph allowing a machine to understand the document contents, hopefully as well as a human does. Of course there is nothing preventing a divergence, and you want to think about how you'll address that.

An example, taken from RDF 1.2 Turtle: Terse RDF Triple Language, looks like this:

PREFIX :    <http://www.example.org/>

:employee38 :familyName "Smith" .
_:id rdf:reifies <<( :employee38 :jobTitle "Assistant Designer" )>> .
_:id :accordingTo :employee22 .

But the syntax is a bit cumbersome.

Can we do better?

Turtle can be verbose, and we're going to want to be able to express more complex ideas than this.

Each MRSL document can begin with namespace declarations in Turtle:

@base   <http://example.org/> .
@prefix person: <http://example.org/person/> .
@prefix loc:    <http://example.org/location/> .
@prefix org:    <http://www.w3.org/ns/org#> .
@prefix time:   <http://www.w3.org/2006/time#> .
@prefix geo:    <http://www.w3.org/2003/01/geo/wgs84_pos#> .
@prefix schema: <http://schema.org/> .
@prefix dc:     <http://purl.org/dc/terms/> .
@prefix doc:    <http://example.org/document/> .

This allows a shorthand syntax to map directly to URIs:

  • @JohnDoe maps to <http://example.org/person/JohnDoe>
  • @loc/NewYork maps to <http://example.org/location/NewYork>
  • @org/AcmeInc maps to <http://example.org/organization/AcmeInc>
  • @event/Conference2023 maps to <http://example.org/event/Conference2023>

To express more complex relationships and ideas:

  1. Properties and Attributes

    • Use square brackets for properties: @JohnDoe[org:role=org:CEO, dc:valid=2020]
    • Nested properties are allowed:
      @org/AcmeInc[
        org:hasSite=@loc/NewYork[
          geo:location[
            schema:addressRegion=Manhattan
          ]
        ]
      ]
      
  2. Immutable Variables

    • Define named references using equality:
      acme = @org/AcmeInc[
        org:hasSite=@loc/NewYork[
          geo:location[
            schema:addressRegion=Manhattan
          ]
        ]
      ]
      
      Variables can then be used anywhere their value could be used, making complex expressions more readable. The equal sign is a statement of equality, not an assignment. The value of a variable never changes, the "immutable data" idea from functional programming.
  3. Relationships

    • Use arrow syntax for relationships:
      @JohnDoe -> org:headOf -> @org/AcmeInc
      
    • Chain relationships:
      @AliceSmith -> org:reportsTo -> @BobJones 
        -> org:reportsTo -> @CarolWhite
      
    • Add properties to relationships:
      @JohnDoe -[dc:valid=2020]-> org:headOf -> @org/AcmeInc
      
  4. Temporal Context

    • Use time: prefix for temporal markers:
      time:Interval[
        time:hasBeginning=2023-10-01;
        time:hasEnd=2023-12-31
      ]
      
    • Attach time context:
      (@JohnDoe -> org:memberOf -> @org/AcmeInc)[
        time:hasTime=time:Interval[
          time:hasBeginning=2020-01-01;
          time:hasEnd=2023-12-31
        ]
      ]
      
  5. Groups and Sets

    • Use curly braces for groups:
      {@JohnDoe, @AliceSmith, @BobJones}
      
    • Apply relationships to groups:
      {@JohnDoe, @AliceSmith} -> schema:attendee -> @event/Conference2023
      
  6. Citations and Sources

    • Use parentheses with source:
      (@JohnDoe -> org:headOf -> @org/AcmeInc)[
        dc:source=@doc/AnnualReport2023
      ]
      

MRSL parser module

Can we preserve that indentation when the markdown is rendered? Maybe it's time to start work on a MRSL parser module to help the markdown parser. This module should handle both rendering to HTML or PDF, and also generating full RDF/Turtle from the MRSL syntax.

To preserve the indentation in rendered markdown, we'll need to switch from inline code blocks (single backticks) to fenced code blocks (triple backticks) with a custom language identifier for MRSL. This would also set us up nicely for syntax highlighting.

The parser module contains these files:

  • mrsl_parser.py - The parser module
  • mrsl_filter.py - A filter for the markdown parser
  • mrsl.xml - A syntax definition for the markdown parser
  • test_mrsl_parser.py - Some pytest cases for the parser

The parser module is a work in progress, and the syntax is not yet complete. In particular, it is producing HTML but LaTeX hasn't been tested, nor has the RDF/Turtle generation been validated against a schema.

<?xml version="1.0" encoding="UTF-8"?>
<language name="mrsl" version="1.0" kateversion="5.0"
section="Markup" extensions="*.mrsl">
<highlighting>
<list name="keywords">
<item>PREFIX</item>
<item>base</item>
</list>
<contexts>
<context name="Normal" attribute="Normal Text" lineEndContext="#stay">
<!-- Variables (orange) -->
<RegExpr attribute="Variable" String="[a-zA-Z_]\w*\s*=" />
<!-- Entities (blue) -->
<RegExpr attribute="Entity" String="@[\w/]+" />
<!-- Properties (green) -->
<RegExpr attribute="Property" String="[\w]+:[\w]+" />
<!-- Operators (red) -->
<RegExpr attribute="Arrow" String="->" />
<DetectChar attribute="Bracket" char="[" />
<DetectChar attribute="Bracket" char="]" />
<DetectChar attribute="Bracket" char="{" />
<DetectChar attribute="Bracket" char="}" />
<!-- Values (purple) -->
<RegExpr attribute="Value" String="&quot;[^&quot;]*&quot;" />
<RegExpr attribute="Value" String="\d{4}-\d{2}-\d{2}" />
</context>
</contexts>
<itemDatas>
<itemData name="Normal Text" defStyleNum="dsNormal"/>
<itemData name="Variable" defStyleNum="dsVariable" color="#FFA500"/>
<itemData name="Entity" defStyleNum="dsKeyword" color="#0000FF"/>
<itemData name="Property" defStyleNum="dsDataType" color="#008000"/>
<itemData name="Arrow" defStyleNum="dsOperator" color="#FF0000"/>
<itemData name="Bracket" defStyleNum="dsOperator" color="#FF0000"/>
<itemData name="Value" defStyleNum="dsString" color="#800080"/>
</itemDatas>
</highlighting>
</language>
#!/usr/bin/env python3
from mrsl_parser import MRSLParser
from panflute import RawBlock, Div, CodeBlock, Para, Str, run_filter
def prepare(doc):
"""Initialize the MRSL parser when document processing begins"""
doc.mrsl_parser = MRSLParser()
def handle_code_block(elem, doc):
"""Process code blocks marked as MRSL"""
import sys
# Only process code blocks with MRSL language identifier
if not isinstance(elem, CodeBlock):
return None
if not elem.classes:
return None
if elem.classes[0] != 'mrsl':
return None
try:
# Parse the MRSL content
ast = doc.mrsl_parser.parse(elem.text)
# Convert to appropriate output format based on doc.format
if doc.format == 'html':
highlighted = doc.mrsl_parser.to_html(ast)
# Return raw HTML block
return Div(RawBlock(highlighted, format='html'))
elif doc.format == 'latex':
highlighted = doc.mrsl_parser.to_latex(ast)
# Return raw LaTeX block
return RawBlock(highlighted, format='latex')
elif doc.format == 'rdf':
# Generate RDF/Turtle output
rdf = doc.mrsl_parser.to_rdf(ast)
return CodeBlock(rdf, classes=['turtle'])
# For other formats, return the original code block
return elem
except Exception as e:
# Handle parsing errors by adding error message as comment
error_msg = f"MRSL parsing error: {str(e)}"
if doc.format == 'html':
return RawBlock(f"<!-- {error_msg} -->", format='html')
elif doc.format == 'latex':
return RawBlock(f"% {error_msg}", format='latex')
else:
return Para(Str(error_msg))
def main(doc=None):
"""Main entry point for the filter"""
return run_filter(handle_code_block, prepare=prepare, doc=doc)
if __name__ == '__main__':
main()
from enum import Enum, auto
import re
import sys
from typing import List, Optional, Union
from pydantic import BaseModel
class TokenType(Enum):
ENTITY = 'ENTITY'
PROPERTY = 'PROPERTY'
ARROW = 'ARROW'
VARIABLE = 'VARIABLE'
EQUALS = 'EQUALS'
LBRACKET = 'LBRACKET'
RBRACKET = 'RBRACKET'
LBRACE = 'LBRACE'
RBRACE = 'RBRACE'
LPAREN = 'LPAREN'
RPAREN = 'RPAREN'
SEMICOLON = 'SEMICOLON'
COMMA = 'COMMA'
VALUE = 'VALUE'
WHITESPACE = 'WHITESPACE'
class Property(BaseModel):
name: str
value: Union[str, 'Entity']
class Entity(BaseModel):
name: str
properties: List[Property]
class Group(BaseModel):
entities: List[Entity]
class Context(BaseModel):
temporal: Optional['TimeInterval'] = None
source: Optional[Entity] = None
class TimeInterval(BaseModel):
start: str # ISO date format
end: Optional[str] = None
class Relation(BaseModel):
"""A relation between entities or groups"""
subject: Union[Entity, Group]
predicate: Optional[str] = None
object: Union[Entity, Group]
context: Optional[Context] = None
class AST(BaseModel):
entities: List[Entity]
relations: List[Relation] = []
variables: dict = {}
class MRSLParser:
def __init__(self):
self.token_patterns = [
(TokenType.WHITESPACE, r'[ \t\n\r]+'),
(TokenType.PROPERTY, r'[a-zA-Z][\w]*:[a-zA-Z][\w-]*'),
(TokenType.ENTITY, r'@[a-zA-Z_][\w/]*'),
(TokenType.VARIABLE, r'[a-zA-Z_]\w*(?=\s*=)'),
(TokenType.VALUE, r'"[^"]*"|\d{4}-\d{2}-\d{2}|\d+|[A-Za-z]+'),
(TokenType.EQUALS, r'='),
(TokenType.ARROW, r'-\[([^\]]+)\]->|->'),
(TokenType.LBRACKET, r'\['),
(TokenType.RBRACKET, r'\]'),
(TokenType.LBRACE, r'\{'),
(TokenType.RBRACE, r'\}'),
(TokenType.LPAREN, r'\('),
(TokenType.RPAREN, r'\)'),
(TokenType.SEMICOLON, r';'),
(TokenType.COMMA, r','),
]
self.variables = {} # Store variable definitions
def tokenize(self, text):
tokens = []
position = 0
while position < len(text):
match = None
remaining = text[position:]
for token_type, pattern in self.token_patterns:
regex = re.compile(pattern)
match = regex.match(remaining)
if match:
value = match.group(0)
if token_type == TokenType.ARROW and '[' in value:
# Extract predicate from -[predicate]->
predicate = match.group(1) # Get the captured group
tokens.append((TokenType.ARROW, value, predicate)) # Store predicate in token
elif token_type != TokenType.WHITESPACE:
tokens.append((token_type, value))
position += len(value)
break
if not match:
context = text[max(0, position-10):position] + "👉" + text[position:position+10]
raise SyntaxError(f"Invalid syntax at position {position}\nContext: {context}")
assert len(tokens) < 1000, "Tokens length is too long"
return tokens
# Apply CSS classes to the AST
# .mrsl-entity { color: blue; }
# .mrsl-property { color: green; }
# .mrsl-operator { color: red; }
# .mrsl-value { color: purple; }
# .mrsl-variable { color: orange; }
def parse(self, text: str) -> Union[Entity, AST]:
"""Parse MRSL text into an AST"""
# import traceback
# print(''.join(traceback.format_stack()), file=sys.stderr)
tokens = self.tokenize(text)
ast = AST(entities=[], relations=[], variables={})
while tokens:
# Skip whitespace
while tokens and tokens[0][0] == TokenType.WHITESPACE:
tokens = tokens[1:]
if not tokens:
break
# Parse next statement
if tokens[0][0] == TokenType.VARIABLE:
var_name = tokens[0][1]
tokens = tokens[1:] # Consume variable name
# Skip whitespace before equals
while tokens and tokens[0][0] == TokenType.WHITESPACE:
tokens = tokens[1:]
if not tokens or tokens[0][0] != TokenType.EQUALS:
raise SyntaxError("Expected = after variable name")
tokens = tokens[1:] # Consume equals
# Parse the value (entity or relation)
value, tokens = self.parse_value(tokens)
ast.variables[var_name] = value
elif tokens[0][0] == TokenType.LPAREN:
relation, tokens = self.parse_relation(tokens)
ast.relations.append(relation)
elif tokens[0][0] == TokenType.ENTITY:
entity, tokens = self.parse_entity(tokens)
ast.entities.append(entity)
else:
raise SyntaxError(f"Expected entity, relation, or variable, got {tokens[0][0]}")
return ast
def parse_entity(self, tokens: List[tuple]) -> tuple[Entity, List[tuple]]:
"""Parse an entity definition, return (Entity, remaining_tokens)"""
if not tokens or tokens[0][0] != TokenType.ENTITY:
raise SyntaxError("Expected entity starting with @")
entity_name = tokens[0][1]
tokens = tokens[1:] # Consume entity token
properties = []
# Skip whitespace before properties
while tokens and tokens[0][0] == TokenType.WHITESPACE:
tokens = tokens[1:]
# Check if we have properties
if tokens and tokens[0][0] == TokenType.LBRACKET:
tokens = tokens[1:] # Consume [
while tokens:
# Skip whitespace
while tokens and tokens[0][0] == TokenType.WHITESPACE:
tokens = tokens[1:]
if not tokens:
raise SyntaxError("Unexpected end of input in properties")
if tokens[0][0] == TokenType.RBRACKET:
tokens = tokens[1:] # Consume ]
break
# Parse property
if tokens[0][0] != TokenType.PROPERTY:
raise SyntaxError(f"Expected property name, got {tokens[0][0]}")
prop, tokens = self.parse_property(tokens)
properties.append(prop)
# Skip whitespace after property
while tokens and tokens[0][0] == TokenType.WHITESPACE:
tokens = tokens[1:]
# Check for comma or closing bracket
if tokens and tokens[0][0] == TokenType.COMMA:
tokens = tokens[1:] # Consume comma
continue
elif tokens and tokens[0][0] == TokenType.RBRACKET:
tokens = tokens[1:] # Consume ]
break
else:
raise SyntaxError("Expected , or ] after property")
return Entity(name=entity_name, properties=properties), tokens
def parse_property(self, tokens: List[tuple]) -> tuple[Property, List[tuple]]:
"""Parse a property definition, return (Property, remaining_tokens)"""
if not tokens or tokens[0][0] != TokenType.PROPERTY:
raise SyntaxError(f"Expected property name, got {tokens[0][0] if tokens else 'EOF'}")
prop_name = tokens[0][1]
tokens = tokens[1:] # Consume property name
# Skip whitespace
while tokens and tokens[0][0] == TokenType.WHITESPACE:
tokens = tokens[1:]
if not tokens:
raise SyntaxError("Unexpected end of input after property name")
# Handle both direct nested structures and equals assignments
if tokens[0][0] == TokenType.LBRACKET:
# Direct nested structure
nested_entity, tokens = self.parse_nested_structure(tokens)
return Property(name=prop_name, value=nested_entity), tokens
elif tokens[0][0] == TokenType.EQUALS:
# Equals assignment
tokens = tokens[1:] # Consume equals
# Skip whitespace after equals
while tokens and tokens[0][0] == TokenType.WHITESPACE:
tokens = tokens[1:]
if not tokens:
raise SyntaxError("Expected value after =")
if tokens[0][0] == TokenType.ENTITY:
# Value is a nested entity
entity, tokens = self.parse_entity(tokens)
value = entity
elif tokens[0][0] == TokenType.VALUE:
# Value is a simple value
value = tokens[0][1]
tokens = tokens[1:] # Consume value
elif tokens[0][0] == TokenType.LBRACKET:
# Handle nested structure after equals
nested_entity, tokens = self.parse_nested_structure(tokens)
value = nested_entity
else:
raise SyntaxError(f"Expected value, entity, or nested structure, got {tokens[0][0]}")
return Property(name=prop_name, value=value), tokens
else:
raise SyntaxError(f"Expected = or [ after property name, got {tokens[0][0]}")
def parse_nested_structure(self, tokens: List[tuple]) -> tuple[Entity, List[tuple]]:
"""Parse a nested property structure"""
if not tokens or tokens[0][0] != TokenType.LBRACKET:
raise SyntaxError("Expected [ at start of nested structure")
tokens = tokens[1:] # Consume [
properties = []
while tokens:
# Skip whitespace
while tokens and tokens[0][0] == TokenType.WHITESPACE:
tokens = tokens[1:]
if not tokens:
raise SyntaxError("Unexpected end of input in nested structure")
if tokens[0][0] == TokenType.RBRACKET:
tokens = tokens[1:] # Consume ]
break
# Parse property
prop, tokens = self.parse_property(tokens)
properties.append(prop)
# Skip whitespace
while tokens and tokens[0][0] == TokenType.WHITESPACE:
tokens = tokens[1:]
# Check for comma or closing bracket
if tokens and tokens[0][0] == TokenType.COMMA:
tokens = tokens[1:] # Consume comma
continue
elif tokens and tokens[0][0] == TokenType.RBRACKET:
tokens = tokens[1:] # Consume ]
break
else:
raise SyntaxError("Expected , or ] in nested structure")
return Entity(name="", properties=properties), tokens
def parse_relation(self, tokens: List[tuple]) -> tuple[Relation, List[tuple]]:
"""Parse a relation definition, return (Relation, remaining_tokens)"""
# Skip opening parenthesis if present
if tokens and tokens[0][0] == TokenType.LPAREN:
tokens = tokens[1:]
# Parse subject
if not tokens:
raise SyntaxError("Unexpected end of input in relation")
if tokens[0][0] == TokenType.LBRACE:
subject, tokens = self.parse_group(tokens)
elif tokens[0][0] == TokenType.ENTITY:
subject, tokens = self.parse_entity(tokens)
else:
raise SyntaxError(f"Expected entity or group at start of relation, got {tokens[0][0]}")
# Skip whitespace before first arrow
while tokens and tokens[0][0] == TokenType.WHITESPACE:
tokens = tokens[1:]
# Parse first arrow
if not tokens or tokens[0][0] != TokenType.ARROW:
raise SyntaxError("Expected -> in relation")
arrow_token = tokens[0]
predicate = None
# Check if the arrow token contains a predicate
if len(arrow_token) > 2 and arrow_token[2]:
predicate = arrow_token[2] # Extract predicate from the third element
# print(f"DEBUG: Found predicate in arrow: {predicate}")
tokens = tokens[1:] # Consume arrow
else:
tokens = tokens[1:] # Consume arrow
# Skip whitespace after first arrow
while tokens and tokens[0][0] == TokenType.WHITESPACE:
tokens = tokens[1:]
# Check for predicate as a separate property
if tokens and tokens[0][0] == TokenType.PROPERTY:
predicate = tokens[0][1] # Get predicate from property token
# print(f"DEBUG: Found predicate as property: {predicate}")
tokens = tokens[1:] # Consume predicate
# Skip whitespace
while tokens and tokens[0][0] == TokenType.WHITESPACE:
tokens = tokens[1:]
# Expect second arrow
if not tokens or tokens[0][0] != TokenType.ARROW:
raise SyntaxError("Expected -> after predicate")
tokens = tokens[1:] # Consume second arrow
# Skip whitespace before object
while tokens and tokens[0][0] == TokenType.WHITESPACE:
tokens = tokens[1:]
# Parse object
if not tokens:
raise SyntaxError("Unexpected end of input in relation")
if tokens[0][0] == TokenType.LBRACE:
obj, tokens = self.parse_group(tokens)
elif tokens[0][0] == TokenType.ENTITY:
obj, tokens = self.parse_entity(tokens)
else:
raise SyntaxError(f"Expected entity or group after arrow, got {tokens[0][0]}")
# Handle optional closing parenthesis
if tokens and tokens[0][0] == TokenType.RPAREN:
tokens = tokens[1:]
# Parse context if present
if tokens and tokens[0][0] == TokenType.LBRACKET:
context, tokens = self.parse_context(tokens)
else:
context = None
relation = Relation(subject=subject, predicate=predicate, object=obj, context=context)
return relation, tokens
def parse_group(self, tokens: List[tuple]) -> tuple[Group, List[tuple]]:
"""Parse a group of entities, return (Group, remaining_tokens)"""
# Skip opening brace
if not tokens or tokens[0][0] != TokenType.LBRACE:
raise SyntaxError("Expected { at start of group")
tokens = tokens[1:]
entities = []
while tokens:
# Skip whitespace
while tokens and tokens[0][0] == TokenType.WHITESPACE:
tokens = tokens[1:]
if not tokens:
raise SyntaxError("Unexpected end of input in group")
if tokens[0][0] == TokenType.RBRACE:
tokens = tokens[1:] # Consume closing brace
break
# Parse entity
if tokens[0][0] != TokenType.ENTITY:
raise SyntaxError(f"Expected entity in group, got {tokens[0][0]}")
entity, tokens = self.parse_entity(tokens)
entities.append(entity)
# Skip whitespace
while tokens and tokens[0][0] == TokenType.WHITESPACE:
tokens = tokens[1:]
# Check for comma or closing brace
if tokens and tokens[0][0] == TokenType.COMMA:
tokens = tokens[1:] # Consume comma
continue
elif tokens and tokens[0][0] == TokenType.RBRACE:
tokens = tokens[1:] # Consume closing brace
break
else:
raise SyntaxError("Expected , or } in group")
return Group(entities=entities), tokens
def parse_value(self, tokens: List[tuple]) -> tuple[Union[Entity, Relation], List[tuple]]:
"""Parse a value (entity or relation) after a variable assignment"""
# Skip whitespace
while tokens and tokens[0][0] == TokenType.WHITESPACE:
tokens = tokens[1:]
if not tokens:
raise SyntaxError("Expected value after =")
# Check if it's a relation (starts with parenthesis)
if tokens[0][0] == TokenType.LPAREN:
return self.parse_relation(tokens)
# Or if it's an entity
elif tokens[0][0] == TokenType.ENTITY:
return self.parse_entity(tokens)
else:
raise SyntaxError(f"Expected entity or relation, got {tokens[0][0]}")
def parse_context(self, tokens: List[tuple]) -> tuple[Context, List[tuple]]:
"""Parse a context definition, return (Context, remaining_tokens)"""
if not tokens or tokens[0][0] != TokenType.LBRACKET:
raise SyntaxError("Expected [ at start of context")
tokens = tokens[1:] # Consume [
temporal = None
source = None
while tokens:
# Skip whitespace
while tokens and tokens[0][0] == TokenType.WHITESPACE:
tokens = tokens[1:]
if not tokens:
raise SyntaxError("Unexpected end of input in context")
if tokens[0][0] == TokenType.RBRACKET:
tokens = tokens[1:] # Consume ]
break
# Parse property
if tokens[0][0] != TokenType.PROPERTY:
raise SyntaxError(f"Expected property in context, got {tokens[0][0]}")
prop_name = tokens[0][1]
tokens = tokens[1:] # Consume property name
# Skip whitespace before equals
while tokens and tokens[0][0] == TokenType.WHITESPACE:
tokens = tokens[1:]
if not tokens or tokens[0][0] != TokenType.EQUALS:
raise SyntaxError("Expected = in context property")
tokens = tokens[1:] # Consume equals
# Skip whitespace after equals
while tokens and tokens[0][0] == TokenType.WHITESPACE:
tokens = tokens[1:]
if not tokens:
raise SyntaxError("Expected value in context property")
if prop_name in ['time:hasTime', 'time:start']: # Added time:start
if tokens[0][0] == TokenType.PROPERTY and tokens[0][1] == 'time:Interval':
tokens = tokens[1:] # Consume time:Interval
interval_entity, tokens = self.parse_nested_structure(tokens)
# Extract start/end from interval properties
start = end = None
for prop in interval_entity.properties:
if prop.name == 'time:hasBeginning':
start = prop.value.strip('"')
elif prop.name == 'time:hasEnd':
end = prop.value.strip('"')
temporal = TimeInterval(start=start, end=end)
else:
temporal = TimeInterval(start=tokens[0][1].strip('"'))
tokens = tokens[1:]
elif prop_name == 'dc:source':
if tokens[0][0] != TokenType.ENTITY:
raise SyntaxError(f"Expected entity as source, got {tokens[0][0]}")
source, tokens = self.parse_entity(tokens)
else:
raise SyntaxError(f"Unknown context property: {prop_name}")
# Skip whitespace after value
while tokens and tokens[0][0] == TokenType.WHITESPACE:
tokens = tokens[1:]
# Check for comma or closing bracket
if tokens and tokens[0][0] == TokenType.COMMA:
tokens = tokens[1:] # Continue to next property
elif tokens and tokens[0][0] == TokenType.RBRACKET:
tokens = tokens[1:] # End of context
break
else:
raise SyntaxError("Expected , or ] in context")
return Context(temporal=temporal, source=source), tokens
def to_rdf(self, ast: Union[AST, Entity]) -> str:
"""Convert AST or Entity to RDF (Turtle format)"""
if isinstance(ast, Entity):
return self._entity_to_rdf(ast)
else:
# Handle AST with multiple entities and relations
lines = []
for entity in ast.entities:
lines.append(self._entity_to_rdf(entity))
for relation in ast.relations:
lines.append(self._relation_to_rdf(relation))
return '\n'.join(lines)
def to_html(self, ast: Union[AST, Entity]) -> str:
"""Convert AST or Entity to HTML with syntax highlighting"""
if isinstance(ast, Entity):
return self._entity_to_html(ast)
else:
# Handle AST with multiple entities and relations
lines = ['<div class="mrsl-block">']
for entity in ast.entities:
lines.append(self._entity_to_html(entity))
for relation in ast.relations:
lines.append(self._relation_to_html(relation))
lines.append('</div>')
return '\n'.join(lines)
def to_latex(self, ast: Union[AST, Entity]) -> str:
"""Convert AST or Entity to LaTeX"""
if isinstance(ast, Entity):
return self._entity_to_latex(ast)
else:
# Handle AST with multiple entities and relations
lines = []
for entity in ast.entities:
lines.append(self._entity_to_latex(entity))
for relation in ast.relations:
lines.append(self._relation_to_latex(relation))
return '\n'.join(lines)
def _entity_to_rdf(self, entity: Entity) -> str:
"""Convert an entity to RDF (Turtle format)"""
# Define prefixes in the order they should appear
prefixes = [
('pers', 'http://example.org/person#'),
('org', 'http://example.org/org#'),
('geo', 'http://example.org/geo#'),
('schema', 'http://schema.org/'),
('loc', 'http://example.org/loc#'),
('name', 'http://example.org/name#'),
('time', 'http://example.org/time#'),
('dc', 'http://purl.org/dc/terms/')
]
# Output prefixes in defined order
lines = []
for prefix, uri in prefixes:
lines.append(f'@prefix {prefix}: <{uri}> .')
lines.append('') # Empty line after prefixes
def entity_to_rdf(entity: Entity, depth: int = 0) -> List[str]:
"""Convert an entity and its properties to RDF triples"""
indent = " " * depth
triples = []
# Convert @org/AcmeInc to org:AcmeInc
if '/' in entity.name:
prefix, name = entity.name.replace('@', '').split('/')
subject = f"{prefix}:{name}"
else:
subject = entity.name.replace('@', '')
if ':' not in subject:
subject = f"pers:{subject}"
for prop in entity.properties:
predicate = prop.name
if isinstance(prop.value, Entity):
triples.append(f"{indent}{subject} {predicate} [")
triples.extend(entity_to_rdf(prop.value, depth + 1))
triples.append(f"{indent}] .")
else:
value = f'"{str(prop.value)}"' if isinstance(prop.value, str) else prop.value
triples.append(f"{indent}{subject} {predicate} {value} .")
return triples
lines.extend(entity_to_rdf(entity))
return '\n'.join(lines).rstrip() # Remove trailing whitespace
def _relation_to_rdf(self, relation: Relation) -> str:
"""Convert a relation to RDF format"""
lines = []
# Convert subject
if isinstance(relation.subject, Entity):
ns, name = relation.subject.name.replace('@', '').split('/')
subject = f"{ns}:{name}"
else: # Group
subject = "_:group1" # TODO: Generate unique blank node IDs
# Convert predicate
predicate = relation.predicate if relation.predicate else "relates"
# Convert object
if isinstance(relation.object, Entity):
ns, name = relation.object.name.replace('@', '').split('/')
obj = f"{ns}:{name}"
else: # Group
obj = "_:group2" # TODO: Generate unique blank node IDs
# Basic relation triple
lines.append(f"{subject} {predicate} {obj} .")
# Add context if present
if relation.context:
if relation.context.temporal:
lines.append(f"{subject} time:start \"{relation.context.temporal.start}\" .")
if relation.context.temporal.end:
lines.append(f"{subject} time:end \"{relation.context.temporal.end}\" .")
if relation.context.source:
ns, name = relation.context.source.name.replace('@', '').split('/')
lines.append(f"{subject} dc:source {ns}:{name} .")
return '\n'.join(lines)
def _entity_to_html(self, entity: Entity) -> str:
"""Convert an entity to HTML with syntax highlighting"""
parts = ['<div class="mrsl-block">']
# Entity name
parts.append(f'<span class="mrsl-entity">{entity.name}</span>')
if entity.properties:
parts.append('<span class="mrsl-bracket">[</span>')
# Properties
for i, prop in enumerate(entity.properties):
parts.append('\n ') # Indent properties
# Property name
parts.append(f'<span class="mrsl-property">{prop.name}</span>')
parts.append('<span class="mrsl-equals"> = </span>')
# Property value
if isinstance(prop.value, Entity):
parts.append(self._entity_to_html(prop.value))
else:
parts.append(f'<span class="mrsl-value">{str(prop.value)}</span>')
if i < len(entity.properties) - 1:
parts.append('<span class="mrsl-comma">,</span>')
parts.append('\n') # Newline before closing bracket
parts.append('<span class="mrsl-bracket">]</span>')
parts.append('</div>')
return ''.join(parts)
def _relation_to_html(self, relation: Relation) -> str:
"""Convert a relation to HTML with syntax highlighting"""
lines = []
lines.append('<span class="mrsl-paren">(</span>')
# Subject
if isinstance(relation.subject, Entity):
lines.append(self._entity_to_html(relation.subject))
else: # Group
lines.append('<span class="mrsl-bracket">{</span>')
for i, entity in enumerate(relation.subject.entities):
lines.append(self._entity_to_html(entity))
if i < len(relation.subject.entities) - 1:
lines.append('<span class="mrsl-comma">, </span>')
lines.append('<span class="mrsl-bracket">}</span>')
# Arrow and predicate
if relation.predicate:
lines.append(f'<span class="mrsl-arrow">-[{relation.predicate}]-></span>')
else:
lines.append('<span class="mrsl-arrow">-></span>')
# Object
if isinstance(relation.object, Entity):
lines.append(self._entity_to_html(relation.object))
else: # Group
lines.append('<span class="mrsl-bracket">{</span>')
for i, entity in enumerate(relation.object.entities):
lines.append(self._entity_to_html(entity))
if i < len(relation.object.entities) - 1:
lines.append('<span class="mrsl-comma">, </span>')
lines.append('<span class="mrsl-bracket">}</span>')
# Context
if relation.context:
lines.append('<span class="mrsl-bracket">[</span>')
if relation.context.temporal:
lines.append(f'<span class="mrsl-property">time:start</span>')
lines.append('<span class="mrsl-equals"> = </span>')
lines.append(f'<span class="mrsl-value">"{relation.context.temporal.start}"</span>')
if relation.context.source:
if relation.context.temporal:
lines.append('<span class="mrsl-comma">, </span>')
lines.append(f'<span class="mrsl-property">dc:source</span>')
lines.append('<span class="mrsl-equals"> = </span>')
lines.append(self._entity_to_html(relation.context.source))
lines.append('<span class="mrsl-bracket">]</span>')
lines.append('<span class="mrsl-paren">)</span>')
return ''.join(lines)
def _entity_to_latex(self, entity: Entity) -> str:
"""Convert an entity to LaTeX"""
def escape_latex(text: str) -> str:
"""Escape special LaTeX characters"""
# Handle #1 as a special case first
text = str(text).replace('#1', '\\#1')
replacements = [
('\\', '\\textbackslash{}'),
('_', '\\_'),
('$', '\\$'),
('%', '\\%'),
('&', '\\&'),
('{', '\\{'),
('}', '\\}'),
('~', '\\textasciitilde{}'),
('^', '\\textasciicircum{}'),
('#', '\\#')
]
for char, replacement in replacements:
text = text.replace(char, replacement)
return text
content = []
# Entity name
content.append(escape_latex(entity.name))
if entity.properties:
content.append('[')
prop_parts = []
for prop in entity.properties:
if isinstance(prop.value, Entity):
value = self._entity_to_latex(prop.value)
else:
value = escape_latex(str(prop.value))
prop_parts.append(f"{escape_latex(prop.name)} = {value}")
content.append(', '.join(prop_parts))
content.append(']')
# Wrap in lstlisting environment
latex = [
'\\begin{lstlisting}[language=MRSL,',
'basicstyle=\\ttfamily,',
'keywordstyle=\\color{blue},',
'stringstyle=\\color{purple},',
'commentstyle=\\color{green},',
'breaklines=true,',
'showstringspaces=false]',
''.join(content),
'\\end{lstlisting}'
]
return '\n'.join(latex)
def _relation_to_latex(self, relation: Relation) -> str:
"""Convert a relation to LaTeX"""
parts = []
# Convert subject
if isinstance(relation.subject, Entity):
parts.append(self._entity_to_latex(relation.subject))
else: # Group
parts.append('\\{')
for i, entity in enumerate(relation.subject.entities):
parts.append(self._entity_to_latex(entity))
if i < len(relation.subject.entities) - 1:
parts.append(', ')
parts.append('\\}')
# Add arrow with predicate
parts.append(f' -[{relation.predicate}]-> ')
# Convert object
if isinstance(relation.object, Entity):
parts.append(self._entity_to_latex(relation.object))
else: # Group
parts.append('\\{')
for i, entity in enumerate(relation.object.entities):
parts.append(self._entity_to_latex(entity))
if i < len(relation.object.entities) - 1:
parts.append(', ')
parts.append('\\}')
return ''.join(parts)
def _context_to_latex(self, context: Optional[Context]) -> str:
"""Convert a context to LaTeX"""
if context is None:
return ''
lines = []
if context.temporal:
lines.append(f" time: {context.temporal.start} {context.temporal.end} .")
if context.source:
lines.append(f" dc:source {context.source.name} .")
return '\n'.join(lines)
import pytest
from mrsl_parser import MRSLParser, TokenType, Entity, Property, Relation, Group, AST
@pytest.fixture
def parser():
return MRSLParser()
def test_nested_properties(parser):
mrsl = """@org/AcmeInc[
org:hasSite=@loc/NewYork[
geo:location[
schema:addressRegion="Manhattan"
]
]
]"""
result = parser.parse(mrsl)
assert isinstance(result, AST)
assert len(result.entities) == 1
entity = result.entities[0]
assert entity.name == '@org/AcmeInc'
assert len(entity.properties) == 1
site = entity.properties[0]
assert site.name == 'org:hasSite'
assert isinstance(site.value, Entity)
assert site.value.name == '@loc/NewYork'
location = site.value.properties[0]
assert location.name == 'geo:location'
assert isinstance(location.value, Entity)
assert location.value.properties[0].name == 'schema:addressRegion'
assert location.value.properties[0].value == '"Manhattan"'
def test_variable_assignment(parser):
mrsl = """acme = @org/AcmeInc[
org:hasSite=@loc/NewYork[
geo:location[
schema:addressRegion="Manhattan"
]
]
]"""
result = parser.parse(mrsl)
assert isinstance(result, AST)
assert 'acme' in result.variables
assert isinstance(result.variables['acme'], Entity)
assert result.variables['acme'].name == '@org/AcmeInc'
def test_simple_relationship(parser):
mrsl = "@JohnDoe -> org:headOf -> @org/AcmeInc"
tokens = parser.tokenize(mrsl)
assert tokens is not None
def test_chained_relationship(parser):
mrsl = """@AliceSmith -> org:reportsTo -> @BobJones
-> org:reportsTo -> @CarolWhite"""
tokens = parser.tokenize(mrsl)
assert tokens is not None
def test_relationship_with_properties(parser):
mrsl = "@JohnDoe -[dc:valid=2020]-> org:headOf -> @org/AcmeInc"
tokens = parser.tokenize(mrsl)
assert tokens is not None
def test_time_interval(parser):
mrsl = """time:Interval[
time:hasBeginning=2023-10-01;
time:hasEnd=2023-12-31
]"""
tokens = parser.tokenize(mrsl)
assert tokens is not None
def test_complex_temporal_context(parser):
mrsl = """(@JohnDoe -> org:memberOf -> @org/AcmeInc)[
time:hasTime=time:Interval[
time:hasBeginning="2020-01-01",
time:hasEnd="2023-12-31"
]
]"""
result = parser.parse(mrsl)
assert isinstance(result, AST)
assert len(result.relations) == 1
relation = result.relations[0]
assert relation.subject.name == '@JohnDoe'
assert relation.predicate == 'org:memberOf'
assert relation.object.name == '@org/AcmeInc'
assert relation.context is not None
assert relation.context.temporal is not None
assert relation.context.temporal.start == '2020-01-01'
assert relation.context.temporal.end == '2023-12-31'
def test_simple_group(parser):
mrsl = "{@JohnDoe, @AliceSmith, @BobJones}"
tokens = parser.tokenize(mrsl)
assert tokens is not None
def test_group_with_relationship(parser):
mrsl = "{@JohnDoe, @AliceSmith} -> schema:attendee -> @event/Conference2023"
tokens = parser.tokenize(mrsl)
assert tokens is not None
def test_relationship_with_source(parser):
mrsl = """(@JohnDoe -> org:headOf -> @org/AcmeInc)[
dc:source=@doc/AnnualReport2023
]"""
result = parser.parse(mrsl)
assert isinstance(result, AST)
assert len(result.relations) == 1
relation = result.relations[0]
assert relation.subject.name == '@JohnDoe'
assert relation.predicate == 'org:headOf'
assert relation.object.name == '@org/AcmeInc'
assert relation.context is not None
assert relation.context.source is not None
assert relation.context.source.name == '@doc/AnnualReport2023'
def test_token_types(parser):
"""Verify that specific tokens are correctly identified"""
mrsl = "@JohnDoe -> org:headOf -> @org/AcmeInc"
tokens = parser.tokenize(mrsl)
# Check first three tokens
assert tokens[0] == (TokenType.ENTITY, "@JohnDoe")
assert tokens[1] == (TokenType.ARROW, "->")
assert tokens[2] == (TokenType.PROPERTY, "org:headOf")
def test_error_handling(parser):
"""Test that invalid syntax raises an exception"""
invalid_mrsl = "@JohnDoe -> $$invalid$$ -> @org/AcmeInc"
with pytest.raises(SyntaxError):
parser.tokenize(invalid_mrsl)
def test_basic_entity():
parser = MRSLParser()
result = parser.parse('@person/JohnDoe[name:full = "John Doe"]')
assert isinstance(result, AST)
assert len(result.entities) == 1
entity = result.entities[0]
assert entity.name == '@person/JohnDoe'
assert len(entity.properties) == 1
assert entity.properties[0].name == 'name:full'
assert entity.properties[0].value == '"John Doe"'
def test_nested_entity():
parser = MRSLParser()
result = parser.parse('''
@person/JohnDoe[
name:full = "John Doe",
works:at = @org/AcmeInc[
name:legal = "Acme Inc",
type:org = "Corporation"
]
]
''')
assert isinstance(result, AST)
assert len(result.entities) == 1
entity = result.entities[0]
assert entity.name == '@person/JohnDoe'
assert len(entity.properties) == 2
work_prop = entity.properties[1]
assert work_prop.name == 'works:at'
assert isinstance(work_prop.value, Entity)
assert work_prop.value.name == '@org/AcmeInc'
assert len(work_prop.value.properties) == 2
def test_relation():
parser = MRSLParser()
result = parser.parse('(@person/JohnDoe -[manages]-> @org/AcmeInc)')
assert isinstance(result, AST)
assert len(result.relations) == 1
relation = result.relations[0]
assert relation.predicate == 'manages'
assert relation.subject.name == '@person/JohnDoe'
assert relation.object.name == '@org/AcmeInc'
def test_group_relation():
parser = MRSLParser()
result = parser.parse('''
({@person/JohnDoe, @person/AliceSmith}
-[attended]->
@event/Conference2023)
''')
assert isinstance(result, AST)
assert len(result.relations) == 1
relation = result.relations[0]
assert relation.predicate == 'attended'
assert isinstance(relation.subject, Group)
assert len(relation.subject.entities) == 2
assert relation.subject.entities[0].name == '@person/JohnDoe'
assert relation.subject.entities[1].name == '@person/AliceSmith'
assert relation.object.name == '@event/Conference2023'
def test_multiple_statements():
parser = MRSLParser()
result = parser.parse('''
@person/JohnDoe[name:full = "John Doe"]
(@person/JohnDoe -> org:memberOf -> @org/AcmeInc)
@org/AcmeInc[name:legal = "Acme Inc"]
''')
assert isinstance(result, AST)
assert len(result.entities) == 2
assert len(result.relations) == 1
# Check entities
assert result.entities[0].name == '@person/JohnDoe'
assert result.entities[1].name == '@org/AcmeInc'
# Check relation
assert result.relations[0].predicate == 'org:memberOf'
assert result.relations[0].subject.name == '@person/JohnDoe'
assert result.relations[0].object.name == '@org/AcmeInc'
def test_invalid_syntax():
parser = MRSLParser()
with pytest.raises(SyntaxError):
parser.parse('@person/JohnDoe[name:full]') # Missing equals and value
with pytest.raises(SyntaxError):
parser.parse('@person/JohnDoe[name:full = ]') # Missing value
with pytest.raises(SyntaxError):
parser.parse('(@person/JohnDoe @org/AcmeInc)') # Missing arrow
def test_whitespace_handling():
parser = MRSLParser()
compact = '@person/JohnDoe[name:full="John Doe"]'
spaced = '''
@person/JohnDoe [
name:full = "John Doe"
]
'''
result1 = parser.parse(compact)
result2 = parser.parse(spaced)
assert isinstance(result1, AST)
assert isinstance(result2, AST)
assert result1.entities[0].name == result2.entities[0].name
assert result1.entities[0].properties[0].name == result2.entities[0].properties[0].name
assert result1.entities[0].properties[0].value == result2.entities[0].properties[0].value
def test_property_types():
parser = MRSLParser()
result = parser.parse('''
@person/JohnDoe[
name:full = "John Doe",
age:years = 30,
joined:date = 2023-01-15,
active:flag = true
]
''')
assert isinstance(result, AST)
assert len(result.entities) == 1
entity = result.entities[0]
assert len(entity.properties) == 4
assert entity.properties[0].value == '"John Doe"' # String
assert entity.properties[1].value == '30' # Number
assert entity.properties[2].value == '2023-01-15' # Date
assert entity.properties[3].value == 'true' # Boolean
def test_to_rdf_simple_entity(parser):
mrsl = """@person/JohnDoe[
name:full = "John Doe",
age:years = 30
]"""
result = parser.parse(mrsl)
rdf = parser.to_rdf(result)
# Normalize whitespace in both strings before comparing
expected = """@prefix pers: <http://example.org/person#> .
@prefix org: <http://example.org/org#> .
@prefix geo: <http://example.org/geo#> .
@prefix schema: <http://schema.org/> .
@prefix loc: <http://example.org/loc#> .
@prefix name: <http://example.org/name#> .
@prefix time: <http://example.org/time#> .
@prefix dc: <http://purl.org/dc/terms/> .
person:JohnDoe name:full ""John Doe"" .
person:JohnDoe age:years "30" .""".strip().replace('\r\n', '\n')
print(f"{rdf.strip().replace('\\r\\n', '\\n') = }")
assert rdf.strip().replace('\r\n', '\n') == expected
def test_to_rdf_nested_entity(parser):
mrsl = """@org/AcmeInc[
org:name = "Acme Inc",
org:hasSite = @loc/NewYork[
geo:location = "Manhattan"
]
]"""
result = parser.parse(mrsl)
rdf = parser.to_rdf(result)
# Normalize whitespace in both strings before comparing
expected = """@prefix pers: <http://example.org/person#> .
@prefix org: <http://example.org/org#> .
@prefix geo: <http://example.org/geo#> .
@prefix schema: <http://schema.org/> .
@prefix loc: <http://example.org/loc#> .
@prefix name: <http://example.org/name#> .
@prefix time: <http://example.org/time#> .
@prefix dc: <http://purl.org/dc/terms/> .
org:AcmeInc org:name ""Acme Inc"" .
org:AcmeInc org:hasSite [
loc:NewYork geo:location ""Manhattan"" .
] .""".strip().replace('\r\n', '\n')
assert rdf.strip().replace('\r\n', '\n') == expected
def test_to_html_simple_entity(parser):
mrsl = """@person/JohnDoe[
name:full = "John Doe",
age:years = 30
]"""
result = parser.parse(mrsl)
html = parser.to_html(result)
# Check for key HTML elements and classes
assert '<div class="mrsl-block">' in html
assert '<span class="mrsl-entity">@person/JohnDoe</span>' in html
assert '<span class="mrsl-property">name:full</span>' in html
assert '<span class="mrsl-value">"John Doe"</span>' in html
assert '<span class="mrsl-property">age:years</span>' in html
assert '<span class="mrsl-value">30</span>' in html
assert '<span class="mrsl-bracket">[</span>' in html
assert '<span class="mrsl-bracket">]</span>' in html
def test_to_html_nested_entity(parser):
mrsl = """@org/AcmeInc[
org:name = "Acme Inc",
org:hasSite = @loc/NewYork[
geo:location = "Manhattan"
]
]"""
result = parser.parse(mrsl)
html = parser.to_html(result)
# Check structure and nesting
assert '<span class="mrsl-entity">@org/AcmeInc</span>' in html
assert '<span class="mrsl-entity">@loc/NewYork</span>' in html
assert '<span class="mrsl-property">org:name</span>' in html
assert '<span class="mrsl-property">geo:location</span>' in html
# Check indentation
assert '\n <span class="mrsl-property">' in html
def test_to_latex_simple_entity(parser):
mrsl = """@person/JohnDoe[
name:full = "John Doe",
age:years = 30
]"""
result = parser.parse(mrsl)
latex = parser.to_latex(result)
# Check LaTeX environment and escaping
assert '\\begin{lstlisting}[language=MRSL,' in latex
assert 'basicstyle=\\ttfamily,' in latex
assert '@person/JohnDoe' in latex
assert 'name:full = "John Doe"' in latex
assert 'age:years = 30' in latex
assert '\\end{lstlisting}' in latex
def test_to_latex_escaped_characters(parser):
mrsl = """@org/Acme_Inc[
org:name = "Acme_Inc #1",
org:type = "B2B_Company"
]"""
result = parser.parse(mrsl)
latex = parser.to_latex(result)
# Check proper escaping of special characters
assert 'Acme\\_Inc' in latex
assert 'B2B\\_Company' in latex
assert '\\#1' in latex # Changed expectation - #1 will be present but escaped
def test_output_methods_with_relation(parser):
mrsl = """(@person/JohnDoe -[manages]-> @org/AcmeInc)[
time:start = "2023-01-01"
]"""
result = parser.parse(mrsl)
# Test RDF output
rdf = parser.to_rdf(result)
assert 'person:JohnDoe manages org:AcmeInc' in rdf
assert 'time:start "2023-01-01"' in rdf
# Test HTML output
html = parser.to_html(result)
assert '<span class="mrsl-arrow">-[manages]-></span>' in html
assert '<span class="mrsl-entity">@person/JohnDoe</span>' in html
# Test LaTeX output
latex = parser.to_latex(result)
assert '-[manages]->' in latex
assert '@person/JohnDoe' in latex
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment