Data formats are used to serialize data into common formats that can be maniuplated and transmitted between systems. The most common data formats are plain text data formats, these combine human readability and computer readable.
The most common plain text data formats are:
- Java Script Object Notation(JSON) -
- Yet Another Markup Language(YAML) -
- eXtensible Markup Language(XML) -
The key benifits of these data formats are they are
- Structured for computers, specifications built-in for computer readability.
- Annotated for humans, Logically and interupted by humans.
- Open and Extensible, open-source and open ended.
- Self-Describing, can be looked at and easily understable.
- Platfrom agnostice, can be used between different systems with minimal requirements.
- Lifespan, because of all the above it will last for a while.
XML is an old language and shares many commonalities with HTML. It's primarly used by legacy programs or programs that use SOAP to communicated. It uses the concept of tags to structure data in a tree like fasion.Tags start with <TAGOPEN> DATA </TAGCLOSE> and can be nested within each other.
The main issues with XML is that it's very verbose making data file very long which can have a impact on performance.
Common Compontents:
- Prolog is the defining of what verision of XML is used and how it is encoded.
- Element is defined by open/close tags
- Attributes are
name=value - Empty Elements are `
- Parents are Objects of Children
- Children are Objects of Parents.
- Sibilings are Children of same Parent.
Example of Code:
<?xml verision:"1.0" encoding="UTF-8"?>
<People>
<Person Id="1">
<FirstName>Matt</FirstName>
<LastName>Steele</LastName>
<Email>[email protected]
</Person>
<Person Id="2">
<FirstName>Steve</FirstName>
<LastName>Jobs</LastName>
<Email>[email protected]<Email>
</Person>
</Person>JSON is the most popular data format today and that's because it's a relative of popluar Java Script programming language and is also widely supported because of this. It's also lightweight and doesn't require white space.
JSON uses a key=value pair methodology and structed using arrays [ARRAY] and objects{OBJECT}. JSON supports string, interger and boolean.
Rules of JSON
- Each
{OBJECT}must have a key:value pair - After Each
{OBJECT}in a[ARRAY]there must be a comma,EXCEPT for the last{OBJECT} - Each
key:valuepair must have a comma,after each proceeding value EXCEPT for the lastkey:value {OBJECT}and[ARRAY]can be nested inside of each other.- To reference an object look to it's parent notation, for
{OBJECT}use.or to reference a[ARRAY]value uses[NUM]. These can be combined to reference further down a object tree. Some examplesKEY[NUMBER]orKEY[NUMBER].KEY
To reference a object in a array use KEY[NUMBER] and to call out a specific object key:value use
KEY[NUMBER].KEY.
JSON Example
{"People":
[
{
"id": "1",
"FirstName": "Matt",
"LastName": "Steele",
"Email": "[email protected]",
"Active": true
},
{
"id": "2",
"FirstName": "Steve",
"LastName": "Jobs",
"Email": "[email protected]",
"Active": true
},
{
"id": "3",
"Name":{"FN":"Steve", "LN":"Jobs"},
"Email": "None",
"Active": false
}
]
}YAML is the most lightweight, succient and ledgibable plain text data structure, it's often used for configuration files. YAML relies on indentation and key value pairs and it suppors str, boolean, interger and floating point.
Supports multiple documents inside on file, the documents are opened and closed using by --- and ....
Collection is a group of objects are identified using -
Comments are supported denoted using #
The number one rule is to be human readable.
---
ANumber: 145
AFloat: 19.8075
AString: Hello this is a piece of string!
this is valid string inside of YAML.
AQuotedString: "Quotes are perfectly acceptable."
Active: FALSE
ON: True
...