Skip to content

Instantly share code, notes, and snippets.

@MichaelCurrin
Last active August 25, 2024 22:29
Show Gist options
  • Save MichaelCurrin/f8d908596276bdbb2044f04c352cb7c7 to your computer and use it in GitHub Desktop.
Save MichaelCurrin/f8d908596276bdbb2044f04c352cb7c7 to your computer and use it in GitHub Desktop.
Jekyll - how to build a REST API

Jekyll - how to build a REST API

Serve your data as static JSON

How to make a read-only JSON REST API using Jekyll.

This doesn't need any Ruby plugins - you just use some built-in templating features in Jekyll 3 or 4.

You will end up with a single JSON file contains data for all pages on the site, and another JSON file of just posts. Alternatively, you can replace every HTML page and post with a JSON version.

Notes

Data source

You might have data stored in your Jekyll project. Typically in YAML, CSV or JSON file in the _data/ directory. Or in the frontmatter of your pages. This tutorial shows you how to render as JSON pages.

The data files could be:

  • in version control (visible in Git and on GitHub)
  • dynamic (ignored by Git) - for example if you have a script to fetch API data and write to _data every time you deploy your site.

Why

This is useful if you want to make your data available for another service or for others to consume (like GitHub, Twitter and Facebook make their data available on APIs). Or perhaps you have JavaScript single-page application that reads from your API backend to serve the app or build a search index.

Example

Given data in page frontmatter or a YAML data file:

---
my_data:
  - a: 1
    b: 2
  - a: 100
    b: 200
---

Or CSV data as:

a,  b
1,  2
100,200

The rendered JSON page will look like this:

  • localhost:4000/foo.json
    [
      { "a": "1", "b": "2" },
      { "a": "100", "b": "200" }
    ]

Static site note

Note that Jekyll is a static site generator. So your JSON API data will only be updated whenever you make a commit and the site rebuilds. If your API needs to have data which changes based on many users or needs to change in realtime, then you're better off using Node.js or Python for your API.

Summary of approach

Here we will use .json as the file extension, instead of the usual .md or .html. This means your HTTP header will be set correctly:

Content-Type	application/json

We'll also make sure to set the layout to null so that we get a pure JSON page without any HTML styling.

We'll use our the YAML data to build the response. If you use the jsonify Jekyll, you can convert data straight to JSON without worrying about for loops or JSON syntax.

I recommend using a Jekyll extension for your IDE, to handle Liquid syntax highlighting and recognizing frontmatter. Then make sure you choose jekyll as your formatter for your .json files.

Sitemap note

It looks like JSON files are excluded from a sitemap file by default. The plugin recognizes just HTML and Markdown files.

Resources

Taking it further

See also Data files in the Jekyll docs. You might iterate over data in a JSON or YAML file or files to build up output on a page. The data files are your database and then Jekyll builds your API. This is also a way to build a databse JSON file to be consumed on the frontend such as for search functionality.

Or you might store your data in the frontmatter of Jekyll Collections which get outputted as pages similar to the Page example below. That way you can group your JSON data into collections as interate over them easily.

All pages endpoint

Make a single JSON file containing data for all pages.

See also Post listing section.

Use site.pages if you want to list pages. This will exclude posts and collections.

  • api/pages.json
    ---
    layout: none
    ---
    [
      {%- for page in site.pages %}
        {
          "title":      {{- page.title | jsonify }},
          "url":        {{- page.url | relative_url | jsonify }}
        }
        {% unless forloop.last %},{% endunless %}
      {% endfor -%}
    ]

You can do something similar with items in site.my_collection or all collections with site.collections.

Mix formats

If you mix your pages as HTML and JSON, you can add a filter to get only JSON files.

{% if post.ext == '.json' %}

This field works for posts only.

For pages, you'll have to use item.name (e.g. foo.md) and check if it ends with .json.

Pages as JSON files

Render each Markdown page as a JSON file, using frontmatter for data. Note that no HTML file is generated here.

  • _layouts/page.html - build up the JSON object, quoting all values and in the case of the body text we convert from markdown to HTML.
    ---
    layout: none
    ---
    {"title":{{ page.title | jsonify }},"content":{{ page.body | markdownify | jsonify }},"links":{{ page.links | jsonify }}}
  • foo.json - a Jekyll HTML/markdown page with a .json extension.
    ---
    layout: page
    title: Foo
    body: |
      Hello, **world**.
      I am here.
      
    links:
      - title: Go track on Exercism
        url: https://exercism.io/tracks/go
    
      - title: Go tour  welcome page
        url: https://tour.golang.org/welcome/1
    
      - title: Learn Go with Tests
        url: https://quii.gitbook.io/learn-go-with-tests/
    
      - title: Go for Python programmers
        url: https://golang-for-python-programmers.readthedocs.io/en/latest/index.html
    ---

Output file

Visit at http://localhost:4000/foo.json

{"title":"Testing","content":"<p>Hello, <strong>world</strong>.\nI am here.</p>\n","links":[{"title":"Go track on Exercism","url":"https://exercism.io/tracks/go"},{"title":"Go tour  welcome page","url":"https://tour.golang.org/welcome/1"},{"title":"Learn Go with Tests","url":"https://quii.gitbook.io/learn-go-with-tests/"},{"title":"Go for Python programmers","url":"https://golang-for-python-programmers.readthedocs.io/en/latest/index.html"}]}

Here it is in a more readable format:

{
  "title":"Testing",
  "content":"<p>Hello, <strong>world</strong>.\nI am here.</p>\n",
  "links": [
    {"title":"Go track on Exercism","url":"https://exercism.io/tracks/go"},
    {"title":"Go tour  welcome page","url":"https://tour.golang.org/welcome/1"},
    {"title":"Learn Go with Tests","url":"https://quii.gitbook.io/learn-go-with-tests/"},
    {"title":"Go for Python programmers","url":"https://golang-for-python-programmers.readthedocs.io/en/latest/index.html"}
  ]
}

All posts endpoint

Make a single JSON file containing data for all posts.

Input file

  • api/posts.json
    ---
    layout: none
    ---
    [
      {%- for post in site.posts %}
        {
          "id":         {{- post.id | jsonify -}},
          "title":      {{- post.title | jsonify }},
          "date":       {{- post.date | jsonify }},
          "url":        {{- post.url | relative_url | jsonify }},
          "tags":       {{- post.tags | jsonify }},
          "categories": {{- post.categories | jsonify }}
        }
        {% unless forloop.last %},{% endunless %}
      {% endfor -%}
    ]

Add content too if you need that, but it will make the output a lot longer and the JSON file will make a lot bigger to download.

"content": {{- content | jsonify }}

You might want to put that through a Markdown to HTML converter filter if you want HTML.

Also note that Liquid code in the content won't get rendered - you'll get the raw code.

Output file

  • /posts/
    [
        {
          "id":"/2020/05/15/meaning-and-recognition",
          "title":"Meaning and recognition",
          "date":"2020-05-15 00:00:00 +0200",
          "url":"/my-repo-name/2020/05/15/meaning-and-recognition.html",
          "tags":["reflection","motivation"],
          "categories":[]
        }
        ,
        // more posts...
    ]

Permalink

Optionally set a permalink as:

permalink: /api/posts/

Then you can access it as /api/posts/ instead of /api/posts.json.

Posts as JSON files

How to output each Markdown post into a JSON file.

Input file

Here we make a post in the usual _posts directory.

The layout is also much lighter and more flexible - using the data object without caring about keys, values and formatting.

The downside is that you have to be more verbose in your post frontmatter to define a YAML variable with & and then use it with *. Based on this YAML tutorial.

Also note unlike in the page example above, we don't covert to body content from markdown to HTML.

  • _layouts/post.html - a layout which uses frontmatter and converts it to JSON output.
    ---
    layout: none
    ---
    {{ page.data | jsonify }}
  • _posts/2020-12-12-hello.json - a post with a .jsonextension.
    ---
    layout: post
    
    title: &title Hello world
    categories: &categories
      - Go
      - Python
    content: &content |
      Hello, **world**
      I am here
    
    data:
      title: *title
      categories: *categories
      content: *content
    ---

Output file

Visit at http://localhost:4000/go/python/2020/12/12/hello.json

{"title":"Hello world","categories":["Go","Python"],"content":"Hello, **world**\nI am here\n"}
@rudSarkar
Copy link

Wow! Thanks, This is what I am looking for.

@RyanTG
Copy link

RyanTG commented Mar 27, 2022

Thanks for this.

I have a question: We have a blog with 90 posts on it. We want to retain the blog on the website, but also have json files so we can serve the posts on our app. What is the correct process in this case?

It seems we would need to retain the .md post files, and then create new .json files for each post. And... put the .json post files in a different folder than _posts, so the urls don't conflict?

EDIT: My solution was to use Collections, putting the json posts in a collection. ok, now I have to convert 90 posts to json!!

@oDinZu
Copy link

oDinZu commented Apr 25, 2022

Thank You!

Now to make that API public facing with nginx :)

@MichaelCurrin
Copy link
Author

@RyanTG maybe not relevant any more for you, but instead of 90 md files and 90 JSON files, just make a single JSON file which contains data for all posts. See my all-posts endpoint.

Using Collections means posts are no longer posts so then you lose the posts functionality and rules (like putting a date in the title) and treat it as a collection which is okay because posts are a collection anyway in Jekyll.

@MichaelCurrin
Copy link
Author

@csharpee if you use modern workflows, you can skip Nginx.

Whether you have a standard HTML Jekyll site, or a JSON API files generated by Jekyll, or a mix of both, the principle is the same.

You can run a build yourself and serve content with Nginx as a static server. Or use a service like GitHub Pages or Netlify which will build your site and then serve the content for you. They probably use Nginx internally but no need to manage that configuration or server yourself.

So just add a .json file to your GitHub repo and push then wait for your JSON API to be available publicly.

@oDinZu
Copy link

oDinZu commented Apr 26, 2022

@MichaelCurrin thanks for the heads up.

I want to use the API as a separate authenticated Jekyll admin dashboard...Jekyll Admin is a plugin that piggybacks off of Jekyll serve and I don't want to have Jekyll public facing.

I don't use GitHub (only a mirror) or Netlify like headless CMS. I have my own git server that auto builds jekyll site and generates a static website, then serves that static website with nginx.

Managing the nginx myself isn't all that difficult, but it is an extra step, plus, once you have a solid image, we can easily duplicate that server and update the nginx conf ssl certs and domain details in nginx conf. This allows different scalable hosting options for the clients, but I do have to do that extra step and also pay for the hosting myself.

@MichaelCurrin
Copy link
Author

A weakness of Jekyll Admin is that it is built for local use and does not have a user authentication or management aspect, so then you have to add security yourself to the admin endpoints and have Jekyll server running on the remote.

Also the admin does not commit file changes, it only changes them on disk, so it is not a good flow I think.

I did enjoy using Forestry as a CMS for Jekyll (and Hugo and Gatsby etc.) that feels similar to Jekyll Admin or WordPress and lets you add a couple of users on the free tier.
Any change you make will be committed to the repo and then your usual service (if you used Netlify etc.) would handle the static site.

There's Netlify CMS which is similar but requires you to write your configuration as code rather than using a UI.

@RyanTG
Copy link

RyanTG commented May 12, 2022

just make a single JSON file which contains data for all posts. See my all-posts endpoint.

I did do that, as well. My goal was to have a list of all posts in a single file, and then you click a post title and it loads that individual post. For that single file of all posts, I'd rather not include the post content, but just title, date, tags. But yeah, it's cool to have a full list.

At any rate, I do have 90 json files now. The issue I'm finding is that these individual posts don't have the correct asset paths. As an example, the json contains ![Screenshot](/assets/img/Screenshot-1.jpg#500) rather than the full remote path. I'm not sure if I did something wrong, or if this is expected.

@MichaelCurrin
Copy link
Author

MichaelCurrin commented May 13, 2022

@RyanTG

You can do all posts and then pick the fields you want instead of the full post object.

Can you share your repo or a snippet in a gist?

If you HTML page has the full path with | relative_url to get /my-repo/assets/img/Screenshot-1.jpg then that content used for the JSON too in "content".

@RyanTG
Copy link

RyanTG commented May 13, 2022

@MichaelCurrin Thanks. I do have the all posts json just showing what I want:

For the asset paths issue, maybe I'm just approaching this wrong. I'm following the "Posts as JSON Files" section above. However, I don't want to mess with the .md files in my _posts folder, because then the json outputs are displayed on the website index files (https://blog.pinballmap.com/). So, instead I created a separate _json collection containing json versions of the md files (which I made manually, for the most part) and then they get stored safely in here: https://github.com/RyanTG/pbm-jekyll/tree/master/_site/json

I don't have relative_url set, so hopefully that's just my issue!

@MichaelCurrin
Copy link
Author

MichaelCurrin commented May 22, 2022

Yes add relative_url like this to the url. And if you treat id as a URL to then make it relative too.

    {
      "id":         {{- post.id | jsonify -}},
      "title":      {{- post.title | jsonify }},
      "date":       {{- post.date | jsonify }},
      "url":        {{- post.url | relative_url | jsonify }},
      "categories": {{- post.categories | jsonify }}
    }

I've updated my gist code too.

@RyanTG
Copy link

RyanTG commented May 23, 2022

Thanks.

My issue is with image links in the body of my posts when I use the Posts as JSON Files endpoint. Maybe I'm just not understanding this sentence

If you HTML page has the full path with | relative_url to get /my-repo//assets/img/Screenshot-1.jpg then that content used for the JSON too in "content".

Are you saying to use the generated html page as my content for the json file? I think I'd rather have my json be based on the markdown. Plus, my html pages are currently not showing the full image paths, even with relative_url in the image markdown. I could just hardcode the full paths into my md files. But then I wouldn't be able to see the images while drafting posts and viewing them locally. I think that has to be my solution, though, since {{ page.data | jsonify }} is literally just grabbing the markdown file's content. Sorry to hijack.

@MichaelCurrin
Copy link
Author

Oh sorry forgot about the image link problem.

My solution won't help with that unfortunately.

When comes to getting the content of a Markdown page or HTML page with {{ content }} or {{ post.content }} or {{ page.data }} for your JSON data, there is no way to get the Liquid code to render. You can use a Markdown to HTML filter if you want that, but any Liquid like | relative_url will stay as literal code and not get rendered. It's a limitation of Jekyll I've come up against before.

So you'd have to hardcode your image URLs to have the base URL in them.

Or do something on your frontend with JS once the page has loaded to go through each img element and prepend the base url.

@oDinZu
Copy link

oDinZu commented May 23, 2022

Hi 👋 @RyanTG

I hope I understand your question.

I haven't had any trouble with img link paths.

My Jekyll generates a index.json for pages, posts and products, but I ain't
using that JSON as an endpoint with authentication.

It is only "read only" permissions and I am NOT using Jekyll as a server. Jekyll is only a SSG for my stack. I would like to have a headless cms that is specific to Jekyll and not JavaScript pancakes like all these insane complex work environments jumping on the bus with FB react.

I don't know why Jekyll team would create a Jekyll Admin that is dependent on Jekyll public facing 24/7; all that is needed for a headless cms api to work is simple secret key with oauth for the Git server. The database would then become the git server i.e. GitHub, Gitea, etc... I was going to call it Hyden so we would have Hyde and Jekyll...and Hyde would be hiding Jekyll cause Hyde is front-end...plus a play on Dr. Jekyll and Mr. Hyde.

I started to do this myself, but Strapi has a huge community and has built something more complicated than I personally need, but it is what I was envisioning when it comes to hosting our own headless cms with Jekyll.

The Liquid jsonify command is super powerful and makes creating .json files easy, the difficult part is the authentication.

My headless cms admin dashboard pushes to the Git server, then my CI/CD server builds the Jekyll site from Gitea, pushes the www-data to www-data branch, compresses the www-data for download link on hosting server and then rsyncs the www-data to my hosting server with HTTPS via CI/CD.

---
layout: none
permalink: /api/posts/
---
[
  {%- for post in site.posts %}
    {
      "id":         {{- post.id | jsonify -}},
      "url":        {{- post.url | jsonify }},
      "image":      {{- post.banner_image | jsonify -}},
      "image-alt":  {{- post.banner_image_alt | jsonify -}},
      "sub_heading":{{- post.sub_heading | jsonify -}},
      "title":      {{- post.title | jsonify -}},
      "author":     {{- post.author | jsonify -}},
      "date":       {{- post.date | jsonify -}},
      "tags":       {{- post.tags | jsonify -}},
      "categories": {{- post.categories | jsonify -}},
      "content":    {{- post.content | jsonify -}}
    }
    {% unless forloop.last %},{% endunless %}
  {% endfor -%}
]

The generated index.json from posts api.

[
    {
      "id":"/blog/tutorials/2021/12/27/setup-nginx-https-web-server-with-lets-encrypt-plus-strapi-4.0-headless-cms",
      "url":"/blog/tutorials/2021/12/27/setup-nginx-https-web-server-with-lets-encrypt-plus-strapi-4.0-headless-cms/",
      "image":"/uploads/2021/santa-rudolph-unsplash.webp",
      "image-alt":"Qt5 Compile",
      "sub_heading":"Static Websites with CMS",
      "title":"Setup a Secure NGINX HTTPS Web Server with Let's Encrypt + Strapi 4.0 Headless CMS",
      "author":"Charles",
      "date":"2021-12-27 00:00:00 +0000",
      "tags":["Linux, Strapi, Nginx, JAMstack"],
      "categories":["Tutorials"],
      "content":"<h2 id=\
...

@MichaelCurrin
Copy link
Author

Thanks for the additional info - glad this gist is providing a discussion and sharing.

The problem I understand was with image paths within the page content - if the site is on a subpath like abc.github.io/my-repo/ and you use relative_url then I think that gets lost.

@RyanTG
Copy link

RyanTG commented May 24, 2022

Correct - yes, I was focused on content. Your solutions seem the best (hard code or prepend later). Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment