Skip to content

Instantly share code, notes, and snippets.

@jeremyboggs
Created November 7, 2018 14:35
Show Gist options
  • Save jeremyboggs/4577e616b0d940973a089ba9656717b0 to your computer and use it in GitHub Desktop.
Save jeremyboggs/4577e616b0d940973a089ba9656717b0 to your computer and use it in GitHub Desktop.
desc "Create corpus for search"
file './corpus.json' => ['./', *Rake::FileList['collections/**/*.md'].exclude('./ISSUE_TEMPLATE.md', './PULL_REQUEST_TEMPLATE.md', './README.md', './index.md', './code_of_conduct.md')] do |md_file|
progressbar = ProgressBar.create(
:title => "creating corpus",
:format => "\e[0;35m%t: |%B|\e[0m",
:starting_at => 10)
50.times { progressbar.increment; sleep 0.1 }
unsafe_loader = ->(string) { YAML.load(string) }
corpus = md_file.sources.grep(/\.md$/)
.map do |path|
file_path = './' + path
parsed = FrontMatterParser::Parser.parse_file(file_path, loader: unsafe_loader)
{
id: path.pathmap('%n'),
title: parsed.front_matter["title"],
name: parsed.front_matter["name"],
author: parsed.front_matter["author"],
date: parsed.front_matter["date"],
categories: parsed.front_matter["categories"],
url: parsed.front_matter["slug"],
layout: parsed.front_matter["layout"],
content: parsed.content,
}
end
File.open(md_file.name, 'w') do |f|
f << JSON.generate(corpus)
end
progressbar.finish
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment