Skip to content

Instantly share code, notes, and snippets.

@ebuildy
Created July 25, 2025 13:39
Show Gist options
  • Save ebuildy/6516dbaf3abca1489cf3f2dc64fde125 to your computer and use it in GitHub Desktop.
Save ebuildy/6516dbaf3abca1489cf3f2dc64fde125 to your computer and use it in GitHub Desktop.
Meilisearch docs scraper
services:
engine:
image: getmeili/meilisearch:v1.15.0
ports:
- 7700:7700
environment:
MEILI_MASTER_KEY: myMasterKey
scraper:
image: getmeili/docs-scraper:v0.12.12
command:
- sh
- -c
- |
echo "Start scraper"
pipenv run ./docs_scraper /opt/doc-admin.json
environment:
MEILISEARCH_HOST_URL: http://engine:7700
MEILISEARCH_API_KEY: myMasterKey
volumes:
- ./:/opt/
{
"index_uid": "docs-admin",
"start_urls": ["https://my-doc"],
"sitemap_urls": ["https://my-doc/sitemap.xml"],
"stop_urls": [],
"selectors": {
"lvl0": {
"selector": "article h1",
"global": true,
"default_value": "Documentation"
},
"lvl1": {
"selector": "article h1",
"global": true,
"default_value": "Chapter"
},
"lvl2": "article h2",
"lvl3": "article h3",
"lvl4": "article h4",
"lvl5": "article h5",
"lvl6": "article h6",
"text": "article"
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment