Skip to content

Instantly share code, notes, and snippets.

View amotl's full-sized avatar

Andreas Motl amotl

  • $PYTHONPATH
View GitHub Profile
@amotl
amotl / cratedb-cloud-mongodb-cdc.md
Created May 16, 2025 18:02
How do I optimally synchronize data between MongoDB and CrateDB?

To optimally synchronize data between MongoDB and CrateDB, you should use a Change Data Capture (CDC) integration, which is available as a managed feature in CrateDB Cloud. This allows you to keep your MongoDB data continuously and efficiently synchronized with a table in CrateDB. Here’s a concise guide on how to do this:


1. Use CrateDB Cloud’s MongoDB CDC Integration

CrateDB Cloud (preview feature, see docs) can continuously import and sync data from MongoDB (e.g., MongoDB Atlas) using Change Streams.

Key Features:

@amotl
amotl / compose.yml
Created May 6, 2025 18:21
Miniature rig for evaluating cratedb-cockpit on a non-root URL
services:
nginx:
image: nginx:1.27
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf:ro
ports:
- "8080:80"
restart: unless-stopped
@amotl
amotl / minigeocode.py
Created May 4, 2025 19:00
Miniature geocoder using a dedicated instance of Nominatim.
#!/usr/bin/env python3
# /// script
# requires-python = ">=3.9"
# dependencies = [
# "click",
# "geopandas",
# "geopy",
# ]
# ///
"""
@amotl
amotl / uv_run_stuck_mcp.py
Last active April 3, 2025 19:02
Problem with `uv run` not running to completion.
#!/usr/bin/env python3
"""
Prerequisite:
docker run --rm --name=cratedb \
--publish=4200:4200 --publish=5432:5432 \
--env=CRATE_HEAP_SIZE=2g crate/crate:nightly \
-Cdiscovery.type=single-node
Variants:
@amotl
amotl / pandas_cratedb_date_type.py
Created March 31, 2025 18:10
Investigate anomaly with pandas and CrateDB, re. storing `DATE` types that naturally do not use time zones.
"""
Investigate anomaly with pandas and CrateDB, re. storing `DATE` types that naturally do not use time zones.
https://github.com/crate/sqlalchemy-cratedb/issues/216
Setup: Please install dependency packages enumerated below.
Usage: Please toggle database connection URI per `dburi = this or that`.
Invoke: Just type `uv run pandas_cratedb_date_type.py`.
docker run --rm -it --name=cratedb --publish=4200:4200 --publish=5432:5432 --env=CRATE_HEAP_SIZE=2g crate/crate:nightly -Cdiscovery.type=single-node
docker run --rm -it --name=postgresql --publish=5433:5432 --env "POSTGRES_HOST_AUTH_METHOD=trust" postgres:17 postgres -c log_statement=all
@amotl
amotl / bug1000.py
Created March 27, 2025 21:15
Limit of total columns [1000] in table [doc.total1000] exceeded
#!/usr/bin/env python3
# /// script
# requires-python = ">=3.9"
# dependencies = [
# "sqlalchemy-cratedb",
# ]
# ///
"""
Usage:
@amotl
amotl / cratedb_sqlalchemy_connect.py
Last active February 13, 2025 20:19
Validate CrateDB SQLAlchemy connect timeout
import sqlalchemy as sa
TIMEOUT = 0.00001
if __name__ == "__main__":
engine = sa.create_engine('crate://localhost/', connect_args={"timeout": TIMEOUT})
c = engine.connect()
result = c.execute(sa.text("SELECT 42;"))
print(result.all())
@amotl
amotl / transfer_pull_requests.py
Last active February 10, 2025 23:50
Transfer pull requests from one repository to another
"""
# Transfer GitHub Pull Requests
## About
Transfer pull requests on GitHub from one repository to another.
## Details
Here: Transfer PRs closed by stale bot on the PyCaret repository,
modulo updates submitted by Dependabot, to the fork at sktime.
@amotl
amotl / acquire_dataset_cached.py
Last active January 19, 2025 03:09
Concisely fetch data from remote resources in Python, with caching
#!/usr/bin/env python
"""
## About
Concisely fetch data from remote resources in Python, with caching.
## Synopsis
```
uv run acquire_dataset_cached.py
```
@amotl
amotl / Universal_Declaration_of_Human_Rights.md
Created December 24, 2024 23:39
Universal Declaration of Human Rights