This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# syntax=docker/dockerfile:1 | |
FROM python:3.8-slim-buster | |
WORKDIR /app | |
ENV ACCEPT_EULA=Y | |
RUN apt-get update -y && apt-get update \ | |
&& apt-get install -y --no-install-recommends curl gcc g++ gnupg unixodbc-dev |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def dataframe_difference(df1, df2, which=None): | |
"""Find rows which are different.""" | |
comparison_df = df1.merge(df2, | |
indicator=True, | |
how='outer') | |
if which is None: | |
diff_df = comparison_df[comparison_df['_merge'] != 'both'] | |
else: | |
diff_df = comparison_df[comparison_df['_merge'] == which] | |
diff_df.to_csv('data/diff.csv') |
This small subclass of the Pandas sqlalchemy-based SQL support for reading/storing tables uses the Postgres-specific "COPY FROM" method to insert large amounts of data to the database. It is much faster that using INSERT. To acheive this, the table is created in the normal way using sqlalchemy but no data is inserted. Instead the data is saved to a temporary CSV file (using Pandas' mature CSV support) then read back to Postgres using Psychopg2 support for COPY FROM STDIN.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python | |
import sys | |
import pandas as pd | |
import pymongo | |
import json | |
def import_content(filepath): | |
mng_client = pymongo.MongoClient('localhost', 27017) |
NewerOlder