export oracle tables to portable raw bytes via named pipe, delimiter separated format (tsv|csv|tabnull|etc), compressed and protected/encrypted by gpg
-
protect yourself and your system, do not run as
root
. -
curl -LO https://gist.github.com/kyle0r/10501a2078a0b1cd198891e971db5dd9/raw/run-export.sh
-
run
run-export.sh
overriding the env vars as needed e.g.:EDITOR=vim GPG_RECIPIENT_KEY=FFFFFF WORKING_DIR_PATH=/var/tmp/your-preferred-path /bin/bash run-export.sh
-
in theory the script should do the rest for you, or show you what is missing/wrong.
- follow the guidance in the script. There are multiple opportunities to abort the process and/or correct issues.
-
the script expects things to go wrong and is designed to be run multiple times until success is reached.
-
you can debug the script with
/bin/bash -x
-
bear in mind that the script spawns’ processes waiting to read on a named pipe, if nothing is written they will stall and wait in the background. In between invocations of
run-export.sh
Keep an eye onps xuf
for detached/zombie processes, and kill and clean them up.
- support specifying a specific oracle db hostname and port
- support selecting a specific database
- checksums and ways to verify the table contents are correctly exported. i.e. data quality assertions.
- e.g.
tee
the read from thefifo
, one output to the existing flow, and another for checksums/assertions.
- e.g.
looks like its related to what comes after the @
in the sqlplus
arguments. Here is a good DBA post with infos.
create-export-proc.sql->dump_dsv_fast
is a standalone generic oracle proc for exporting tables to raw delimiter separated files.
run-export.sh
is a helper/runner script that utilises dump_dsv_fast
, pigz|gzip
, sqlplus
, gpg
.
run-export.sh
auto-fetches the latest revision of create-export-proc.sql
if a copy doesn't exist in WORKING_DIR_PATH
oracle exp
and expdp
(data pump) tools export data in a propriety/binary format which is basically useless and non-interoperable with non-oracle systems, and often non-interoperable between varying oracle versions.
I had some valuable multi-billion record datasets in oracle systems that were being decommissioned, so I naturally wanted a backup for later use.
I wanted to export tables out of oracle in a platform independent (portable) way. i.e. raw bytes. I wanted to bypass issues and limitations with propriety/binary formats. The results could be compressed and encrypted with standard tools and used later in any system/platform. e.g. ingestion to druid.
directly, a Saturday afternoon+evening on 2021-01-23.
export oracle tables to portable raw bytes via named pipe, delimiter separated format (tsv|csv|tabnull|etc), compressed and protected/encrypted by gpg.
- this was created for a PCI Level 1 env processing card holder data (CHD).
- all written bytes should be encrypted (with a PCI DEK) and compressed with standard tools.
- there should be no temp/intermediate/transient files, written bytes should always be protected.
- an oracle proc (
create-export-proc.sql->dump_dsv_fast
) is created for exporting tables to raw bytes. - a named pipe is created.
- for each table object listed in
dbObjectList.txt
- a compression proc is started to read from the pipe, and waits for writes, it pipes out to the encryption proc.
- a encryption proc is started to read from the compression pipe.
- the encryption proc outputs to a table specific archive file.
- export raw bytes from oracle to the named pipe
oracle-exp.fifo
inspired by James Abley How to convert an Oracle .dmp into a more portable format (related gist: https://gist.github.com/jabley/3713603)
and jareeq oracle utl_file encoding from utf8
useful info tom post: https://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:235814350980
useful info on oracle UTF8 https://community.oracle.com/tech/apps-infra/discussion/3514820/difference-between-al32utf8-and-utf8
useful info on sqlplus
shell scripting http://www.dbatoolz.com/t/sqlplus-s-shell-scripting.html
- read the scripts carefully and understand what is going on.
sqlplus
assumes that oracle is running locally on default ports.- note that if oracle starts writing to the fifo before a proc is reading the fifo, then oracle could present blocking IO issues.
- ensure you have enough storage space to dump the data in the
WORKING_DIR_PATH
. - ensure that the terminal locale/encoding matches that of the oracle raw output to avoid encoding issues.
- ensure that the output encoding for the proc matches that of the shell to avoid encoding issues. Controlled by
p_characterset
- ensure
sqlplus
is available in the shells$PATH
.