Skip to content

Instantly share code, notes, and snippets.

@crtag
Created February 20, 2025 02:41
Show Gist options
  • Save crtag/4eeddc1c5da3c794ffc73c7e8b3e1550 to your computer and use it in GitHub Desktop.
Save crtag/4eeddc1c5da3c794ffc73c7e8b3e1550 to your computer and use it in GitHub Desktop.
OCI Data Pump Export File Reconstruction Script

OCI Data Pump Export File Reconstruction Script

Purpose: This script is specifically designed for reconstructing Oracle Data Pump export files that were written to Object Storage using Swift format. It combines segmented files from a source bucket (accessed via PAR) into their original format suitable for import using Oracle Data Pump (impdp). This script can be repurposed for similar format file transfers or other segmented file formats.

Prerequisites

Environment

  • OCI Cloud Shell or equivalent Bash shell environment with configured OCI CLI
  • Proper IAM permissions to:
    • Read from source bucket (via PAR)
    • Write to target bucket
  • curl utility installed

Required Permissions

  • Source bucket: Valid Pre-Authenticated Request (PAR) with read access to all segment objects
  • Target bucket: Write permissions for the authenticated OCI CLI user

Script Configuration

Required Parameters

  • SOURCE_URL: PAR URL of source bucket (no trailing slash)
  • TARGET_BUCKET: Name of the destination bucket
  • <index>: Index identifier to use when referencing individual segment (Example: "01")

File Naming Convention

  • Source segment and its chunks format (OCI Swift format): export<index>.dmp_segments/aaaaaa, export<index>.dmp_segments/aaaaab, etc.
  • Target file format: export<index>.dmp

Usage

  1. Update the script with your specific values:

    TARGET_BUCKET="<Target bucket name>"
    SOURCE_URL="<Source PAR url including bucket name with no trailing slash>"
  2. Run the script with an index parameter:

    ./script.sh 01

Operation Overview

  1. Segment Discovery

    • Starts with segment "aaaaaa"
    • Performs HEAD requests to detect available segments
    • Collects metadata including file sizes
    • Progress reported every 100 segments
  2. File Transfer

    • Downloads each segment sequentially
    • Includes retry logic (3 attempts with 2-second delay)
    • Streams concatenated segments directly to target bucket
    • No intermediate file storage required
  3. Completion

    • Reports total segments processed
    • Shows total size transferred
    • Indicates success or failure status

Error Handling

  • Validates command-line arguments
  • Checks HTTP response codes
  • Verifies segment sizes
  • Reports transfer status
  • Exits with appropriate status codes on failures

Performance

  • Streaming transfer minimizes (basically eliminates) disk usage which is important in OCI Cloud Shell environment
  • Network retry logic for resilience
  • Progress monitoring with timestamps
  • Memory-efficient segment tracking

Limitations

  • Assumes sequential segment naming (aaaaaa, aaaaab, etc.)
  • Requires valid PAR URL with proper access
  • TARGET_BUCKET must exist before script execution and the user must have access to write into it via oci CLI

Example Output

$ ./script.sh 01
Collecting segments...
Checking segment: aaaaaa
2024-02-20 10:15:30 Milestone: 100 segments found, Total size: 1048576000 bytes
...
Transfer completed successfully.
Total segments: 150
Total size: 1572864000 bytes

License

MIT License

Copyright (c) 2025 [Alexey Novikov]

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

#!/bin/bash
if [ "$#" -ne 1 ]; then
echo "Usage: $0 <index>" >&2
echo "Example: $0 01" >&2
exit 1
fi
INDEX="$1"
PREFIX="export${INDEX}.dmp"
SEGMENTS_PREFIX="${PREFIX}_segments"
TARGET_BUCKET="<Target bucket name>"
# Source PAR URL base
SOURCE_URL="<Source PAR url including bucket name with no trailing slash>"
# Function to generate next segment name
next_segment() {
local curr=$1
local len=${#curr}
local i=$((len - 1))
local new_segment=("$curr")
while [ $i -ge 0 ]; do
local char="${curr:$i:1}"
if [ "$char" = "z" ]; then
new_segment="${new_segment:0:$i}a${new_segment:$((i + 1))}"
((i--))
else
local next_char=$(printf "\\$(printf '%03o' $(( $(printf '%d' "'$char") + 1 )))")
new_segment="${new_segment:0:$i}$next_char${new_segment:$((i + 1))}"
break
fi
done
if [ "$i" -lt 0 ]; then
new_segment="a$new_segment"
fi
echo "$new_segment"
}
# Start with first segment
current_segment="aaaaaa"
total_size=0
segment_count=0
# User agent for curl requests
CURL_UA="Mozilla/5.0 (OCI-Migration-Script; ref=68747470733a2f2f6769746875622e636f6d2f6372746167)"
# Collect all segments first
echo "Collecting segments..." >&2
segments=()
while true; do
echo "Checking segment: ${current_segment}" >&2
# Try to download the segment with a HEAD request first
SEGMENT_HEADERS=$(curl -s -I -A "${CURL_UA}" "${SOURCE_URL}/${SEGMENTS_PREFIX}/${current_segment}")
if ! echo "$SEGMENT_HEADERS" | grep -q "HTTP/1.1 200 OK"; then
echo "No more segments found after ${current_segment}" >&2
break
fi
# Get segment size from headers (safely)
segment_size=$(echo "$SEGMENT_HEADERS" | grep -i "content-length:" | awk '{print $2}' | tr -d '\r')
if [[ ! "$segment_size" =~ ^[0-9]+$ ]]; then
segment_size=0
fi
segments+=("$current_segment")
total_size=$((total_size + segment_size))
segment_count=$((segment_count + 1))
# Show milestone every 100 segments
if [ $((segment_count % 100)) -eq 0 ]; then
echo "$(date '+%Y-%m-%d %H:%M:%S') Milestone: ${segment_count} segments found, Total size: ${total_size} bytes" >&2
fi
# Generate next segment name
current_segment=$(next_segment "$current_segment")
done
echo "Found ${#segments[@]} segments. Starting transfer..." >&2
# Now stream all segments concatenated into a single OCI upload
(
for segment in "${segments[@]}"; do
echo "Downloading segment: ${segment}" >&2
curl -s --retry 3 --retry-delay 2 -A "${CURL_UA}" "${SOURCE_URL}/${SEGMENTS_PREFIX}/${segment}" || exit 1
done
) | oci os object put -bn "${TARGET_BUCKET}" --name "${PREFIX}" --file -
status=$?
if [ $status -eq 0 ]; then
echo "Transfer completed successfully." >&2
echo "Total segments: ${segment_count}" >&2
echo "Total size: ${total_size} bytes" >&2
else
echo "Transfer failed with status: $status!" >&2
exit 1
fi
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment