Skip to content

Instantly share code, notes, and snippets.

View onefoursix's full-sized avatar

Mark Brooks onefoursix

View GitHub Profile
@onefoursix
onefoursix / dataops-sdc-svc-ingress.yaml
Created April 21, 2023 23:27
StreamSets DataOps Platform Kubernetes manifest for SDC + Service + Ingress with TLS all the way down
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: streamsets-deployment-cfa81f1d-baf1-4d7d-9136-e0114a083bc9
name: streamsets-deployment-cfa81f1d-baf1-4d7d-9136-e0114a083bc9
namespace: ns1
spec:
replicas: 1
selector:
@onefoursix
onefoursix / dataops-backup.py
Last active March 11, 2023 03:01
A Python script to backup objects from StreamSets DataOps Platform
#!/usr/bin/env python
'''
This script exports Fragments, Pipelines, Jobs, and Job Templates from StreamSets DataOps Platform
The current version of this script does not export Connections, Tasks, nor Topologies
Prerequisites:
- Python 3.6+; Python 3.9+ preferred
@onefoursix
onefoursix / dataops_change_object_owners.py
Last active February 24, 2023 03:49
StreamSets DataOps Platform SDK script to update object owners
#!/usr/bin/env python
'''This script changes ownership of objects from an 'old' user to a 'new' user
in StreamSets DataOps Platform
Set DRY_RUN to True to generate a list of objects owned by the old user without making any changes.
Set DRY_RUN to False to actually change the ownership of objects from the 'old' to the 'new' user
Objects include:
@onefoursix
onefoursix / start_or_restart_job_on_primary_sdc.py
Created February 4, 2023 07:36
A script that use the StreamSets SDK for Python to explicitly place Jobs on specific SDCs even when HA is enabled
#!/usr/bin/env python
'''This Python script stops a Control Hub Job, removes a dynamic engine label from a "backup"
instance of SDC that the user does not want the Job placed on, starts the Job again,
and then restores the dynamic engine label to the same "backup" SDC. This process
allows the user to start or restart a Job with while preventing it from starting up on the "backup"
SDC.
Prerequisites:
- A Python 3.4 or higher environment to run this script
@onefoursix
onefoursix / run-job-on-dynamic-engine-aws.py
Last active October 27, 2022 23:00
Python Script to run a StreamSets DataOps Platform Job on a dynamically deployed Engine on AWS
#!/usr/bin/env python
'''This Python script creates and starts a DataOps Platform AWS Deployment of a single
SDC Engine, starts a Job that runs on that Engine, waits for the Job to complete, then
stops and deletes the SDC Engine and Deployment.
Prerequisites:
- An AWS Environment configured in DataOps Platform in an Active state
@onefoursix
onefoursix / run_job_on_ephemeral_sdc-sch-3.x.py
Last active May 27, 2024 15:41
Python script to spin up an SDC instance on K8s, run a Job on that SDC, wait for the Job to complete, and then tear down the deployed SDC
#!/usr/bin/env python
'''This Python script deploys an instance of SDC on Kubernetes using a Control Agent,
starts a Job that runs on that instance, waits for the Job to complete, then deletes the
Control Hub Depoyment which tears down the deployed SDC.
This version of the script runs against Control Hub 3.x
Prerequisites:
@onefoursix
onefoursix / sdc-http-server-jmeter.jmx
Created September 6, 2022 18:31
JMeter script to test SDC HTTP Server only
<?xml version="1.0" encoding="UTF-8"?>
<jmeterTestPlan version="1.2" properties="5.0" jmeter="5.5">
<hashTree>
<TestPlan guiclass="TestPlanGui" testclass="TestPlan" testname="Test Plan" enabled="true">
<stringProp name="TestPlan.comments"></stringProp>
<boolProp name="TestPlan.functional_mode">false</boolProp>
<boolProp name="TestPlan.tearDown_on_shutdown">true</boolProp>
<boolProp name="TestPlan.serialize_threadgroups">false</boolProp>
<elementProp name="TestPlan.user_defined_variables" elementType="Arguments" guiclass="ArgumentsPanel" testclass="Arguments" testname="User Defined Variables" enabled="true">
<collectionProp name="Arguments.arguments"/>
@onefoursix
onefoursix / sdc-custom-http-client.jython
Created July 28, 2022 17:01
Jython script for custom http client origin that handles offsets and pagination
## Example Scripting Origin to call a REST API with pagination and offset handling
# Imports
try:
sdc.importLock()
import sys
## Set this path to where we can load the Requests module
sys.path.append('/usr/local/lib/python3.9/site-packages')
@onefoursix
onefoursix / get-oracle-cdc-metrics-dataops.sh
Last active September 30, 2022 04:01
Bash script to monitor Oracle CDC Lag Time on StreamSets DataOps Platform
#!/usr/bin/env bash
# DataOps Platform URL
export SCH_URL=https://na01.hub.streamsets.com
# SDC URL - The SDC where the Job is running
export SDC_URL=http://<host>:<port>
# CRED_ID -- Your API Credential CRED_ID.
export CRED_ID=<redacted>
@onefoursix
onefoursix / get-oracle-cdc-metrics-sch3.sh
Last active September 30, 2022 04:00
Bash script to monitor Oracle CDC Lag Time on StreamSets Control Hub 3.x
#!/usr/bin/env bash
# Control Hub 3.x URL
export SCH_URL=https://cloud.streamsets.com
# SDC URL - The SDC where the Job is running
export SDC_URL=https://<sdc-host>:<sdc-port>
# Control Hub User in the form <user>@<org>
export SCH_USER=