Skip to content

Instantly share code, notes, and snippets.

View alopresto's full-sized avatar
🔏
Focusing on NiFi security...

Andy LoPresto alopresto

🔏
Focusing on NiFi security...
View GitHub Profile
@alopresto
alopresto / Multiline_split.xml
Created February 17, 2017 06:15
Generates a flowfile with example CSV multiline text with newlines inside quoted strings. Replaces the newlines inside quotes with an escaped character, then splits each line. Includes LogAttributes processors to debug.
<?xml version="1.0" ?>
<template encoding-version="1.0">
<description>Generates a flowfile with example CSV multiline text with newlines inside quoted strings. Replaces the newlines inside quotes with an escaped character, then splits each line. Includes LogAttributes processors to debug. </description>
<groupId>4aa66ece-015a-1000-4348-efd94f63c4df</groupId>
<name>ReplaceText and SplitText</name>
<snippet>
<connections>
<id>015a1001-919e-1aa8-0000-000000000000</id>
<parentGroupId>4aa66ece-015a-1000-0000-000000000000</parentGroupId>
<backPressureDataSizeThreshold>1 GB</backPressureDataSizeThreshold>
@alopresto
alopresto / ListFiles.xml
Created February 18, 2017 00:08
Lists files from a directory, then updates the attributes so a processor can run a command against "pairs" of files (i.e. sam.txt and sam.txt.gz) from a single flowfile.
<?xml version="1.0" ?>
<template encoding-version="1.0">
<description>Lists files from a directory, then updates the attributes so a processor can run a command against "pairs" of files (i.e. sam.txt and sam.txt.gz) from a single flowfile. </description>
<groupId>4e3e7f99-015a-1000-b05e-27a45fe36d70</groupId>
<name>ListFiles</name>
<snippet>
<connections>
<id>015a1012-dfe4-1e6f-0000-000000000000</id>
<parentGroupId>4e3e7f99-015a-1000-0000-000000000000</parentGroupId>
<backPressureDataSizeThreshold>1 GB</backPressureDataSizeThreshold>
@alopresto
alopresto / Split_multiple_CSV.xml
Created February 20, 2017 20:17
Generates a flowfile with content provided in StackOverflow question and splits into multiple CSV files. Split regexes are specific to prescribed input but can be modified to be generic.
<?xml version="1.0" ?>
<template encoding-version="1.0">
<description>Generates a flowfile with content provided in StackOverflow question and splits into multiple CSV files. Split regexes are specific to prescribed input but can be modified to be generic. </description>
<groupId>5d1d8d50-015a-1000-6eef-5c6f05d60cb7</groupId>
<name>Split multiple files from CSV</name>
<snippet>
<connections>
<id>015a1001-fcfd-1d1e-0000-000000000000</id>
<parentGroupId>5d1d8d50-015a-1000-0000-000000000000</parentGroupId>
<backPressureDataSizeThreshold>1 GB</backPressureDataSizeThreshold>
@alopresto
alopresto / ExecuteStreamCommand_example.xml
Created February 27, 2017 20:28
Example of using rev and sed against flowfile content.
<?xml version="1.0" ?>
<template encoding-version="1.0">
<description>Example of using rev and sed against flowfile content. </description>
<groupId>81363c71-015a-1000-f0a6-b32176dacbe9</groupId>
<name>ExecuteStreamCommand Examples</name>
<snippet>
<connections>
<id>8138777c-015a-1000-0000-000000000000</id>
<parentGroupId>81363c71-015a-1000-0000-000000000000</parentGroupId>
<backPressureDataSizeThreshold>1 GB</backPressureDataSizeThreshold>
@alopresto
alopresto / routehl7.xml
Created April 6, 2017 18:47
An Apache NiFi template which generates HL7 data and routes it based on a simple comparison.
<?xml version="1.0" ?>
<template encoding-version="1.1">
<description></description>
<groupId>3b737254-015b-1000-aee2-fe3d19b02179</groupId>
<name>RouteHL7</name>
<snippet>
<connections>
<id>015b1002-e563-1455-0000-000000000000</id>
<parentGroupId>3b737254-015b-1000-0000-000000000000</parentGroupId>
<backPressureDataSizeThreshold>1 GB</backPressureDataSizeThreshold>
@alopresto
alopresto / Provenance Exhauster.xml
Created April 20, 2017 22:41
A flow which consists of 3 components -- a GenerateFlowFile with run schedule `0s`, an UpdateAttribute, and a LogAttribute. This flow generates tens of thousands of provenance events per second to stress test the provenance repository implementation.
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<flowController encoding-version="1.1">
<maxTimerDrivenThreadCount>10</maxTimerDrivenThreadCount>
<maxEventDrivenThreadCount>5</maxEventDrivenThreadCount>
<rootGroup>
<id>7dfee5cb-015b-1000-a99c-3d5ff830eb26</id>
<name>NiFi Flow</name>
<position x="0.0" y="0.0"/>
<comment/>
<processor>
@alopresto
alopresto / flow.xml
Created May 22, 2017 22:11
An Apache NiFi flow definition which uses a `GetHTTP` and an `InvokeHTTP` processor to verify that the `SSLPeerUnverifiedException` issue is resolved when connecting to `https://googleapis.com`.
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<flowController encoding-version="1.1">
<maxTimerDrivenThreadCount>10</maxTimerDrivenThreadCount>
<maxEventDrivenThreadCount>5</maxEventDrivenThreadCount>
<rootGroup>
<id>31d11ade-015c-1000-9dc6-0d21547d5a8d</id>
<name>NiFi Flow</name>
<position x="0.0" y="0.0"/>
<comment/>
<processor>
@alopresto
alopresto / scripted_lookup_record.xml
Created May 22, 2017 22:15
An Apache NiFi flow used to test the addition of a `ScriptedLookupRecord` component.
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<flowController encoding-version="1.1">
<maxTimerDrivenThreadCount>10</maxTimerDrivenThreadCount>
<maxEventDrivenThreadCount>5</maxEventDrivenThreadCount>
<rootGroup>
<id>21c4d984-015c-1000-7c5f-95366fc0810e</id>
<name>NiFi Flow</name>
<position x="0.0" y="0.0"/>
<comment/>
<processor>
@alopresto
alopresto / inline_lookup.groovy
Created May 22, 2017 22:18
A Groovy script to implement the `ScriptedLookupRecord` functionality in Apache NiFi.
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
@alopresto
alopresto / time_format.xml
Last active May 25, 2017 18:53
Apache NiFi template that trims the extra characters from the input string to allow the Java SimpleDateFormat parser to convert the source to a date in GMT.
<?xml version="1.0" ?>
<template encoding-version="1.1">
<description>Trims the extra characters from the input string to allow the Java SimpleDateFormat parser to convert the source to a date in GMT. </description>
<groupId>40c4c83b-015c-1000-9023-21cd583de301</groupId>
<name>Time formatting</name>
<snippet>
<connections>
<id>d33d0164-1f51-3d15-0000-000000000000</id>
<parentGroupId>23c9f5f5-9b7a-372c-0000-000000000000</parentGroupId>
<backPressureDataSizeThreshold>1 GB</backPressureDataSizeThreshold>