Skip to content

Instantly share code, notes, and snippets.

@prashant-shahi
Last active February 11, 2019 14:07
Show Gist options
  • Select an option

  • Save prashant-shahi/c3bf1fb54a9c84d77d125a25d7c21c38 to your computer and use it in GitHub Desktop.

Select an option

Save prashant-shahi/c3bf1fb54a9c84d77d125a25d7c21c38 to your computer and use it in GitHub Desktop.
Regular Expression(RegEx) to obtain Dimensions Spec and Flatten Spec of ingestionSpec for Kafka in Druid

Dimension Spec

Note: Assuming that you have a flattened data sample to begin with.

For string

Find:

"([a-zA-Z\_]+)": ".+",

Replace:

"$1",

For boolean(treated as string)

Find:

"([a-zA-Z\_]+)": (false|true),

Replace:

"$1",

For int/long(treated as long)

Find:

"([a-zA-Z\_]+)": [0-9]+,

Replace:

{
	"name": "$1",
	"type": "long"
},

For double/float(treated as double)

Find:

"([a-zA-Z\_]+)": [0-9\.]+,

Replace:

{
	"name": "$1",
	"type": "double"
},

Flattening Spec

For strings

Find:

"([A-Za-z0-9]+)",

Replace:

{
	"name":"$1",
	"type":"jq",
	"expr":".$1",
},

For rest of all

Find:

{
	"name": "([A-Za-z0-9]+)",
	"type": "(float|double|long|string)"
},

Replace:

{
	"name": "$1",
	"type": "jq",
	"expr": ".$1",
},

Adjustment in the end for all

Find:

"expr": "\.([A-Za-z0-9]+)_([A-Za-z0-9]+)_([A-Za-z0-9]+)_([A-Za-z0-9]+)",

Replace:

"expr": "\.$1.$2.$3.$4"

Note: Increase/Decrease $N equivalent one by one

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment