Skip to content

Instantly share code, notes, and snippets.

@NaotoKumagai
Last active November 19, 2016 10:39
Show Gist options
  • Save NaotoKumagai/8f141bc33362896f9ab6 to your computer and use it in GitHub Desktop.
Save NaotoKumagai/8f141bc33362896f9ab6 to your computer and use it in GitHub Desktop.
Embulkのinputとoutput時のスキーマ作成補助
#!/usr/bin/perl
for(@ARGV) {
open(IN, $_) || die "$!";
for(<IN>) {
# 通常のjsonだと m/"(\w*?)":["|\d]/g
for my $match ( m/"(\$?\w*?)":["|\d|true|false]/g ) {
$is_match = 0;
for my $_ (@list) {
if($_ eq $match) {
$is_match = 1;
last;
}
}
unless($is_match) {
push (@list, $match);
print $match."\n";
}
}
}
close(IN);
# keyのリストを出力(目視確認用)
open(OUT, ">key_list.log") or die "$!";
for my $key (@list) {
printf OUT $key."\n";
}
close(OUT);
# input時のスキーマ生成(補助)
open(OUT, ">mixpanel_schema.log") or die "$!";
$i =0;
for my $key (@list) {
printf OUT " - {index: $i, name: $key, type: string}\n";
$i++;
}
close(OUT);
# output時のスキーマ生成(最後のコロンに注意)
open(OUT, ">bigquery_schema.log") or die "$!";
printf OUT "[\n";
for my $key (@list) {
printf OUT " {\n";
printf OUT " \"name\": \"$key\",\n";
printf OUT " \"type\": \"string\"\n";
printf OUT " },\n";
}
printf OUT "]";
close(OUT);
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment