Last active
August 29, 2015 14:05
-
-
Save whitmanc/979a6b8549f8633af01b to your computer and use it in GitHub Desktop.
To use, run `ruby align.rb "path_to/audio.wav" "For J and C, with all my love"`
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
regExp = /\(([^)]+)\)/; | |
times = [] | |
timesPretty = "" | |
if ARGV.length < 2 | |
puts "Needs 2 args input to parse: 1. audio path and 2. text that matches audio" | |
else | |
# Make variables from parameters | |
audioPath = ARGV[0] | |
text = ARGV[1] | |
# Open file and save text to it for reading later by Sphinx | |
File.open("input.txt", 'w') { |file| file.write(text) } | |
# Run align script & get relevant output | |
output = `java -ms400m -mx1500m -jar bin/aligner.jar #{audioPath} input.txt` | |
output = output.split("\n")[4] | |
# Parse output further and put start time of words in array | |
wordsAndTimes = output.split(' ') | |
wordsAndTimes.each do |w| | |
times << w.match(regExp)[1].split(',')[0] | |
end | |
# Put word start times into a formatted string | |
times.each_with_index do |t, index| | |
timesPretty << t | |
timesPretty << ", " if index < times.length - 1 | |
end | |
# Wrap word start times string in brackets | |
timesPretty = "[" << timesPretty << "]" | |
# Output string "array" | |
puts "\n\nTimes: " + timesPretty + "\n\n\n" | |
end |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment