Created
August 29, 2014 08:33
-
-
Save Tarrasch/5e5b0fa0e0f242ece9bb to your computer and use it in GitHub Desktop.
Currently, our skeleton doesn't instantiate everything
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import datetime | |
import luigi | |
from spotify.luigi.crunch import ScrubJobTask, load_avsc | |
from spotify.luigi import HdfsTarget | |
from spotify.luigi.external_shrek_anonym import CreateEndSongCleaned | |
class SampleEndSongSubset(luigi.ExternalTask): | |
def output(self): | |
return HdfsTarget("/user/spotify-analytics-data/examples/data_pipeline_crunch/stream_count_anonym") | |
class Example1StreamCountJob(ScrubJobTask): | |
""" | |
You can run this example from maven artifact: | |
> greaserun --runner luigi com.spotify.data:spotify-data-crunch:LATEST --module stream_count --task Example1StreamCountJob | |
or using your local build (uploaded to your edgenode): | |
> greaserun --runner luigi myartifaaaaaact-0.1.2.3.4.5-jar-with-dependencies.jar --module stream_count --task Example1StreamCountJob | |
""" | |
def main_class(self): | |
return "mygrooooooooooooooooup.pipeline.Example1StreamCountJob" | |
def requires(self): | |
return { | |
"input": SampleEndSongSubset() | |
} | |
def output(self): | |
return HdfsTarget('stream_count', schema=load_avsc("ExamplePlaysByCountry.avsc")) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
So this is the file that get instantiated, I picked extra silly artifact and group ids to make it clear :)