Skip to content

Instantly share code, notes, and snippets.

@alexanderdean
Created February 27, 2013 08:42
Show Gist options
  • Save alexanderdean/5046364 to your computer and use it in GitHub Desktop.
Save alexanderdean/5046364 to your computer and use it in GitHub Desktop.
Specs2 parallel test issues with Scalding 0.8.3
╭─alex@nasqueron ~/Development/SnowPlow/snowplow/3-etl/hadoop-etl ‹feature/scalding-etl›
╰─$ sbt
Detected sbt version 0.12.1
Starting sbt: invoke with -help for other options
[info] Loading global plugins from /home/alex/.sbt/plugins
[info] Loading project definition from /home/alex/Development/SnowPlow/snowplow/3-etl/hadoop-etl/project
[info] Set current project to snowplow-hadoop-etl (in build file:/home/alex/Development/SnowPlow/snowplow/3-etl/hadoop-etl/)
snowplow-hadoop-etl > test-only com.snowplowanalytics.snowplow.hadoop.etl.jobs.CorruptedCfLinesTest
13/02/27 08:40:20 INFO property.AppProps: using app.id: 83265FFB1D12AEF2BE02B7B711912163
13/02/27 08:40:20 INFO util.Version: Concurrent, Inc - Cascading 2.0.7
13/02/27 08:40:20 INFO flow.Flow: [com.snowplowanalytics....] starting
13/02/27 08:40:20 INFO flow.Flow: [com.snowplowanalytics....] source: MemoryTap["TextLine[['offset', 'line']->[ALL]]"]["0.0804209113563844"]"]
13/02/27 08:40:20 INFO flow.Flow: [com.snowplowanalytics....] sink: MemoryTap["TextLine[['offset', 'line']->[ALL]]"]["0.5725539372041639"]"]
13/02/27 08:40:20 INFO flow.Flow: [com.snowplowanalytics....] sink: MemoryTap["TextDelimited[[UNKNOWN]->[ALL]]"]["0.13081274806378096"]"]
13/02/27 08:40:20 INFO flow.Flow: [com.snowplowanalytics....] parallel execution is enabled: true
13/02/27 08:40:20 INFO flow.Flow: [com.snowplowanalytics....] starting jobs: 1
13/02/27 08:40:20 INFO flow.Flow: [com.snowplowanalytics....] allocating threads: 1
13/02/27 08:40:20 INFO flow.FlowStep: [com.snowplowanalytics....] starting step: local
[info] CorruptedCfLinesTest
[info]
[info] A job which processes a corrupted input line should
[info] + write an error JSON containing the input line and all errors
[info]
[info]
[info] Total for specification CorruptedCfLinesTest
[info] Finished in 13 ms
[info] 1 example, 0 failure, 0 error
[info]
[info] Passed: : Total 1, Failed 0, Errors 0, Passed 1, Skipped 0
[success] Total time: 3 s, completed 27-Feb-2013 08:40:21
snowplow-hadoop-etl > test-only com.snowplowanalytics.snowplow.hadoop.etl.jobs.DiscardableCfLinesTest
13/02/27 08:40:34 INFO property.AppProps: using app.id: B5C1A20EF5E2D5875F692B2EB7DD8601
13/02/27 08:40:34 INFO util.Version: Concurrent, Inc - Cascading 2.0.7
13/02/27 08:40:34 INFO flow.Flow: [com.snowplowanalytics....] starting
13/02/27 08:40:34 INFO flow.Flow: [com.snowplowanalytics....] source: MemoryTap["TextLine[['offset', 'line']->[ALL]]"]["0.23493547701688933"]"]
13/02/27 08:40:34 INFO flow.Flow: [com.snowplowanalytics....] sink: MemoryTap["TextLine[['offset', 'line']->[ALL]]"]["0.2996217903309317"]"]
13/02/27 08:40:34 INFO flow.Flow: [com.snowplowanalytics....] sink: MemoryTap["TextDelimited[[UNKNOWN]->[ALL]]"]["0.5718465172114834"]"]
13/02/27 08:40:34 INFO flow.Flow: [com.snowplowanalytics....] parallel execution is enabled: true
13/02/27 08:40:34 INFO flow.Flow: [com.snowplowanalytics....] starting jobs: 1
13/02/27 08:40:34 INFO flow.Flow: [com.snowplowanalytics....] allocating threads: 1
13/02/27 08:40:34 INFO flow.FlowStep: [com.snowplowanalytics....] starting step: local
[info] DiscardableCfLinesTest
[info]
[info] A job which processes expected but discardable CloudFront input lines should
[info] + silently discard those input lines
[info]
[info]
[info] Total for specification DiscardableCfLinesTest
[info] Finished in 38 ms
[info] 1 example, 0 failure, 0 error
[info]
[info] Passed: : Total 1, Failed 0, Errors 0, Passed 1, Skipped 0
[success] Total time: 2 s, completed 27-Feb-2013 08:40:35
snowplow-hadoop-etl > test-only com.snowplowanalytics.snowplow.hadoop.etl.jobs.InvalidLinesTest
13/02/27 08:40:41 INFO property.AppProps: using app.id: 45D8C97D97A5D1FC2A748416A9B5889A
13/02/27 08:40:41 INFO util.Version: Concurrent, Inc - Cascading 2.0.7
13/02/27 08:40:41 INFO flow.Flow: [com.snowplowanalytics....] starting
13/02/27 08:40:41 INFO flow.Flow: [com.snowplowanalytics....] source: MemoryTap["TextLine[['offset', 'line']->[ALL]]"]["0.042205460646965176"]"]
13/02/27 08:40:41 INFO flow.Flow: [com.snowplowanalytics....] sink: MemoryTap["TextLine[['offset', 'line']->[ALL]]"]["0.4219761887323268"]"]
13/02/27 08:40:41 INFO flow.Flow: [com.snowplowanalytics....] sink: MemoryTap["TextDelimited[[UNKNOWN]->[ALL]]"]["0.1987333402149215"]"]
13/02/27 08:40:41 INFO flow.Flow: [com.snowplowanalytics....] parallel execution is enabled: true
13/02/27 08:40:41 INFO flow.Flow: [com.snowplowanalytics....] starting jobs: 1
13/02/27 08:40:41 INFO flow.Flow: [com.snowplowanalytics....] allocating threads: 1
13/02/27 08:40:41 INFO flow.FlowStep: [com.snowplowanalytics....] starting step: local
[info] InvalidLinesTest
[info]
[info] A job which processes input lines not in CloudFront format should
[info] + write an error JSON with input line and error message for each input line
[info]
[info]
[info] Total for specification InvalidLinesTest
[info] Finished in 13 ms
[info] 1 example, 0 failure, 0 error
[info]
[info] Passed: : Total 1, Failed 0, Errors 0, Passed 1, Skipped 0
[success] Total time: 2 s, completed 27-Feb-2013 08:40:42
snowplow-hadoop-etl > test-only com.snowplowanalytics.snowplow.hadoop.etl.jobs.BadTrackerLinesTest
13/02/27 08:40:56 INFO property.AppProps: using app.id: DDDA10FB587B549B3E443CEB5508FBA0
13/02/27 08:40:56 INFO util.Version: Concurrent, Inc - Cascading 2.0.7
13/02/27 08:40:56 INFO flow.Flow: [com.snowplowanalytics....] starting
13/02/27 08:40:56 INFO flow.Flow: [com.snowplowanalytics....] source: MemoryTap["TextLine[['offset', 'line']->[ALL]]"]["0.4110761072822968"]"]
13/02/27 08:40:56 INFO flow.Flow: [com.snowplowanalytics....] sink: MemoryTap["TextLine[['offset', 'line']->[ALL]]"]["0.6017258606821916"]"]
13/02/27 08:40:56 INFO flow.Flow: [com.snowplowanalytics....] sink: MemoryTap["TextDelimited[[UNKNOWN]->[ALL]]"]["0.08296774212717806"]"]
13/02/27 08:40:56 INFO flow.Flow: [com.snowplowanalytics....] parallel execution is enabled: true
13/02/27 08:40:56 INFO flow.Flow: [com.snowplowanalytics....] starting jobs: 1
13/02/27 08:40:56 INFO flow.Flow: [com.snowplowanalytics....] allocating threads: 1
13/02/27 08:40:56 INFO flow.FlowStep: [com.snowplowanalytics....] starting step: local
[info] BadTrackerLinesTest
[info]
[info] A job which processes input lines containing corrupted data from the tracker should
[info] + write error JSONs, each containing an input line and the errors
[info]
[info]
[info] Total for specification BadTrackerLinesTest
[info] Finished in 12 ms
[info] 1 example, 0 failure, 0 error
[info]
[info] Passed: : Total 1, Failed 0, Errors 0, Passed 1, Skipped 0
[success] Total time: 3 s, completed 27-Feb-2013 08:40:57
snowplow-hadoop-etl > test
13/02/27 08:41:03 INFO property.AppProps: using app.id: E1C4AFAAFFB1EA6FE1813653537F4865
13/02/27 08:41:04 INFO util.Version: Concurrent, Inc - Cascading 2.0.7
13/02/27 08:41:04 INFO flow.Flow: [com.snowplowanalytics....] starting
13/02/27 08:41:04 INFO flow.Flow: [com.snowplowanalytics....] source: MemoryTap["TextLine[['offset', 'line']->[ALL]]"]["0.23481012710242333"]"]
13/02/27 08:41:04 INFO flow.Flow: [com.snowplowanalytics....] sink: MemoryTap["TextLine[['offset', 'line']->[ALL]]"]["0.7506051133380082"]"]
13/02/27 08:41:04 INFO flow.Flow: [com.snowplowanalytics....] sink: MemoryTap["TextDelimited[[UNKNOWN]->[ALL]]"]["0.13412958998903823"]"]
13/02/27 08:41:04 INFO flow.Flow: [com.snowplowanalytics....] parallel execution is enabled: true
13/02/27 08:41:04 INFO flow.Flow: [com.snowplowanalytics....] starting jobs: 1
13/02/27 08:41:04 INFO flow.Flow: [com.snowplowanalytics....] allocating threads: 1
13/02/27 08:41:04 INFO flow.FlowStep: [com.snowplowanalytics....] starting step: local
[info] CorruptedCfLinesTest
[info]
[info] A job which processes a corrupted input line should
[info] + write an error JSON containing the input line and all errors
[info]
[info]
[info] Total for specification CorruptedCfLinesTest
[info] Finished in 12 ms
[info] 1 example, 0 failure, 0 error
[info]
13/02/27 08:41:04 INFO flow.Flow: [com.snowplowanalytics....] starting
13/02/27 08:41:04 INFO flow.Flow: [com.snowplowanalytics....] source: MemoryTap["TextLine[['offset', 'line']->[ALL]]"]["0.06380578617908528"]"]
13/02/27 08:41:04 INFO flow.Flow: [com.snowplowanalytics....] sink: MemoryTap["TextLine[['offset', 'line']->[ALL]]"]["0.5920083860088079"]"]
13/02/27 08:41:04 INFO flow.Flow: [com.snowplowanalytics....] sink: MemoryTap["TextDelimited[[UNKNOWN]->[ALL]]"]["0.29905151522078866"]"]
13/02/27 08:41:04 INFO flow.Flow: [com.snowplowanalytics....] parallel execution is enabled: true
13/02/27 08:41:04 INFO flow.Flow: [com.snowplowanalytics....] starting jobs: 1
13/02/27 08:41:04 INFO flow.Flow: [com.snowplowanalytics....] allocating threads: 1
13/02/27 08:41:04 INFO flow.FlowStep: [com.snowplowanalytics....] starting step: local
[info] InvalidLinesTest
[info]
[info] A job which processes input lines not in CloudFront format should
[error] x write an error JSON with input line and error message for each input line
[error] '{"line":"","errors":["Line does not match CloudFront header or data row formats"]}'
[error] is not equal to
[error] '{"line":"20yy-05-24 00:08:40 LHR5 3397 74.125.17.210 GET d3gs014xn8p70.cloudfront.net /ice.png 200 http://www.psychicbazaar.com/oracles/119-psycards-book-and-deck-starter-pack.html Mozilla/5.0%20(Linux;%20U;%20Android%202.3.4;%20generic)%20AppleWebKit/535.1%20(KHTML,%20like%20Gecko;%20Google%20Web%20Preview)%20Version/4.0%20Mobile%20Safari/535.1 -","errors":["Unexpected exception converting date [20yy-05-24] and time [00:08:40] to timestamp: [Invalid format: \"20yy-05-24T00:08:40\" is malformed at \"yy-05-24T00:08:40\"]","Querystring is empty, cannot extract GET payload"]}' (CorruptedCfLinesTest.scala:46)
[error] Expected: ...ne":"[20yy-05-24 00]:[08:40 ]L[HR5] [ 3397] [ 74.125.17.210] [GET d3gs014xn8p70.c]loud[f]ront[.net] [ /ice.png 200 ]h[ttp://www.psychicb]a[zaa]r[.c]o[m/o]r[acles/119-psycar]d[s-book-]a[nd-deck-s]ta[rte]r[-pack.html] [M]o[zill]a[/5.0%2]
[error] Actual: ...ne":"[","errors"]:["]L[ine] [does] [not] [match C]loud[F]ront[] []h[e]a[de]r[ ]o[]r[ ]d[]a[]ta[ ]r[ow] [f]o[rm]a[ts"]}
[info]
[info]
[info]
[info] Total for specification InvalidLinesTest
[info] Finished in 0 ms
[info] 1 example, 1 failure, 0 error
[info]
[info] TransformMapTest
[info]
[info] Executing a TransformMap against a SourceMap should
[info] + successfully set each of the target fields
[info]
[info]
[info] Total for specification TransformMapTest
[info] Finished in 0 ms
[info] 1 example, 0 failure, 0 error
[info]
13/02/27 08:41:05 INFO flow.Flow: [com.snowplowanalytics....] starting
13/02/27 08:41:05 INFO flow.Flow: [com.snowplowanalytics....] source: MemoryTap["TextLine[['offset', 'line']->[ALL]]"]["0.12188519264945952"]"]
13/02/27 08:41:05 INFO flow.Flow: [com.snowplowanalytics....] sink: MemoryTap["TextLine[['offset', 'line']->[ALL]]"]["0.4167993684661703"]"]
13/02/27 08:41:05 INFO flow.Flow: [com.snowplowanalytics....] sink: MemoryTap["TextDelimited[[UNKNOWN]->[ALL]]"]["0.07419567520006687"]"]
13/02/27 08:41:05 INFO flow.Flow: [com.snowplowanalytics....] parallel execution is enabled: true
13/02/27 08:41:05 INFO flow.Flow: [com.snowplowanalytics....] starting jobs: 1
13/02/27 08:41:05 INFO flow.Flow: [com.snowplowanalytics....] allocating threads: 1
13/02/27 08:41:05 INFO flow.FlowStep: [com.snowplowanalytics....] starting step: local
[info] BadTrackerLinesTest
[info]
[info] A job which processes input lines containing corrupted data from the tracker should
[error] x write error JSONs, each containing an input line and the errors
[error] '{"line":"2012-05-24 00:08:40 LHR5 3397 74.125.17.210 GET d3gs014xn8p70.cloudfront.net /ice.png 200 http://www.psychicbazaar.com/oracles/119-psycards-book-and-deck-starter-pack.html Mozilla/5.0%20(Linux;%20U;%20Android%202.3.4;%20generic)%20AppleWebKit/535.1%20(KHTML,%20like%20Gecko;%20Google%20Web%20Preview)%20Version/4.0%20Mobile%20Safari/535.1 e=pv&p=mobile&page=Psycards%2520book%2520and%2520deck%2520starter%2520pack%2520-%2520Psychic%2520Bazaar&tid=721410&uid=3798cdce0493133e&vid=1&lang=en&refr=http%253A%252F%252Fwww.google.com%252Fm%252Fsearch&res=640x960&cookie=1","errors":["Field [p]: [mobile] is not a support tracking platform"]}'
[error] is not equal to
[error] '{"line":"20yy-05-24 00:08:40 LHR5 3397 74.125.17.210 GET d3gs014xn8p70.cloudfront.net /ice.png 200 http://www.psychicbazaar.com/oracles/119-psycards-book-and-deck-starter-pack.html Mozilla/5.0%20(Linux;%20U;%20Android%202.3.4;%20generic)%20AppleWebKit/535.1%20(KHTML,%20like%20Gecko;%20Google%20Web%20Preview)%20Version/4.0%20Mobile%20Safari/535.1 -","errors":["Unexpected exception converting date [20yy-05-24] and time [00:08:40] to timestamp: [Invalid format: \"20yy-05-24T00:08:40\" is malformed at \"yy-05-24T00:08:40\"]","Querystring is empty, cannot extract GET payload"]}' (CorruptedCfLinesTest.scala:46)
[error] Expected: ...":"20[yy]-05-2...
[error] ...5.1 [-","err]ors":["Un]e[xpe]c[te]d[ excepti]o[n converting]
[error] [ dat]e []2[]0[yy-05-24] a...e []0[0:08:4]0[] to tme...: [Inval]id[ format: \"20yy-05-2]4[T00:08:40\" is malfo]r[med a]t[ \"yy-0]5[-]2[4T00:08:40\"]",...ry[t]r[ing i]s[ empty, cann]o[t ]e[xtract GET payload"]}
[error] Actual: ...":"20[12]-05-2...
[error] ...5.1 [e=pv&p=m]o[bile&pag]e[=Psy]c[ar]d[s%2520b]o[ok%2520and%]
[error] [2520d]e[ck%]2[52]0[starter%2520pack%252]0[-%252]0[Psych]i[c%2520Bazaar&t]id[=721410&uid=3798cdce0]4[93133e&vid=1&lang=en&]r[efr=h]t[tp%253A%2]5[2F%]2[52Fwww.google.com%252Fm%252F]s[ea]r[ch&re]s[=640x960&co]o[ki]e=1...":["Field ]
[info]
[info]
[info]
[info] Total for specification BadTrackerLinesTest
[info] Finished in 2 ms
[info] 1 example, 1 failure, 0 error
[info]
13/02/27 08:41:05 INFO flow.Flow: [com.snowplowanalytics....] starting
13/02/27 08:41:05 INFO flow.Flow: [com.snowplowanalytics....] source: MemoryTap["TextLine[['offset', 'line']->[ALL]]"]["0.0030329153867624248"]"]
13/02/27 08:41:05 INFO flow.Flow: [com.snowplowanalytics....] sink: MemoryTap["TextLine[['offset', 'line']->[ALL]]"]["0.754796493264088"]"]
13/02/27 08:41:05 INFO flow.Flow: [com.snowplowanalytics....] sink: MemoryTap["TextDelimited[[UNKNOWN]->[ALL]]"]["0.31432243337075416"]"]
13/02/27 08:41:05 INFO flow.Flow: [com.snowplowanalytics....] parallel execution is enabled: true
13/02/27 08:41:05 INFO flow.Flow: [com.snowplowanalytics....] starting jobs: 1
13/02/27 08:41:05 INFO flow.Flow: [com.snowplowanalytics....] allocating threads: 1
13/02/27 08:41:05 INFO flow.FlowStep: [com.snowplowanalytics....] starting step: local
[info] DiscardableCfLinesTest
[info]
[info] A job which processes expected but discardable CloudFront input lines should
[error] ! silently discard those input lines
[error] NoSuchElementException: head of empty list (List.scala:399)
[error] com.snowplowanalytics.snowplow.hadoop.etl.jobs.CorruptedCfLinesTest$$anonfun$1$$anonfun$apply$1$$anonfun$apply$5.apply(CorruptedCfLinesTest.scala:45)
[error] com.snowplowanalytics.snowplow.hadoop.etl.jobs.CorruptedCfLinesTest$$anonfun$1$$anonfun$apply$1$$anonfun$apply$5.apply(CorruptedCfLinesTest.scala:44)
[error] com.twitter.scalding.JobTest$$anonfun$sink$1.apply$mcV$sp(JobTest.scala:66)
[error] com.twitter.scalding.JobTest$$anonfun$runJob$2.apply(JobTest.scala:132)
[error] com.twitter.scalding.JobTest$$anonfun$runJob$2.apply(JobTest.scala:132)
[error] com.twitter.scalding.JobTest.runJob(JobTest.scala:132)
[error] com.twitter.scalding.JobTest.run(JobTest.scala:80)
[error] com.snowplowanalytics.snowplow.hadoop.etl.jobs.DiscardableCfLinesTest$$anonfun$1$$anonfun$apply$1.apply(DiscardableCfLinesTest.scala:46)
[error] com.snowplowanalytics.snowplow.hadoop.etl.jobs.DiscardableCfLinesTest$$anonfun$1$$anonfun$apply$1.apply(DiscardableCfLinesTest.scala:34)
[info]
[info]
[info] Total for specification DiscardableCfLinesTest
[info] Finished in 0 ms
[info] 1 example, 0 failure, 1 error
[info]
[error] Error: Total 5, Failed 2, Errors 1, Passed 2, Skipped 0
[error] Failed tests:
[error] com.snowplowanalytics.snowplow.hadoop.etl.jobs.BadTrackerLinesTest
[error] com.snowplowanalytics.snowplow.hadoop.etl.jobs.InvalidLinesTest
[error] Error during tests:
[error] com.snowplowanalytics.snowplow.hadoop.etl.jobs.DiscardableCfLinesTest
[trace] Stack trace suppressed: run last test:test for the full output.
[error] (test:test) Tests unsuccessful
[error] Total time: 4 s, completed 27-Feb-2013 08:41:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment