Created
June 14, 2011 14:46
-
-
Save neilkod/1025039 to your computer and use it in GitHub Desktop.
I want pig to return "2011-06-14" instead of 2011 minus 6 minus 14
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Problem: I'm trying to set a pig variable to the current date in YYYY-MM-DD format but pig interprets the YYYY-MM-DD as an expression and then solves it. | |
How can I coerce pig into accepting YYYY-MM-DD as a chararray? The cast operator isn't helping here. | |
watch: | |
-bash-3.1$ date +%Y\-%m\-%d | |
2011-06-14 | |
# 2011 minus 6 minus 14 | |
-bash-3.1$ python -c 'print 2011-6-14' | |
1991 | |
-bash-3.1$ cat test.pig | |
raw = LOAD 'hello.txt' as (txt:chararray); | |
%declare thedate `date +%Y\-%m\-%d`; | |
tst = foreach raw generate txt, $thedate; | |
dump tst; | |
-bash-3.1$ cat hello.txt | |
hello | |
world | |
test.pig produces: | |
(hello,1991) | |
(world,1991) | |
full pig output: | |
-bash-3.1$ pig -x local test.pig | |
2011-06-14 07:41:18,718 [main] INFO org.apache.pig.Main - Logging error messages to: /home/nkodner/ wip/pig_1308062478717.log | |
2011-06-14 07:41:18,759 [main] INFO org.apache.pig.tools.parameters.PreprocessorContext - Executing command : date +%Y\-%m\-%d | |
2011-06-14 07:41:18,899 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: file:/// | |
2011-06-14 07:41:19,360 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used i n the script: UNKNOWN | |
2011-06-14 07:41:19,360 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - pig.usenewlogicalplan is set to true. New logical plan will be used. | |
2011-06-14 07:41:19,702 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - (Name: tst: Store(file:/tmp/temp-2067247496/tmp195828621:org.apache.pig.impl.io.InterStorage) - sc ope-10 Operator Key: scope-10) | |
2011-06-14 07:41:19,725 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MR Compiler - File concatenation threshold: 100 optimistic? false | |
2011-06-14 07:41:19,780 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Mu ltiQueryOptimizer - MR plan size before optimization: 1 | |
2011-06-14 07:41:19,780 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Mu ltiQueryOptimizer - MR plan size after optimization: 1 | |
2011-06-14 07:41:19,803 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Initializing JVM Met rics with processName=JobTracker, sessionId= | |
2011-06-14 07:41:19,823 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job | |
2011-06-14 07:41:19,850 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Jo bControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3 | |
2011-06-14 07:41:22,198 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Jo bControlCompiler - Setting up single store job | |
2011-06-14 07:41:22,284 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JV M Metrics with processName=JobTracker, sessionId= - already initialized | |
2011-06-14 07:41:22,285 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Ma pReduceLauncher - 1 map-reduce job(s) waiting for submission. | |
2011-06-14 07:41:22,297 [Thread-3] INFO org.apache.hadoop.util.NativeCodeLoader - Loaded the native -hadoop library | |
2011-06-14 07:41:22,507 [Thread-3] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Tot al input paths to process : 1 | |
2011-06-14 07:41:22,507 [Thread-3] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUt il - Total input paths to process : 1 | |
2011-06-14 07:41:22,520 [Thread-3] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUt il - Total input paths (combined) to process : 1 | |
2011-06-14 07:41:22,794 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Ma pReduceLauncher - 0% complete | |
2011-06-14 07:41:22,987 [Thread-3] WARN org.apache.hadoop.conf.Configuration - file:/tmp/hadoop-nko dner/mapred/local/localRunner/job_local_0001.xml:a attempt to override final parameter: mapred.child .java.opts; Ignoring. | |
2011-06-14 07:41:23,006 [Thread-3] WARN org.apache.hadoop.conf.Configuration - file:/tmp/hadoop-nko dner/mapred/local/localRunner/job_local_0001.xml:a attempt to override final parameter: mapred.jobtr acker.maxtasks.per.job; Ignoring. | |
2011-06-14 07:41:23,007 [Thread-3] WARN org.apache.hadoop.conf.Configuration - file:/tmp/hadoop-nko dner/mapred/local/localRunner/job_local_0001.xml:a attempt to override final parameter: mapred.job.r euse.jvm.num.tasks; Ignoring. | |
2011-06-14 07:41:23,148 [Thread-4] INFO org.apache.hadoop.mapred.Task - Task:attempt_local_0001_m_0 00000_0 is done. And is in the process of commiting | |
2011-06-14 07:41:23,155 [Thread-4] INFO org.apache.hadoop.mapred.LocalJobRunner - | |
2011-06-14 07:41:23,155 [Thread-4] INFO org.apache.hadoop.mapred.Task - Task attempt_local_0001_m_0 00000_0 is allowed to commit now | |
2011-06-14 07:41:23,168 [Thread-4] INFO org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - Saved output of task 'attempt_local_0001_m_000000_0' to file:/tmp/temp-2067247496/tmp195828621 | |
2011-06-14 07:41:23,168 [Thread-4] INFO org.apache.hadoop.mapred.LocalJobRunner - | |
2011-06-14 07:41:23,168 [Thread-4] INFO org.apache.hadoop.mapred.Task - Task 'attempt_local_0001_m_ 000000_0' done. | |
2011-06-14 07:41:23,516 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Ma pReduceLauncher - HadoopJobId: job_local_0001 | |
2011-06-14 07:41:28,038 [main] WARN org.apache.pig.tools.pigstats.PigStatsUtil - Failed to get RunningJob for job job_local_0001 | |
2011-06-14 07:41:28,042 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete | |
2011-06-14 07:41:28,042 [main] INFO org.apache.pig.tools.pigstats.PigStats - Detected Local mode. Stats reported below may be incomplete | |
2011-06-14 07:41:28,046 [main] INFO org.apache.pig.tools.pigstats.PigStats - Script Statistics: | |
HadoopVersion PigVersion UserId StartedAt FinishedAt Features | |
0.20.2-cdh3u0 0.8.0-cdh3u0 nkodner 2011-06-14 07:41:19 2011-06-14 07:41:28 UNKNOWN | |
Success! | |
Job Stats (time in seconds): | |
JobId Alias Feature Outputs | |
job_local_0001 raw,tst MAP_ONLY file:/tmp/temp-2067247496/tmp195828621, | |
Input(s): | |
Successfully read records from: "file:///home/nkodner/wip/hello.txt" | |
Output(s): | |
Successfully stored records in: "file:/tmp/temp-2067247496/tmp195828621" | |
Job DAG: | |
job_local_0001 | |
2011-06-14 07:41:28,048 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success! | |
2011-06-14 07:41:28,056 [main] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1 | |
2011-06-14 07:41:28,056 [main] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1 | |
(hello,1991) | |
(world,1991) | |
-bash-3.1$ |
heh that's an old paste....i figured this out long ago but never bothered to update the gist...but thanks!!!
I didn't see the date on the post. I'm new to Pig, and was looking at how to get the current date inside a pig script, so this post helped me with that. Thanks!!
that was a funny issue. It makes sense but I didn't expect pig to evaluate the date as an expression.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Surround the date variable with single quotes so that it takes the whole thing as a string instead of trying to perform the "-" operation.
tst = foreach raw generate txt, '$thedate';