Skip to content

Instantly share code, notes, and snippets.

@muga
Created August 26, 2013 06:18
Show Gist options
  • Save muga/6338511 to your computer and use it in GitHub Desktop.
Save muga/6338511 to your computer and use it in GitHub Desktop.
org.apache.hadoop.hive.ql.Driver#compile(..)
...
ParseDriver pd = new ParseDriver();
// parser: parse sql commands with Hive lexer and parser. it converts into AST representation.
ASTNode tree = pd.parse(command, ctx);
...
// semantic analyzer: it analyzes ASTree semantically. that means it converts into block-base
// internal query representation.
BaseSemanticAnalyzer sem = SemanticAnalyzerFactory.get(conf, tree);
sem.analyze(tree, ctx);
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer#analyzeInternal(ASTNode)
...
if (!doPhase1(child, qb, initPhase1Ctx())) {
...
getMetaData(qb);
...
// logical plan gen: convert into internal query representation. create Operator with
// QueryPlans and AST Nodes.
// e.g. SelectDesc and org.apache.hadoop.hive.ql.exec.SelectOperator,
// TableScanDesc and org.apache.hadoop.hive.ql.exec.TableScanOperator
Operator sinkOp = genPlan(qb);
...
ParseContext pCtx = new ParseContext(conf, qb, child, opToPartPruner,
opToPartList, topOps, topSelOps, opParseCtx, joinContext, topToTable,
loadTableWork, loadFileWork, ctx, idToTableNameMap, destTableId, uCtx,
listMapJoinOpsNoReducer, groupOpToInputTables, prunedPartitions,
opToSamplePruner, globalLimitCtx, nameToSplitSample, inputs, rootTasks,
opToPartToSkewedPruner);
...
// logical optimizer: rewrite plans into more optimized plans
Optimizer optm = new Optimizer(); // org.apache.hadoop.hive.ql.optimizer.Optimizer
optm.initialize(conf);
pCtx = optm.optimize();
// pysical plan gen: it converts into physical plans (MR jobs)
genMapRedTasks(ParseContext pCtx);
...
Dispatcher disp = new DefaultRuleDispatcher(new GenMROperator(), opRules, procCtx);
GraphWalker ogw = new GenMapRedWalker(disp);
ArrayList<Node> topNodes = new ArrayList<Node>();
topNodes.addAll(topOps.values());
ogw.startWalking(topNodes, null);
...
...
...
plan = new QueryPlan(command, sem, perfLogger.getStartTime(PerfLogger.DRIVER_RUN),
SessionState.get().getCommandType());
...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment