Skip to content

Instantly share code, notes, and snippets.

View shrijeet's full-sized avatar

Shrijeet shrijeet

  • Redwood City, CA
View GitHub Profile
@shrijeet
shrijeet / hive_mail.txt
Created March 12, 2012 23:37
hive merge file error decription
Hive Version: Hive 0.8 (last commit SHA b581a6192b8d4c544092679d05f45b2e50d42b45 )
Hadoop version : chd3u0
I am trying to use the hive merge small file feature by setting all the necessary params.
I am disabling use of CombineHiveInputFormat since my input is compressed text.
hive> set mapred.min.split.size.per.node=1000000000;
hive> set mapred.min.split.size.per.rack=1000000000;
hive> set mapred.max.split.size=1000000000;
@shrijeet
shrijeet / hive_merge_small_files.java
Created March 12, 2012 23:09
Hive merge small files bug (?) (when using HiveInputFormat and not CombineHiveInputFormat)
diff --git ql/src/java/org/apache/hadoop/hive/ql/exec/MapRedTask.java ql/src/java/org/apache/hadoop/hive/ql/exec/MapRedTask.java
index a3e40f7..7674af4 100644
--- ql/src/java/org/apache/hadoop/hive/ql/exec/MapRedTask.java
+++ ql/src/java/org/apache/hadoop/hive/ql/exec/MapRedTask.java
@@ -381,7 +381,7 @@ public class MapRedTask extends ExecDriver implements Serializable {
.printInfo("Number of reduce tasks is set to 0 since there's no reduce operator");
work.setNumReduceTasks(Integer.valueOf(0));
} else {
- if (numReducersFromWork >= 0) {
+ if (numReducersFromWork > 0) {
@shrijeet
shrijeet / gist:1597563
Created January 11, 2012 23:56
Query NPE
FAILED: Hive Internal Error: java.lang.NullPointerException(null)
java.lang.NullPointerException
at org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc.newInstance(ExprNodeGenericFuncDesc.java:214)
at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:684)
at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:805)
at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89)
at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:88)
at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:125)
at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:102)
at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:161)
@shrijeet
shrijeet / gist:1597560
Created January 11, 2012 23:55
org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc.newInstance NPE
select pid as pid,
sum(if(bid is not null and bid <> '', 1, 0)) as bids,
sum(1) as requests
from table
where data_date = 20120110
and (pid = 15368 or pid = 15369 or pid = 15370)
group by pid,
sum(if(bid is not null and bid <> '', 1, 0)),
sum(1)
@shrijeet
shrijeet / HConnectionManager.java
Created December 14, 2011 19:27
HConnectionManager throwing runtime exceptions
/**
* Copyright 2010 The Apache Software Foundation
*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
@shrijeet
shrijeet / clean_calls_regularly.patch
Created October 14, 2011 19:11
RPC timeout issue
diff --git a/src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java b/src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java
index 2cc1b04..c08a55e 100644
--- a/src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java
+++ b/src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java
@@ -209,6 +209,7 @@ public class HBaseClient {
* socket connected to a remote address. Calls are multiplexed through this
* socket: responses may be delivered out of order. */
private class Connection extends Thread {
+ protected static final long DEFAULT_CLEAN_INTERVAL = -1; // disabled by default
private ConnectionId remoteId;