Common Hiving Setting group by category

Dynamic Partition

Enable/Disable dynmaic partition inserts

hive.exec.dyanamic.partition=true

==> whether or not to allow dynamic partitions in DML/DDL

Use strict mode when in doubt

hive.exec.dynamic.partition.mode=strict

==> In strict mode, the user must specify at least one static partition in case the user accidentally overwrites all partitions. ==> In nonstrict mode all partitions are allowed to be dynamic.

Default maximum dynamic partitions = 1000

hive.exec.max.dynamic.partitions

==> Maximum number of dynamic partitions allowed to be created in total.
hive.exec.max.dynamic.partitions.pernode

==> Maximum number of dynamic partitions allowed to be created in each mapper/reducer node

Increase max number of files a data node can service in (hdfs-site.xml)

dfs.datanode.max.xcievers=4096

Hive Join Configuration

Map Join

Map join is a Hive feature that is used to speed up Hive queries. It lets a table to be loaded into memory so that a join could be performed within a mapper without using a Map/Reduce step. If queries frequently depend on small table joins, using map joins speed up queries’ execution. Map join is a type of join where a smaller table is loaded in memory and the join is done in the map phase of the MapReduce job. As no reducers are necessary, map joins are way faster than the regular joins.

Setting	Description
hive.auto.convert.join=true	When it is enabled, during joins, when a table with a size less than 25 MB (hive.mapjoin.smalltable.filesize) is found, the joins are converted to map-based joins.
hive.auto.convert.join.noconditionaltask=true hive.auto.convert.join.noconditionaltask.size=10000;	When three or more tables are involved in the join condition. Using hive.auto.convert.join, Hive generates three or more map-side joins with an assumption that all tables are of smaller size. Using hive.auto.convert.join.noconditionaltask, you can combine three or more map-side joins into a single map-side join if size of n-1 table is less than 10 MB. (This rule is defined by hive.auto.convert.join.noconditionaltask.size.)

kzhangkzhang/BIGDATA_HIVE_setting.md