Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save kumar-de/4fb3ece71e80372daa18f7e441f38593 to your computer and use it in GitHub Desktop.
Save kumar-de/4fb3ece71e80372daa18f7e441f38593 to your computer and use it in GitHub Desktop.
Read from HBase using SHC with predicate pushdown #shc #predicate #pushdown #predicatepushdown #filter #spark2 #spark
import org.apache.spark.sql.execution.datasources.hbase.HBaseTableCatalog
import sqlContext.implicits._

val configuration = s"""{"hbase.zookeeper.quorum":"127.0.0.1","hbase.zookeeper.property.clientPort":"2181"}"""

val tableCatalog = s"""{
                 |"table":{"namespace":"$namespace", "name":"$tableName"},
                 |"rowkey":"key",
                 |"columns":{
                 |"rowkey":{"cf":"rowkey", "col":"key", "type":"string"},
                 |"col1":{"cf":"cf1", "col":"col1", "type":"string"},
                 |"col2":{"cf":"cf2", "col":"col2", "type":"string"}
                 |}
                 |}""".stripMargin
                 
sqlContext.read
	.options(Map(HBaseTableCatalog.tableCatalog -> tableCatalog, HBaseRelation.HBASE_CONFIGURATION -> configuration))
	.format("org.apache.spark.sql.execution.datasources.hbase")
	.load()
	.filter($"col1".eqNullSafe("value1") || $"col2".eqNullSafe("value2"))     

Important: please use eqNullSafe instead of equalTo to avoid issues while comparing with null

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment