Skip to content

Instantly share code, notes, and snippets.

@ishassan
Last active June 8, 2023 09:08
Show Gist options
  • Save ishassan/c4d5770f4163e13a3e5a9b072e18ce7d to your computer and use it in GitHub Desktop.
Save ishassan/c4d5770f4163e13a3e5a9b072e18ce7d to your computer and use it in GitHub Desktop.
A hello world example about connecting Scala to HBase
name := "ScalaHBase"
version := "1.0"
scalaVersion := "2.11.8"
resolvers ++= Seq(
"Hadoop Releases" at "https://repository.cloudera.com/content/repositories/releases/"
)
libraryDependencies ++= Seq(
"com.google.guava" % "guava" % "15.0",
"org.apache.hadoop" % "hadoop-common" % "2.6.0",
"org.apache.hadoop" % "hadoop-mapred" % "0.22.0",
"org.apache.hbase" % "hbase-common" % "1.0.0",
"org.apache.hbase" % "hbase-client" % "1.0.0"
)
dependencyOverrides += "com.google.guava" % "guava" % "15.0"
assemblyMergeStrategy in assembly := {
case PathList("javax", "servlet", xs @ _*) => MergeStrategy.last
case PathList("javax", "activation", xs @ _*) => MergeStrategy.last
case PathList("org", "apache", xs @ _*) => MergeStrategy.last
case PathList("com", "google", xs @ _*) => MergeStrategy.first
case PathList("com", "yammer", xs @ _*) => MergeStrategy.last
case "about.html" => MergeStrategy.rename
case "plugin.properties" => MergeStrategy.last
case "log4j.properties" => MergeStrategy.last
case x =>
val oldStrategy = (assemblyMergeStrategy in assembly).value
oldStrategy(x)
}
//CHECK ishassan/build.sbt as well
import org.apache.hadoop.hbase.client._
import org.apache.hadoop.hbase.util.Bytes
import org.apache.hadoop.hbase.{CellUtil, HBaseConfiguration, TableName}
import org.apache.hadoop.conf.Configuration
import scala.collection.JavaConverters._
object ScalaHBaseExample extends App{
def printRow(result : Result) = {
val cells = result.rawCells();
print( Bytes.toString(result.getRow) + " : " )
for(cell <- cells){
val col_name = Bytes.toString(CellUtil.cloneQualifier(cell))
val col_value = Bytes.toString(CellUtil.cloneValue(cell))
print("(%s,%s) ".format(col_name, col_value))
}
println()
}
val conf : Configuration = HBaseConfiguration.create()
/*
From http://hbase.apache.org/0.94/book/zookeeper.html
A distributed Apache HBase (TM) installation depends on a running ZooKeeper cluster. All participating nodes and clients
need to be able to access the running ZooKeeper ensemble. Apache HBase by default manages a ZooKeeper "cluster" for you.
It will start and stop the ZooKeeper ensemble as part of the HBase start/stop process. You can also manage the ZooKeeper
ensemble independent of HBase and just point HBase at the cluster it should use. To toggle HBase management of ZooKeeper,
use the HBASE_MANAGES_ZK variable in conf/hbase-env.sh. This variable, which defaults to true, tells HBase whether to
start/stop the ZooKeeper ensemble servers as part of HBase start/stop.
*/
val ZOOKEEPER_QUORUM = "WRITE THE ZOOKEEPER CLUSTER THAT HBASE SHOULD USE"
conf.set("hbase.zookeeper.quorum", ZOOKEEPER_QUORUM);
val connection = ConnectionFactory.createConnection(conf)
val table = connection.getTable(TableName.valueOf( Bytes.toBytes("emostafa:test_table") ) )
// Put example
var put = new Put(Bytes.toBytes("row1"))
put.addColumn(Bytes.toBytes("d"), Bytes.toBytes("test_column_name"), Bytes.toBytes("test_value"))
put.addColumn(Bytes.toBytes("d"), Bytes.toBytes("test_column_name2"), Bytes.toBytes("test_value2"))
table.put(put)
// Get example
println("Get Example:")
var get = new Get(Bytes.toBytes("row1"))
var result = table.get(get)
printRow(result)
//Scan example
println("\nScan Example:")
var scan = table.getScanner(new Scan())
scan.asScala.foreach(result => {
printRow(result)
})
table.close()
connection.close()
}
@alvarodecastro74
Copy link

Stuck at Bytes.toBytes. Does nothing, just sits there....

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment