可以方便的在 Hadoop 和关系数据库之间转移数据。
1.4.x 是 Sqoop1,1.99.x 是 Sqoop2。
环境:
CentOS 7.4
hadoop-2.7.4
hbase-1.2.6
下载:
wget http://archive.apache.org/dist/sqoop/1.99.5/sqoop-1.99.5-bin-hadoop200.tar.gz
tar xvf sqoop-1.99.5-bin-hadoop200.tar.gz
cd sqoop-1.99.5-bin-hadoop200/
设置环境变量:
echo "export PATH=`pwd`/bin:\$PATH" | sudo tee /etc/profile.d/sqoop.sh
source /etc/profile.d/sqoop.sh
修改 server/conf/catalina.properties
,在 common.loader
后面追加如下内容。它们是 Hadoop 库的路径,有 common、hdfs、mapreduce、tools、yarn 以及下面的 lib 共 10 个目录。
/home/vagrant/apps/hadoop-2.7.4/share/hadoop/common/*.jar,/home/vagrant/apps/hadoop-2.7.4/share/hadoop/common/lib/*.jar,/home/vagrant/apps/hadoop-2.7.4/share/hadoop/hdfs/*.jar,/home/vagrant/apps/hadoop-2.7.4/share/hadoop/hdfs/lib/*.jar,/home/vagrant/apps/hadoop-2.7.4/share/hadoop/mapreduce/*.jar,/home/vagrant/apps/hadoop-2.7.4/share/hadoop/mapreduce/lib/*.jar,/home/vagrant/apps/hadoop-2.7.4/share/hadoop/tools/*.jar,/home/vagrant/apps/hadoop-2.7.4/share/hadoop/tools/lib/*.jar,/home/vagrant/apps/hadoop-2.7.4/share/hadoop/yarn/*.jar,/home/vagrant/apps/hadoop-2.7.4/share/hadoop/yarn/lib/*.jar
验证配置的正确性:
sqoop2-tool verify
复制额外的库到 lib/
目录下(自建),比如 MySQL 等。
修改 server/conf/sqoop.properties
:
# hadoop 配置文件所在目录
org.apache.sqoop.submission.engine.mapreduce.configuration.directory=/home/vagrant/apps/hadoop-2.7.4/etc/hadoop/
启动服务:
$ sqoop.sh server start # start|stop
$ jps # 检查启动的进程
... Bootstrap
客户端:
sqoop2-shell
> set server --host localhost 12000 --webapp
> show version --all
执行 sqoop2-tool verify
时报错。配置 common.loader
时,应该在后面追加。
没关系,verify 正常就行。