This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Notes: | |
| ------------- | |
| external table: | |
| ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.RegexSerDe' | |
| WITH SERDEPROPERTIES ("input.regex" = "(.{2})(.{10})(.{30})(.{10})(.{10}).*" ) | |
| LOCATION '${hiveconf:path}'; | |
| location:maprfs:/externalpath | |
| inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, | |
| serializationLib:org.apache.hadoop.hive.serde2.RegexSerDe, |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| 1. To see hostname and fully qualified domain name (FQDN), use: | |
| thanooj@thanooj-VirtualBox:~$ hostname | |
| thanooj-VirtualBox | |
| thanooj@thanooj-VirtualBox:~$ hostname -f | |
| thanooj-VirtualBox | |
| 2. Update your system: | |
| thanooj@thanooj-VirtualBox:~$ sudo apt-get update | |
| 3. Install MySQL |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| hive> CREATE TABLE emp_sal(id INT, salary DOUBLE) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' stored as textfile; | |
| OK | |
| Time taken: 0.261 seconds | |
| hive> LOAD DATA INPATH 'maprfs:/home/thanooj/emp_sal.txt' INTO TABLE emp_sal; | |
| Loading data to table thanooj.emp_sal | |
| Table thanooj.emp_sal stats: [numFiles=1, numRows=0, totalSize=139, rawDataSize=0] | |
| OK | |
| Time taken: 0.504 seconds | |
| hive> select * from emp_sal; | |
| OK |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| $ ls -ltr | |
| total 4 | |
| -rw-r--r-- 1 thanooj users 3221 Mar 20 05:36 words_count.txt | |
| $ vi words_count.txt | |
| $ cat words_count.txt | |
| Apache Sqoop is a tool designed for efficiently transferring data betweeen structured, semi-structured and unstructured data sources. | |
| Relational databases are examples of structured data sources with well defined schema for the data they store. | |
| Cassandra, Hbase are examples of semi-structured data sources. | |
| HDFS is an example of unstructured data source that Sqoop can support. | |
| With Sqoop, you can import data from a relational database system or a mainframe into HDFS |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| package com; | |
| import java.util.ArrayList; | |
| import java.util.Arrays; | |
| import java.util.Collections; | |
| import java.util.List; | |
| import java.util.Random; | |
| public class Deck { |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| pwd | |
| current_pwd=`pwd` | |
| cd /home/thanooj/work | |
| pwd | |
| ls -ltr | |
| for f in $(cat /home/thanooj/files/symlink_localized_file_list.txt); do | |
| rm "$f" | |
| done | |
| ls -ltr | |
| cd $current_pwd |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Apache Hive - Installation and Configuration | |
| UBUNTU 14.04 LTS | |
| JAVA - Oracle JDK 8 | |
| HADOOP 2.7.3 | |
| HIVE 2.1.1 | |
| MySQL 5.5 server | |
| 1. | |
| https://hive.apache.org/downloads.html |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| sudo apt-get install ssh | |
| sudo apt-get install rsync | |
| sudo apt install openssh-client | |
| sudo apt install openssh-server | |
| ssh localhost | |
| ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa | |
| cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Spark SQL is a Spark module for structured data processing. Unlike the basic Spark RDD API, the interfaces provided by Spark SQL provide Spark with more information about the structure of both the data and the computation being performed. Internally, Spark SQL uses this extra information to perform extra optimizations. There are several ways to interact with Spark SQL including SQL and the Dataset API. When computing a result the same execution engine is used, independent of which API/language you are using to express the computation. This unification means that developers can easily switch back and forth between different APIs based on which provides the most natural way to express a given transformation. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| PS C:\Users\thanooj> docker pull mysql | |
| PS C:\Users\thanooj> docker images | |
| REPOSITORY TAG IMAGE ID CREATED SIZE | |
| springio/gs-spring-boot-docker latest c8778cb72ef5 5 days ago 527MB | |
| openjdk 8 b190ad78b520 10 days ago 510MB | |
| mysql latest be0dbf01a0f3 11 days ago 541MB | |
| hello-world latest bf756fb1ae65 5 months ago 13.3kB | |
| PS C:\Users\thanooj> | |
| PS C:\Users\thanooj> docker container ls -a | |
| CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES |