Skip to content

Instantly share code, notes, and snippets.

View jeongho's full-sized avatar

Jeongho Park jeongho

  • Deception Island, Antarctica
View GitHub Profile
@jeongho
jeongho / run_testdfsio.sh
Created February 4, 2016 18:04
Hadoop benchmark 3. run testdfsio
#!/bin/bash
# TestDFS will be performed with the total file size of 1TB using different dfs.block.size variations.
# Usage: TestDFSIO [genericOptions] -read | -write | -append | -clean [-nrFiles N] [-fileSize Size[B|KB|MB|GB|TB]] [-resFile resultFileName] [-bufferSize Bytes] [-rootDir]
#
# The test is designed with two variables
# 1) file_sizes_mb: file size variation with 1GB file x 1,000 = 1TB and 100MB file x 10,000 = 1TB
# this is to test large file and small file impact on HDFS
# 2) dfs.block.size (MB) variation: 512, 256, 128, 50 10
# this is to test impact of different block sizes.
#
@jeongho
jeongho / graphite_collectd.txt
Created February 4, 2016 18:08
graphite+collectd on Centos
yum list installed | grep -i "graphite\|carbon\|whisper"
graphite-web.noarch 0.9.12-5.el6 @epel
graphite-web-selinux.noarch 0.9.12-5.el6 @epel
python-carbon.noarch 0.9.12-3.el6.1 @epel
python-whisper.noarch 0.9.12-1.el6 @epel
Graphite Install
1. Install dependencies
ansible-playbook -i hosts update_yum.yml
@jeongho
jeongho / haproxy_cloudera.cfg
Last active October 19, 2020 02:30
haproxy config for Cloudera
#yum install haproxy
#configure haproxy-cloudera.cfg
#haproxy -f /etc/haproxy/haproxy-cloudera.cfg
#http://seawolf-3.vpc.wonderland.com:1936/
#https://cbonte.github.io/haproxy-dconv/
global
daemon
nbproc 1
maxconn 100000
@jeongho
jeongho / solr_tips.txt
Created February 25, 2016 17:34
Solr CLI 1) add offline shad 2) add replica
------------------------------
# solr search string with NULL values
*:* -field_name:[* to *]
------------------------------
# solr add offline shard
solrctl --solr http://<target_solr_server>:8983/solr
core --create <core_name> \
-p dataDir=<index_hdfs_path> \
-p collection.configName=<config_name> \
@jeongho
jeongho / mysql_backup.sh
Last active March 8, 2017 22:56
mysql backup cron job
#!/usr/bin/env bash
#
# Schedule in cron as local time 11:00PM local time (e.g. PST '00 07 * * * bash /home/antarctica/backup_cloudera_db.sh)
user=root
password=root_password
database=scm
archive_days=7
# Backing Up MySQL Databases
@jeongho
jeongho / check_public_ip.sh
Created February 25, 2016 17:40
check public ip
#!/usr/bin/env bash
curl -s checkip.dyndns.org|sed -e 's/.*Current IP Address: //' -e 's/<.*$//'
#wget -qO- http://ipecho.net/plain ; echo
@jeongho
jeongho / hadoop_client_setup.sh
Created February 25, 2016 17:43
Using the CLI to access the cluster from your own host
#!/usr/bin/env bash
#Using the CLI to access the cluster from your own host
#Step 1. Setup your Hadoop config
#Cloudera Manager UI, Services>All Services>Client Configuration URLs
#Step 2. Download CDH4 and setup your environment
#1. Point your browser at CDH Tarballs
#2. Click on CDH4 tarballs and download hadoop-2-x
#3. Update your environments (~/.bash_profile is a good bet)
@jeongho
jeongho / hdfs_tmp_cleanup.sh
Last active November 2, 2021 22:55
hdfs tmp folder cleanup
#!/usr/bin/env bash
#remove files older than X days:
#based off the hadoop fs -ls
#days=5; for f in $(cutoff=$(echo $(date +%s)"-$days*24*60*60" | bc); hadoop fs -ls -R /tmp 2>/dev/null|grep ^- |awk '{ print "echo $(date -d \""$6,$7"\" +%s)" , $8}'| bash | awk -v cutoff=$cutoff '$1 < cutoff'| sort -n | cut -f2 -d" "|grep ^$d); do hadoop fs -rm $f; done
#remove files older than X days:
days=5;
for f in $(cutoff=$(echo $(date +%s)"-$days*24*60*60" | bc);
hadoop fs -ls -R /tmp 2>/dev/null | grep ^- | \
@jeongho
jeongho / cgroup_config.txt
Last active August 3, 2016 17:36
cgroup configuration
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Resource_Management_Guide
sudo yum install libcgroup
sudo service cgconfig start
sudo chkconfig cgconfig on
lscgroup
cat /etc/cgconfig.d/antarcticatec-server
group antarcticatec-server {
cpu {
@jeongho
jeongho / pin_centos6.7.txt
Created August 3, 2016 16:05
Pin CentOS repository to 6.7 to prevent yum update goes to 6.8
1. disable Base repo
sed -i.bak '/^gpgcheck=1/ a enabled=0 ' /etc/yum.repos.d/CentOS-Base.repo
2. append Vault repo with CentOS 6.7
#-----------------
[C6.7-base]
name=CentOS-6.7 - Base
baseurl=http://vault.centos.org/6.7/os/$basearch/
gpgcheck=1