Steve McLaughlin stevemclaugh

Install FFmpeg with MP3 support on macOS

Initial Setup

If Homebrew is already installed, skip to "Install FFmpeg" below.

Launch Terminal in macOS, located at /Applications/Utilities/Terminal.app. To install a bundle of command-line tools provided by Apple, paste the following line into the terminal window and press return. When prompted, click "Install," then "Agree."

xcode-select --install

Common youtube-dl commands

Download highest-quality copy of a video to the current directory:

youtube-dl http://some/video

Download video and convert to MP4 (also supports flv, ogg, webm, mkv, avi):

Command Line Setup Steps for macOS

Launch Terminal in macOS, located at /Applications/Utilities/Terminal.app

If your current account doesn't have admin privileges, switch users by entering the following command (substituting the username of the admin account). Press return and enter your password at the prompt.

su your_admin_name

If you don't remember your admin username, the following command will list the users on your system. >

Install MongoDB and Python wrapper

sudo apt-get install -y mongodb-org
python3 -m pip install -U pymongo

Start MongoDB daemon

Scraping a page with a headless browser in Python: Selenium WebDriver + PhantomJS

Install dependencies in the bash shell

pip3 install -U selenium

# macOS
brew install phantomjs

Download a list of URLs

wget --wait=0.2 --random-wait --no-check-certificate --page-requisites -erobots=off --tries="inf" -c --user-agent="Mozilla/5.0 (Windows NT 6.1; WOW64; rv:41.0) Gecko/20100101 Firefox/41.0" -i /path/to/list_of_urls.txt

Recursively download a full website

wget -r --wait=0.2 --random-wait --no-check-certificate --page-requisites -erobots=off --tries="inf" -c --user-agent="Mozilla/5.0 (Windows NT 6.1; WOW64; rv:41.0) Gecko/20100101 Firefox/41.0" http://principalhand.org

Install MongoDB and Python wrapper (Ubuntu Linux)

sudo apt-get install -y mongodb-org
pip install pymongo
pip3 install pymongo

	#!/usr/bin/python3

	from selenium import webdriver
	import time
	import random
	import os
	import csv

	url = 'http://egazette.bih.nic.in/SearchAdvanceGazette.aspx'