Skip to content

Instantly share code, notes, and snippets.

View stevemclaugh's full-sized avatar

Steve McLaughlin stevemclaugh

View GitHub Profile
@stevemclaugh
stevemclaugh / Extract_MFCCs.ipynb
Last active April 12, 2024 19:49
Using LibRosa to extract MFCCs from audio and visualize the results
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@stevemclaugh
stevemclaugh / Lightweight Speech to Text with PocketSphinx.ipynb
Created December 20, 2016 05:17
Simple, fast, reasonably accurate speech-to-text processing for audio recordings of speech.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@stevemclaugh
stevemclaugh / FFmpeg_with_mp3_support_macOS.md
Last active April 4, 2024 16:06
Instructions for installing the command-line media conversion program FFmpeg with MP3 support on macOS.

Install FFmpeg with MP3 support on macOS

Initial Setup

If Homebrew is already installed, skip to "Install FFmpeg" below.

Launch Terminal in macOS, located at /Applications/Utilities/Terminal.app. To install a bundle of command-line tools provided by Apple, paste the following line into the terminal window and press return. When prompted, click "Install," then "Agree."

xcode-select --install

Common youtube-dl commands

Download highest-quality copy of a video to the current directory:

youtube-dl http://some/video

Download video and convert to MP4 (also supports flv, ogg, webm, mkv, avi):

Command Line Setup Steps for macOS

Launch Terminal in macOS, located at /Applications/Utilities/Terminal.app

If your current account doesn't have admin privileges, switch users by entering the following command (substituting the username of the admin account). Press return and enter your password at the prompt.

su your_admin_name

If you don't remember your admin username, the following command will list the users on your system. >

Install MongoDB and Python wrapper

sudo apt-get install -y mongodb-org
python3 -m pip install -U pymongo

Start MongoDB daemon

Scraping a page with a headless browser in Python: Selenium WebDriver + PhantomJS

Install dependencies in the bash shell

pip3 install -U selenium

# macOS
brew install phantomjs

Download a list of URLs

wget --wait=0.2 --random-wait --no-check-certificate --page-requisites -erobots=off --tries="inf" -c --user-agent="Mozilla/5.0 (Windows NT 6.1; WOW64; rv:41.0) Gecko/20100101 Firefox/41.0" -i /path/to/list_of_urls.txt

Recursively download a full website

wget -r --wait=0.2 --random-wait --no-check-certificate --page-requisites -erobots=off --tries="inf" -c --user-agent="Mozilla/5.0 (Windows NT 6.1; WOW64; rv:41.0) Gecko/20100101 Firefox/41.0" http://principalhand.org

Install MongoDB and Python wrapper (Ubuntu Linux)

sudo apt-get install -y mongodb-org
pip install pymongo
pip3 install pymongo

Start MongoDB daemon

@stevemclaugh
stevemclaugh / Gazette_of_India_scrape.py
Last active September 24, 2024 18:39
Scraping The Gazette of India with Selenium + ChromeDriver in Python
#!/usr/bin/python3
from selenium import webdriver
import time
import random
import os
import csv
url = 'http://egazette.bih.nic.in/SearchAdvanceGazette.aspx'