You can use the above script with parallel to speed up the process. You need to do some prep work first.
- Split the large file into 100 smaller files:
split -n 100 domains.txt domains_
- Make a list of the smaller files and save it:
ls -l domains_* | awk '{ print $9 }' > dom_files.txt
- Run the script with parallel:
parallel -a dom_files.txt -j 10 ./strip.py
- Cat all of the domain_*_strip.txt files together:
cat *_strip.txt > domains_stripped.txt