Skip to content

Instantly share code, notes, and snippets.

@poros
Created July 21, 2015 22:46
Show Gist options
  • Save poros/9406e66c0c883e66ecd5 to your computer and use it in GitHub Desktop.
Save poros/9406e66c0c883e66ecd5 to your computer and use it in GitHub Desktop.
Extract ALL status updates from wall.htm in your downloaded facebook data copy
import codecs
import sys
from bs4 import BeautifulSoup
wall_file = open(sys.argv[1], "r")
wall = BeautifulSoup(wall_file, 'html.parser')
comment_divs = wall.find_all(class_="comment")
comments = [div.string for div in comment_divs]
out_file = codecs.open(sys.argv[2], "w", "utf-8")
for comment in comments:
out_file.write("%s\n#######\n" % comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment