Skip to content

Instantly share code, notes, and snippets.

@roycewilliams
Created June 22, 2021 02:31
Show Gist options
  • Select an option

  • Save roycewilliams/367be060d5a52bdc7665a8528b6b4369 to your computer and use it in GitHub Desktop.

Select an option

Save roycewilliams/367be060d5a52bdc7665a8528b6b4369 to your computer and use it in GitHub Desktop.
strip-mbox-attachments.py
#!/bin/env python
# Strips attachments from mbox, writes to new mbox.
# Does not load entire old mbox into memory first - useful for very large mboxes.
# Ref: https://ask.slashdot.org/story/11/12/04/1754257/ask-slashdot-handling-and-cleaning-up-a-large-personal-email-archive
# Ref: https://ask.slashdot.org/comments.pl?sid=2557794&cid=38260428
from __future__ import print_function
from mailbox import mbox, mboxMessage
print("Reading mailbox ...\n")
orig_mb = mbox('old_mbox_filename')
new_mb = mbox('new_mbox_filename')
print("Analyzing ...\n")
for key,msg in orig_mb.iteritems():
print('.', end='')
new_msg = mboxMessage()
payload = msg.get_payload()
if msg.is_multipart():
payload = payload[0].get_payload()
print('x', end='')
for header in msg.keys():
new_msg[header] = msg[header]
new_msg.set_payload(payload)
new_mb.add(new_msg)
new_mb.flush()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment