Skip to content

Instantly share code, notes, and snippets.

@SKaplanOfficial
Last active January 18, 2023 21:56
Show Gist options
  • Save SKaplanOfficial/14b1a09e4bd1c29ae08710629504b447 to your computer and use it in GitHub Desktop.
Save SKaplanOfficial/14b1a09e4bd1c29ae08710629504b447 to your computer and use it in GitHub Desktop.
PyXA script to classify email subject lines as junk or other using Latent Semantic Mapping
import PyXA # Version 0.2.0
app = PyXA.Application("Mail")
dataset = {
"junk": app.accounts()[0].mailboxes().by_name("Junk").messages().subject(),
"other": app.accounts()[0].mailboxes().by_name("INBOX").messages().subject()
}
queries = [
"Amazon Web Services Billing Statement Available",
"Complete my registration form asap receive your rewards"
]
lsm = PyXA.XALSM(dataset)
for query in queries:
category = list(dataset.keys())[lsm.categorize_query(query)[0][0] - 1]
print(query, "- category:", category)
# Amazon Web Services Billing Statement Available - category: other
# Complete my registration form asap receive your rewards - category: junk
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment