Skip to content

Instantly share code, notes, and snippets.

@theikkila
Last active December 18, 2017 20:29
Show Gist options
  • Save theikkila/3459dc3f7cdd5963e0ea6f45b035fbed to your computer and use it in GitHub Desktop.
Save theikkila/3459dc3f7cdd5963e0ea6f45b035fbed to your computer and use it in GitHub Desktop.
email_backend.md

Enron Mail Server - GraphQL server exercise

The Enron scandal, publicized in October 2001, eventually led to the bankruptcy of the Enron Corporation, an American energy company based in Houston, Texas, and the de facto dissolution of Arthur Andersen, which was one of the five largest audit and accountancy partnerships in the world. In addition to being the largest bankruptcy reorganization in American history at that time, Enron was cited as the biggest audit failure Wikipedia (https://en.wikipedia.org/wiki/Enron_scandal)

In this excercise your task is to build a email server for storing Enron-emails.

You can choose your tooling pretty freely but the API should have following methods or ways to accomplish these tasks (at root level):

  • Show all emails received by single user, recognized by email-address. ie. mailboxOf([email protected]) { .. }
  • Show all emails sent by single user, recognized by email-address. ie. sentMailsOf([email protected]) { .. }
  • Full text search of the emails from subject,cc, recipients, sender, bcc or text fields. ie search(Rosalee Fleming) { .. }

Mail and User should be distinct models inside the API.

User

Should have following fields/resolvables:

  • address, that equals fully qualified email-address of the User
  • received (relation) that should be a list of received Mails (also when User was in cc or bcc fields)
  • sent (relation) that should be a list of sent Mails

Mail

Should have following fields/resolvables:

  • id, Scalar, mail unique identifier
  • text, mail content
  • sender, a relation to User who sent the Mail
  • recipients, a relation to Users that are direct recipients of the message
  • cc, a relation to Users that are carbon copy recipients of the message
  • bcc, a relation to Users that are blind carbon copy recipients of the message
  • date, a strictly defined scalar

Dataset

http://jsonstudio.com/wp-content/uploads/2014/02/enron.zip

Notes about the data

{
  "_id": {
    "$oid": "52af48b5d55148fa0c19964c"
  },
  "sender": "[email protected]",
  "recipients": [
    "[email protected]"
  ],
  "cc": [],
  "text": "Liz, I don't know how the address shows up when sent, but they tell us it's \n[email protected].\n\nTalk to you soon, I hope.\n\nRosie",
  "mid": "32285792.1075840285818.JavaMail.evans@thyme",
  "fpath": "enron_mail_20110402/maildir/lay-k/_sent/108.",
  "bcc": [],
  "to": [
    "[email protected]"
  ],
  "replyto": null,
  "ctype": "text/plain; charset=us-ascii",
  "fname": "108.",
  "date": "2000-08-10 03:27:00-07:00",
  "folder": "_sent",
  "subject": "KLL's e-mail address"
}

You can ignore folder, ctype, replyto, fname, fpath and mid-fields.

Map _id.$oid -> id, that should be sufficient for forming unique identifier for each Mail

Beware duplicating data.

You can ignore folder, ctype, replyto, fname, fpath and mid-fields.

Technology recommendations

Full-text search is very easy to implement using Elasticsearch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment