Skip to content

Instantly share code, notes, and snippets.

@quizzicol
Created November 4, 2016 06:31
Show Gist options
  • Save quizzicol/92c583730c960273e2c4e151f900473d to your computer and use it in GitHub Desktop.
Save quizzicol/92c583730c960273e2c4e151f900473d to your computer and use it in GitHub Desktop.
Authorship attribution of ghost written novels.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file has been truncated, but you can view the full file.
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Can a ghost writer mimic another author?\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In fiction literature, it is not uncommon for authors to write under more than one name or to write as another author. Some examples of this are:\n",
"* 'Tom Clancy' books written by Tom Clancy as well as other authors including Mark Greaner, Grant Blackwood etc\n",
"\n",
"| | |\n",
"|-|-|\n",
"|<img src=http://www.tomclancy.com/books/true-faith-and-allegiance-hc300.jpg width='100px'>|<img src=http://www.tomclancy.com/books/tom-clancy-duty-and-honor-hc300.jpg width='100px'>|\n",
"\n",
"* 'Clive Cussler' books written by Clive Cussler as well as other authors including Boyd Morrison, Justin Scott etc\n",
"\n",
"| | |\n",
"|-|-|\n",
"|<img src=http://clive-cussler-books.com/wp-content/uploads/2016/05/053116_Emperors-Revenge-Oregon-Files-Clive-Cussler-Novels_199x300.jpg width='100px'>|<img src=http://clive-cussler-books.com/wp-content/uploads/2016/05/030116_The-Gangster-Isaac-Bell-Clive-Cussler-Adventure-Novels_199x300.jpg width='100px'>|\n",
"\n",
"* J.K Rowling writting under the pseudonym Robert Galbraith\n",
"\n",
"| | |\n",
"|-|-|\n",
"|<img src=http://vignette3.wikia.nocookie.net/harrypotter/images/7/7b/Harry01english.jpg/revision/latest?cb=20150208225304 width='100px'>|<img src=https://static.independent.co.uk/s3fs-public/styles/story_medium/public/thumbnails/image/2013/07/14/22/9-calling.jpg width='112px'>|\n",
"\n",
"* Stephen King writing under the pseudonym Richard Bachman\n",
"\n",
"| | |\n",
"|-|-|\n",
"|<img src=https://images-na.ssl-images-amazon.com/images/I/513H0tH7t-L._SX301_BO1,204,203,200_.jpg width='100px'>|<img src=https://room435.files.wordpress.com/2011/03/2running-man.jpg width='100px'>|\n",
"\n",
"\n",
"This notebook contains the code use to test whether an author writing under another autors name (e.g. Mark Greaney writing as Tom Clancy) can write in the style of another author, or if they retain their own language style."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Introduction"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Autorship attribution is the task of identifying the autor of a document. Authors typically have their own unique writing style and their works can often be characterised by identifying elements or features of these styles. There are numerous features which can be used to help classify a text including [1](http://www.icsd.aegean.gr/lecturers/stamatatos/papers/survey.pdf):\n",
"\n",
"* lexical features (e.g. average word length, sentence length, vocabulary richness, word frequencies, common errors\n",
"* character features (e.g. character types, character n-grams etc)\n",
"* Syntactic features (e.g. parts-of-speech, sentence structure etc)\n",
"* Semantic features (e.g. function words, synonym use etc)\n",
"\n",
"A well known example of authorship attribution is that of determining the authorship of the federalist papers [a collection of 85 articles and essays written (under the pseudonym Publius) by Alexander Hamilton, James Madison, and John Jay promoting the ratification of the United States Constitution](https://en.wikipedia.org/wiki/The_Federalist_Papers).\n",
"\n",
"More recently authorship attribution is often used to identify who wrote blog posts or social media comments, and whether one person is writing
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment