Skip to content

Instantly share code, notes, and snippets.

@jasonsalas
Last active November 5, 2019 12:38
Show Gist options
  • Save jasonsalas/da38903498c4d8fbe340845fe0acf592 to your computer and use it in GitHub Desktop.
Save jasonsalas/da38903498c4d8fbe340845fe0acf592 to your computer and use it in GitHub Desktop.
NLP: predict the MPAA rating of a movie based solely on its plot summary
# Make predictions on unseen data
#
# follows my full demo/repo at
# https://github.com/jasonsalas/nlp_predict_movie_rating_via_description/
import pickle
from keras.preprocessing.text import Tokenizer
from keras.models import load_model
''' Cabin in the Woods (R) '''
synopsis = 'Five friends go for a break at a remote cabin, where they get more than they bargained for, discovering the truth behind the cabin in the woods.'
''' Walking Tall (PG-13) '''
# synopsis = 'Chris Vaughn is a retired soldier who returns to his hometown to make a new life for himself, only to discover his wealthy high school rival, Jay Hamilton, has closed the once-prosperous lumber mill to turn the town resources towards his own criminal gains. The town is now overrun with crime, drugs and violence. Enlisting the help of his old pal Ray Templeton, Chris gets elected sheriff and vows to shut down Hamilton operations. His actions endanger his family and threaten his own life, but Chris refuses to back down until his hometown once again feels like home.'
''' The Hurt Locker (R) '''
# synopsis = 'Based on the personal wartime experiences of journalist Mark Boal (who adapted his experiences with a bomb squad into a fact-based, yet fictional story), director Kathryn Bigelow Iraq War-set action thriller The Hurt Locker presents the conflict in the Middle East from the perspective of those who witnessed the fighting firsthand -- the soldiers. As an elite Army Explosive Ordnance Disposal team tactfully navigates the streets of present-day Iraq, they face the constant threat of death from incoming bombs and sharp-shooting snipers. In Baghdad, roadside bombs are a common danger. The Army is working to make the city a safer place for Americans and Iraqis, so when it comes to dismantling IEDs (improvised explosive devices) the Explosive Ordnance Disposal (EOD) crew is always on their game. But protecting the public easy when there no room for error, and every second spent dismantling a bomb is another second spent flirting with death. Now, as three fearless bomb technicians take on the most dangerous job in Baghdad, it only a matter of time before one of them gets sent to the hurt locker. Jeremy Renner, Guy Pearce, and Ralph Fiennes star.'
''' Revenge of the Ninja (R) '''
# synopsis = 'After his family is killed in Japan by ninjas, Cho and his son Kane come to America to start a new life. He opens a doll shop but is unwittingly importing heroin in the dolls. When he finds out that his friend has betrayed him, Cho must prepare for the ultimate battle he has ever been involved in.'
''' The Lion King (G) '''
''' NOTE: the two summaries are combined for the 1994 and 2019 films '''
# synopsis = 'A Lion cub crown prince is tricked by a treacherous uncle into thinking he caused his father death and flees into exile in despair, only to learn in adulthood his identity and his responsibilities. After the murder of his father, a young lion prince flees his kingdom only to learn the true meaning of responsibility and bravery.'
''' Scarface (R) '''
synopsis = X[1974]
print(synopsis)
# key-value mappings for MPAA ratings
mpaa_ratings = { 0:'G', 1:'NC-17', 2:'NR', 3:'PG', 4:'PG-13', 5:'R' }
saved_tokenizer = pickle.load(open('tokenizer.pkl', 'rb'))
synopsis = saved_tokenizer.texts_to_sequences(synopsis)
saved_model = load_model('movie_description_classifier.h5')
flat_list = []
for sublist in synopsis:
for item in sublist:
flat_list.append(item)
synopsis = pad_sequences([flat_list], maxlen=400)
prediction = saved_model.predict_classes(synopsis)
print('\nI predict that the MPAA would give your movie a {} rating'.format(mpaa_ratings[prediction[0]]))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment