Created
November 27, 2014 14:20
-
-
Save rhizoome/9f5d49931b57acabfb4c to your computer and use it in GitHub Desktop.
Card Complete with noise
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"metadata": { | |
"name": "", | |
"signature": "sha256:9e533461c3971d0ca543d09b07e7430683b57758b8768a2c2718424791da3e54" | |
}, | |
"nbformat": 3, | |
"nbformat_minor": 0, | |
"worksheets": [ | |
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"An algorithm to complete sets by observing sub-sets of these sets." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Basic idea\n", | |
"\n", | |
"There are common hearthstone decks out there. As we track games we observe parts of these decks. We want to reconstruct these decks and generate a list of the most common decks.\n", | |
"\n", | |
"Hearthstone has about 600 cards." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"cards = range(600)" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 15 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"We create 5 decks from these 600 cards" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"import random\n", | |
"decks = []\n", | |
"for _ in range(5):\n", | |
" decks.append(random.sample(cards, 30))" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 16 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"No we create games. The first deck is played 5 times, the second 20 times, 45, 80, 125" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"games = []\n", | |
"for x in range(5):\n", | |
" for _ in range((x + 1)**2 * 5 * x):\n", | |
" games.append(set(random.sample(decks[x], 12)))\n", | |
" \n", | |
" \n", | |
"# Add some noise\n", | |
"for _ in range(100000):\n", | |
" games.append(set(random.sample(cards, 12)))\n", | |
"\n", | |
"random.shuffle(games)" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 17 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Next we find the most commonly played cards, we use a pandas DataFrame.\n", | |
"\n", | |
"We call this card_game since it is a like a SQL relation between cards and games." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"import pandas\n", | |
"\n", | |
"card_game = pandas.DataFrame([[card in game or None for card in cards] for game in games])" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 18 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"Find the most common cards, restrict the set to games playing this card, repeat until the resulting deck is 30 cards or smaller." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"filter_set = card_game\n", | |
"cards = []\n", | |
"x = 0\n", | |
"while True:\n", | |
" game_count = filter_set.count()\n", | |
" # Find the card with the most games\n", | |
" game_count.sort(ascending=False)\n", | |
" most_games = game_count.index[x]\n", | |
" x += 1\n", | |
" cards.append(most_games)\n", | |
" # Restrict to games containing this card\n", | |
" filter_set = filter_set[filter_set[int(most_games)] == True]\n", | |
" deck_sum = filter_set.sum()\n", | |
" deck = deck_sum[deck_sum.notnull()].index.values\n", | |
" if len(deck) < 31:\n", | |
" break" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 19 | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"deck = set(deck)\n", | |
"deck.symmetric_difference(set(decks[4]))" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"metadata": {}, | |
"output_type": "pyout", | |
"prompt_number": 20, | |
"text": [ | |
"set()" | |
] | |
} | |
], | |
"prompt_number": 20 | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"We have found the most common deck. How can we find the next deck??\n", | |
"\n", | |
"-> We remove the games using this deck and repeat." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"def deck_not_matches(x):\n", | |
" return not all([not x[v] or v in deck for v in x.index])\n", | |
"new_card_game = card_game[card_game.apply(deck_not_matches, axis=1)]" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 21 | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"filter_set = new_card_game\n", | |
"cards = []\n", | |
"x = 0\n", | |
"while True:\n", | |
" game_count = filter_set.count()\n", | |
" # Find the card with the most games\n", | |
" game_count.sort(ascending=False)\n", | |
" most_games = game_count.index[x]\n", | |
" x += 1\n", | |
" cards.append(most_games)\n", | |
" # Restrict to games containing this card\n", | |
" filter_set = filter_set[filter_set[int(most_games)] == True]\n", | |
" deck_sum = filter_set.sum()\n", | |
" deck = deck_sum[deck_sum.notnull()].index.values\n", | |
" if len(deck) < 31:\n", | |
" break" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [], | |
"prompt_number": 22 | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"deck = set(deck)\n", | |
"deck.symmetric_difference(set(decks[3]))" | |
], | |
"language": "python", | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"metadata": {}, | |
"output_type": "pyout", | |
"prompt_number": 23, | |
"text": [ | |
"{172, 183}" | |
] | |
} | |
], | |
"prompt_number": 23 | |
} | |
], | |
"metadata": {} | |
} | |
] | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment