Skip to content

Instantly share code, notes, and snippets.

@ayushthakur
Last active February 23, 2025 23:10
Show Gist options
  • Save ayushthakur/7765ba1381075f6a4ca34ac87e53709c to your computer and use it in GitHub Desktop.
Save ayushthakur/7765ba1381075f6a4ca34ac87e53709c to your computer and use it in GitHub Desktop.
Understanding roll py.ipynb
Display the source blob
Display the rendered blob
Raw
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"provenance": [],
"authorship_tag": "ABX9TyP/c+4QvppTbOiwXMvFtH2i",
"include_colab_link": true
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
},
"language_info": {
"name": "python"
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/ayushthakur/7765ba1381075f6a4ca34ac87e53709c/understanding-roll-py.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"source": [
"## So the background\n",
"\n",
"- I want to see how the mnemonic is generated in BIP-39.\n",
"- Try getting a mnemonic which is correct (when checked using tools)\n",
"\n"
],
"metadata": {
"id": "38_Pnkkw71os"
}
},
{
"cell_type": "markdown",
"source": [
"# Let's first get the english wordlist from BIP-39"
],
"metadata": {
"id": "AHWzpyYeGUv3"
}
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "sjKaPOtZ7Wxy"
},
"outputs": [],
"source": [
"import requests\n",
"\n",
"# URL of the raw text file on GitHub\n",
"url = \"https://raw.githubusercontent.com/bitcoin/bips/master/bip-0039/english.txt\"\n",
"\n",
"# Fetch the file content\n",
"response = requests.get(url)\n",
"\n",
"# Check if the request was successful\n",
"if response.status_code == 200:\n",
" # Print the content of the file\n",
" word_list = response.text.split()\n",
" #print(file_content)\n",
"else:\n",
" print(f\"Failed to retrieve the file. Status code: {response.status_code}\")\n",
"\n"
]
},
{
"cell_type": "markdown",
"source": [
"# Now let's create our own entropy.\n",
"We are assuming a coin toss leads to either 0 or 1. We will do it 128 times to create a 128 bit long entropy"
],
"metadata": {
"id": "mhFCqhBNGXTI"
}
},
{
"cell_type": "code",
"source": [
"import random\n",
"\n",
"# Initialize an empty list to store the dice rolls\n",
"bit_list = []\n",
"\n",
"# Roll the dice 128 times\n",
"for _ in range(128):\n",
" # Randomly select either 0 or 1\n",
" dice_roll = random.choice([0, 1])\n",
" # Append the result to the list\n",
" bit_list.append(str(dice_roll))"
],
"metadata": {
"id": "2o1hg_DLGlSj"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"# Join the list into a single string to get the 128-bit long string\n",
"bit_string = ''.join(bit_list)\n",
"\n",
"# Print the 128-bit long string\n",
"print(bit_string) #this is our entropy"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "mwS46DLhG0F4",
"outputId": "55d74c75-2289-41c0-a86f-ca3bf64b83fc"
},
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"10100110000011011111001011111101100001100101000000010011101000110000100100101110111100010101010100100010101100111110001010101111\n"
]
}
]
},
{
"cell_type": "code",
"source": [
"# Convert binary to hexadecimal\n",
"hex_str = hex(int(bit_string, 2))[2:].upper()\n",
"\n",
"print(hex_str)"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "h9N7WRrOIsSv",
"outputId": "bbfac237-4c7e-4cf6-ec70-49e78cbbe77d"
},
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"A60DF2FD865013A3092EF15522B3E2AF\n"
]
}
]
},
{
"cell_type": "code",
"source": [
"import hashlib\n",
"\n",
"# Convert the hex string into bytes\n",
"byte_data = bytes.fromhex(hex_str)\n",
"\n",
"# Apply SHA-256 hash\n",
"sha256_hash = hashlib.sha256(byte_data).hexdigest()"
],
"metadata": {
"id": "bxM7nfNCIsMd"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"# Print the checksum (SHA-256 hash)\n",
"print(\"SHA-256 Checksum:\", sha256_hash)"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "fm8jx4_UIrXk",
"outputId": "3a8a8bfb-a809-44d8-b52f-d09f0db88c2e"
},
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"SHA-256 Checksum: 5b4d8e235f7d2297ad16e8a52bdeebc45cc0727e20ef7b26c8dfe1f4456699e3\n"
]
}
]
},
{
"cell_type": "code",
"source": [
"# Get the first 4 bits of the checksum (the checksum length is 4 bits for 128-bit entropy)\n",
"checksum_binary = bin(int(sha256_hash, 16))[2:].zfill(256)[:4]"
],
"metadata": {
"id": "9AWe5TvyKttV"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"Checksum is 4 bit, why? Because 128/32 is 4 i.e. length of entropy divided by 32. If we had taken 256 (i.e. 24 words one) the checksum would be 8 bit"
],
"metadata": {
"id": "NLz4kSHrRX4R"
}
},
{
"cell_type": "code",
"source": [
"checksum_binary"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 35
},
"id": "v1v8KeYIKs4n",
"outputId": "01f335f0-2eda-42cf-af74-dbae7964c6f4"
},
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"'0101'"
],
"application/vnd.google.colaboratory.intrinsic+json": {
"type": "string"
}
},
"metadata": {},
"execution_count": 56
}
]
},
{
"cell_type": "code",
"source": [
"# Append the checksum to the original entropy (128 bits + 4 bits)\n",
"entropy_with_checksum = bit_string + checksum_binary"
],
"metadata": {
"id": "D6yn9cVlLaqo"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"# Now, split this 132-bit string into 11-bit chunks\n",
"chunks = [entropy_with_checksum[i:i+11] for i in range(0, len(entropy_with_checksum), 11)]"
],
"metadata": {
"id": "6hHh6HWPLaTe"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"# Convert each 11-bit chunk to an integer (this gives the index for the word list)\n",
"indexes = [int(chunk, 2) for chunk in chunks]"
],
"metadata": {
"id": "03MDcOx6Llf0"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"# Print the indexes\n",
"print(\"Indexes:\", indexes)"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "5h-WdKvaLwOX",
"outputId": "07485d3c-0090-430d-94da-36f60eaf4531"
},
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Indexes: [1328, 892, 1531, 101, 9, 1676, 293, 1777, 681, 172, 1989, 757]\n"
]
}
]
},
{
"cell_type": "code",
"source": [
"# Map the indexes to words\n",
"mnemonic_words = [word_list[i] for i in indexes]\n",
"\n",
"# Join the words into a single mnemonic phrase\n",
"mnemonic_phrase = \" \".join(mnemonic_words)\n",
"\n",
"# Print the final mnemonic phrase\n",
"print(\"Mnemonic Phrase:\", mnemonic_phrase)"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "4BsvPwdxMBnb",
"outputId": "75b76304-5482-4944-d525-5e069050f145"
},
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Mnemonic Phrase: plastic hurdle satoshi arrow abuse spice caution taste festival better weather gadget\n"
]
}
]
},
{
"cell_type": "markdown",
"source": [
"# We have derived mnemonic using wordlist and our coin (virtual).\n",
"\n",
"The next step is to derive the 12th word when you have the first 11 of your valid key. This is just a way to check if you lost the last word of your key, will you be able to get it back?\n",
"\n",
"I don't think so, because then it makes the key insecure (tbh).\n",
"\n",
"Because the checksum is generated from whole 128 bit so getting the same checksum from 11 words won't be possible. Let's check it using code."
],
"metadata": {
"id": "UvhbvmvhMygG"
}
},
{
"cell_type": "code",
"source": [
"first_11_words = ['plastic', 'hurdle', 'satoshi', 'arrow', 'abuse', 'spice', 'caution', 'taste', 'festival', 'better', 'weather']"
],
"metadata": {
"id": "t1NQMdb5G8pD"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"# Convert the words to their corresponding indexes\n",
"indexes = [word_list.index(word) for word in first_11_words]"
],
"metadata": {
"id": "x7woFi7mNdfl"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"# Reconstruct the binary entropy from the indexes\n",
"entropy_binary = ''.join([bin(index)[2:].zfill(11) for index in indexes])\n"
],
"metadata": {
"id": "X0e-CvTuNjlX"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"entropy_binary"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 35
},
"id": "jvb7WWs9N1kT",
"outputId": "2c0c48af-4f0a-4a55-d47b-b9c3a4f1eb1d"
},
"execution_count": null,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"'1010011000001101111100101111110110000110010100000001001110100011000010010010111011110001010101010010001010110011111000101'"
],
"application/vnd.google.colaboratory.intrinsic+json": {
"type": "string"
}
},
"metadata": {},
"execution_count": 65
}
]
},
{
"cell_type": "code",
"source": [
"entropy_bytes = bytearray()\n",
"for i in range(0, len(entropy_binary), 8):\n",
" byte = entropy_binary[i:i+8]\n",
" entropy_bytes.append(int(byte, 2))\n",
"\n",
"# Step 4: Apply SHA-256 to the entropy\n",
"sha256_hash = hashlib.sha256(entropy_bytes).hexdigest()\n",
"\n",
"# Print the SHA-256 hash (it should be a 64-character hexadecimal string)\n",
"print(\"SHA-256 Hash:\", sha256_hash)\n",
"\n",
"# Step 5: Extract the first 4 bits of the SHA-256 hash for the checksum\n",
"checksum = bin(int(sha256_hash, 16))[2:].zfill(256)[:4]\n",
"\n",
"# Print the checksum bits\n",
"print(\"Checksum Bits:\", checksum)\n"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "G70blEWbNu3l",
"outputId": "ae772de6-aedc-459c-a9b2-a73940ce2487"
},
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"SHA-256 Hash: 539a3d545ea8e66044d0b8b7fedb27c1fbb388ca4d1ef81367b679a04627d6bd\n",
"Checksum Bits: 0101\n"
]
}
]
},
{
"cell_type": "code",
"source": [
"# Step 6: Find the index of the 12th word using the checksum\n",
"checksum_index = int(checksum, 2)\n",
"\n",
"# Print the checksum index\n",
"print(\"Checksum Index:\", checksum_index)\n",
"\n",
"# Step 7: Find the 12th word from the word list\n",
"twelfth_word = word_list[checksum_index]\n",
"\n",
"# Print the 12th word\n",
"print(\"The 12th word is:\", twelfth_word)"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "FcYPfmy4OUkp",
"outputId": "5bb98f15-361a-4b7f-dfd9-e8dd12885771"
},
"execution_count": null,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Checksum Index: 5\n",
"The 12th word is: absent\n"
]
}
]
}
]
}
@lvnilesh
Copy link

Thanks

@ayushthakur
Copy link
Author

@lvnilesh please ignore this one, the 12th word isn't validating. I will update the share a new version tomorrow. Thanks.

@lvnilesh
Copy link

Did you update? I missed probably.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment