First of, lets talk about how i got motivation to write this guide in the form of "gist"?
So one day, god(literally : 😂) sent me a message : "Hey Rishad, go and build your own torrent client (an application that downloads torrents)". I took it as an order from god, and i started to explore possible ways to build it. After many !hours but minutes of google searches, i got myself stuck reading a blog/article/guide written by ALLEN KIM and the reference blogs mentioned there, which were written by other developers.
Link : https://allenkim67.github.io/programming/2016/05/04/how-to-make-your-own-bittorrent-client.html
I was reading it thoroughly and on the way i found that in order to get "peer list", i must make a "GET" request to a tracker URL which was stored in "announce" field of bencode encoded torrent file. I found that the params of GET request's URL /?xyz=abc&mno=pqr
must have one of them as a field named "hash_value" and its value should be URL encoded SHA1 hash of "info" field of bencode encoded torrent file.
i.e it must parameters in GET
request must be in this form
?info_hash=%02%40%2C%3Cq%BC%CDp%8DM%9F%94%01d%AE%12E%82%0A%10&peer_id=TIX0284-c6g0j1f9f7f7&port=24523&uploaded=0&downloaded=494006386&left=0&corrupt=0&key=0T3M6T7H&numwant=100&compact=1&no_peer_id=1 HTTP/1.1\r\n
it's an example of a captured GET request(with Wireshark) of a torrent client requesting tracker for peer list information
In order to get the percent encoded "info_hash" :
First, we must get the SHA1 hash in "HEX" form, from the "info" field of bencode encoded torrent file. In NODEJS we can do that by
const fs = require('fs') // Needed to read from torrent file
const bencode = require('bencode') // Needed to decode and encode bencode
const crypto = require('crypto') // Required to create SHA1 hash
// Decodes a torrent file into Javascript object of Key and value, where value is Buffer Object
const torrent_bencode_decoded = bencode.decode(fs.readFileSync("test.torrent"));
// It will generate info hash int he form of Buffer Object, from torrent file decoded from bencode
const generate_info_hash_from_torrent_bencode = (torrentBencode) => {
let info = torrentBencode.info;
let encoded_info = bencode.encode(info);
let info_hash = crypto.createHash("sha1").update(encoded_info).digest();
return info_hash
}
// It will store info hash in the form of "HEX"
const torrent_info_hash = generate_info_hash_from_torrent_bencode(torrent_bencode_decoded).toString("HEX").toUpperCase();
If we console.log(torrent_info_hash)
, then we will see a SHA1 Hash Hex Form that looks jibberish, like this :
Phew! we're almost there now! ( To finish writing this gist faster, i'm gonna assume you know how SHA1 works)
So, now all that left is, to URL encode this HEX form of SHA1 which is 40 Characters Long. You might say "Hey rishad, it's simple. Just add a % sign after every even character position as in the GET params that was mentioned above". 😫 "Wish it was that easy" 😥😥.
In order to understand how to URL encode this HEX form of SHA1. First of all, i want you to understand why even we need to URL encode (Percent Encode) "this piece of jibberish looking SHA1 shit". For that you need to learn what this URL encoding is, and learn it from the link below :
Link : https://en.wikipedia.org/wiki/Percent-encoding
So from the above wikipedia article, i'll assume you understood few things :
- In a URL you can only use the following characters without any percent encoding
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
a b c d e f g h i j k l m n o p q r s t u v w x y z
0 1 2 3 4 5 6 7 8 9 - _ . ~
Seeing this, you might ask me "Hey rishad, but all the characters in the SHA1 that i got earlier above in nodeJS have above characters. Does it mean this URLencoding(Percent Encoding) should not be done?". 😫😫"Fat Noooooooooo"