Disclaimer 1: Don't know much about tftp protocol.
Disclaimer 2: Don't know much about Golang.
$ apt-cache search tftp | egrep -i tftp | egrep -i client
tftp-hpa - HPA's tftp client
atftp - advanced TFTP client
tftp - Trivial file transfer protocol client
$ sudo apt-get install tftp
$ man tftp | egrep Peter
This version of tftp is maintained by H. Peter Anvin <[email protected]>. It was derived from, but has substantially diverged from, an OpenBSD source base, with added patches by Markus Gutschke and Gero Kulhman.
$ # see https://github.com/firmero/tftp
$ git clone https://github.com/firmero/tftp.git
$ cd tftp/
$ make simple
gcc -Wall -std=gnu99 -pthread server.c -c -o server.o
gcc -Wall -std=gnu99 -pthread flist.c -c -o flist.o
gcc -Wall -std=gnu99 -pthread workers_data.c -c -o workers_data.o
gcc -Wall -std=gnu99 -pthread workers.c -c -o workers.o
gcc server.o workers.o workers_data.o flist.o -o tftp_server -std=gnu99 -lpthread
./tftp_server --port 12345 --dir /tmp &
./tests/simple.sh
IP: :: Port: 12345
[========= gen client_file 2 MB =========]
[========= DONE gen client_file 2 MB =========]
[========= Put file =========]
[========= End of transmission =========]
[ OK: No diffrence found ]
[========= gen server_file 2 MB =========]
[========= DONE gen server_file 2 MB =========]
[========= Get file =========]
[========= End of transmission =========]
[ OK: No diffrence found ]
[========= gen client_file 31 MB =========]
[========= DONE gen client_file 31 MB =========]
[========= Put file =========]
[========= End of transmission =========]
[ OK: No diffrence found ]
[========= gen server_file 31 MB =========]
[========= DONE gen server_file 31 MB =========]
[========= Get file =========]
[========= End of transmission =========]
[ OK: No diffrence found ]
pkill tftp_server
tftp_server: recvfrom : Interrupted system call
rm -f client_file* server_file* /tmp/client_file* /tmp/server_file*
- We can see the client send 4 packets for 4 file write requests:
$ sudo tcpdump -i lo -tt -nn -s 0 -XX port 12345
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on lo, link-type EN10MB (Ethernet), capture size 262144 bytes
1564967337.437473 IP 127.0.0.1.42509 > 127.0.0.1.12345: UDP, length 24
0x0000: 0000 0000 0000 0000 0000 0000 0800 4500 ..............E.
0x0010: 0034 70b2 4000 4011 cc04 7f00 0001 7f00 .4p.@.@.........
0x0020: 0001 a60d 3039 0020 fe33 0002 636c 6965 ....09...3..clie
0x0030: 6e74 5f66 696c 655f 326d 6200 6f63 7465 nt_file_2mb.octe
0x0040: 7400 t.
1564967337.507118 IP 127.0.0.1.38549 > 127.0.0.1.12345: UDP, length 24
0x0000: 0000 0000 0000 0000 0000 0000 0800 4500 ..............E.
0x0010: 0034 90b8 4000 4011 abfe 7f00 0001 7f00 .4..@.@.........
0x0020: 0001 9695 3039 0020 fe33 0001 7365 7276 ....09...3..serv
0x0030: 6572 5f66 696c 655f 326d 6200 6f63 7465 er_file_2mb.octe
0x0040: 7400 t.
1564967337.805815 IP 127.0.0.1.47940 > 127.0.0.1.12345: UDP, length 25
0x0000: 0000 0000 0000 0000 0000 0000 0800 4500 ..............E.
0x0010: 0035 b0df 4000 4011 8bd6 7f00 0001 7f00 .5..@.@.........
0x0020: 0001 bb44 3039 0021 fe34 0002 636c 6965 ...D09.!.4..clie
0x0030: 6e74 5f66 696c 655f 3331 6d62 006f 6374 nt_file_31mb.oct
0x0040: 6574 00 et.
1564967338.668186 IP 127.0.0.1.58327 > 127.0.0.1.12345: UDP, length 25
0x0000: 0000 0000 0000 0000 0000 0000 0800 4500 ..............E.
0x0010: 0035 a101 4000 4011 9bb4 7f00 0001 7f00 .5..@.@.........
0x0020: 0001 e3d7 3039 0021 fe34 0001 7365 7276 ....09.!.4..serv
0x0030: 6572 5f66 696c 655f 3331 6d62 006f 6374 er_file_31mb.oct
0x0040: 6574 00 et.
- Add some comments to the tcpdump to explain what's happening:
$ sudo tcpdump -i lo -tt -nn -s 0 -XX | tee tcpdump.log
1564968684.004976 IP 127.0.0.1.41773 > 127.0.0.1.12345: UDP, length 24 <-- client on port 41773 sends request
0x0000: 0000 0000 0000 0000 0000 0000 0800 4500 ..............E.
0x0010: 0034 70a2 4000 4011 cc14 7f00 0001 7f00 .4p.@.@.........
0x0020: 0001 a32d 3039 0020 fe33 0002 636c 6965 ...-09...3..clie <-- 0002 means WRQ Write Request; mode is 'octet'
0x0030: 6e74 5f66 696c 655f 326d 6200 6f63 7465 nt_file_2mb.octe
0x0040: 7400 t.
1564968684.005153 IP 127.0.0.1.56558 > 127.0.0.1.41773: UDP, length 4
0x0000: 0000 0000 0000 0000 0000 0000 0800 4500 ..............E.
0x0010: 0020 70a3 4000 4011 cc27 7f00 0001 7f00 ..p.@.@..'......
0x0020: 0001 dcee a32d 000c fe1f 0004 0000 .....-........ <-- server replies from port 41773; 0004 means ACK Acknowledgement; Block # 0000
1564968684.005170 IP 127.0.0.1.41773 > 127.0.0.1.56558: UDP, length 516
0x0000: 0000 0000 0000 0000 0000 0000 0800 4500 ..............E.
0x0010: 0220 70a4 4000 4011 ca26 7f00 0001 7f00 ..p.@.@..&......
0x0020: 0001 a32d dcee 020c 0020 0003 0001 9481 ...-............ <-- client: 0003 means DATA; Block # 0001; data ... <-- "fixed length blocks of 512 bytes"
0x0030: 9dbc 493e e418 e391 6555 72a7 875c 582a ..I>....eUr..\X*
0x0040: b636 58f3 e37d a364 21e1 9769 2fad 055d .6X..}.d!..i/..]
0x0050: 9ff9 0c7c 66b9 2105 746f fc7e e5d6 c6f9 ...|f.!.to.~....
0x0060: d69d d1ad 8a2d a6bd 2ea8 69b7 90e5 bc7c .....-....i....|
0x0070: a6ba f886 1381 bf0d af5c aee3 ed3f 0cf3 .........\...?..
0x0080: e28f 88c1 4644 11e5 3694 e505 8eff f5a2 ....FD..6.......
0x0090: 090d be2f 727a 80c1 1bdc 8609 5150 71e9 .../rz......QPq.
0x00a0: c69b 8c1e e5ba 249b 3c5f e6de 1faa 0c0b ......$.<_......
0x00b0: cb76 2438 8b1e a4eb 48d9 f720 d301 54dd .v$8....H.....T.
0x00c0: 3224 c786 b0ef 86be 2d76 d416 97e3 2603 2$......-v....&.
0x00d0: 4669 3604 2729 f0a0 d532 f18a bc4b 471a Fi6.')...2...KG.
0x00e0: 6ecf 9aa6 c042 19b2 a1e9 e5fe 4b50 d67f n....B......KP..
0x00f0: fb49 fa27 af3b 6293 db48 e3db 0e42 59a4 .I.'.;b..H...BY.
0x0100: 27a3 d29f 92da 24d2 0dd0 7326 0312 6c82 '.....$...s&..l.
0x0110: 5f3d 812a 23e3 7156 96cc 6b27 669d 505f _=.*#.qV..k'f.P_
0x0120: 360e 54c5 6584 d595 f13a 958f 08fd e68e 6.T.e....:......
0x0130: cfb5 59f3 5ac0 79ca 8b85 c0d6 2968 2918 ..Y.Z.y.....)h).
0x0140: b509 708f 0684 e924 2ab2 3092 5c37 f8ca ..p....$*.0.\7..
0x0150: 8eda 6486 0d8d 3185 0708 07e4 255e cded ..d...1.....%^..
0x0160: e12f 21a2 b70c ce6c 1baa c841 a6e0 b85e ./!....l...A...^
0x0170: 36a5 eeb8 2564 2629 816a 92b5 d4b3 c434 6...%d&).j.....4
0x0180: 9f1f 3621 0506 43ae 5ee8 148b a868 75eb ..6!..C.^....hu.
0x0190: 42dc 2d02 f2de 8161 caa3 7148 059d aa18 B.-....a..qH....
0x01a0: 88b6 3e65 7d95 88f6 f5b1 341a c46e ce99 ..>e}.....4..n..
0x01b0: 9b74 a6e2 6641 2218 19af 36d7 017e cfec .t..fA"...6..~..
0x01c0: f69e 39f8 74a4 3934 d2eb d5a8 9dc6 cf86 ..9.t.94........
0x01d0: 7404 5486 4e75 7f4b 5b52 3db3 b68d 4336 t.T.Nu.K[R=...C6
0x01e0: bede e1e8 d983 a110 b997 a516 9b16 81aa ................
0x01f0: 3705 3b9a 37e9 e1b0 9f6d 4647 8703 cb37 7.;.7....mFG...7
0x0200: 9422 2f13 7058 93dc 8049 2847 bcf6 89dc ."/.pX...I(G....
0x0210: 0bbe 16fc d3eb a9ea 7fbb ef83 65e2 386e ............e.8n
0x0220: 5496 2120 0876 f051 78a6 b5be 91a0 T.!..v.Qx.....
1564968684.005196 IP 127.0.0.1.56558 > 127.0.0.1.41773: UDP, length 4
0x0000: 0000 0000 0000 0000 0000 0000 0800 4500 ..............E.
0x0010: 0020 70a5 4000 4011 cc25 7f00 0001 7f00 ..p.@.@..%......
0x0020: 0001 dcee a32d 000c fe1f 0004 0001 .....-........ <-- server replies from port 41773; 0004 means ACK Acknowledgement; Block # 0001
1564968684.005205 IP 127.0.0.1.41773 > 127.0.0.1.56558: UDP, length 516
0x0000: 0000 0000 0000 0000 0000 0000 0800 4500 ..............E.
0x0010: 0220 70a6 4000 4011 ca24 7f00 0001 7f00 ..p.@.@..$......
0x0020: 0001 a32d dcee 020c 0020 0003 0002 368e ...-..........6. <-- client: 0003 means DATA; Block # 0002; data ...
0x0030: b692 510b 817c 8e20 772d 4e43 9189 5606 ..Q..|..w-NC..V.
0x0040: a3eb 4938 3739 4c62 1b27 9db4 8711 5725 ..I879Lb.'....W%
0x0050: 4f1d 5652 8e35 6432 1e91 06ef a5f8 7d98 O.VR.5d2......}.
0x0060: 3373 f81d 14ef 7d13 dc50 897a b98e 1860 3s....}..P.z...`
0x0070: 14a1 7703 b530 eaca 03d9 158c 5e52 3baf ..w..0......^R;.
0x0080: a6e3 62d0 3fe5 eb56 591d dcad 746b ec96 ..b.?..VY...tk..
0x0090: 8e1e c7c6 c6ba 0c8b 0f86 3d14 7028 8831 ..........=.p(.1
0x00a0: 8527 d66e 5487 8e48 7d6b 916d 654e d194 .'.nT..H}k.meN..
0x00b0: 6853 43bc dbfe 6af1 f3bf 8b7c ee23 c8b3 hSC...j....|.#..
0x00c0: 5ca4 f0f7 08e0 271e ad12 d28c 6d5d 8ae8 \.....'.....m]..
0x00d0: af2b 5983 c881 ddd8 5f2b 2fab 1c57 7521 .+Y....._+/..Wu!
0x00e0: 7ba9 dadd cd03 13a9 b2e0 552d d263 00dc {.........U-.c..
0x00f0: 8dfc f12a c0a6 88b5 859c 71f1 ae0d a5cb ...*......q.....
0x0100: 6ece ab10 85db 2b1e 09a1 b9c0 df19 2963 n.....+.......)c
0x0110: 7ef9 93ae 6b5e a1bc 46df de79 2194 3333 ~...k^..F..y!.33
0x0120: cb70 c6d1 5b0e 1fc1 6af9 4f74 07c6 04eb .p..[...j.Ot....
0x0130: 726c f99c 3979 afc7 3109 930e 4c9f abff rl..9y..1...L...
0x0140: 80cc 3c3f c91b ef72 1247 1956 a228 7bf5 ..<?...r.G.V.({.
0x0150: a9ff d2e4 5624 5f8d 5cb3 3419 7e04 31dc ....V$_.\.4.~.1.
0x0160: b9b2 c321 74ef df10 82ab fc0b 313e e1e0 ...!t.......1>..
0x0170: e7c3 3e5f 81f9 e864 b20b bd25 6d23 ff7d ..>_...d...%m#.}
0x0180: b842 f406 029a 79c5 cf79 1579 19ce dc06 .B....y..y.y....
0x0190: cc9e cb40 71a8 5394 0750 0a2b 5c7b 8259 [email protected].+\{.Y
0x01a0: 1824 f0e3 7ae4 b8b6 9697 c300 2462 7b18 .$..z.......$b{.
0x01b0: 2b0c ce1f 846d 1362 98d4 1da1 ecbf b6b9 +....m.b........
0x01c0: db6a 4513 c70d 750f e7e3 8ade 9104 b361 .jE...u........a
0x01d0: 5510 a04e 32e3 d47a ce26 0dbc e874 407e U..N2..z.&...t@~
0x01e0: 270a 5edd 3e20 018e 4a04 2839 8f41 63f3 '.^.>...J.(9.Ac.
0x01f0: f437 5f3f 1ccc f8df 7cba 196b 1fc3 84bc .7_?....|..k....
0x0200: 2df1 628b a6a6 af7a de30 fcbd c14a f409 -.b....z.0...J..
0x0210: f75d 7028 2b64 a1fa d4da 8113 8316 09cf .]p(+d..........
0x0220: 0009 d8fc eb7f 86be fe16 db48 b406 ...........H..
...
1564968684.056062 IP 127.0.0.1.41773 > 127.0.0.1.56558: UDP, length 516
0x0000: 0000 0000 0000 0000 0000 0000 0800 4500 ..............E.
0x0010: 0220 90a2 4000 4011 aa28 7f00 0001 7f00 ....@.@..(......
0x0020: 0001 a32d dcee 020c 0020 0003 1000 d042 ...-...........B <-- client: 0003 means DATA; Block # 0x1000; data ...
0x0030: ee64 ee06 ea55 56c0 9a1d 11bf 1d46 27d0 .d...UV......F'.
0x0040: 3a39 f692 7688 fa9b c172 8dbb 4aad 2a49 :9..v....r..J.*I
0x0050: 12f8 e7da 2e57 dc58 23c1 46bb 42a9 b56a .....W.X#.F.B..j
0x0060: afc9 7955 462d 6024 826c b80d 91f9 8eca ..yUF-`$.l......
0x0070: 0f4c afd3 cb90 836c 7017 213b a202 e237 .L.....lp.!;...7
0x0080: 82e6 2729 062f 43aa 898e fea9 5db7 6df4 ..')./C.....].m.
0x0090: 5b35 6d22 13d2 a7ac ecb4 8cf1 8648 8195 [5m".........H..
0x00a0: 3daf b889 2401 c8c4 35b5 7c1f d05f bdfd =...$...5.|.._..
0x00b0: c4db ee02 8af9 3665 9f76 002f 3089 c8d4 ......6e.v./0...
0x00c0: 86c8 e0a4 10bc 8674 3109 4fae 0538 4b51 .......t1.O..8KQ
0x00d0: 0920 962c 3cef dcb3 daa3 3d50 e43e 8311 ...,<.....=P.>..
0x00e0: a1c5 cb77 ed32 cf52 6174 efe8 ebfa 03b3 ...w.2.Rat......
0x00f0: f3fc 6cc3 4162 2fb8 d8af eb18 61fe df61 ..l.Ab/.....a..a
0x0100: 56cd 79ac ecad 7fe2 e87e 8162 9a8f ec83 V.y......~.b....
0x0110: 7953 a496 5f87 50a3 473b 2f2c 6cf9 f248 yS.._.P.G;/,l..H
0x0120: d079 81a1 4743 4482 2a74 adf2 ec54 fe55 .y..GCD.*t...T.U
0x0130: 5f33 7cf7 b449 ad9f 57b2 33b2 bb88 9bd9 _3|..I..W.3.....
0x0140: feaf 5eb3 c6c7 59f4 1b01 c6a6 44f5 51ce ..^...Y.....D.Q.
0x0150: a306 4947 4f62 219f a9f6 642d e249 2161 ..IGOb!...d-.I!a
0x0160: ea52 53e7 949e 92a7 3cfa d7b4 4ad4 c4d3 .RS.....<...J...
0x0170: add2 8f89 e19e 5aa6 f758 1a28 3911 2f17 ......Z..X.(9./.
0x0180: 010c ab1c 158e c541 85ba f1a2 e07e ceb6 .......A.....~..
0x0190: 6ab7 00a9 abca c508 6ff1 7fc0 3fa6 f840 j.......o...?..@
0x01a0: 9318 db9c ee2a 804f d0ed 05c6 637d a287 .....*.O....c}..
0x01b0: 7047 894c 9aa1 cc34 b1ee 51fc cf1c c87a pG.L...4..Q....z
0x01c0: ee38 d797 167a 8505 9ccf 4161 2b67 086f .8...z....Aa+g.o
0x01d0: 7d21 9283 ddec 6a5a 066f 2289 7780 41b5 }!....jZ.o".w.A.
0x01e0: 9087 20b8 5fae 438d 7991 d277 4e92 7faa ...._.C.y..wN...
0x01f0: 65a3 fd1f 637d 0e6b 8da3 f712 be5a 0523 e...c}.k.....Z.#
0x0200: 4282 41d7 b25e 0ae9 25be 3297 3ac9 d44f B.A..^..%.2.:..O
0x0210: c3a4 80aa 6055 e5e1 446a 5914 0802 2928 ....`U..DjY...)(
0x0220: 7521 65cb dbc8 b007 b5f3 c11c af9a u!e...........
1564968684.056069 IP 127.0.0.1.56558 > 127.0.0.1.41773: UDP, length 4
0x0000: 0000 0000 0000 0000 0000 0000 0800 4500 ..............E.
0x0010: 0020 90a3 4000 4011 ac27 7f00 0001 7f00 ....@.@..'......
0x0020: 0001 dcee a32d 000c fe1f 0004 1000 .....-........ <-- server replies from port 41773; 0004 means ACK Acknowledgement; Block # 0x1000
1564968684.056078 IP 127.0.0.1.41773 > 127.0.0.1.56558: UDP, length 4
0x0000: 0000 0000 0000 0000 0000 0000 0800 4500 ..............E.
0x0010: 0020 90a4 4000 4011 ac26 7f00 0001 7f00 ....@.@..&......
0x0020: 0001 a32d dcee 000c fe1f 0003 1001 ...-.......... <-- client: 0003 means DATA; Block # 0x1001; data is zero length! <-- "A data packet of less than 512 bytes signals termination of a transfer"
1564968684.056084 IP 127.0.0.1.56558 > 127.0.0.1.41773: UDP, length 4
0x0000: 0000 0000 0000 0000 0000 0000 0800 4500 ..............E.
0x0010: 0020 90a5 4000 4011 ac25 7f00 0001 7f00 ....@.@..%......
0x0020: 0001 dcee a32d 000c fe1f 0004 1001 .....-........ <-- server replies from port 41773; 0004 means ACK Acknowledgement; Block # 0x1001
$ git clone https://github.com/maranellored/tftpd-go.git
$ cd tftpd-go
$ cp tftp_server.go tftp_server.go.orig
$ diff --ignore-all-space tftp_server.go.orig tftp_server.go
6a7,8
> "time"
> "syscall"
180a183,184
> var start time.Time = time.Now()
>
249c253,254
< fmt.Println("TFTP WRITE: File written to memory- " + filename)
---
> //fmt.Println("TFTP WRITE: File written to memory- " + filename)
> fmt.Printf("%6d=tid TFTP WRITE: File written to memory- %s in %f seconds\n", syscall.Gettid(), filename, time.Now().Sub(start).Seconds())
$ dd bs=M count=10 </dev/urandom > dummy_file_1_10mb 2>/dev/null
$ dd bs=M count=10 </dev/urandom > dummy_file_2_10mb 2>/dev/null
$ dd bs=M count=10 </dev/urandom > dummy_file_3_10mb 2>/dev/null
$ ls -alh dummy_file_*
-rw-rw-r-- 1 simon simon 10M Sep 8 11:11 dummy_file_1_10mb
-rw-rw-r-- 1 simon simon 10M Sep 8 11:11 dummy_file_2_10mb
-rw-rw-r-- 1 simon simon 10M Sep 8 11:11 dummy_file_3_10mb
- Note: The files get transferred in parallel and each 10 MB file takes about 0.36 seconds.
- Note: The tftp client used seems buggy and got the transfer time wrong at -5 seconds!
- Note: We transfer 3 files using 3 processes, plus the Golang server is 4 processes, for optimal performance on the 4 main cores available without using 4 slower hyper-threads.
$ go build
$ ./tftpd-go --port 6969 --threads 32 --timeout 10
Started TFTP server at 127.0.0.1:6969
Processing TFTP Write Request for file dummy_file_1_10mb, in mode octet, from host 127.0.0.1:34046
Processing TFTP Write Request for file dummy_file_2_10mb, in mode octet, from host 127.0.0.1:34059
Processing TFTP Write Request for file dummy_file_3_10mb, in mode octet, from host 127.0.0.1:46400
28714=tid TFTP WRITE: File written to memory- dummy_file_2_10mb in 0.361195 seconds
28716=tid TFTP WRITE: File written to memory- dummy_file_3_10mb in 0.361078 seconds
28716=tid TFTP WRITE: File written to memory- dummy_file_1_10mb in 0.365685 seconds
^C
$ tftp -v -m octet 127.0.0.1 6969 -c put dummy_file_1_10mb & tftp -v -m octet 127.0.0.1 6969 -c put dummy_file_2_10mb & tftp -v -m octet 127.0.0.1 6969 -c put dummy_file_3_10mb
...
Sent 10485760 bytes in -5.4 seconds [-15579293 bit/s]
Sent 10485760 bytes in -5.4 seconds [-15569721 bit/s]
Sent 10485760 bytes in -5.3 seconds [-15711946 bit/s]
We don't no how fast or well implementated the tftp client is, but we can look at the source code for the Golang tftp server and make some observations:
- The github page [1] for the tftp server says "You can configure the maximum number of threads (go routines) that can be spawned by the server."
- Possible performance issue: It looks like the tftp server spawns a separate thread for each file being received, which obviously scales badly and would be bad for performance if receiving very many files concurrently.
- Possible performance issue: There seems to be some kind of IPC happening [2] (read: Golang channels) between the main thread and the worker threads, which seems unnecessary in a world of epoll.
- Possible performance issue: Each new handler thread starts its own listening socket [3], which doesn't seem strictly necessary, and might be an unneeded run-time overhead.
- Possible performance issue: Packet parsing seems to unnecessarily duplicate parts of recevied packet [4], causing Golang to spin its wheels copying memory?
[1] https://github.com/maranellored/tftpd-go [2] https://github.com/maranellored/tftpd-go/blob/master/tftp_server.go#L60 [3] https://github.com/maranellored/tftpd-go/blob/master/tftp_server.go#L77 [4] https://github.com/maranellored/tftpd-go/blob/master/tftp_server.go#L229
- Use epoll so that one thread per tftp file transfer is not necessary; 1,000s of file transfers could be done by a single thread.
- Using epoll, use greedy UDP read on incoming packets, which increases performance due to less notifications.
- Using epoll, pre-create all listening sockets ever needed at server initialization, which saves valuable CPU time at run-time.
- Use epoll_data [1] to maintain socket related data structures and protocol state, again only allocated and initialized once upon server initialization.
- Use private in-memory mmap() to append incoming file chunks because mremap() [2] can efficiently grow the memory without copying the existing memory.
[1] http://man7.org/linux/man-pages/man2/epoll_ctl.2.html [2] http://man7.org/linux/man-pages/man2/mremap.2.html
Unfortunately no Golang network modules appear to support the power of epoll, at least for the purposes of this weekend Golang tftp server project:
- This example [1] is not a module, uses epoll, but via the syscall interface which probably has a performance disadvantage.
- This Golang evio framework [2] was designed to use epoll, but contains a bad UDP writing abstraction [3] which unfortunately is needed by the tftp server!
- The Golang evio framework also seems controversial [4]: "One of my favorite things about Go is that it cuts through the "threads vs. events" debate by offering thread-style programming with event-style scaling using what you might call green threads (compiler assisted cooperative multitasking that has the programming semantics of preemptive multitasking). That is, I can write simple blocking code, and my server still scales. Using event loop programming in Go would take away one of my favorite things about the language, so I won't be using this. However I do appreciate the work, as it makes an excellent bug report against the Go runtime. It gives us a standard to hold the standard library's net package to."
- "The Go network stack already makes use of epoll" [5] but it seems like the Golang developer just cannot access it and enjoy highly performing event driven programming: "When I started using Go more and more, I really enjoyed the different I/O-model using goroutines and blocking function calls. It also has a few drawbacks but the mental model is a lot easier to reason about." [5]
[1] https://gist.github.com/tevino/3a4f4ec4ea9d0ca66d4f [2] https://github.com/tidwall/evio [3] tidwall/evio#19 [4] https://news.ycombinator.com/item?id=15624432 [5] https://news.ycombinator.com/item?id=15624586
- Implement tftp server as a Golang cgo project in order to compare run-time performance against the existing Golang tftp server using the highly threaded default Golang networking.
- Figure out how to move the transferred files at run-time between C and Golang.
[1] https://github.com/maranellored/tftpd-go
$ sudo tcpdump -i lo -nn -tt -s 0 -XX host 127.0.0.2
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on lo, link-type EN10MB (Ethernet), capture size 262144 bytes
1567821958.286256 IP 127.0.0.2.12345 > 127.0.0.1.6969: UDP, length 30
0x0000: 0000 0000 0000 0000 0000 0000 0800 4500 ..............E.
0x0010: 003a a6e7 4000 4011 95c8 7f00 0002 7f00 .:..@.@.........
0x0020: 0001 3039 1b39 0026 fe3a 6865 6c6c 6f20 ..09.9.&.:hello.
0x0030: 776f 726c 6420 616e 6420 676f 6f64 6279 world.and.goodby
0x0040: 6520 776f 726c 640a e.world.
1567821958.286430 IP 127.0.0.1.34782 > 127.0.0.2.12345: UDP, length 30
0x0000: 0000 0000 0000 0000 0000 0000 0800 4500 ..............E.
0x0010: 003a 0954 4000 4011 335c 7f00 0001 7f00 .:.T@[email protected]\......
0x0020: 0002 87de 3039 0026 fe3a 6865 6c6c 6f20 ....09.&.:hello.
0x0030: 776f 726c 6420 616e 6420 676f 6f64 6279 world.and.goodby
0x0040: 6520 776f 726c 640a e.world.
$ echo 'hello world and goodbye world' | socat -d -d -v - udp4-sendto:127.0.0.1:6969,bind=127.0.0.2,reuseaddr,sourceport=12345
2019/09/06 19:05:58 socat[30857] N reading from and writing to stdio
2019/09/06 19:05:58 socat[30857] N successfully prepared local socket AF=2 127.0.0.2:12345
2019/09/06 19:05:58 socat[30857] N starting data transfer loop with FDs [0,1] and [5,5]
> 2019/09/06 19:05:58.286137 length=30 from=0 to=29
hello world and goodbye world
2019/09/06 19:05:58 socat[30857] N local address: AF=2 127.0.0.2:12345
2019/09/06 19:05:58 socat[30857] N socket 1 (fd 0) is at EOF
2019/09/06 19:05:58 socat[30857] N received packet with 30 bytes from AF=2 127.0.0.1:34782
2019/09/06 19:05:58 socat[30857] N socket 1 (fd 0) is at EOF
2019/09/06 19:05:58 socat[30857] N exiting with status 0
$ gcc -DSHF_DEBUG_VERSION=1 -O2 -o shftftpd shftftpd-main.c -lpthread && ./shftftpd
1567967060.621895 6837=tid tftp epoll thread created
1567967060.621943 6838=tid socket_id #1 of 4 listening on port 6969 with fd 4 <- tfpt main port
1567967060.621956 6838=tid socket_id #2 of 4 listening on port 53661 with fd 5
1567967060.621963 6838=tid socket_id #3 of 4 listening on port 33962 with fd 6
1567967060.621969 6838=tid socket_id #4 of 4 listening on port 44900 with fd 7
1567967060.621972 6838=tid entering epoll event loop
...
1567967130.229333 6838=tid event 1 of 1: read udp packet on listening socket_id #4 of 4 on port 44900 with fd 7; read 516 bytes from peer 127.0.0.1:42495
1567967130.229336 6838=tid hex dump 0000 00 03 50 00 88 bd f1 c8 79 92 0d 35 2a 1b 49 64 |..P.....y..5*.Id| <- udp_buf[516]
1567967130.229340 6838=tid hex dump 0010 28 06 46 6c 62 52 86 77 2d 7e b0 98 26 41 c6 00 |(.FlbR.w-~..&A..|
1567967130.229343 6838=tid hex dump 0020 47 ac 74 53 c5 b2 d6 6b f0 30 c1 f2 6e ba 47 24 |G.tS...k.0..n.G$|
1567967130.229347 6838=tid hex dump 0030 07 7b ef b9 6d 37 9b 50 f8 3c 3e 26 e7 01 7e 59 |.{..m7.P.<>&..~Y|
1567967130.229351 6838=tid hex dump 0040 51 a2 52 3c 84 3f 0e c2 ab 5f d9 93 70 16 af 64 |Q.R<.?..._..p..d|
1567967130.229354 6838=tid hex dump 0050 ab 04 d2 14 65 30 28 0b fb e7 ac 1c c2 7d 04 42 |....e0(......}.B|
1567967130.229358 6838=tid hex dump 0060 04 47 cc a2 dc d0 84 57 49 30 ae aa 91 cb 8e fb |.G.....WI0......|
1567967130.229469 6838=tid hex dump 0070 c0 41 f8 60 53 15 77 e3 8d c1 1c 33 61 52 b8 49 |.A.`S.w....3aR.I|
1567967130.229473 6838=tid hex dump 0080 6c 00 2a 9f fc 91 ab fc 49 dd 4f af f4 fa ce 21 |l.*.....I.O....!|
1567967130.229476 6838=tid hex dump 0090 aa 97 01 77 0c e4 77 c7 4e 58 66 40 08 2d 63 a2 |[email protected].|
1567967130.229480 6838=tid hex dump 00a0 98 74 a3 d9 74 ef ed 11 1e 6f 3a 64 37 f2 5d f2 |.t..t....o:d7.].|
1567967130.229484 6838=tid hex dump 00b0 eb 9a f2 32 0d fc a3 f4 c4 47 40 5e 23 98 46 77 |...2.....G@^#.Fw|
1567967130.229487 6838=tid hex dump 00c0 55 66 5c 9d 28 2c a9 63 4d 43 c9 ca ee ca ac d8 |Uf\.(,.cMC......|
1567967130.229491 6838=tid hex dump 00d0 45 bf aa f9 16 cb de 5d 6c 69 20 b8 16 f3 89 0f |E......]li .....|
1567967130.229494 6838=tid hex dump 00e0 48 f0 46 9a 42 ca 8d 09 25 af 1f 16 74 5d 8c 7f |H.F.B...%...t]..|
1567967130.229503 6838=tid hex dump 00f0 dd bc 8d d0 e9 aa 74 2d 87 ad 67 18 1c 1a 1b bc |......t-..g.....|
1567967130.229506 6838=tid hex dump 0100 11 e3 ec 49 e2 6f 68 f9 03 24 db e8 0f 48 7b 90 |...I.oh..$...H{.|
1567967130.229510 6838=tid hex dump 0110 83 ab e5 ad 60 de 8c 84 7c 16 d7 0b 2e 92 bc 22 |....`...|......"|
1567967130.229514 6838=tid hex dump 0120 63 bf 23 0b cf 11 e5 a5 75 7d 52 68 cc 58 24 1b |c.#.....u}Rh.X$.|
1567967130.229517 6838=tid hex dump 0130 56 39 49 e3 3c eb 6d 5a cc c8 96 2a 29 e2 68 de |V9I.<.mZ...*).h.|
1567967130.229521 6838=tid hex dump 0140 69 2a bc 3b 41 f9 1b 38 15 64 a2 5a c6 52 e0 98 |i*.;A..8.d.Z.R..|
1567967130.229525 6838=tid hex dump 0150 23 96 77 9b 91 21 df 49 92 18 16 67 81 aa 38 22 |#.w..!.I...g..8"|
1567967130.229528 6838=tid hex dump 0160 5d 76 5a a7 0d a7 15 ba 74 1d 8d dd 55 d3 d6 21 |]vZ.....t...U..!|
1567967130.229532 6838=tid hex dump 0170 21 54 26 2e c0 06 a0 74 56 65 3e 99 5f 9f f0 a9 |!T&....tVe>._...|
1567967130.229535 6838=tid hex dump 0180 a0 80 40 b2 eb bc f8 db bb a6 7a b0 92 17 23 dc |[email protected]...#.|
1567967130.229539 6838=tid hex dump 0190 ac 83 cc ef 73 d1 65 a0 87 e8 91 78 e0 3a 74 66 |....s.e....x.:tf|
1567967130.229543 6838=tid hex dump 01a0 51 6a 5c 98 54 d5 53 2a 63 92 d6 6f c0 11 05 a1 |Qj\.T.S*c..o....|
1567967130.229546 6838=tid hex dump 01b0 0d 1e 67 bb 10 93 ad 34 3f 90 f5 39 12 63 dd 0d |..g....4?..9.c..|
1567967130.229550 6838=tid hex dump 01c0 01 f8 7a 22 db 7c 27 7d b3 f5 04 a8 1a 5a f2 31 |..z".|'}.....Z.1|
1567967130.229554 6838=tid hex dump 01d0 76 da 6a 80 aa 4d 60 a8 22 95 e7 23 6f 78 46 be |v.j..M`."..#oxF.|
1567967130.229557 6838=tid hex dump 01e0 b1 94 4b 7b 3a f0 7e ed c8 5d 2f 51 28 70 ac 51 |..K{:.~..]/Q(p.Q|
1567967130.229561 6838=tid hex dump 01f0 21 73 f2 ba 17 f1 ea 12 c7 dd f4 b0 dc 89 8b 9b |!s..............|
1567967130.229565 6838=tid hex dump 0200 ac 58 87 fe |.X..|
1567967130.229568 6838=tid send ack udp packet on listening socket_id #4 of 4 on port 44900 with fd 7; sent 4 bytes to peer 127.0.0.1:42495 for file 'dummy_file_1_10mb'
1567967130.229573 6838=tid socket_id #4 of 4 packet from expected peer 127.0.0.1:42495 with tftp type 0x0003 and block 0x5000; appending to mmap
1567967130.229576 6838=tid socket_id #4 of 4 mmap_buffer[+0 = 10485760 bytes] for appending +512 = 10485760 bytes
1567967130.229580 6838=tid looping through 1 epoll event(s)
1567967130.229583 6838=tid event 1 of 1: read udp packet on listening socket_id #4 of 4 on port 44900 with fd 7; read 4 bytes from peer 127.0.0.1:42495
1567967130.229586 6838=tid hex dump 0000 00 03 50 01 |..P.|
1567967130.229589 6838=tid send ack udp packet on listening socket_id #4 of 4 on port 44900 with fd 7; sent 4 bytes to peer 127.0.0.1:42495 for file 'dummy_file_1_10mb'
1567967130.229594 6838=tid socket_id #4 of 4 packet from expected peer 127.0.0.1:42495 with tftp type 0x0003 and block 0x5001; transfer complete: 10,485,760 bytes in 8.745089 seconds or 2,342.0 PPS for file 'dummy_file_1_10mb'
1567967130.230062 6837=tid socket_id #4 of 4 has transferred a 10,485,760 byte file 'dummy_file_1_10mb'
^C
$ tftp -v -m octet 127.0.0.1 6969 -c put dummy_file_1_10mb
...
Sent 10485760 bytes in 6.5 seconds [12998641 bit/s]
$ gcc -DSHF_DEBUG_VERSION=0 -O2 -o shftftpd shftftpd-main.c -lpthread && time ./shftftpd
1567967385.357182 6972=tid tftp epoll thread created
1567967385.357235 6973=tid socket_id #1 of 4 listening on port 6969 with fd 4 <- tfpt main port
1567967385.357251 6973=tid socket_id #2 of 4 listening on port 56832 with fd 5
1567967385.357257 6973=tid socket_id #3 of 4 listening on port 47068 with fd 6
1567967385.357263 6973=tid socket_id #4 of 4 listening on port 53963 with fd 7
1567967385.357266 6973=tid entering epoll event loop
1567967389.897385 6973=tid socket_id #3 of 4 packet from expected peer 127.0.0.1:57698 with tftp type 0x0003 and block 0x5001; transfer complete: 10,485,760 bytes in 0.221628 seconds or 92,411.6 PPS for file 'dummy_file_1_10mb'
1567967389.897648 6973=tid socket_id #2 of 4 packet from expected peer 127.0.0.1:38354 with tftp type 0x0003 and block 0x5001; transfer complete: 10,485,760 bytes in 0.221303 seconds or 92,547.3 PPS for file 'dummy_file_3_10mb'
1567967389.898258 6972=tid socket_id #2 of 4 has transferred a 10,485,760 byte file 'dummy_file_3_10mb'
1567967389.898266 6972=tid socket_id #3 of 4 has transferred a 10,485,760 byte file 'dummy_file_1_10mb'
1567967389.899457 6973=tid socket_id #4 of 4 packet from expected peer 127.0.0.1:51177 with tftp type 0x0003 and block 0x5001; transfer complete: 10,485,760 bytes in 0.223760 seconds or 91,531.1 PPS for file 'dummy_file_2_10mb'
1567967389.900336 6972=tid socket_id #4 of 4 has transferred a 10,485,760 byte file 'dummy_file_2_10mb'
^C
real 0m13.884s
user 0m0.048s
sys 0m0.296s
$ tftp -v -m octet 127.0.0.1 6969 -c put dummy_file_1_10mb & tftp -v -m octet 127.0.0.1 6969 -c put dummy_file_2_10mb & tftp -v -m octet 127.0.0.1 6969 -c put dummy_file_3_10mb
...
Sent 10485760 bytes in 2.2 seconds [37833390 bit/s]
Sent 10485760 bytes in 2.2 seconds [37894230 bit/s]
Sent 10485760 bytes in 2.2 seconds [37467874 bit/s]
- Note: How the Golang TIDs change as even simple Golang code jumps from Linux thread to thread.
$ go build -x shftftpd.go && ./shftftpd
...
1567987280.009375 25036=tid go: start c tftpd
1567987280.009480 25036=tid tftp epoll thread created
1567987280.009512 25040=tid socket_id #1 of 4 listening on port 6969 with fd 4 <- tfpt main port
1567987280.009534 25040=tid socket_id #2 of 4 listening on port 58823 with fd 5
1567987280.009543 25040=tid socket_id #3 of 4 listening on port 43425 with fd 6
1567987280.009553 25040=tid socket_id #4 of 4 listening on port 50292 with fd 7
1567987280.009563 25040=tid entering epoll event loop
1567987291.552442 25040=tid socket_id #4 of 4 packet from expected peer 127.0.0.1:46103 with tftp type 0x0003 and block 0x5001; transfer complete: 10,485,760 bytes in 0.214158 seconds or 95,635.1 PPS for file 'dummy_file_1_10mb'
1567987291.554942 25040=tid socket_id #2 of 4 packet from expected peer 127.0.0.1:54647 with tftp type 0x0003 and block 0x5001; transfer complete: 10,485,760 bytes in 0.216230 seconds or 94,718.6 PPS for file 'dummy_file_2_10mb'
1567987291.555125 25040=tid socket_id #3 of 4 packet from expected peer 127.0.0.1:53427 with tftp type 0x0003 and block 0x5001; transfer complete: 10,485,760 bytes in 0.216564 seconds or 94,572.5 PPS for file 'dummy_file_3_10mb'
1567987291.556646 25039=tid go: discovered file with 10485760 bytes ready 'dummy_file_1_10mb', 1 unique files so far (copied from c to go map in 4.022162 ms)
1567987291.560123 25036=tid go: discovered file with 10485760 bytes ready 'dummy_file_3_10mb', 2 unique files so far (copied from c to go map in 3.273408 ms)
1567987291.564358 25041=tid go: discovered file with 10485760 bytes ready 'dummy_file_2_10mb', 3 unique files so far (copied from c to go map in 3.968911 ms)
^C
$ tftp -v -m octet 127.0.0.1 6969 -c put dummy_file_1_10mb & tftp -v -m octet 127.0.0.1 6969 -c put dummy_file_2_10mb & tftp -v -m octet 127.0.0.1 6969 -c put dummy_file_3_10mb
...
Sent 10485760 bytes in 2.1 seconds [39145689 bit/s]
Sent 10485760 bytes in 2.2 seconds [38769915 bit/s]
Sent 10485760 bytes in 2.2 seconds [38722130 bit/s]
- The performance tested was very simple concurrent performance transferring only 3 files concurrently.
- The Golang tftp server without explicit epoll using multiple threads took about 0.36 seconds to transfer each 10 MB file using up to 3 threads.
- The Golang tftp server with explicit epoll using a single thread took about 0.22 seconds to transfer each 10 MB file using only 1 thread.
- This means the explicit epoll version of the tftp server transferred the same files in only 61% of the time taken by the tftp server using regular Golang networking.
- So the explicit epoll tftp server is in the direction of being 'twice as fast' for this simple test while using less threads.
- It would be interesting to push each tftp server to the max to discover what the respective box capacity is with each implementation method.
$ find . -type f | egrep -v test | egrep "\.go" | xargs wc -l # tftp server using traditional Golang networking
154 ./tftp_util.go
262 ./tftp_server.go
24 ./main.go
32 ./tftp_cache.go
472 total
$ cat shftftpd.c shftftpd.go | wc -l # tftp server using epoll networking in C
468
- The tftp server using traditional Golang networking comes in at 472 LOC excluding the tests.
- The tftp server using epoll networking comes in at 468 LOC excluding the tests, and excluding the tftp read functionality.
- However, about 134 LOC of the 468 LOC is boiler plate code which has nothing specifically to do with tftp, e.g. gettid(), C to Golang function API, sleep_ms(), get_time_in_seconds(), hex_dump(), mmap_buffer_append().
- Also, the 468 LOC is instrumented with debug lines.
- The tftp server using traditional Golang networking has a single block of code for write file functionality from line 180 [1] to 250 [2] or 70 LOC in total.
- The tftp server using epoll networking also has a single block of code for write file functionality which is only 32 LOC in total.
- Presumably if the epoll functionality was abstracted (e.g. using an epoll framework similar to [3]) from the tftp business logic, then the epoll LOC would be even less.
[1] https://github.com/maranellored/tftpd-go/blob/master/tftp_server.go#L180 [2] https://github.com/maranellored/tftpd-go/blob/master/tftp_server.go#L250 [3] https://github.com/tidwall/evio
- Only had time to implement the tftp protocol write mode.
- Implemented many protocol sanity checks but not automated tests and/or code coverage which would eventually get implemented like this previous project [1].
- I turns out the copying the large transferred files in-memory from C to Golang take a few ms per 10 MB file.
- In the end it might be easier not to transfer them using the C.GoBytes mechanism, but instead actually store the mmap()ed files in /dev/shm which would also act as a hash map for the transferred files.
- In this way, the Golang tftp server can simply mmap() a file which is already memory and use it directly without copying it.
- However, /dev/shm is Linux specific, so this wouldn't work on other OSs like Windows etc.
[1] https://www.youtube.com/watch?v=cRfaaSKwYAY
- Consider implementing an epoll framework similar to [1] in order to allow the tftp server business logic to be abstracted away from the epoll business logic, resulting in even less LOC while maintaining excellent run-time performance.
- Consider implementing the epoll algorithm using native Golang code and syscall interface. Would be interesting to know the performance hit, if any?
- Implement a high performance epoll tftp client, since we are not sure about how optimal the implementation is for the off-the-shelf client we tested with.
- Test performance while maxing out tftp server CPU. Unfortunately this is difficult to do on a single laptop over the weekend.
- Test performance with increasing numbers of concurrent files and file sizes, e.g. 10, 100, 1000, 10000.
- In theory the traditional Golang network will do particularly badly when more concurrent files are used? Due to the overhead of socket handling, thread handling, and IPC between the threads?