After the interesting talk of my friend Luca Deri at the SharkFest EU 2018, I was curious to see how secure was my D-Link security camera.
So I used my computer to share the WiFi internet collection to the camera connected via Ethernet. Then I started sniffing on the Ethernet interface with wireshark and turned on the d-link.
Then, I opened "My Dlink" iPhone app to actually see the video of my camera.
Surprise, the camera sends HTTP in clear with h264 video.
These are the steps done.
A quick investigation of the traffic immediately allowed me to find a very large stream from the camera to a certain IP. A brief investigation of the stream allowed me to identify some HTTP content
In particular, the camera responds with HTTP, content multipart/x-mixed-replace
and content-type video/h264
HTTP/1.0 200 OK
Server: alphapd/2.1.8
Date: Sat Nov 10 20:22:04 2018
...6........p...Pragma: no-cache
Cache-Control: no-cache
Content-Type: multipart/x-mixed-replace;boundary=video boundary--
...6.............^..............................................................................................................................--video boundary--
Content-length: 36144
Date: 11-10-2018 08:22:04 PM IO_00010000_PT_000_000
Content-type: video/h264
....gB..
So I extracted only the camera-to-Internet part of the communication, with the aim of isolating the video stream. Using wireshark "Follow TCP stream", I selected only one part of the stream and set "Show data" as "Raw". Then i saved the resulting file.
The next step was to strip out the extra HTTP and other crap from the file, to leave only the h264 stream. It was pretty easy to find the header 0000 0001
of h264 (https://stackoverflow.com/questions/38094302/how-to-understand-header-of-h264) at offset 3590
dec (hex 00000e00 + 6) of the file
00000d80: 0000 0000 0000 0000 0000 2d2d 7669 6465 ..........--vide
00000d90: 6f20 626f 756e 6461 7279 2d2d 0d0a 436f o boundary--..Co
00000da0: 6e74 656e 742d 6c65 6e67 7468 3a20 3336 ntent-length: 36
00000db0: 3134 340d 0a44 6174 653a 2031 312d 3130 144..Date: 11-10
00000dc0: 2d32 3031 3820 3038 3a32 323a 3034 2050 -2018 08:22:04 P
00000dd0: 4d20 494f 5f30 3030 3130 3030 305f 5054 M IO_00010000_PT
00000de0: 5f30 3030 5f30 3030 0d0a 436f 6e74 656e _000_000..Conten
00000df0: 742d 7479 7065 3a20 7669 6465 6f2f 6832 t-type: video/h2
00000e00: 3634 0d0a 0d0a 0000 0001 6742 001e a950 64........gB...P
00000e10: 1407 b420 0000 7d00 001d 4c00 8000 0000 ... ..}...L.....
00000e20: 0168 ce3c 8000 0000 0165 8880 4801 ffff .h.<.....e..H...
00000e30: c461 a162 8000 830f d562 e5f3 67ff c3eb .a.b.....b..g...
00000e40: 1829 f5d6 2b1e 7f18 805b 0617 d6bf 6edf .)..+....[....n.
So I used dd to slice this file at that offest:
$ dd if=test/s9.raw bs=1 skip=3590 > sliced.h264
Then did a quick sanity check on the sizes to make sure the difference was exactly 3590
bytes:
$ wc -c < sliced.h264
1459892
$ wc -c < test/s9.raw
1463482
Finally, I converted the n264 into mp4 using ffmpeg
$ ffmpeg -framerate 24 -i sliced.h264 -c copy sliced.h264.mp4
And this is the final result. It's me working in front of the PC.
Actually the reconstructed video is missing the bottom part and I don't really care about it. I just wanted to prove the video was sent in cleartext over the Internet
Thanks for posting this, it was useful. I am gonna buy my first D-Link security camera too, so I need some explanation and stories from experienced users. Can you share your review on it? Maybe some recommendations on how to choose?