-
-
Save tobek/a17fa9101d7e28ddad26 to your computer and use it in GitHub Desktop.
/* open up chrome dev tools (Menu > More tools > Developer tools) | |
* go to network tab, refresh the page, wait for images to load (on some sites you may have to scroll down to the images for them to start loading) | |
* right click/ctrl click on any entry in the network log, select Copy > Copy All as HAR | |
* open up JS console and enter: var har = [paste] | |
* (pasting could take a while if there's a lot of requests) | |
* paste the following JS code into the console | |
* copy the output, paste into a text file | |
* open up a terminal in same directory as text file, then: wget -i [that file] | |
*/ | |
var imageUrls = []; | |
har.log.entries.forEach(function (entry) { | |
// This step will filter out all URLs except images. If you just want e.g. just jpg's then check mimeType against "image/jpeg", etc. | |
if (entry.response.content.mimeType.indexOf("image/") !== 0) return; | |
imageUrls.push(entry.request.url); | |
}); | |
console.log(imageUrls.join('\n')); |
when I input "wget -i [the file path]", Windows Terminal at first needed me to "Supply values for the following parameters: Uri:" and typing the target website comes back with an error
The instructions I wrote are for Linux. I didn't think Windows even had wget
, but sounds like it does but with a different interface. Look up how to download files using a text file with a list of URLs in Windows.
Cool. Thanks!
I don't understand what you mean by it takes a long time to paste? because when i paste, its instant then i get the message "undefined"
I don't understand what you mean by it takes a long time to paste? because when i paste, its instant then i get the message "undefined"
It occurs in situations where one needs to download a bunch of images.
update: we can also use charles or fiddler to proxy the chrome/firefox http traffic, then just select and save all image file to your cumputer, remember to add file extension like jpeg or png after that. It's effictive when you need download images with cookies. However this method won't keep the file order like what it is in Network Devtool panel.
an example python code for download image from har with cookies, inspired by @puziyi
import json
import requests
with open('source_har.har', 'r', encoding="utf-8") as f:
har_json = json.loads(f.read())
for i,entry in enumerate(har_json['log']["entries"]):
if entry["response"]["content"]["mimeType"].find("image/jpeg") == 0:
url = entry["request"]["url"]
name = str(i) + '.jpeg'
cookies = entry["request"]["cookies"][0]
# when cookies's value is boolean, you need convert it to str
cookies = {k:str(v) for k,v in cookies.items()}
img = requests.get(url, cookies=cookies).content
with open(name,'wb') as f:
f.write(img)
That worked on my windows10:
& 'C:\path\to\wget.exe' -r -nH --cut-dirs=<N> -P 'C:\Path\to\output' -i 'target_link.txt'
Thanks! saved me some time <3
if that comes to u " 'wget' is not recognized as an internal or external command"
Follow this ==> https://bobbyhadz.com/blog/wget-is-not-recognized-as-internal-or-external-command
in windows's instead of
"wget -i [that file]"
use following command from PowerShell:
Get-Content [that file] | ForEach-Object { Invoke-WebRequest -Uri $_ -OutFile (Split-Path -Leaf $_) }
I figured out a workaround a while ago by using Mozilla Firefox and following the steps from there. Now, my issue is at "8. open up a terminal in same directory as text file, then: wget -i [that file]" because when I input "wget -i [the file path]", Windows Terminal at first needed me to "Supply values for the following parameters: Uri:" and typing the target website comes back with an error. Should I go somewhere else because my problem deviates from the original topic?