Last active
June 3, 2020 17:50
-
-
Save lanceliao/90dd3b5107019263d261 to your computer and use it in GitHub Desktop.
Cache all packages from OpenWrt trunk repository
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python | |
#coding=utf-8 | |
# | |
# Openwrt Package Grabber | |
# | |
# Copyright (C) 2014 http://www.shuyz.com | |
# | |
import urllib2 | |
import re | |
import os | |
# the url of package list page, end with "/" | |
baseurl = 'http://downloads.openwrt.org/snapshots/trunk/ar71xx/packages/' | |
# which directory to save all the packages, end with "/" | |
savedir = './download/' | |
if not os.path.exists(savedir): | |
os.makedirs(savedir) | |
print 'fetching package list from ' + baseurl | |
content = urllib2.urlopen(baseurl, timeout=15).read() | |
print 'packages list ok, analysing...' | |
pattern = r'<a href="(.*?)">' | |
items = re.findall(pattern, content) | |
cnt = 0 | |
for item in items: | |
if item == '../': | |
continue | |
else: | |
cnt += 1 | |
print 'downloading item %d: '%(cnt) + item | |
if os.path.isfile(savedir + item): | |
print 'file exists, ignored.' | |
else: | |
rfile = urllib2.urlopen(baseurl + item) | |
with open(savedir + item, "wb") as code: | |
code.write(rfile.read()) | |
print 'done!' |
Hi @jacobq , comments are always welcome! I wrote the script for my simple usage without testing carefully, thanks for your correction.
This would do approximately the same, plus it can go down hierarchy:
wget --mirror --no-parent -nv http://downloads.openwrt.org/snapshots/trunk/ar71xx/
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I just happened to stumble across this and am not sure if you're looking for issues/PRs/comments/whatever, but I noticed that this just scrapes all the links and thus has some trouble with the "breadcrumb" at the top (e.g. "Index of (root) / releases / 18.06.4 / targets / x86 / 64 / packages /" has link at each level). A quick'n'dirty way to "fix" this is replace
if item == '../':
withif "/" in item: