Skip to content

Instantly share code, notes, and snippets.

@nenodias
Created October 10, 2017 14:16
Show Gist options
  • Save nenodias/e10c52971374e9d522182b80e9229224 to your computer and use it in GitHub Desktop.
Save nenodias/e10c52971374e9d522182b80e9229224 to your computer and use it in GitHub Desktop.
CrowlerVimeo
""" CrowlerVimeo."""
import json
import mechanicalsoup
BROWSER = mechanicalsoup.StatefulBrowser(
soup_config={'features': 'lxml'},
raise_on_404=True,
user_agent='MyBot/0.1: mysite.example.com/bot_info',
)
PREFIXO = 'https://vimeo.com'
lista = []
for page in range(5, 0, -1):
BROWSER.open('https://vimeo.com/album/3493175/page:{0}/sort:date/format:thumbnail'.format(page))
page = BROWSER.get_current_page()
messages = page.find('ol', class_='browse_videos')
lista_pagina = []
if messages != None:
items = messages.findAll('li')
for item in items:
link = item.find('a')
if link:
title = link.attrs['title']
href = link.attrs['href']
print(title)
print(href)
lista_pagina.append({"title":title, "href": PREFIXO+href})
lista_pagina.reverse()
lista.extend(lista_pagina)
with open('saida.json', 'w') as f:
f.write(json.dumps(lista))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment