wid widnyana

🤘

available for hire

{Cloud Infrastructure, Software, System} Engineer

178 followers · 516 following

View GitHub Profile

Recently created

Least recently created

Recently updated

Least recently updated

widnyana / imdb_next_page_spider.py

Created January 6, 2016 07:23 — forked from premit/imdb_next_page_spider.py

Scrapy reference: Crawling next pagination

	'''
	Spider for IMDb
	- Retrieve most popular movies & TV series with rating of 8.0 and above
	- Crawl next pages recursively
	'''

	from scrapy.contrib.spiders import CrawlSpider, Rule
	from scrapy.contrib.linkextractors.sgml import SgmlLinkExtractor
	from scrapy.selector import Selector

widnyana / pipelines.py

Last active November 23, 2015 06:57 — forked from tzermias/pipelines.py

Scrapy MySQL pipeline.Just a mirror to the asynchronous MySQL pipeline.Copy-paste it directly to pipelines.py. Database credentials are stored in settings.py. Based on http://snipplr.com/view/66986/

	import MySQLdb.cursors
	from twisted.enterprise import adbapi

	from scrapy.xlib.pydispatch import dispatcher
	from scrapy import signals
	from scrapy.utils.project import get_project_settings
	from scrapy import log

	SETTINGS = get_project_settings()

widnyana / awesome_task.py

Last active August 29, 2015 14:15 — forked from winhamwr/awesome_task.py

	"""
	Celery base task aimed at longish-running jobs that return a result.

	``AwesomeResultTask`` adds thundering herd avoidance, result caching, progress
	reporting, error fallback and JSON encoding of results.
	"""
	from __future__ import division

	import logging
	import simplejson

widnyana / uri_validate.py

Last active August 29, 2015 14:06 — forked from mnot/uri_validate.py

	#!/usr/bin/env python

	"""
	Regex for URIs

	These regex are directly derived from the collected ABNF in RFC3986
	(except for DIGIT, ALPHA and HEXDIG, defined by RFC2234).

	They should be processed with re.VERBOSE.
	"""

NewerOlder