Skip to content

Instantly share code, notes, and snippets.

# encoding:utf-8
'''
1) 只看北京地区(包括各行政区)的货运司机的招聘分布情况
python3 to_redis.py --city_name 北京 --position 货运司机 --include_district
2) 只看北京的货运司机的招聘分布情况
python3 to_redis.py --city_name 北京 --position 货运司机
'''
import redis
import json
from selenium import webdriver
def get_pdf():
url = "http://www.baidu.com/
chrome_options = webdriver.ChromeOptions()
settings = {
"recentDestinations": [{
"id": "Save as PDF",
"origin": "local",
from sklearn.feature_extraction.text import TfidfVectorizer
documents = [
"The quick brown fox jumped over the lazy dog's back",
"Now is the time for all good men to come to the aid of their party"
]
vectorizer = TfidfVectorizer(stop_words=["for","is","of","the","to"])
X = vectorizer.fit_transform(documents)
@jerryan999
jerryan999 / medium_article_analysis.ipynb
Last active January 6, 2024 00:46
Given tag name, this script starts to crawl tag related post in the medium archive
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
const puppeteer = require('puppeteer');
function delay(time) {
return new Promise(function(resolve) {
setTimeout(resolve, time)
});
}
@jerryan999
jerryan999 / get_captcha_image.py
Created October 25, 2020 11:45
download captcha image for training study
import requests
import urllib
import json
import uuid
import os
image_folder = "images"
def save_image():
@jerryan999
jerryan999 / tiny-obj.cfg
Last active November 7, 2020 15:15
Yolo tiny object detection configuration (one kind of object)
[net]
# Testing
# batch=1
# subdivisions=1
# Training
batch=64
subdivisions=32
width=416
@jerryan999
jerryan999 / crawl_by_puppeteer.js
Created December 10, 2020 06:18
crawl_by_puppeteer using abuyun proxy
const puppeteer = require("puppeteer");
const crypto = require('crypto');
// 代理服务器
const proxyHost = "http-dyn.abuyun.com";
const proxyPort = 9020;
const proxyServer = "http://" + proxyHost + ":" + proxyPort;
// Setting the NODE_TLS_REJECT_UNAUTHORIZED environment variable to '0' makes TLS connections and HTTPS requests insecure by disabling certificate verification.
const puppeteer = require("puppeteer");
const crypto = require('crypto');
const { proxyRequest } = require('puppeteer-proxy');
// xun proxy
let orderno = 'ZF2020121058569meja6';
let secret = '';
@jerryan999
jerryan999 / sample.json
Created March 8, 2021 05:49
this is a sample json
{
"center": {
"x": 333.5164489746094,
"y": 104.28516387939453
},
"has_center": true,
"result": [
[
"box",
0.798782229423523,