Skip to content

Instantly share code, notes, and snippets.

View nguyenhieuec's full-sized avatar

m_fx nguyenhieuec

  • Viet Nam
View GitHub Profile
@jagregory
jagregory / gist:710671
Created November 22, 2010 21:01
How to move to a fork after cloning
So you've cloned somebody's repo from github, but now you want to fork it and contribute back. Never fear!
Technically, when you fork "origin" should be your fork and "upstream" should be the project you forked; however, if you're willing to break this convention then it's easy.
* Off the top of my head *
1. Fork their repo on Github
2. In your local, add a new remote to your fork; then fetch it, and push your changes up to it
git remote add my-fork [email protected]
@dnedbaylo
dnedbaylo / PersistentWebdriver.py
Created March 3, 2011 09:39
PersistentWebdriver
from selenium import webdriver
from selenium.webdriver.remote.remote_connection import RemoteConnection
from selenium.webdriver.remote.errorhandler import ErrorHandler
from selenium.webdriver.remote.command import Command
class PersistentWebdriver (webdriver.Remote):
def __init__(self, session_id=None, browser_name=''):
command_executor='http://localhost:4444/wd/hub'
@thuandt
thuandt / no_accent_vietnamese.py
Created August 22, 2012 03:07
Chuyển đổi từ Tiếng Việt có dấu sang Tiếng Việt không dấu
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""Chương trình chuyển đổi từ Tiếng Việt có dấu sang Tiếng Việt không dấu
Chỉnh sửa từ mã nguồn của anh NamNT
http://www.vithon.org/2009/06/14/x%E1%BB%AD-ly-ti%E1%BA%BFng-vi%E1%BB%87t-trong-python
"""
import re
INTAB = "ạảãàáâậầấẩẫăắằặẳẵóòọõỏôộổỗồốơờớợởỡéèẻẹẽêếềệểễúùụủũưựữửừứíìịỉĩýỳỷỵỹđẠẢÃÀÁÂẬẦẤẨẪĂẮẰẶẲẴÓÒỌÕỎÔỘỔỖỒỐƠỜỚỢỞỠÉÈẺẸẼÊẾỀỆỂỄÚÙỤỦŨƯỰỮỬỪỨÍÌỊỈĨÝỲỶỴỸĐ"
@sebdah
sebdah / threading_example.py
Last active July 18, 2024 14:35
Running a background thread in Python
import threading
import time
class ThreadingExample(object):
""" Threading example class
The run() method will be started and it will run in the background
until the application exits.
"""
@bsweger
bsweger / useful_pandas_snippets.md
Last active October 6, 2025 13:44
Useful Pandas Snippets

Useful Pandas Snippets

A personal diary of DataFrame munging over the years.

Data Types and Conversion

Convert Series datatype to numeric (will error if column has non-numeric values)
(h/t @makmanalp)

@joshlk
joshlk / faster_toPandas.py
Last active September 19, 2025 16:11
PySpark faster toPandas using mapPartitions
import pandas as pd
def _map_to_pandas(rdds):
""" Needs to be here due to pickling issues """
return [pd.DataFrame(list(rdds))]
def toPandas(df, n_partitions=None):
"""
Returns the contents of `df` as a local `pandas.DataFrame` in a speedy fashion. The DataFrame is
repartitioned if `n_partitions` is passed.
@J2TEAM
J2TEAM / remove_accents.py
Created August 31, 2016 17:11 — forked from cinoss/remove_accents.py
Remove Vietnamese Accents - Xoá dấu tiếng việt in Python
s1 = u'ÀÁÂÃÈÉÊÌÍÒÓÔÕÙÚÝàáâãèéêìíòóôõùúýĂăĐđĨĩŨũƠơƯưẠạẢảẤấẦầẨẩẪẫẬậẮắẰằẲẳẴẵẶặẸẹẺẻẼẽẾếỀềỂểỄễỆệỈỉỊịỌọỎỏỐốỒồỔổỖỗỘộỚớỜờỞởỠỡỢợỤụỦủỨứỪừỬửỮữỰựỲỳỴỵỶỷỸỹ'
s0 = u'AAAAEEEIIOOOOUUYaaaaeeeiioooouuyAaDdIiUuOoUuAaAaAaAaAaAaAaAaAaAaAaAaEeEeEeEeEeEeEeEeIiIiOoOoOoOoOoOoOoOoOoOoOoOoUuUuUuUuUuUuUuYyYyYyYy'
def remove_accents(input_str):
s = ''
print input_str.encode('utf-8')
for c in input_str:
if c in s1:
s += s0[s1.index(c)]
else:
s += c
@anton-petrov
anton-petrov / recaptcha.py
Created September 24, 2017 19:11
Solve reCAPTCHA with Selenium + Python
import re, csv
from time import sleep, time
from random import uniform, randint
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import NoSuchElementException
@snakers4
snakers4 / modeling.py
Created March 1, 2019 09:14
Best pretraining for Russian language - embedding bag interfaces
class BertEmbeddingBag(nn.Module):
"""Construct the embeddings from word, position and token_type embeddings.
"""
def __init__(self, config):
super(BertEmbeddingBag, self).__init__()
# self.word_embeddings = nn.Embedding(config.vocab_size, config.hidden_size)
ngram_matrix=np.load(config.ngram_matrix_path)
self.old_bag = config.old_bag