Skip to content

Instantly share code, notes, and snippets.

View franchb's full-sized avatar
🎯
Focusing

Eliah Rusin franchb

🎯
Focusing
View GitHub Profile
@franchb
franchb / gist:0ed60861afe5d39032520f1096629cef
Created February 2, 2018 12:47 — forked from casschin/gist:1990245
Python webdriver api quick sheet
### Locating UI elements ###
# By ID
<div id="coolestWidgetEvah">...</div>
element = driver.find_element_by_id("coolestWidgetEvah")
or
from selenium.webdriver.common.by import By
element = driver.find_element(by=By.ID, value="coolestWidgetEvah")
# By class name:

API workthough

  1. Open a browser

    # start an instance of firefox with selenium-webdriver
    driver = Selenium::WebDriver.for :firefox
    # :chrome -> chrome
    # :ie     -> iexplore
    
  • Go to a specified URL
@franchb
franchb / separator.py
Created February 14, 2018 09:12 — forked from jlln/separator.py
Efficiently split Pandas Dataframe cells containing lists into multiple rows, duplicating the other column's values.
def splitDataFrameList(df,target_column,separator):
''' df = dataframe to split,
target_column = the column containing the values to split
separator = the symbol used to perform the split
returns: a dataframe with each entry for the target column separated, with each element moved into a new row.
The values in the other columns are duplicated across the newly divided rows.
'''
def splitListToRows(row,row_accumulator,target_column,separator):
split_row = row[target_column].split(separator)
@franchb
franchb / SparkRowApply.scala
Created February 14, 2018 09:12 — forked from jlln/SparkRowApply.scala
How to apply a function to every row in a Spark DataFrame.
def findNull(row:Row):String = {
if (row.anyNull) {
val indices = (0 to row.length-1).toArray.filter(i => row.isNullAt(i))
indices.mkString(",")
}
else "-1"
}
sqlContext.udf.register("findNull", findNull _)
df = df.withColumn("MissingGroups",callUDF("findNull",struct(df.columns.map(df(_)) : _*)))
@franchb
franchb / main.go
Created February 20, 2018 11:17
reuse response Body in Golang
// read the response body to a variable
bodyBytes, _ := ioutil.ReadAll(response.Body)
// Use io.Copy to just dump the response body to the file. This supports huge files
err := ioutil.WriteFile("tmp/asdf.png", bodyBytes, 0644)
if err != nil {
log.Fatal(err)
}
fmt.Println("Captcha image saving success!")
//reset the response body to the original unread state
response.Body = ioutil.NopCloser(bytes.NewBuffer(bodyBytes))
@franchb
franchb / pandas_chunks.py
Created March 5, 2018 13:25
chunks pandas
chunksize = 500
chunks = []
for chunk in pd.read_csv('pizza.csv', chunksize=chunksize):
# Do stuff...
chunks.append(chunk)
df = pd.concat(chunks, axis=0)
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@franchb
franchb / sdev-calc.php
Created March 14, 2018 10:28
Standart deviation for in-system calculation
$this->addFunction('stats_standard_deviation',
function(array $a, $sample = false) {
$n = count($a);
if ($n === 0) {
return 0;
}
if ($sample && $n === 1) {
return 0;
}
$mean = array_sum($a) / $n;
@franchb
franchb / gist:f5551a1cd7394800df236c0843d89b59
Created March 14, 2018 11:47
PowerShell csv fast view
Import-Csv -Delimiter ‰ an_bizinvest_for_test.csv |Out-GridView
Create HADOOP_HOME environment variable pointing to your installation folder selected above.
Without admin:
$env:HADOOP_HOME = "D:\Tools\Hadoop"
With admin:
[Environment]::SetEnvironmentVariable("HADOOP_HOME", "D:\Tools\Hadoop", "Machine")
Add Hadoop bin folder to your Windows Path variable as %HADOOP_HOME%\bin.