Pieter Huybrechts PietrH

Working on machine observation data at the Research Institute for Nature and Forest. --- Biodiversity Informatics & Big Data: 🐟 🌿 🐦 📡

11 followers · 12 following

View GitHub Profile

Recently created

Least recently created

Recently updated

Least recently updated

PietrH / intermed.R

Created June 18, 2019 11:31

Save intermediate result in dplyr pipe

	library(tidyverse)
	head(test) %>%
	{.->>intermed_result} %>% #save intermediate result
	print()

PietrH / date_day_not_equal.txt

Created June 19, 2019 10:36

Check if a date doesn't start with a number of day values in Openrefine GREL (General Refine Expression Language)

	#parse the date as a string and extract the day,
	#compare it to the string value '1' and '15' and if it's equal to either, respond TRUE else respond FALSE

	or((value.toDate('dd MMM yyyy').toString('d'))==toString(1),(value.toDate('dd MMM yyyy').toString('d'))==toString(15))

	#to be used as a Facet in OpenRefine

PietrH / distinct_rows.R

Created June 25, 2019 11:58

Keep only distinct rows of a single column in a dataframe in R, but return all columns of these rows

	library(dplyr)

	distinct(DataFrame,Column_To_Filter_On,.keep_all = TRUE) %>%
	view('Dataframe distinct')

PietrH / grepl.R

Created June 26, 2019 08:01

R grepl Function, REGEX examples

	#Orignial Source: http://www.endmemo.com/program/R/grepl.php

	#grepl returns TRUE if a string contains the pattern, otherwise FALSE; if the parameter is a string vector,
	#returns a logical vector (match or not for each element of the vector).

	grepl(pattern, x, ignore.case = FALSE, perl = FALSE,
	fixed = FALSE, useBytes = FALSE)

	#pattern: regular expression, or string for fixed=TRUE
	#x: string, the character vector

PietrH / check_string_ending.txt

Created June 26, 2019 09:13

Checks if a string ends with a certain character, and if this is the case, omits that character in OpenRefine GREL

if(value.endsWith(';'),substring(value,0,-1),value)

PietrH / xpath_basic_syntax.md

Last active July 4, 2019 10:23

XPath basic syntax

    
Expression
Description

nodename
Selects all nodes with the name "nodename"

/
Selects from the root node

//
Selects nodes in the document from the current node that match the selection no matter where they are

.
Selects the current node

..
Selects the parent of the current node

@
Selects attributes

[]
Condition on a selection

Expression	Description
nodename	Selects all nodes with the name "nodename"
/	Selects from the root node
//	Selects nodes in the document from the current node that match the selection no matter where they are
.	Selects the current node
..	Selects the parent of the current node
@	Selects attributes
[]	Condition on a selection

PietrH / xpath_wildcards.md

Created July 4, 2019 10:31

Wildcards for XPath queries

    
Wildcard
Description

*
matches any element node

@*
matches any attribute node

node()
matches any node

Wildcard	Description
*	matches any element node
@*	matches any attribute node
node()	matches any node

PietrH / number_from_string_re.py

Created July 5, 2019 12:12

Extract a number from a string in Python using regular expressions

	import re #regular expressions in Python

	def number_from_string(string):
	return re.findall("\d+",string)


	#The function will return a list of all digit sequences in the string

PietrH / ChangeExt.ps1

Created July 8, 2019 09:59

Bulk change the extension of files in a folder in Powershell

Get-ChildItem -Path C:\Demo -Filter *.txt | Rename-Item -NewName {[System.IO.Path]::ChangeExtension($_.Name, ".old")}

PietrH / count_nchar.R

Created July 15, 2019 06:59

Frequency table of character length

	library(dplyr)

	#we want to produce a frequency table of the length (nchar) of a column 'column' of our dataframe 'dataset'

	mutate(dataset,nchar=nchar(column)) %>% count(nchar)

OlderNewer