Skip to content

Instantly share code, notes, and snippets.

View Dmitrii-I's full-sized avatar

Dmitrii I Dmitrii-I

  • Switzerland
  • 14:41 (UTC +01:00)
View GitHub Profile
@Dmitrii-I
Dmitrii-I / power_set.py
Last active May 2, 2018 21:31
Power set of an arbitrary set, not allowed to use set functions
def power_set(any_set):
any_set = list(any_set)
subsets = []
s = tuple(range(1, len(any_set)+1))
buffer = [[el] for el in s]
while buffer:
new_buffer = []
@Dmitrii-I
Dmitrii-I / timezones-POSIXct.R
Created May 2, 2017 17:50
Weird timezone behavior POSIXct objects in R
cat('Weird timezone behavior')
# Weird timezone behavior
(t1 <- as.POSIXct(c('2015-02-01 20:00', '2015-10-25 02:57'))) # CET CET
# [1] "2015-02-01 20:00:00 CET" "2015-10-25 02:57:00 CET"
(t2 <- as.POSIXct(c('2015-06-01 20:00', '2015-10-25 02:57'))) # CEST CEST
# [1] "2015-06-01 20:00:00 CEST" "2015-10-25 02:57:00 CEST"
(t3 <- as.POSIXct('2015-10-25 02:57')) # CEST
# [1] "2015-10-25 02:57:00 CEST"
t1[2] == t2[2]
# [1] FALSE
def list_files_recursively(topdir):
files = []
topdir = os.path.expanduser(topdir)
for dirname, dirnames, filenames in os.walk(topdir):
for filename in filenames:
fullpath = os.path.join(dirname, filename)
files.append(fullpath[len(topdir)+1:])
return files

What is machine learning exactly? Here's what been said:

Wikipedia
Machine learning is a subfield of computer science (CS) and artificial intelligence (AI) that deals with the construction and study of systems that can learn from data, rather than follow only explicitly programmed instructions.

Standford Machine Learning course - Coursera
Machine learning is the science of getting computers to act without being explicitly programmed.

Data Mining - Ian H. Witten, Eibe Frank, Mark A. Hall
We interpret machine learning as the acquisition of structural descriptions from examples.

@Dmitrii-I
Dmitrii-I / top_10_data_mining_algos_using_R.md
Last active December 26, 2021 05:36
Top 10 algorithms in data mining -- using R

Top 10 algorithms in data mining - with R

Wu et al. describe top 10 algorithms in data mining in (LDO) "Top 10 algorithms in data mining" (2007). How to use these algorithms in R is shown here. The datasets used are available in R itself, no need to download anything. Run data() to see the available datasets. Nothing is original here, everything was Googled, and no references are made to sources. The purpose of all this is to show how quickly you can prototype most algorithms with minimal code, in R.

1. C4.5

require(rJava) # needed for printing strings out of Java objects
require(RWeka) # contains the J48() function that builds C4.5 decision trees 
iris_c4.5 <- J48(Species ~ ., data=iris)
writeLines(rJava::.jstrVal(iris_C4.5$classifier))