Audhi Aprilliant audhiaprilliant

🎯

Focusing

Eat Sleep Code Repeat

43 followers · 20 following

Towards Data Science
Indonesia
https://audhiaprilliant.github.io/

View GitHub Profile

Recently created

Least recently created

Recently updated

Least recently updated

audhiaprilliant / threshold_precision_recall_fscore.py

Created December 24, 2020 03:03

How to choose the optimal threshold for imbalanced classification

	# Calculate the f-score
	fscore = (2 * precision * recall) / (precision + recall)

	# Find the optimal threshold
	index = np.argmax(fscore)
	thresholdOpt = round(thresholds[index], ndigits = 4)
	fscoreOpt = round(fscore[index], ndigits = 4)
	recallOpt = round(recall[index], ndigits = 4)
	precisionOpt = round(precision[index], ndigits = 4)
	print('Best Threshold: {} with F-Score: {}'.format(thresholdOpt, fscoreOpt))

audhiaprilliant / threshold_tuning.py

Created December 24, 2020 03:05

How to choose the optimal threshold for imbalanced classification

	# Array for finding the optimal threshold
	thresholds = np.arange(0.0, 1.0, 0.0001)
	fscore = np.zeros(shape=(len(thresholds)))
	print('Length of sequence: {}'.format(len(thresholds)))

	# Fit the model
	for index, elem in enumerate(thresholds):
	# Corrected probabilities
	y_pred_prob = (y_pred > elem).astype('int')
	# Calculate the f-score

audhiaprilliant / kprototype.ipynb

Last active April 13, 2023 16:37

Clustering Algorithm for Mixed Data Type

Sorry, something went wrong. Reload?

Sorry, we cannot display this file.

Sorry, this file is invalid so it cannot be displayed.

audhiaprilliant / fuzzy_optimization.py

Created February 12, 2021 12:18

Fuzzy String Matching

	# Import module for data manipulation
	import pandas as pd
	# Import module for linear algebra
	import numpy as np
	# Import module for Fuzzy string matching
	from fuzzywuzzy import fuzz, process
	# Import module for regex
	import re
	# Import module for iteration
	import itertools

audhiaprilliant / fuzzy_conventional.py

Last active February 12, 2021 12:51

Fuzzy String Matching

	# Import module for data manipulation
	import pandas as pd
	# Import module for linear algebra
	import numpy as np
	# Import module for Fuzzy string matching
	from fuzzywuzzy import fuzz, process
	# Import module for binary search

	def stringMatching(
	df: pd.DataFrame,

audhiaprilliant / cluster_ensemble_data_manipulation.R

Last active March 7, 2021 08:11

Cluster Ensemble - Data Manipulation

	# Install pakcage for cluster ensemble
	install.packages('diceR')
	install.packages('treemapify')
	library(diceR)
	library(dplyr)
	library(ggplot2)
	library(treemapify)

	# Load the order data
	df_orders = read.csv(file = '../data/olist_orders_dataset.csv', header = TRUE, sep = ',')

audhiaprilliant / cluster_ensemble_data_visualization.R

Created March 7, 2021 08:22

Cluster Ensemble - Data Visualization

	# Group customer by their customer segment
	rfm_level_agg = rfm %>%
	group_by(`Customer Segment`) %>%
	summarize(
	Recency = mean(Recency),
	Frequency = mean(Frequency),
	`Monetary Mean` = mean(Monetary),
	`Monetary Count` = n(),
	`Marketing Action` = unique(`Marketing Action`)
	)

audhiaprilliant / kmodes.ipynb

Last active July 2, 2023 23:49

Clustering Algorithm for Categorical Data Type

Sorry, something went wrong. Reload?

Sorry, we cannot display this file.

Sorry, this file is invalid so it cannot be displayed.

audhiaprilliant / factor_analysis_composite_index.ipynb

Last active November 9, 2022 02:48

Sorry, something went wrong. Reload?

Sorry, we cannot display this file.

Sorry, this file is invalid so it cannot be displayed.

audhiaprilliant / simulation_central_limit_theorem.R

Last active May 28, 2021 03:33

	# ---------- Central Limit Theorem ----------
	# Parameters
	sample_mean = 100000
	sample_size = 20
	set.seed(1234)

	# 1 Exponential Distribution
	x = rexp(n = 4000, rate = 0.1)
	hist(x)
	mean(x)

Older Newer