Skip to content

Instantly share code, notes, and snippets.

View shamilnabiyev's full-sized avatar

Shamil Nabiyev shamilnabiyev

View GitHub Profile
@shamilnabiyev
shamilnabiyev / resize-image-and-bbox.py
Last active November 17, 2022 15:50
Resize an image and bounding boxes
# Source: https://sheldonsebastian94.medium.com/resizing-image-and-bounding-boxes-for-object-detection-7b9d9463125a
import albumentations
from PIL import Image
import numpy as np
sample_img = Image.open("data/img1.jpg")
sample_arr = np.asarray(sample_img)
def resize_image(img_arr, bboxes, h, w):
@shamilnabiyev
shamilnabiyev / pypi-dependency.md
Created November 12, 2022 19:29
Find dependecies of a selected PyPi package

https://pypi.org/pypi/seaborn/0.11.2/json

@shamilnabiyev
shamilnabiyev / confusion-matrix-cross-validation.md
Last active November 2, 2022 10:43
Confusion matrix for cross validation. Results are being saved as mlflow artifacts and retrieved later for evaluation purposes.
import mlflow
import numpy as np
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import precision_recall_fscore_support
from sklearn.metrics import accuracy_score
from sklearn.metrics import confusion_matrix
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import StratifiedKFold
@shamilnabiyev
shamilnabiyev / hpo-optuna.md
Created October 31, 2022 18:52
Hyperparameter optimization for Random Forest Classifier using the Optuna lib
import optuna
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.model_selection import cross_val_score
from sklearn.metrics import classification_report
from sklearn.metrics import precision_recall_fscore_support
from sklearn.metrics import accuracy_score
@shamilnabiyev
shamilnabiyev / mlflow-autolog.md
Created October 31, 2022 18:17
MLFow with autologging and custom metrics
from sklearn.metrics import precision_recall_fscore_support
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import classification_report
from sklearn.datasets import make_classification
from sklearn.metrics import accuracy_score
import mlflow
pandoc --filter pandoc-citeproc --bibliography=test.bib --citeproc -t docx -o test.docsx

@shamilnabiyev
shamilnabiyev / paper.md
Last active October 4, 2022 09:47
Feature selection
  • Guyon, I., & Elisseeff, A. (2003). An Introduction to Variable and Feature Selection. J. Mach. Learn. Res., 3, 1157-1182. URL
  • Tang, J., Alelyani, S., & Liu, H. (2014). Feature Selection for Classification: A Review. Data Classification: Algorithms and Applications. URL
  • ...
@shamilnabiyev
shamilnabiyev / seaborn.md
Last active September 21, 2022 08:52
Matplotlib / seaborn scatter plot. Colors on condition. Custom color palette.
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(20, 5))

a = pd.Series(np.random.randint(60, 180, 25))
b = pd.Series(np.random.randint(55, 160, 25))

x_min = min(min(a), min(b))
y_max = max(max(a), max(b))

sns.scatterplot(a, b, ax=ax1)
@shamilnabiyev
shamilnabiyev / RandomizedSearchCV.md
Last active October 31, 2022 18:20
Hyperparameter tuning for Random Forest Classifier using the RandomizedSearchCV
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import RandomizedSearchCV
import numpy as np
SEED=42

# Number of trees in random forest
n_estimators = [int(x) for x in range(100,505,100)]
# Number of features to consider at every split
max_features = ['auto', 'sqrt']
@shamilnabiyev
shamilnabiyev / one-hot-encoder.md
Last active August 15, 2022 08:18
OneHotEncoder
import numpy as np
from sklearn.preprocessing import OneHotEncoder

# Load the npz file
data = np.load('data.npz')
X, y = data["x"], data["y"]

print(y.shape)
# Output: (1519,)