Kamal Shrestha shresthakamal

🏠

Working from home

Senior Machine Learning Engineer

shresthakamal / sampleREADME.md

Last active July 20, 2022 05:53 — forked from FrancesCoronel/sampleREADME.md

A sample README for all your GitHub projects.

#Kamal Shrestha

shresthakamal / pytest.py

Created May 29, 2021 09:34

Pytest

	""" Pytest Notes
	Tests increases your confidence that the code behaves as you expect and ensures that changes to your code won’t cause regressions.

	- Several short comming of unittest
	- Need to import Test case class
	- Define a function for each test cases
	-

	- With pytest common t tasks takes les code time saving commands for advanced tasks

shresthakamal / dataset.py

Last active November 2, 2024 12:14

Custom Dataset in Pytorch from Pandas Dataframe

	from torch.utils.data import Dataset

	class CustomTrainDataset(Dataset):
	def __init__(self, df, tokenizer):
	self.df = df
	self.tokenizer = tokenizer

	def __len__(self):
	return len(self.df)

shresthakamal / text-processing.py

Created February 20, 2023 08:38

Common Text Processing Steps in NLP

	# standard pre-processing steps for text processing
	# 1. lower case
	# 2. remove punctuation
	# 3. remove stop words
	# 4. remove numbers
	# 5. remove short words
	# 6. lemmatize
	# 7. stem
	# 8. remove non-ascii characters
	# 9. remove extra spaces

shresthakamal / finetunning-BERT.py

Created February 20, 2023 08:56

Standard Finetunning Steps

	# standard steps to follow for finetunning BERT
	# 1. Load the pre-trained model
	# 2. Tokenize the input
	# 3. Convert the tokens to their index numbers in the BERT vocabulary
	# 4. Set all of the model’s parameters to their gradients to zero
	# 5. Run the forward pass, calculate the loss, and perform a backward pass to calculate the gradients
	# 6. Clip the the gradients to 1.0. It helps in preventing the exploding gradient problem
	# 7. Update the model’s parameters
	# 8. Update the learning rate.
	# 9. Clear the calculated gradients

shresthakamal / .pre-commit-config.yaml

Last active November 3, 2024 06:16

Sample pre-commit hooks

	repos:
	- repo: https://github.com/pre-commit/pre-commit-hooks
	rev: v5.0.0
	hooks:
	- id: trailing-whitespace
	- id: end-of-file-fixer
	- id: check-yaml
	- id: debug-statements
	- id: name-tests-test
	- id: requirements-txt-fixer

shresthakamal / dataclasses.py

Created February 21, 2024 04:11

Data Class Guides in Python >=3.10

	import random
	import string
	from dataclasses import dataclass, field


	def generate_id() -> str:
	return "".join(random.choices(string.ascii_uppercase, k=12))


	"""

shresthakamal / pydantic._usage.py

Created February 21, 2024 04:39

Pydantic Usage [similar to dataclasses]

	"""
	Basic example showing how to read and validate data from a file using Pydantic.
	"""

	import json
	from typing import List, Optional

	import pydantic

shresthakamal / basics_bert.py

Created February 24, 2024 08:37

Basics of BERT with good generalizations and rule of thumbs

	https://colab.research.google.com/drive/1yFphU6PW9Uo6lmDly_ud9a6c4RCYlwdX#scrollTo=Mq2PKplWfbFv
	https://mccormickml.com/2019/05/14/BERT-word-embeddings-tutorial/#31-running-bert-on-our-text


	# -- coding: utf-8 --
	"""BERT Word Embeddings v2.ipynb

	Automatically generated by Colaboratory.

	Original file is located at

shresthakamal / tmux

Created February 28, 2024 10:50

Tmux Server Commands