Emeka boris ama Emekaborisama

🎯

Focusing

Machine Learning Engineer, Data Scientist, Youtuber and Dev Advocate

Emekaborisama / 100 Days of DS Code Curriculum

Last active March 21, 2019 11:37

Inspired by Siraj Raval post on Learn Data science in 3 month check it out here https://www.youtube.com/watch?v=9rDhY1P3YLA

All content here has been moved to https://github.com/Emekaborisama/100daysofdscode

Emekaborisama / sample.py

Created June 8, 2022 06:26

	from magniv.core import task
	from datetime import datetime
	import urllib
	import json

	import tweepy as tp
	#auth for twitter api
	auth = tp.OAuthHandler('xxxxxxx', 'xxxxxxxx')
	auth.set_access_token('xxxxx-xxxxx', 'xxxxxxxx')
	api = tp.API(auth, wait_on_rate_limit=False)

Emekaborisama / get_btc_price.py

Created June 8, 2022 06:52

	import urllib
	import json

	def get_bitcoin_data():
	"""get btc info via messari api"""
	main_result = {}
	try:

	url = "https://data.messari.io/api/v1/assets/btc/metrics"
	resp = urllib.request.urlopen(url).read()

Emekaborisama / auth_tweepy.py

Created June 8, 2022 06:53

	import tweepy as tp
	#auth for twitter api
	auth = tp.OAuthHandler('xxxxxxxxx', 'xxxxxxx')
	auth.set_access_token('xxxx-xxxxx', 'xxxxxx')
	api = tp.API(auth, wait_on_rate_limit=False)


	try:
	api.verify_credentials()
	print("Authentication done")

Emekaborisama / load_hg_model.py

Last active August 30, 2022 12:29

load huggingface model

	from sentence_transformers import SentenceTransformer,util

	from transformers import AutoTokenizer, AutoModel
	import torch
	import torch.nn.functional as F

	#Mean Pooling - Take attention mask into account for correct averaging
	def mean_pooling(model_output, attention_mask):
	token_embeddings = model_output[0]
	print(token_embeddings)

Emekaborisama / trans_inference.py

Last active August 30, 2022 12:33

transformers_inference

	# Sentences we want sentence embeddings for
	sentences = ['This is an example sentence', 'This is sample of the sentence']



	import time
	start = time.time()

Emekaborisama / convert_transformer_to_onnx.py

Created August 30, 2022 12:39

convert transformers model to onnx using pytorch

	torch.onnx.export(
	model,
	tuple(encoded_input.values()),
	f="torch-model.onnx",
	input_names=['input_ids', 'attention_mask','token_type_ids'],
	output_names=['logits'],
	dynamic_axes={'input_ids': {0: 'batch_size', 1: 'sequence'},
	'attention_mask': {0: 'batch_size', 1: 'sequence'},
	'token_type_ids': {0: 'batch_size', 1: 'sequence'},
	'logits': {0: 'batch_size', 1: 'sequence'}},

Emekaborisama / onnx_runtime_inference.py

Last active August 30, 2022 12:41

onnx inference on cpu with optimization

	import onnxruntime
	import time
	ort_session = onnxruntime.InferenceSession("torch-model.onnx", providers=["CPUExecutionProvider"])

	def to_numpy(tensor):
	return tensor.detach.cpu().numpy() if tensor.requires_grad else tensor.cpu().numpy()

	def run_inference(input):
	tokenei= tokenizer(input, padding=True, truncation=True,return_tensors="pt")
	attention_mask = tokenei['attention_mask']

Emekaborisama / read_data.py

Created November 7, 2022 20:00

read_data

	import pandas as pd
	#read csv file
	train_df = pd.read_csv("train.csv")
	#print the len of the dataframe
	print(len(train_df))
	#print the summary of the dataset
	train_df.info()

Emekaborisama / handle_missing.py

Created November 7, 2022 20:02

handle missing values

	#handle missing values
	from sklearn.impute import SimpleImputer
	imp_ = SimpleImputer(missing_values=np.nan, strategy='most_frequent')
	new_train_df = imp_.fit_transform(train_df)
	new_train_df = pd.DataFrame(new_train_df, columns = train_df.columns)