Skip to content

Instantly share code, notes, and snippets.

View abodacs's full-sized avatar

Abdullah Mohammed abodacs

View GitHub Profile
@abodacs
abodacs / clean_code.md
Created September 13, 2023 19:03 — forked from wojteklu/clean_code.md
Summary of 'Clean code' by Robert C. Martin

Code is clean if it can be understood easily – by everyone on the team. Clean code can be read and enhanced by a developer other than its original author. With understandability comes readability, changeability, extensibility and maintainability.


General rules

  1. Follow standard conventions.
  2. Keep it simple stupid. Simpler is always better. Reduce complexity as much as possible.
  3. Boy scout rule. Leave the campground cleaner than you found it.
  4. Always find root cause. Always look for the root cause of a problem.

Design rules

What is this gist?

Explanation of a fullstack deployment of wagtail in a dockerized environment with Nginx, Elasticsearch, Postgres and Memcached

Required Skills:

  • docker
  • docker-compose
  • get a local wagtail site running
@abodacs
abodacs / finetune_llama_v2.py
Created July 19, 2023 09:49 — forked from younesbelkada/finetune_llama_v2.py
Fine tune Llama v2 models on Guanaco Dataset
# coding=utf-8
# Copyright 2023 The HuggingFace Inc. team. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
#pip install git+https://github.com/huggingface/transformers.git
import datetime
import sys
from transformers import pipeline
from transformers.pipelines.audio_utils import ffmpeg_microphone_live
pipe = pipeline("automatic-speech-recognition", model="openai/whisper-base", device=0)
sampling_rate = pipe.feature_extractor.sampling_rate
@abodacs
abodacs / GPT4all-langchain-demo.ipynb
Created April 4, 2023 10:52 — forked from psychemedia/GPT4all-langchain-demo.ipynb
Example of running GPT4all local LLM via langchain in a Jupyter notebook (Python)
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
from sentence_transformers import SentenceTransformer, util
import torch
# save model in current directory
model = SentenceTransformer('paraphrase-multilingual-MiniLM-L12-v2', device='cpu', cache_folder='./')
# save model in models folder (you need to create the folder on your own beforehand)
# model = SentenceTransformer('paraphrase-multilingual-MiniLM-L12-v2', device='cpu', cache_folder='./models/')
# Corpus with example sentences
corpus = [
#!/usr/bin/python
# -*- coding=utf-8 -*-
"""
An example of cleaning arabic text with PyArbic Library
Requirements: pip install pyarabic
Data: text file
Ouput: text file ( cleaned)
"""
import sys
from phonemizer.backend import EspeakBackend
backend = EspeakBackend('en-us', preserve_punctuation=True, with_stress=True)
text = ["Hello, world!", "Welcome to Medium!"]
phonemized = backend.phonemize(text, strip=True)
print(phonemized)
@abodacs
abodacs / whisper-transcribe.bash
Created November 9, 2022 08:49 — forked from DaniruKun/whisper-transcribe.bash
Transcribe (and translate) any VOD (e.g. from Youtube) using Whisper from OpenAI and embed subtitles!
#!/usr/bin/env bash
# Small shell script to more easily automatically download and transcribe live stream VODs.
# This uses YT-DLP, ffmpeg and the CPP version of Whisper: https://github.com/ggerganov/whisper.cpp
# Use `./transcribe-vod help` to print help info.
# MIT License
# Copyright (c) 2022 Daniils Petrovs
@abodacs
abodacs / gsoc_2022_work_product.md
Created October 11, 2022 22:33 — forked from yuroitaki/gsoc_2022_work_product.md
This document summarises the work that I have done as part of Google Summer of Code 2022.

Google Summer of Code 2022 Work Product

This document summarises the work that I have done as part of Google Summer of Code 2022 (GSoC).

Summary