Skip to content

Instantly share code, notes, and snippets.

View shantanuo's full-sized avatar

Shantanu Oak shantanuo

  • oksoft
  • mumbai
View GitHub Profile
@shantanuo
shantanuo / LibrePythonista-docker-linuxserver-libreoffice.md
Last active March 18, 2025 04:20 — forked from Amourspirit/LibrePythonista-docker-linuxserver-libreoffice.md
Running LibrePythonista in linuxserver/libreoffice docker

Running LibrePythonista in linuxserver/libreoffice docker

In order to install LibrePythonista in linuxserver/libreoffice docker (Alpine) image py3-pip must be installed.

Create a docker and a docker-compose.yml files and place them in the same folder. Use the contents from the examples below.

Note that volumes are optional. If you want to follow this yml file on Linux or Mac then create a folder in your home folder named vm_shared and that folder will be shared with you docker image.

volumes:
 - ~/vm_shared:/vm_shared # Mount ~/vm_shared to /vm_shared inside the container
@shantanuo
shantanuo / unsupported.csv
Created February 1, 2025 05:54 — forked from jsoncow/unsupported.csv
Unsupported AWS Service Quotas
ServiceCode ServiceName QuotaCode QuotaName
AWSCloudMap AWS Cloud Map L-D95E8A57 Instances per namespace
AWSCloudMap AWS Cloud Map L-2DA90E5C Instances per service
AWSCloudMap AWS Cloud Map L-D589BB26 Custom attributes per instance
account AWS Account Management L-E37B66F4 Number of concurrent region-opt requests per account
account AWS Account Management L-33A0F311 Number of concurrent region-opt requests per organization
acm AWS Certificate Manager (ACM) L-DA1D8B98 ACM certificates created in last 365 days
acm AWS Certificate Manager (ACM) L-D2CB7DE9 Imported certificates
acm AWS Certificate Manager (ACM) L-FB94F0B0 Domain names per ACM certificate
acm AWS Certificate Manager (ACM) L-F141DD1D ACM certificates
@shantanuo
shantanuo / unsupported.csv
Created February 1, 2025 05:54 — forked from jsoncow/unsupported.csv
Unsupported AWS Service Quotas
AWSCloudMap AWS Cloud Map L-D95E8A57 Instances per namespace
AWSCloudMap AWS Cloud Map L-2DA90E5C Instances per service
AWSCloudMap AWS Cloud Map L-D589BB26 Custom attributes per instance
account AWS Account Management L-E37B66F4 Number of concurrent region-opt requests per account
account AWS Account Management L-33A0F311 Number of concurrent region-opt requests per organization
acm AWS Certificate Manager (ACM) L-DA1D8B98 ACM certificates created in last 365 days
acm AWS Certificate Manager (ACM) L-D2CB7DE9 Imported certificates
acm AWS Certificate Manager (ACM) L-FB94F0B0 Domain names per ACM certificate
acm AWS Certificate Manager (ACM) L-F141DD1D ACM certificates
@shantanuo
shantanuo / fetch-aws-spotprice.bash
Created January 5, 2025 07:43 — forked from alexanderdavidsen/fetch-aws-spotprice.bash
Bash script to fetch AWS spot pricing
#!/bin/bash
# Colors
BLUE='\033[0;34m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
RED='\033[0;31m'
NC='\033[0m' # No Color
BOLD='\033[1m'
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@shantanuo
shantanuo / tokenizers.md
Created January 30, 2023 06:29 — forked from akhan619/tokenizers.md
Exploring Tokenizers from Hugging Face

Exploring Tokenizers from Hugging Face

Hugging Face (HF) has made NLP (Natural Language Processing) a breeze. In this post, we are going to take a look at tokenization using a hands on approach with the help of the Tokenizers library. We are going to load a real world dataset containing 10-K filings of public firms and see how to train a tokenizer from scratch based on the BERT tokenization scheme. In the process we will understand tokenization in detail and some gotchas to keep an eye out for.

Background on NLP (Optional)

If you already have an understanding of the NLP pipeline, you can safely skip this section.

For any NLP task, one of the first steps is pre-processing the data so that it can be fed into our NLP models. For those new to NLP, the general pipeline for any NLP task (text classification, question answering, etc.) is as follows:

@shantanuo
shantanuo / all_summary.py
Created December 30, 2021 09:50 — forked from NewscatcherAPI/all_summary.py
spacy_vs_nltk_newscatcher_blog
summary = [article['summary'] for article in articles]
sentence = summary[0]
@shantanuo
shantanuo / index.html
Last active May 17, 2021 00:50 — forked from aquilax/index.html
Sort textarea unique
<a href="javascript:(function(){Array.from(document.querySelectorAll('textarea')).map(function(b){var a=document.createElement('div');var d=document.createElement('button');d.textContent='↑';d.addEventListener('click',function(f){f.preventDefault();b.value=Array.from(new Set(b.value.split('\n'))).sort().join('\n')});var c=document.createElement('button');c.textContent='↓';c.addEventListener('click',function(f){f.preventDefault();b.value=Array.from(new Set(b.value.split('\n'))).sort().reverse().join('\n')});a.appendChild(d);a.appendChild(c);b.parentNode.insertBefore(a,b)})})();">Sort textarea unique</a>
@shantanuo
shantanuo / cache_example.py
Created October 28, 2019 09:55 — forked from treuille/cache_example.py
This demonstrates the st.cache function
import streamlit as st
import pandas as pd
# Reuse this data across runs!
read_and_cache_csv = st.cache(pd.read_csv)
BUCKET = "https://streamlit-self-driving.s3-us-west-2.amazonaws.com/"
data = read_and_cache_csv(BUCKET + "labels.csv.gz", nrows=1000)
desired_label = st.selectbox('Filter to:', ['car', 'truck'])
st.write(data[data.label == desired_label])
clf = Pipeline([("dct", DictVectorizer()), ("svc", LinearSVC())])
params = {
"svc__C": [1e15, 1e13, 1e11, 1e9, 1e7, 1e5, 1e3, 1e1, 1e-1, 1e-3, 1e-5]
}
gs = GridSearchCV(clf, params, cv=10, verbose=2, n_jobs=-1)
gs.fit(X, y)
model = gs.best_estimator_