Skip to content

Instantly share code, notes, and snippets.

View snoop2head's full-sized avatar

snoop2head snoop2head

View GitHub Profile
@LevanKvirkvelia
LevanKvirkvelia / nanoBERT.py
Last active October 3, 2024 17:51
nanoBERT, inspired by @karpathy's nanoGPT
import torch.nn as nn
import torch.nn.functional as F
import math
from typing import Optional, Tuple
class BertSelfAttention(nn.Module):
def __init__(self, config):
super().__init__()
if config.hidden_size % config.num_attention_heads != 0:
@avidale
avidale / t5_extra_models.py
Created August 29, 2021 07:59
T5ForSequenceClassification
import torch
import copy
from torch import nn
from transformers import T5PreTrainedModel
from transformers.models.t5.modeling_t5 import T5Stack
from transformers.modeling_outputs import SequenceClassifierOutput
from torch.nn import BCEWithLogitsLoss, CrossEntropyLoss, MSELoss
def mean_pooling(inputs, mask):
@lovit
lovit / huggingface_konlpy.md
Last active November 20, 2024 18:00
huggingface + KoNLPy

Huggingface

  • NLP 관련 다양한 패키지를 제공하고 있으며, 특히 언어 모델 (language models) 을 학습하기 위하여 세 가지 패키지가 유용
package note
transformers Transformer 기반 (masked) language models 알고리즘, 기학습된 모델을 제공
tokenizers transformers 에서 사용할 수 있는 토크나이저들을 학습/사용할 수 있는 기능 제공. transformers 와 분리된 패키지로 제공
nlp 데이터셋 및 평가 척도 (evaluation metrics) 을 제공
@typokign
typokign / zoomsucks.md
Last active September 8, 2023 05:06
Zoom Sucks

Zoom Sucks

  • Zoom abuses the installer flow on MacOS to bypass permissions dialogs (source)
  • Zoom sends identifying device info to Facebook, even when users don't have a Facebook account (source) (fixed)
  • A bug in Zoom sent identifying information (including email addresses and profile pictures) of thousands of users to strangers (source)
  • Zoom claims that meetings are end-to-end encrypted in their white paper and marketing materials, but meetings are only encrypted in transit, and are available in plaintext to Zoom servers and employees. (source)
  • zoomAutenticationTool can be used to escalat
@aditya-malte
aditya-malte / smallberta_pretraining.ipynb
Created February 22, 2020 13:41
smallBERTa_Pretraining.ipynb
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@staugur
staugur / Python URL Validation.py
Created September 3, 2019 08:24 — forked from Integralist/Python URL Validation.py
[Python URL Validation] #python #urls #validation
import re
ip_middle_octet = u"(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5]))"
ip_last_octet = u"(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))"
regex = re.compile(
u"^"
# protocol identifier
u"(?:(?:https?|ftp)://)"
# user:pass authentication
import json
import urllib.parse
import boto3
print('Loading function')
s3 = boto3.client('s3')
def lambda_handler(event, context):
#1 - Get the bucket name
@weiaicunzai
weiaicunzai / accuracy.py
Created June 22, 2018 13:20
compute top1, top5 error using pytorch
from __future__ import print_function, absolute_import
__all__ = ['accuracy']
def accuracy(output, target, topk=(1,)):
"""Computes the precision@k for the specified values of k"""
maxk = max(topk)
batch_size = target.size(0)
_, pred = output.topk(maxk, 1, True, True)
@roycewilliams
roycewilliams / pwnedpasswords-v2-top20k.txt
Last active January 4, 2025 03:16
pwnedpasswords-v2-top20k.txt
#------------------------------------------------------------------------------
# Top 20K hashes from the Troy Hunt / haveibeenpwned Pwned Passwords list v2 (2018-02-21)
# with frequency count and cracked plaintext passwords
#
# The latest version of this file can be found here:
# https://gist.github.com/roycewilliams/281ce539915a947a23db17137d91aeb7
#
# NOTE: THIS FILE IS DEPRECATED.
# The equivalent of this file, but based on v6 of the Pwned Passwords, is here:
# https://gist.github.com/roycewilliams/226886fd01572964e1431ac8afc999ce