Skip to content

Instantly share code, notes, and snippets.

View Laurian's full-sized avatar

Laurian Gridinoc Laurian

  • Creative Technologist ※ Knight-Mozilla OpenNews Fellow ※ Visual analytics × Computational Linguistics × Semantic Web
  • Cyberspace
  • X @gridinoc
View GitHub Profile
@veekaybee
veekaybee / chatgpt.md
Last active March 10, 2025 07:45
Everything I understand about chatgpt

ChatGPT Resources

Context

ChatGPT appeared like an explosion on all my social media timelines in early December 2022. While I keep up with machine learning as an industry, I wasn't focused so much on this particular corner, and all the screenshots seemed like they came out of nowhere. What was this model? How did the chat prompting work? What was the context of OpenAI doing this work and collecting my prompts for training data?

I decided to do a quick investigation. Here's all the information I've found so far. I'm aggregating and synthesizing it as I go, so it's currently changing pretty frequently.

Model Architecture

@gmurdocca
gmurdocca / socat_caesar_dpi.md
Last active May 2, 2025 06:17
Circumventing Deep Packet Inspection with Socat and rot13

Circumventing Deep Packet Inspection with Socat and rot13

I have a Linux virtual machine inside a customer's private network. For security, this VM is reachable only via VPN + Citrix + Windows + a Windows SSH client (eg PuTTY). I am tasked to ensure this Citrix design is secure, and users can not access their Linux VM's or other resources on the internal private network in any way outside of using Citrix.

The VM can access the internet. This task should be easy. The VM's internet gateway allows it to connect anywhere on the internet to TCP ports 80, 443, and 8090 only. Connecting to an internet bastion box on one of these ports works and I can send and receive clear text data using netcat. I plan to use good old SSH, listening on tcp/8090 on the bastion, with a reverse port forward configured to expose sshd on the VM to the public, to show their Citrix gateway can be circumvented.

Rejected by Deep Packet Inspection

I hit an immediate snag. The moment I try to establish an SSH or SSL connection over o

@darconeous
darconeous / rect-starlink-cable-hack.md
Last active December 7, 2024 17:45
Hacking the Rectangular Starlink Dishy Cable
@avilde
avilde / createRandomWithSeed.ts
Last active February 26, 2025 16:59
Mulberry 32-bit random seed generator - Typescript
/* eslint-disable no-bitwise, no-param-reassign, operator-assignment */
// Mulberry32 - 32-bit random seed generator
// Source: https://github.com/bryc/code/blob/master/jshash/PRNGs.md#mulberry32
/**
* Function is used to get the same random value every time to ensure
* data is the same in unit tests and screenshot tests for storybook
* @param seed
* @returns random number based on input seed
@0xabad1dea
0xabad1dea / copilot-risk-assessment.md
Last active September 11, 2023 10:21
Risk Assessment of GitHub Copilot

Risk Assessment of GitHub Copilot

0xabad1dea, July 2021

this is a rough draft and may be updated with more examples

GitHub was kind enough to grant me swift access to the Copilot test phase despite me @'ing them several hundred times about ICE. I would like to examine it not in terms of productivity, but security. How risky is it to allow an AI to write some or all of your code?

Ultimately, a human being must take responsibility for every line of code that is committed. AI should not be used for "responsibility washing." However, Copilot is a tool, and workers need their tools to be reliable. A carpenter doesn't have to

@wlib
wlib / LICENSE
Last active April 30, 2024 17:07
Run a shell script with bash, line-by-line, prompted on each command. Useful for running unknown scripts or debugging. Not a secure substitute for understanding a script beforehand.
MIT License
Copyright (c) 2021 Daniel Ethridge
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
@dabit3
dabit3 / SingleTableAppSync.md
Last active February 24, 2023 20:05
GraphQL Single Table Design with DynamoDB and AWS AppSync

GraphQL

GraphQL Schema

type Customer {
  id: ID!
  email: String!
}
@maitrungduc1410
maitrungduc1410 / create-vod-hls.sh
Last active April 12, 2025 01:00
Bash scripts to create VOD HLS stream with ffmpeg (Extended version)
#!/usr/bin/env bash
START_TIME=$SECONDS
set -e
echo "-----START GENERATING HLS STREAM-----"
# Usage create-vod-hls.sh SOURCE_FILE [OUTPUT_NAME]
[[ ! "${1}" ]] && echo "Usage: create-vod-hls.sh SOURCE_FILE [OUTPUT_NAME]" && exit 1
# comment/add lines here to control which renditions would be created
renditions=(
@simonw
simonw / macos-machine-learning-models.md
Created May 21, 2020 15:46
Machine learning models installed on macOS is part of Vision.framework

Machine learning models installed on macOS is part of Vision.framework

I found these while poking around at the list of open files for photoanalysisd in Activity Monitor.

% ls -lahS /System/Library/Frameworks/Vision.framework/Versions/A/Resources
total 465616
-rw-r--r--    1 root  wheel    40M Dec 13 16:32 landmarks_v2.bin
-rw-r--r--    1 root  wheel    31M Dec 13 16:32 scenenet_sc2.4_sa1.4_ae1.4_r9_opt_int8.espresso.weights
-rw-r--r--    1 root  wheel    29M Dec 13 16:32 scenenet_sc2.4_sa1.4_ae1.6_r13.4_opt_int8_asymetric.espresso.weights
@tamuhey
tamuhey / tokenizations_post.md
Last active July 27, 2024 14:46
How to calculate the alignment between BERT and spaCy tokens effectively and robustly

How to calculate the alignment between BERT and spaCy tokens effectively and robustly

image

site: https://tamuhey.github.io/tokenizations/

Natural Language Processing (NLP) has made great progress in recent years because of neural networks, which allows us to solve various tasks with end-to-end architecture. However, many NLP systems still require language-specific pre- and post-processing, especially in tokenizations. In this article, I describe an algorithm that simplifies calculating correspondence between tokens (e.g. BERT vs. spaCy), one such process. And I introduce Python and Rust libraries that implement this algorithm. Here are the library and the demo site links: