Skip to content

Instantly share code, notes, and snippets.

View tnlin's full-sized avatar
🎯
Focusing

Tony Lin tnlin

🎯
Focusing
View GitHub Profile
@pratos
pratos / zeppelin_ubuntu.md
Last active January 7, 2026 00:44
To Install Zeppelin [Scala and Spark] in Ubuntu 16.04LTS

Install Zeppelin in Ubuntu systems

  • First install Java, Scala and Spark in Ubuntu

    • Install Java
      sudo apt-add-repository ppa:webupd8team/java
      sudo apt-get update
      sudo apt-get install oracle-java8-installer
      
@dusenberrymw
dusenberrymw / tensorflow_tips_and_tricks.md
Last active April 2, 2020 16:49
Tips and tricks for TensorFlow, Keras, CUDA, etc.

TensorFlow Tips & Tricks

GPU Memory Issues

  • nvidia-smi to check for current memory usage.
  • watch -n 1 nvidia-smi to monitor memory usage every second.
  • Often, extra Python processes can stay running in the background, maintaining a hold on the GPU memory, even if nvidia-smi doesn't show it.
    • Probably due to running Keras in a notebook, and then running the cell that starts the processes again, since this will fork the current process, which has a hold on GPU memory. In the future, restart the kernel first, and stop all process before exiting (even though they are daemons and should stop automatically when the parent process ends).
@naotokui
naotokui / conv_autoencoder_keras.ipynb
Created January 10, 2017 04:17
Convolutional Autoencoder in Keras
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@W4ngatang
W4ngatang / download_glue_data.py
Last active October 21, 2025 02:22
Script for downloading data of the GLUE benchmark (gluebenchmark.com)
''' Script for downloading all GLUE data.
Note: for legal reasons, we are unable to host MRPC.
You can either use the version hosted by the SentEval team, which is already tokenized,
or you can download the original data from (https://download.microsoft.com/download/D/4/6/D46FF87A-F6B9-4252-AA8B-3604ED519838/MSRParaphraseCorpus.msi) and extract the data from it manually.
For Windows users, you can run the .msi file. For Mac and Linux users, consider an external library such as 'cabextract' (see below for an example).
You should then rename and place specific files in a folder (see below for an example).
mkdir MRPC
cabextract MSRParaphraseCorpus.msi -d MRPC
@yoavg
yoavg / LLMs.md
Last active December 27, 2025 05:35

Some remarks on Large Language Models

Yoav Goldberg, January 2023

Audience: I assume you heard of chatGPT, maybe played with it a little, and was imressed by it (or tried very hard not to be). And that you also heard that it is "a large language model". And maybe that it "solved natural language understanding". Here is a short personal perspective of my thoughts of this (and similar) models, and where we stand with respect to language understanding.

Intro

Around 2014-2017, right within the rise of neural-network based methods for NLP, I was giving a semi-academic-semi-popsci lecture, revolving around the story that achieving perfect language modeling is equivalent to being as intelligent as a human. Somewhere around the same time I was also asked in an academic panel "what would you do if you were given infinite compute and no need to worry about labour costs" to which I cockily responded "I would train a really huge language model, just to show that it doesn't solve everything!". We