This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Based on younesbelkada/finetune_llama_v2.py | |
# Install the following libraries: | |
# pip install accelerate==0.21.0 peft==0.4.0 bitsandbytes==0.40.2 transformers==4.31.0 trl==0.4.7 scipy | |
from dataclasses import dataclass, field | |
from typing import Optional | |
import torch | |
from datasets import load_dataset | |
from transformers import ( |
This file has been truncated, but you can view the full file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{"doc_id":"nXJBccSwtB8_0","video_id":"nXJBccSwtB8","content":"You said these are dangerous times. The world order is shifting before our eyes. We also both know that with hyper disruptive technologies like AI on the horizon, a good outcome is not guaranteed. Why do you think big tech will become the third superpower and what are the dangers and opportunities if it does? Big tech is essentially sovereign over the digital world. The fact that former President Trump was de-platformed from Facebook and from Twitter when he was president, you know, most powerful political figure on the planet. And he's just taken off of those networks and as a consequence, hundreds of millions of people that would be regularly engaging with him in real time suddenly can't see it. That wasn't a decision that was made by a government. It wasn't a decision made by a judge or by a regulatory authority or even by a multinational organization like, you know, the UN. It was made by individuals that own tech companies. The same thing is t |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env bash | |
export PROJECT_ID=$(gcloud config get-value project) | |
export PROJECT_USER=$(gcloud config get-value core/account) # set current user | |
export PROJECT_NUMBER=$(gcloud projects describe $PROJECT_ID --format="value(projectNumber)") | |
export IDNS=${PROJECT_ID}.svc.id.goog # workload identity domain | |
export GCP_REGION="us-central1" | |
export GCP_ZONE="us-central1-a" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
# Add to instance metadata with `gcloud compute instances add-metadata \ | |
# instance-name --metadata-from-file startup-script=idle-shutdown.sh` and reboot | |
# NOTE: requires `bc`, eg, sudo apt-get install bc | |
# Modified from https://stackoverflow.com/questions/30556920/how-can-i-automatically-kill-idle-gce-instances-based-on-cpu-usage | |
threshold=0.1 | |
count=0 | |
wait_minutes=60 | |
while true |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# This example demonstrates running furrr code distributed on 2 AWS instances ("nodes"). | |
# The instances have already been created. | |
library(future) | |
library(furrr) | |
# Two t2.micro AWS instances | |
# Created from http://www.louisaslett.com/RStudio_AMI/ | |
public_ip <- c("34.205.155.182", "34.201.26.217") |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
--- | |
title: New York City Taxi & Limousine Commission (TLC) Trip Data Analysis Using Sparklyr | |
and Google BigQuery | |
author: "Mirai Solutions" | |
date: 8\textsuperscript{th} January 2018 | |
output: | |
html_document: | |
theme: flatly | |
params: | |
# gcp_json_keyfile: gcp_keyfile.json |
-
iTerm2
-
Command Line Tools
xcode-select –install
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" Implementation of OKapi BM25 with sklearn's TfidfVectorizer | |
Distributed as CC-0 (https://creativecommons.org/publicdomain/zero/1.0/) | |
""" | |
import numpy as np | |
from sklearn.feature_extraction.text import TfidfVectorizer | |
from scipy import sparse | |
class BM25(object): |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#' Group Mean | |
#' | |
#' @examples | |
#' iris %>% | |
#' group_mean(Species, Petal.Length) | |
group_mean <- function(tbl, group_var, summary_var){ | |
tbl %>% | |
group_by({{ group_var }}) %>% | |
summarize( | |
{{ summary_var }} := mean({{ summary_var }}) |
NewerOlder