hi everyone so I've wanted to make this video for a while it is a comprehensive but General audience introduction to large language models like Chachi PT and what I'm hoping to achieve in this video is to give you kind of mental models for thinking through what it is that this tool is it is obviously magical and amazing in some respects it's uh really good at some things not very good at other things and there's also a lot of sharp edges to be aware of so what is behind this text box you can put anything in there and press enter but uh what should we be putting there and what are these words generated back how does this work and what what are you talking to exactly so I'm hoping to get at all those topics in this video we're going to go through the entire pipeline of how this stuff is built but I'm going to keep everything uh sort of accessible to a general audience so let's take a look at first how you build something like chpt and along the way I'm going to talk about um you know some of the sort of cogniti
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env bash | |
FILE="/some/dir/to/file-01.sql"; | |
echo $FILE; | |
case "${FILE}" in | |
**/file-01.sql \ | |
| **/file-02.sql \ | |
| **/file-03.sql) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
-- https://medium.com/codex/normalize-and-casefold-in-bigquery-675c670976b0 | |
-- example of two strings that are converted to same string under all normalization modes | |
SELECT a, b, | |
NORMALIZE_AND_CASEFOLD(a, NFD) as a_nfd, | |
NORMALIZE_AND_CASEFOLD(b, NFD) AS b_nfd, | |
NORMALIZE_AND_CASEFOLD(a, NFC) as a_nfc, | |
NORMALIZE_AND_CASEFOLD(b, NFC) AS b_nfc, | |
NORMALIZE_AND_CASEFOLD(a, NFKD) as a_nfkd, | |
NORMALIZE_AND_CASEFOLD(b, NFKD) AS b_nkfd, |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env bash | |
DATE_RANGE=$1; # 2023-05-01,2023-11-01 | |
if [[ -z "$1" ]]; then | |
echo "must specify date range (format YYYY-MM-DD,YYYY-MM-DD) as 2nd argument" | |
exit 1 | |
fi; | |
IFS=",;:" read -r START_DATE END_DATE <<< "${DATE_RANGE}"; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Date | Open | High | Low | Close | Adj Close | Volume | |
---|---|---|---|---|---|---|---|
2022-12-01 | 197.080002 | 198.919998 | 191.800003 | 194.699997 | 194.699997 | 80046200 | |
2022-12-02 | 191.779999 | 196.250000 | 191.110001 | 194.860001 | 194.860001 | 73645900 | |
2022-12-05 | 189.440002 | 191.270004 | 180.550003 | 182.449997 | 182.449997 | 93122700 | |
2022-12-06 | 181.220001 | 183.649994 | 175.330002 | 179.820007 | 179.820007 | 92150800 | |
2022-12-07 | 175.029999 | 179.380005 | 172.220001 | 174.039993 | 174.039993 | 84213300 | |
2022-12-08 | 172.199997 | 175.199997 | 169.059998 | 173.440002 | 173.440002 | 97624500 | |
2022-12-09 | 173.839996 | 182.500000 | 173.360001 | 179.050003 | 179.050003 | 104872300 | |
2022-12-12 | 176.100006 | 177.369995 | 167.520004 | 167.820007 | 167.820007 | 109794500 | |
2022-12-13 | 174.869995 | 175.050003 | 156.910004 | 160.949997 | 160.949997 | 175862700 |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
# clone only basic branch details / metadata & checkout only root dir
git clone --filter=blob:none --sparse %your-git-repo-url%
cd %your-local-github-repo%
# can checkout sub-directory or individual file
git sparse-checkout add %subdirectory-to-be-cloned%
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
conda env list | |
conda activate llm_3.9 | |
pip install jupyter | |
pip install notebook | |
pip install langchain | |
pip install openai # required for chat and embeddings | |
pip install chromadb # required for default Vector DB | |
pip install pypdf # required for PyPDFLoader |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# -------------------------------------------------------- | |
# result of a `json_serving_input_fn` that returns | |
# tf.estimator.export.ServingInputReceiver(inputs, inputs) | |
# where 'inputs' contains the instance key | |
# -------------------------------------------------------- | |
export MODE=local_single | |
export TRAIN_STEPS=2000 | |
export NUM_EPOCHS=1 | |
DATE=`date '+%Y%m%d_%H%M%S'` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from __future__ import absolute_import | |
from __future__ import division | |
from __future__ import print_function | |
from collections import OrderedDict | |
import multiprocessing | |
import numpy as np | |
import six |