Created
January 21, 2025 21:19
-
-
Save jake9696/86f1f9d65af275d77435f6b33c78d64b to your computer and use it in GitHub Desktop.
lecture notes generated from 2020 Creative Writing Lectures at BYU by Brandon Sanderson
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
https://www.youtube.com/playlist?list=PLSH_xM-KC3Zv-79sVZTTj-YA6IAqh8qeQ |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Attn Story Hackers: How these notes were created
I saw the post from @alex Lutz referring to Brandon Sanderson's 2020 lecture series at BYU. I'd never run across this, so it was a great new resource for me. Thanks Alex!
I also saw that the series is 13-14 hours total, and I know I'm flawed, inattentive, lazy, and slothful (like all adherents of the Oxford comma cult), so it seemed like a good opportunity to try out some LLM tools to see if I could get a set of lecture notes detailed enough that I could just read those and get most of the pertinent information.
My first thought was to use my local n8n server and copy a reference plan for getting transcripts from YouTube and analyzing them. However, I have not had a great experience with n8n in the past and this time was no different. I originally installed it to use as a quick no-code way to create small private applications for my farm team, but it's been a little fussy from the start. Once I caught myself yelling at the computer, I decided to switch gears and be a little more manual. The n8n solution should have allowed me to feed in the YouTube URLs and get a Discord message with the notes, pretty minimal effort on my part if it had worked.
My next and really only choice for the LLM/interface was through my Google AI Studio account. I wanted to use one of the very large context window models and I particularly like the Gemini experimental 1206 release, which isn't clearly labeled as such but is apparently an updated Flash 2.0 model, released in early December 2024. I've had good luck with it for coding architecture tasks in the past, so I thought it might be good at this kind of relatively formal writing. This model's input context window length through the AI Studio interface is a pretty impressive two million tokens, enough to handle a lot of instructions and an hour long transcript.
To get the transcripts, I probably could have used AI Studio to directly ingest the videos, but because I was still in my troubleshooting mindset from n8n, I used a free online transcript generator. There are quite a few out there, so I don't know if your choice matters, but I used https://notegpt.io/youtube-transcript-generator. Again, this is probably not the best way to do it, but it wasn't much additional work for me, just wait for a few seconds for the transcript and then paste the whole thing into AI Studio.
AI Studio allows you to set custom instructions for the prompt. I started out with some instructions I copied from Daniel Miessler's project, Fabric (https://github.com/danielmiessler/fabric/). Mr Miessler has some pretty cool ideas about how to employ LLMs in your daily life; I recommend checking out his other projects if you have any interest. Fabric is intended as a command line tool to call AI applications that leverage LLMs on text files, using an extensive series of specialized prompts, but you can also go into the "patterns" subdirectory on that github repo page and just directly read the prompts. I tried out a couple and decided that, while none of the really fit the bill (they're more geared towards very brief summaries, even with long input content), this https://github.com/danielmiessler/fabric/blob/main/patterns/summarize_lecture/system.md was the closest.
I used that prompt as a framework and did a little googling to find some kind of guide for a more extensive and detailed note-taking process. I ended up with these as my references:
https://learningcenter.unc.edu/tips-and-tools/effective-note-taking-in-class/
https://www.oxfordlearning.com/5-effective-note-taking-methods/
https://www.student.unsw.edu.au/note-taking-skills
https://lsc.cornell.edu/wp-content/uploads/2016/10/Cornell-NoteTaking-System.pdf
After I made my final prompt (with a little testing on slight variations), I was able to set the prompt as the system instruction in AI Studio and feed it the transcripts one by one, then copy and paste those into my gist. Gemini took about 2-3 minutes to generate each transcript.
Here is the final prompt I decided on:
IDENTITY and PURPOSE
As an organized, high-skill expert lecturer, your role is to extract information from a lecture transcript and provide a detailed set of lecture notes using bullet points and lists of definitions for each subject. You will also include timestamps to indicate where in the video these notes are from.
Before starting, think step-by-step about how you would do this. You will generally follow the outlining method of note-taking, with some modifications, as detailed in the STEPS section below. Here is a summary of the process. First, you will read the entire transcript. Next, you will create an outline listing each major topic, each subtopic for every major topic, and each key point for every subtopic. If the subtopics are further broken down into sub-subtopics, extend the notes' structure as necessary. Then go back over the outline and add one telegraphic sentence to summarize each major topic, one or two telegraphic sentence(s) for each subtopic or sub-subtopic etc., and one to three telegraphic sentence(s) for each key point. Use only as many sentences as you need to capture all of the important information.
HOW TO TELL WHAT IS IMPORTANT
Distinguish between main points, elaboration, examples, waffle or filler, and new points by listening for:
STEPS
OUTPUT INSTRUCTIONS
You only output Markdown.
In the markdown, use formatting like bold, highlight, headlines as # ## ### , blockquote as > , code block in necessary as
{block_code}
, lists as * , etc. Make the output maximally readable in plain text.Create the output using the formatting above.
Do not start items with the same opening words.
Use middle ground/semi-formal speech for your output context.
To ensure the summary is easily searchable in the future, keep the structure clear and straightforward.
Ensure you follow ALL these instructions when creating your output.
Ensure all output timestamps are sequential and fall within the length of the content, e.g., if the total length of the video is 24 minutes. (00:00:00 - 00:24:00), then no output can be 01:01:25, or anything over 00:25:00 or over!