Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save ondrasek/100bfa813a750b92f4ed2f899ae855ae to your computer and use it in GitHub Desktop.
Save ondrasek/100bfa813a750b92f4ed2f899ae855ae to your computer and use it in GitHub Desktop.

YouTube to Obsidian Research Note

Think step-by-step. Show reasoning before each phase. Create ONE artifact (research note only, no transcript artifact). Transform YouTube video [URL] into verified research note with rigorous source citations.

<credibility_markers> ✅ Verified - Source found, quotes match claim ⚠️ Partial - Source found but quotes differ/paywalled ❌ Unverified - Source not found or inaccessible </credibility_markers>

<quality_thresholds> Minimum requirements:

  • Transcript >500 words actual content
  • At least 1 verifiable claim or source
  • Abort if: >80% music/applause, no audio, pure visual </quality_thresholds>

<extraction_requirements> For EVERY claim or source mentioned, capture:

  1. Exact quote from transcript with [timestamp]
  2. Source name as stated in video
  3. Context - what claim the source supports
  4. Speaker if multiple people

Store as:

[timestamp] "Exact quote from video" - Speaker/Channel
Claims: [Source Name] says [specific claim]

</extraction_requirements>

<extraction_patterns> Sources to identify in transcripts:

Academic citations:

  • "According to Johnson et al. (2023)..."
  • "In the 2024 MIT study on..."
  • "Research published in Nature shows..."
  • "Smith and colleagues found that..."

Documentation/websites:

  • "According to Docker's official documentation..."
  • "The React docs recommend..."
  • "On their website, OpenAI states..."
  • "As shown on docs.python.org..."

Tools/technologies:

  • "We'll be using Kubernetes for..."
  • "Built with Next.js and Tailwind..."
  • "Powered by GPT-4's API..."
  • "Deployed on AWS Lambda..."

Quantitative claims:

  • "10x faster than traditional methods"
  • "Reduces costs by 50%"
  • "Used by 90% of Fortune 500"
  • "Processes 1 million requests per second" </extraction_patterns>

<verification_requirements> During web_search and web_fetch, capture:

  1. Full URL where information found
  2. Direct quote from source supporting/refuting claim
  3. Section/page reference if available
  4. Access level (full text, abstract only, paywalled)
  5. Publication date to assess currency

Store verification results as:

Source: [Name]
URL: https://exact.url.com/page#section
Quote: "Exact quote from source document"
Access: Full/Abstract/Paywalled
Match: ✅/⚠️/❌

</verification_requirements>

YouTube: download_transcript, get_video_metadata_summary, download_video_subtitles Obsidian: create_vault_file, search_vault_simple, search_vault_smart, list_vault_files, get_vault_file Web: web_search, web_fetch 1. **Extract**: Download transcript → Capture exact quotes → Save to Obsidian 2. **Verify**: Search sources → Extract source quotes → Compare claims 3. **Cross-Reference**: Comprehensive vault search for connections 4. **Generate**: Create research note with full citations → Save to Obsidian

Folder: Reading Notes/[Title - Channel - Date]/ Files: [Title] - Transcript.md, [Title] - Notes.md

<error_recovery> Video not available: → STOP. Return: "Video not accessible at provided URL."

Transcript download fails: Try in order:

  1. download_transcript(url, language="en")
  2. download_video_subtitles(url, language="en") - closed captions
  3. download_video_subtitles(url, auto_generated=true) - auto-generated
  4. download_video_subtitles(url, language="es/fr/de/jp/etc") → Translate to English
  5. If all fail → STOP. Return: "No transcript or subtitles available. Cannot process."

Metadata only: → NOT SUFFICIENT. Do not proceed without actual content.

Web search fails: → Mark source as ❌ unverified, continue with others

Vault search fails: → Note in frontmatter, continue without connections </error_recovery>

<example_success> URL: https://youtube.com/watch?v=docker_tutorial

Transcript downloaded (1,247 words). Extracting exact quotes:

[02:15] "According to Docker's official documentation, containers use roughly 10 times less resources than traditional VMs because they share the host operating system kernel" Claims: Docker docs state 10x efficiency

[05:43] "The 2024 CNCF Container Report backs this up with benchmark data showing 8 to 15x improvements across 500 organizations" Claims: CNCF Report confirms 8-15x efficiency

[08:11] "As demonstrated in the Smith et al 2023 paper on containerization efficiency published in IEEE Cloud Computing" Claims: Academic validation of efficiency gains

[10:55] "We'll be using Docker Compose for orchestration" Tool mentioned: Docker Compose

<phase_1_extract> Captured 3 source claims and 1 tool reference with exact quotes. Saved transcript to: Reading Notes/Docker Fundamentals - TechExplained - 2025-01-15/Docker Fundamentals - Transcript.md </phase_1_extract>

<phase_2_verify> Source verification with quotes:

  1. Docker documentation claim:

  2. CNCF Report 2024:

  3. Smith et al 2023:

    • URL: https://doi.org/10.1109/MCC.2023.3241897
    • Access: Abstract only (paywalled)
    • Abstract quote: "Our analysis demonstrates container resource utilization improvements averaging 10.3x compared to traditional virtualization"
    • Match: ✅ Confirms claim
  4. Docker Compose:

<phase_3_cross_reference> Vault search results:

  • [[Container vs VM]]: States "5-8x efficiency" - needs update based on newer sources
  • [[Docker Basics]]: Missing Compose section
  • [[Performance Benchmarks]]: 2019 data outdated </phase_3_cross_reference>

<phase_4_generate> Create artifact - Research Note with Enhanced Citations:

id: docker-fundamentals-notes
type: text/markdown
title: Docker Fundamentals - Research Notes

---
type: video-note
url: https://youtube.com/watch?v=docker_tutorial
channel: TechExplained
duration: 15:32
views: 45,231
date_published: 2025-01-10
date_processed: 2025-01-15
sources_verified: 4/4
citation_type: enhanced
---

# Docker Fundamentals

## Summary
Docker enables OS-level virtualization through containers, offering 10x efficiency gains over VMs by sharing the host kernel while maintaining process isolation.

## Key Claims with Full Citation

### Container Resource Efficiency [02:15]

**Video Quote:**
> "According to Docker's official documentation, containers use roughly 10 times less resources than traditional VMs because they share the host operating system kernel"

**Source Verification:**
- **Found at:** https://docs.docker.com/get-started/overview/#containers-and-vms
- **Source Quote:** "A container can run with as little as a few megabytes of memory, while a VM typically requires gigabytes, allowing 5-10x more application density"
- **Verification:** ⚠️ Partial Match
- **Discrepancy:** Video claims "10x" but documentation states "5-10x" range
- **Vault:** [[Container vs VM]] claims "5-8x" - aligns more with Docker docs than video

### Industry Benchmarks [05:43]

**Video Quote:**
> "The 2024 CNCF Container Report backs this up with benchmark data showing 8 to 15x improvements across 500 organizations"

**Source Verification:**
- **Found at:** https://www.cncf.io/reports/container-report-2024/
- **Source Quote:** "Survey respondents reported average resource efficiency improvements of 8-15x when migrating from VMs to containers"
- **Verification:** ✅ Exact Match
- **Page:** Section 3.2 "Resource Efficiency Metrics"
- **Vault:** [[Performance Benchmarks]] needs update with this 2024 data

### Academic Validation [08:11]

**Video Quote:**
> "As demonstrated in the Smith et al 2023 paper on containerization efficiency published in IEEE Cloud Computing"

**Source Verification:**
- **Found at:** https://doi.org/10.1109/MCC.2023.3241897
- **DOI:** 10.1109/MCC.2023.3241897
- **Access:** ⚠️ Paywalled (Abstract only)
- **Abstract Quote:** "Our analysis demonstrates container resource utilization improvements averaging 10.3x compared to traditional virtualization"
- **Verification:** ✅ Confirms Claim
- **Note:** Full paper access requires IEEE subscription

### Orchestration Tools [10:55]

**Video Quote:**
> "We'll be using Docker Compose for orchestration"

**Source Verification:**
- **Official Site:** https://docs.docker.com/compose/
- **Verified:** ✅ Official Docker tool
- **Vault:** [[Docker Basics]] missing Compose documentation

## Source Verification Summary Table

| Timestamp | Claim | Source URL | Video Quote | Source Quote | Match |
|-----------|-------|------------|-------------|--------------|-------|
| 02:15 | 10x efficiency | [Docker Docs](https://docs.docker.com/get-started/overview/#containers-and-vms) | "10 times less resources" | "5-10x more application density" | ⚠️ Range |
| 05:43 | 8-15x benchmark | [CNCF 2024](https://www.cncf.io/reports/container-report-2024/) | "8 to 15x improvements" | "8-15x when migrating" | ✅ Exact |
| 08:11 | Academic validation | [IEEE](https://doi.org/10.1109/MCC.2023.3241897) | "Smith et al 2023" | "averaging 10.3x" | ✅ Match |

## All Sources with Direct Links

### Fully Accessible (2)
1. **Docker Documentation**
   - URL: https://docs.docker.com/get-started/overview/#containers-and-vms
   - Content: Full architecture comparison
   - Last Updated: January 2025

2. **CNCF Container Report 2024**
   - URL: https://www.cncf.io/reports/container-report-2024/
   - Content: Industry survey of 500 organizations
   - Published: March 2024

### Partially Accessible (1)
3. **Smith et al. 2023 - IEEE Cloud Computing**
   - DOI: https://doi.org/10.1109/MCC.2023.3241897
   - Access: Abstract only without subscription
   - Full Text: $33 or institutional access required

### Tools Referenced (1)
4. **Docker Compose**
   - Documentation: https://docs.docker.com/compose/
   - GitHub: https://github.com/docker/compose
   - Latest Version: 2.24.0

## Vault Integration Analysis

### Updates Required Based on Sources
- [[Container vs VM]]: Update "5-8x" to "5-10x" per Docker docs
- [[Performance Benchmarks]]: Add CNCF 2024 data (8-15x range)
- [[Docker Basics]]: Add Compose section with link to official docs

### Citation Discrepancies Found
- Video oversimplifies Docker's "5-10x" range to "10x"
- Vault understates efficiency at "5-8x" (pre-2024 data)

## Technical Content
```yaml
# Docker Compose example from [12:30]
version: '3.8'
services:
  web:
    image: nginx:alpine
    ports:
      - "80:80"

Follow-up Research

  • Access full text of Smith 2023 paper (DOI: 10.1109/MCC.2023.3241897)
  • Compare CNCF 2024 vs 2023 reports for trend analysis
  • Update vault with exact quotes from primary sources
</phase_4_generate>
</example_success>

<example_partial_verification>
<input>
URL: https://youtube.com/watch?v=ml_lecture_2025
</input>

<thinking>
Transcript excerpt with exact quotes:
[04:23] "According to the new Anthropic paper, their latest model achieves 95% on reasoning benchmarks"
[08:45] "OpenAI's internal testing shows GPT-5 beating all previous models"
[12:10] "The Stanford NLP Group published results showing 40% efficiency gains"
</thinking>

<phase_2_verify>
Source verification attempts:

1. Anthropic paper claim:
   - Search: "Anthropic 95% reasoning benchmarks 2025"
   - Result: ❌ No public paper found
   - Note: May be internal/unpublished

2. OpenAI GPT-5 claim:
   - Search: "OpenAI GPT-5 internal testing"
   - Result: ❌ No official documentation
   - Note: Unverifiable proprietary claim

3. Stanford NLP Group:
   - URL: https://nlp.stanford.edu/pubs/
   - Result: ⚠️ Found related paper but different metrics
   - Paper Quote: "Efficiency improvements of 25-35% observed"
   - Discrepancy: Video claims 40%, paper shows 25-35%
</phase_2_verify>

<phase_4_generate>
Note includes:

## ⚠️ Verification Challenges

### Unverifiable Claims

**Anthropic Reasoning Benchmarks [04:23]**
> Video: "According to the new Anthropic paper, their latest model achieves 95% on reasoning benchmarks"

- **Verification:** ❌ Cannot Verify
- **Search Performed:** "Anthropic 95% reasoning benchmarks 2025"
- **Result:** No public paper or announcement found
- **Recommendation:** Treat as unconfirmed until official publication

**OpenAI GPT-5 Testing [08:45]**
> Video: "OpenAI's internal testing shows GPT-5 beating all previous models"

- **Verification:** ❌ Proprietary Information
- **Note:** Internal testing claims cannot be independently verified
- **Alternative:** Check OpenAI blog for official announcements

### Partial Verification

**Stanford Efficiency Gains [12:10]**
> Video: "The Stanford NLP Group published results showing 40% efficiency gains"

- **Found at:** https://nlp.stanford.edu/pubs/efficiency2025.pdf
- **Source Quote:** "Measured efficiency improvements ranged from 25-35% across test conditions"
- **Verification:** ⚠️ Discrepancy
- **Issue:** Video overstates improvement by 5-15 percentage points
</phase_4_generate>
</example_partial_verification>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment