Created
October 5, 2024 14:40
-
-
Save kkdai/4d613dcdc86bad995477be4d22a7f907 to your computer and use it in GitHub Desktop.
YouTubeLoader
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"nbformat": 4, | |
"nbformat_minor": 0, | |
"metadata": { | |
"colab": { | |
"provenance": [], | |
"authorship_tag": "ABX9TyPyZ8epgiFPVSvoMrVXk0fR", | |
"include_colab_link": true | |
}, | |
"kernelspec": { | |
"name": "python3", | |
"display_name": "Python 3" | |
}, | |
"language_info": { | |
"name": "python" | |
} | |
}, | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "view-in-github", | |
"colab_type": "text" | |
}, | |
"source": [ | |
"<a href=\"https://colab.research.google.com/gist/kkdai/4d613dcdc86bad995477be4d22a7f907/youtubeloader.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "NimetB15ZEuP", | |
"outputId": "4dc3b61c-040a-46d4-f99b-92858e67c3fb" | |
}, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"name": "stdout", | |
"text": [ | |
"Collecting langchain\n", | |
" Using cached langchain-0.3.1-py3-none-any.whl.metadata (7.1 kB)\n", | |
"Collecting langchain_core\n", | |
" Using cached langchain_core-0.3.7-py3-none-any.whl.metadata (6.3 kB)\n", | |
"Collecting langchain_google_genai\n", | |
" Using cached langchain_google_genai-2.0.0-py3-none-any.whl.metadata (3.9 kB)\n", | |
"Collecting langchain_community\n", | |
" Downloading langchain_community-0.3.1-py3-none-any.whl.metadata (2.8 kB)\n", | |
"Collecting youtube-transcript-api\n", | |
" Using cached youtube_transcript_api-0.6.2-py3-none-any.whl.metadata (15 kB)\n", | |
"Collecting pytube\n", | |
" Downloading pytube-15.0.0-py3-none-any.whl.metadata (5.0 kB)\n", | |
"Requirement already satisfied: PyYAML>=5.3 in /usr/local/lib/python3.10/dist-packages (from langchain) (6.0.2)\n", | |
"Requirement already satisfied: SQLAlchemy<3,>=1.4 in /usr/local/lib/python3.10/dist-packages (from langchain) (2.0.35)\n", | |
"Requirement already satisfied: aiohttp<4.0.0,>=3.8.3 in /usr/local/lib/python3.10/dist-packages (from langchain) (3.10.5)\n", | |
"Requirement already satisfied: async-timeout<5.0.0,>=4.0.0 in /usr/local/lib/python3.10/dist-packages (from langchain) (4.0.3)\n", | |
"Collecting langchain-text-splitters<0.4.0,>=0.3.0 (from langchain)\n", | |
" Downloading langchain_text_splitters-0.3.0-py3-none-any.whl.metadata (2.3 kB)\n", | |
"Collecting langsmith<0.2.0,>=0.1.17 (from langchain)\n", | |
" Downloading langsmith-0.1.129-py3-none-any.whl.metadata (13 kB)\n", | |
"Requirement already satisfied: numpy<2,>=1 in /usr/local/lib/python3.10/dist-packages (from langchain) (1.26.4)\n", | |
"Requirement already satisfied: pydantic<3.0.0,>=2.7.4 in /usr/local/lib/python3.10/dist-packages (from langchain) (2.9.2)\n", | |
"Requirement already satisfied: requests<3,>=2 in /usr/local/lib/python3.10/dist-packages (from langchain) (2.32.3)\n", | |
"Collecting tenacity!=8.4.0,<9.0.0,>=8.1.0 (from langchain)\n", | |
" Downloading tenacity-8.5.0-py3-none-any.whl.metadata (1.2 kB)\n", | |
"Collecting jsonpatch<2.0,>=1.33 (from langchain_core)\n", | |
" Downloading jsonpatch-1.33-py2.py3-none-any.whl.metadata (3.0 kB)\n", | |
"Requirement already satisfied: packaging<25,>=23.2 in /usr/local/lib/python3.10/dist-packages (from langchain_core) (24.1)\n", | |
"Requirement already satisfied: typing-extensions>=4.7 in /usr/local/lib/python3.10/dist-packages (from langchain_core) (4.12.2)\n", | |
"Requirement already satisfied: google-generativeai<0.8.0,>=0.7.0 in /usr/local/lib/python3.10/dist-packages (from langchain_google_genai) (0.7.2)\n", | |
"Collecting dataclasses-json<0.7,>=0.5.7 (from langchain_community)\n", | |
" Downloading dataclasses_json-0.6.7-py3-none-any.whl.metadata (25 kB)\n", | |
"Collecting pydantic-settings<3.0.0,>=2.4.0 (from langchain_community)\n", | |
" Downloading pydantic_settings-2.5.2-py3-none-any.whl.metadata (3.5 kB)\n", | |
"Requirement already satisfied: aiohappyeyeballs>=2.3.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.8.3->langchain) (2.4.0)\n", | |
"Requirement already satisfied: aiosignal>=1.1.2 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.8.3->langchain) (1.3.1)\n", | |
"Requirement already satisfied: attrs>=17.3.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.8.3->langchain) (24.2.0)\n", | |
"Requirement already satisfied: frozenlist>=1.1.1 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.8.3->langchain) (1.4.1)\n", | |
"Requirement already satisfied: multidict<7.0,>=4.5 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.8.3->langchain) (6.1.0)\n", | |
"Requirement already satisfied: yarl<2.0,>=1.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.8.3->langchain) (1.11.1)\n", | |
"Collecting marshmallow<4.0.0,>=3.18.0 (from dataclasses-json<0.7,>=0.5.7->langchain_community)\n", | |
" Downloading marshmallow-3.22.0-py3-none-any.whl.metadata (7.2 kB)\n", | |
"Collecting typing-inspect<1,>=0.4.0 (from dataclasses-json<0.7,>=0.5.7->langchain_community)\n", | |
" Downloading typing_inspect-0.9.0-py3-none-any.whl.metadata (1.5 kB)\n", | |
"Requirement already satisfied: google-ai-generativelanguage==0.6.6 in /usr/local/lib/python3.10/dist-packages (from google-generativeai<0.8.0,>=0.7.0->langchain_google_genai) (0.6.6)\n", | |
"Requirement already satisfied: google-api-core in /usr/local/lib/python3.10/dist-packages (from google-generativeai<0.8.0,>=0.7.0->langchain_google_genai) (2.19.2)\n", | |
"Requirement already satisfied: google-api-python-client in /usr/local/lib/python3.10/dist-packages (from google-generativeai<0.8.0,>=0.7.0->langchain_google_genai) (2.137.0)\n", | |
"Requirement already satisfied: google-auth>=2.15.0 in /usr/local/lib/python3.10/dist-packages (from google-generativeai<0.8.0,>=0.7.0->langchain_google_genai) (2.27.0)\n", | |
"Requirement already satisfied: protobuf in /usr/local/lib/python3.10/dist-packages (from google-generativeai<0.8.0,>=0.7.0->langchain_google_genai) (3.20.3)\n", | |
"Requirement already satisfied: tqdm in /usr/local/lib/python3.10/dist-packages (from google-generativeai<0.8.0,>=0.7.0->langchain_google_genai) (4.66.5)\n", | |
"Requirement already satisfied: proto-plus<2.0.0dev,>=1.22.3 in /usr/local/lib/python3.10/dist-packages (from google-ai-generativelanguage==0.6.6->google-generativeai<0.8.0,>=0.7.0->langchain_google_genai) (1.24.0)\n", | |
"Collecting jsonpointer>=1.9 (from jsonpatch<2.0,>=1.33->langchain_core)\n", | |
" Downloading jsonpointer-3.0.0-py2.py3-none-any.whl.metadata (2.3 kB)\n", | |
"Collecting httpx<1,>=0.23.0 (from langsmith<0.2.0,>=0.1.17->langchain)\n", | |
" Downloading httpx-0.27.2-py3-none-any.whl.metadata (7.1 kB)\n", | |
"Collecting orjson<4.0.0,>=3.9.14 (from langsmith<0.2.0,>=0.1.17->langchain)\n", | |
" Downloading orjson-3.10.7-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (50 kB)\n", | |
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m50.4/50.4 kB\u001b[0m \u001b[31m3.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", | |
"\u001b[?25hRequirement already satisfied: annotated-types>=0.6.0 in /usr/local/lib/python3.10/dist-packages (from pydantic<3.0.0,>=2.7.4->langchain) (0.7.0)\n", | |
"Requirement already satisfied: pydantic-core==2.23.4 in /usr/local/lib/python3.10/dist-packages (from pydantic<3.0.0,>=2.7.4->langchain) (2.23.4)\n", | |
"Collecting python-dotenv>=0.21.0 (from pydantic-settings<3.0.0,>=2.4.0->langchain_community)\n", | |
" Downloading python_dotenv-1.0.1-py3-none-any.whl.metadata (23 kB)\n", | |
"Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests<3,>=2->langchain) (3.3.2)\n", | |
"Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests<3,>=2->langchain) (3.10)\n", | |
"Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests<3,>=2->langchain) (2.2.3)\n", | |
"Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests<3,>=2->langchain) (2024.8.30)\n", | |
"Requirement already satisfied: greenlet!=0.4.17 in /usr/local/lib/python3.10/dist-packages (from SQLAlchemy<3,>=1.4->langchain) (3.1.1)\n", | |
"Requirement already satisfied: googleapis-common-protos<2.0.dev0,>=1.56.2 in /usr/local/lib/python3.10/dist-packages (from google-api-core->google-generativeai<0.8.0,>=0.7.0->langchain_google_genai) (1.65.0)\n", | |
"Requirement already satisfied: cachetools<6.0,>=2.0.0 in /usr/local/lib/python3.10/dist-packages (from google-auth>=2.15.0->google-generativeai<0.8.0,>=0.7.0->langchain_google_genai) (5.5.0)\n", | |
"Requirement already satisfied: pyasn1-modules>=0.2.1 in /usr/local/lib/python3.10/dist-packages (from google-auth>=2.15.0->google-generativeai<0.8.0,>=0.7.0->langchain_google_genai) (0.4.1)\n", | |
"Requirement already satisfied: rsa<5,>=3.1.4 in /usr/local/lib/python3.10/dist-packages (from google-auth>=2.15.0->google-generativeai<0.8.0,>=0.7.0->langchain_google_genai) (4.9)\n", | |
"Requirement already satisfied: anyio in /usr/local/lib/python3.10/dist-packages (from httpx<1,>=0.23.0->langsmith<0.2.0,>=0.1.17->langchain) (3.7.1)\n", | |
"Collecting httpcore==1.* (from httpx<1,>=0.23.0->langsmith<0.2.0,>=0.1.17->langchain)\n", | |
" Downloading httpcore-1.0.5-py3-none-any.whl.metadata (20 kB)\n", | |
"Requirement already satisfied: sniffio in /usr/local/lib/python3.10/dist-packages (from httpx<1,>=0.23.0->langsmith<0.2.0,>=0.1.17->langchain) (1.3.1)\n", | |
"Collecting h11<0.15,>=0.13 (from httpcore==1.*->httpx<1,>=0.23.0->langsmith<0.2.0,>=0.1.17->langchain)\n", | |
" Downloading h11-0.14.0-py3-none-any.whl.metadata (8.2 kB)\n", | |
"Collecting mypy-extensions>=0.3.0 (from typing-inspect<1,>=0.4.0->dataclasses-json<0.7,>=0.5.7->langchain_community)\n", | |
" Downloading mypy_extensions-1.0.0-py3-none-any.whl.metadata (1.1 kB)\n", | |
"Requirement already satisfied: httplib2<1.dev0,>=0.19.0 in /usr/local/lib/python3.10/dist-packages (from google-api-python-client->google-generativeai<0.8.0,>=0.7.0->langchain_google_genai) (0.22.0)\n", | |
"Requirement already satisfied: google-auth-httplib2<1.0.0,>=0.2.0 in /usr/local/lib/python3.10/dist-packages (from google-api-python-client->google-generativeai<0.8.0,>=0.7.0->langchain_google_genai) (0.2.0)\n", | |
"Requirement already satisfied: uritemplate<5,>=3.0.1 in /usr/local/lib/python3.10/dist-packages (from google-api-python-client->google-generativeai<0.8.0,>=0.7.0->langchain_google_genai) (4.1.1)\n", | |
"Requirement already satisfied: grpcio<2.0dev,>=1.33.2 in /usr/local/lib/python3.10/dist-packages (from google-api-core[grpc]!=2.0.*,!=2.1.*,!=2.10.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,!=2.8.*,!=2.9.*,<3.0.0dev,>=1.34.1->google-ai-generativelanguage==0.6.6->google-generativeai<0.8.0,>=0.7.0->langchain_google_genai) (1.64.1)\n", | |
"Requirement already satisfied: grpcio-status<2.0.dev0,>=1.33.2 in /usr/local/lib/python3.10/dist-packages (from google-api-core[grpc]!=2.0.*,!=2.1.*,!=2.10.*,!=2.2.*,!=2.3.*,!=2.4.*,!=2.5.*,!=2.6.*,!=2.7.*,!=2.8.*,!=2.9.*,<3.0.0dev,>=1.34.1->google-ai-generativelanguage==0.6.6->google-generativeai<0.8.0,>=0.7.0->langchain_google_genai) (1.48.2)\n", | |
"Requirement already satisfied: pyparsing!=3.0.0,!=3.0.1,!=3.0.2,!=3.0.3,<4,>=2.4.2 in /usr/local/lib/python3.10/dist-packages (from httplib2<1.dev0,>=0.19.0->google-api-python-client->google-generativeai<0.8.0,>=0.7.0->langchain_google_genai) (3.1.4)\n", | |
"Requirement already satisfied: pyasn1<0.7.0,>=0.4.6 in /usr/local/lib/python3.10/dist-packages (from pyasn1-modules>=0.2.1->google-auth>=2.15.0->google-generativeai<0.8.0,>=0.7.0->langchain_google_genai) (0.6.1)\n", | |
"Requirement already satisfied: exceptiongroup in /usr/local/lib/python3.10/dist-packages (from anyio->httpx<1,>=0.23.0->langsmith<0.2.0,>=0.1.17->langchain) (1.2.2)\n", | |
"Downloading langchain-0.3.1-py3-none-any.whl (1.0 MB)\n", | |
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m1.0/1.0 MB\u001b[0m \u001b[31m15.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", | |
"\u001b[?25hDownloading langchain_core-0.3.7-py3-none-any.whl (400 kB)\n", | |
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m400.9/400.9 kB\u001b[0m \u001b[31m27.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", | |
"\u001b[?25hDownloading langchain_google_genai-2.0.0-py3-none-any.whl (39 kB)\n", | |
"Downloading langchain_community-0.3.1-py3-none-any.whl (2.4 MB)\n", | |
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m2.4/2.4 MB\u001b[0m \u001b[31m54.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", | |
"\u001b[?25hDownloading youtube_transcript_api-0.6.2-py3-none-any.whl (24 kB)\n", | |
"Downloading pytube-15.0.0-py3-none-any.whl (57 kB)\n", | |
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m57.6/57.6 kB\u001b[0m \u001b[31m4.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", | |
"\u001b[?25hDownloading dataclasses_json-0.6.7-py3-none-any.whl (28 kB)\n", | |
"Downloading jsonpatch-1.33-py2.py3-none-any.whl (12 kB)\n", | |
"Downloading langchain_text_splitters-0.3.0-py3-none-any.whl (25 kB)\n", | |
"Downloading langsmith-0.1.129-py3-none-any.whl (292 kB)\n", | |
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m292.2/292.2 kB\u001b[0m \u001b[31m23.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", | |
"\u001b[?25hDownloading pydantic_settings-2.5.2-py3-none-any.whl (26 kB)\n", | |
"Downloading tenacity-8.5.0-py3-none-any.whl (28 kB)\n", | |
"Downloading httpx-0.27.2-py3-none-any.whl (76 kB)\n", | |
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m76.4/76.4 kB\u001b[0m \u001b[31m7.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", | |
"\u001b[?25hDownloading httpcore-1.0.5-py3-none-any.whl (77 kB)\n", | |
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m77.9/77.9 kB\u001b[0m \u001b[31m7.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", | |
"\u001b[?25hDownloading jsonpointer-3.0.0-py2.py3-none-any.whl (7.6 kB)\n", | |
"Downloading marshmallow-3.22.0-py3-none-any.whl (49 kB)\n", | |
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m49.3/49.3 kB\u001b[0m \u001b[31m5.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", | |
"\u001b[?25hDownloading orjson-3.10.7-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (141 kB)\n", | |
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m141.9/141.9 kB\u001b[0m \u001b[31m12.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", | |
"\u001b[?25hDownloading python_dotenv-1.0.1-py3-none-any.whl (19 kB)\n", | |
"Downloading typing_inspect-0.9.0-py3-none-any.whl (8.8 kB)\n", | |
"Downloading mypy_extensions-1.0.0-py3-none-any.whl (4.7 kB)\n", | |
"Downloading h11-0.14.0-py3-none-any.whl (58 kB)\n", | |
"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m58.3/58.3 kB\u001b[0m \u001b[31m6.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", | |
"\u001b[?25hInstalling collected packages: tenacity, pytube, python-dotenv, orjson, mypy-extensions, marshmallow, jsonpointer, h11, youtube-transcript-api, typing-inspect, jsonpatch, httpcore, pydantic-settings, httpx, dataclasses-json, langsmith, langchain_core, langchain-text-splitters, langchain_google_genai, langchain, langchain_community\n", | |
" Attempting uninstall: tenacity\n", | |
" Found existing installation: tenacity 9.0.0\n", | |
" Uninstalling tenacity-9.0.0:\n", | |
" Successfully uninstalled tenacity-9.0.0\n", | |
"Successfully installed dataclasses-json-0.6.7 h11-0.14.0 httpcore-1.0.5 httpx-0.27.2 jsonpatch-1.33 jsonpointer-3.0.0 langchain-0.3.1 langchain-text-splitters-0.3.0 langchain_community-0.3.1 langchain_core-0.3.7 langchain_google_genai-2.0.0 langsmith-0.1.129 marshmallow-3.22.0 mypy-extensions-1.0.0 orjson-3.10.7 pydantic-settings-2.5.2 python-dotenv-1.0.1 pytube-15.0.0 tenacity-8.5.0 typing-inspect-0.9.0 youtube-transcript-api-0.6.2\n" | |
] | |
} | |
], | |
"source": [ | |
"!pip install --upgrade langchain langchain_core langchain_google_genai langchain_community youtube-transcript-api pytube" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"source": [ | |
"from langchain_community.document_loaders import YoutubeLoader\n", | |
"from langchain.docstore.document import Document" | |
], | |
"metadata": { | |
"id": "xMQKKxqZZ2lc" | |
}, | |
"execution_count": null, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "code", | |
"source": [ | |
"loader = YoutubeLoader.from_youtube_url(\n", | |
" \"https://www.youtube.com/watch?v=ViA4-YWx8Y4\", add_video_info=True, language=[\"zh-Hant\", \"zh-Hans\", \"ja\", \"en\"])" | |
], | |
"metadata": { | |
"id": "IK1pexKsZ4ui" | |
}, | |
"execution_count": null, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "code", | |
"source": [ | |
"docs = loader.load()" | |
], | |
"metadata": { | |
"id": "iN0815LMZ669" | |
}, | |
"execution_count": null, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "code", | |
"source": [ | |
"print(\"Length:\", len(docs))" | |
], | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "pKy_wpBLaCpv", | |
"outputId": "80296bf7-5f64-4efb-cf67-6130c286ffb2" | |
}, | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"name": "stdout", | |
"text": [ | |
"Length: 1\n" | |
] | |
} | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"source": [ | |
"def docs_to_str(docs: list[Document]) -> str:\n", | |
" return \"\\n\".join([doc.page_content for doc in docs])" | |
], | |
"metadata": { | |
"id": "mn-xD09oaZdU" | |
}, | |
"execution_count": null, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "code", | |
"source": [ | |
"text_content = docs_to_str(docs)\n", | |
"print(\"Words: \", len(text_content.split()), \"First 1000 chars: \", text_content[:1000])" | |
], | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/" | |
}, | |
"id": "A2kDmGGhagvI", | |
"outputId": "cd62b450-79cc-4b1e-e45e-6e2fea195e0d" | |
}, | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"name": "stdout", | |
"text": [ | |
"Words: 2052 First 1000 chars: It's been 6 months since I purchased a \n", | |
"pair of the Meta Ray-Ban smart glasses. I, like a lot of people looking at \n", | |
"getting these, had some questions like: What is the point of these if you're not an \n", | |
"Instagram influencer or you just don't care about posting to social media? And would I \n", | |
"use them enough to justify the cost? Well, 6 months later, I now know \n", | |
"the answer to both questions. So the first highlight of these glasses is \n", | |
"their design. Meta and Ray-Ban have done something pretty remarkable here. They've created \n", | |
"a wearable that looks like something people are already wearing. I have walked by and talked \n", | |
"with countless people over the past 6 months while wearing these glasses, and like one person \n", | |
"realized what they actually were. The only real giveaway is the camera, and the sides are \n", | |
"slightly thicker. Meta Ray-Bans come in a few different styles: Headliner, Wayfarer (which \n", | |
"is what I have), and a couple others. And yes, you can get them with prescription \n" | |
] | |
} | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"source": [ | |
"def summarized_from_youtube(youtube_url: str) -> str:\n", | |
" \"\"\"\n", | |
" Summarize a YouTube video using the YoutubeLoader and Google Generative AI model.\n", | |
" \"\"\"\n", | |
" try:\n", | |
" print(\"Youtube URL: \", youtube_url)\n", | |
" # Load the video content using YoutubeLoader\n", | |
" loader = YoutubeLoader.from_youtube_url(\n", | |
" youtube_url, add_video_info=True, language=[\"zh-Hant\", \"zh-Hans\", \"ja\", \"en\"])\n", | |
" docs = loader.load()\n", | |
"\n", | |
" print(\"Pages of Docs: \", len(docs))\n", | |
" # Extract the text content from the loaded documents\n", | |
" text_content = docs_to_str(docs)\n", | |
" print(\"Words: \", len(text_content.split()),\n", | |
" \"First 1000 chars: \", text_content[:1000])\n", | |
"\n", | |
" # Summarize the extracted text\n", | |
" return text_content\n", | |
" except Exception as e:\n", | |
" # Log the exception if needed\n", | |
" print(f\"An error occurred: {e}\")\n", | |
" return \"\"" | |
], | |
"metadata": { | |
"id": "74FzGfRvc5ll" | |
}, | |
"execution_count": null, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "code", | |
"source": [ | |
"summarized_from_youtube(\"https://www.youtube.com/watch?v=ViA4-YWx8Y4\")" | |
], | |
"metadata": { | |
"colab": { | |
"base_uri": "https://localhost:8080/", | |
"height": 400 | |
}, | |
"id": "E4MiqG7Ec_ns", | |
"outputId": "4af67199-a2c5-495f-b3a8-b3f72a06628b" | |
}, | |
"execution_count": null, | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"name": "stdout", | |
"text": [ | |
"Youtube URL: https://www.youtube.com/watch?v=ViA4-YWx8Y4\n", | |
"Pages of Docs: 1\n", | |
"Words: 2052 First 1000 chars: It's been 6 months since I purchased a \n", | |
"pair of the Meta Ray-Ban smart glasses. I, like a lot of people looking at \n", | |
"getting these, had some questions like: What is the point of these if you're not an \n", | |
"Instagram influencer or you just don't care about posting to social media? And would I \n", | |
"use them enough to justify the cost? Well, 6 months later, I now know \n", | |
"the answer to both questions. So the first highlight of these glasses is \n", | |
"their design. Meta and Ray-Ban have done something pretty remarkable here. They've created \n", | |
"a wearable that looks like something people are already wearing. I have walked by and talked \n", | |
"with countless people over the past 6 months while wearing these glasses, and like one person \n", | |
"realized what they actually were. The only real giveaway is the camera, and the sides are \n", | |
"slightly thicker. Meta Ray-Bans come in a few different styles: Headliner, Wayfarer (which \n", | |
"is what I have), and a couple others. And yes, you can get them with prescription \n" | |
] | |
}, | |
{ | |
"output_type": "execute_result", | |
"data": { | |
"text/plain": [ | |
"'It\\'s been 6 months since I purchased a\\xa0\\npair of the Meta Ray-Ban smart glasses. I,\\xa0\\xa0 like a lot of people looking at\\xa0\\ngetting these, had some questions like:\\xa0\\xa0 What is the point of these if you\\'re not an\\xa0\\nInstagram influencer or you just don\\'t care\\xa0\\xa0 about posting to social media? And would I\\xa0\\nuse them enough to justify the cost? Well,\\xa0\\xa0 6 months later, I now know\\xa0\\nthe answer to both questions. So the first highlight of these glasses is\\xa0\\ntheir design. Meta and Ray-Ban have done\\xa0\\xa0 something pretty remarkable here. They\\'ve created\\xa0\\na wearable that looks like something people are\\xa0\\xa0 already wearing. I have walked by and talked\\xa0\\nwith countless people over the past 6 months\\xa0\\xa0 while wearing these glasses, and like one person\\xa0\\nrealized what they actually were. The only real\\xa0\\xa0 giveaway is the camera, and the sides are\\xa0\\nslightly thicker. Meta Ray-Bans come in a\\xa0\\xa0 few different styles: Headliner, Wayfarer (which\\xa0\\nis what I have), and a couple others. And yes,\\xa0\\xa0 you can get them with prescription lenses,\\xa0\\nand you can even get transition lenses,\\xa0\\xa0 either prescription or non-prescription,\\xa0\\nthough they\\'ll cost you an extra $50 to $80. So how exactly do these work and\\xa0\\nwhat makes them smart glasses? Well,\\xa0\\xa0 they have an integrated camera, they have\\xa0\\nopen-ear speakers on each side of the glasses,\\xa0\\xa0 and they have the Meta AI built into them.\\xa0\\nIn terms of controls, you\\'ve got a capture\\xa0\\xa0 button on the top right for photos and videos,\\xa0\\na touchpad on the right side for volume control,\\xa0\\xa0 skipping tracks, and triggering the assistant,\\xa0\\nand you can even use them to trigger Spotify,\\xa0\\xa0 Apple Music, or Amazon Music\\xa0\\nwith a single tap. Also note,\\xa0\\xa0 Apple Music is only available on the\\xa0\\niPhone version of The Meta View app. The glasses come with an IPX4 water resistance\\xa0\\nrating, so you don\\'t have to worry if you get\\xa0\\xa0 caught in light rain, and I was able to\\xa0\\nwear them out kayaking in the Sound with\\xa0\\xa0 no problem. You just wipe them off after\\xa0\\nthey get a bit wet, though note they\\'re\\xa0\\xa0 not designed to stay in the rain or sustain\\xa0\\nsplashes for hours or anything like that. Now, one of the most crucial aspects of the\\xa0\\nglasses is the speakers. The sound quality is\\xa0\\xa0 surprisingly decent. I found over the past 6\\xa0\\nmonths that I\\'ve reached for my earbuds a bit\\xa0\\xa0 less while wearing these, especially when all I\\'m\\xa0\\nlooking for is just some background music. One of\\xa0\\xa0 my favorite use cases is transitioning music\\xa0\\nplaying from my car to the glasses with just a\\xa0\\xa0 single tap. The glasses also have a five-mic\\xa0\\nsystem, which is excellent for phone calls. All right, and this is how the mic\\xa0\\nsounds for the Meta Ray-Bans. Let me\\xa0\\xa0 know what you think in the comments.\\xa0\\nHow do you think these mics sound? So the AI thing, what exactly can\\xa0\\nMeta AI and this assistant do? Well,\\xa0\\xa0 I\\'d break it down into two categories. One\\xa0\\nis more task-oriented. This is going to be\\xa0\\xa0 more familiar to you if you\\'re familiar\\xa0\\nwith like Apple and Google\\'s assistants.\\xa0\\xa0 So that\\'s things like hands-free phone calls, you\\xa0\\ncan ask these to take a photo for you, send a text\\xa0\\xa0 message to somebody in your contacts, play music,\\xa0\\nand more. You can also use it to send messages to\\xa0\\xa0 your contacts in both WhatsApp and Messenger,\\xa0\\nor add videos directly to your Instagram Story. The second thing the AI assistant is for\\xa0\\nis having general conversations with,\\xa0\\xa0 and you can pretty much ask it anything like\\xa0\\ngeneral search queries. So, for example,\\xa0\\xa0 here: \"What zones do pawpaw trees grow in?\" Pawpaw trees can grow in Raleigh, North Carolina,\\xa0\\xa0 as it falls within USDA zone 8,\\xa0\\nwhich is suitable for their growth. There you go. It\\'ll even do proactive things for\\xa0\\nyou with the media quality check feature. It\\'ll\\xa0\\xa0 tell you when someone in a photo has their face\\xa0\\ncovered by hair or if your camera lens is dirty. And speaking of the camera, that\\'s where things on\\xa0\\nthese glasses get really interesting. The glasses\\xa0\\xa0 pack a 12-megapixel camera, and while that might\\xa0\\nnot sound impressive in the age of 50-megapixel\\xa0\\xa0 smartphone camera sensors, the quality is\\xa0\\nsurprisingly good. The colors are pretty accurate,\\xa0\\xa0 and the HDR processing doesn\\'t go overboard\\xa0\\nlike it does on many modern smartphone cameras. What\\'s really cool about having a camera in\\xa0\\nyour glasses is the unique vantage point,\\xa0\\xa0 like when I was kayaking in The Sound. The Meta\\xa0\\nRay-Bans were perfect for capturing moments when\\xa0\\xa0 I was on the water, or even when we kayaked to an\\xa0\\nisland in the middle of the sound and my brother\\xa0\\xa0 found this little shrimp in his boat. In the\\xa0\\namount of time I would have spent fumbling for\\xa0\\xa0 my phone, I would have missed capturing that\\xa0\\nmoment that I was able to with the glasses. Now, will these replace your smartphone\\xa0\\ncamera or fancy dedicated camera? No,\\xa0\\xa0 not for serious photography at least, but for\\xa0\\ncapturing spontaneous moments or when you\\'re\\xa0\\xa0 in situations where using other cameras isn\\'t\\xa0\\npractical, that\\'s where the Meta Ray-Bans shine. There are some constraints though to keep\\xa0\\nin mind with this camera system. Videos\\xa0\\xa0 are limited to 3 minutes, though I\\xa0\\nusually keep mine at 60 seconds to\\xa0\\xa0 conserve battery. Another limitation of the\\xa0\\ncamera system is the aspect ratio for both\\xa0\\xa0 photos and videos. It\\'s fixed in a portrait\\xa0\\norientation, which might not always be ideal. Now, what about battery life and how do you\\xa0\\ncharge the glasses? The glasses come with\\xa0\\xa0 this nice leather Ray-Ban case that doubles as\\xa0\\na charger. There\\'s a USB-C port on the bottom,\\xa0\\xa0 and the glasses themselves charge via pin\\xa0\\nconnectors in the nose bridge. Meta claims\\xa0\\xa0 4 hours of battery life on a single charge and\\xa0\\nup to 36 hours with the case. In my experience,\\xa0\\xa0 that\\'s been pretty accurate. I took these on a\\xa0\\nday trip to Cape Lookout, and they lasted about\\xa0\\xa0 between 4 and 5 hours, and I was taking a lot of\\xa0\\nphotos and videos. They\\'re also pretty quick to\\xa0\\xa0 recharge. You\\'ll get about 30 to 40% capacity\\xa0\\nwith just a 15-minute charge in the case. So what features have I found myself not really\\xa0\\nusing with these glasses? While I use the Meta\\xa0\\xa0 assistant for basic tasks like controlling\\xa0\\nmedia, phone calls, etc., I haven\\'t found\\xa0\\xa0 myself using its deeper capabilities like\\xa0\\nhaving conversations with it or asking it\\xa0\\xa0 to identify what you\\'re looking at. I have tried\\xa0\\nthat feature a lot, especially when it first came\\xa0\\xa0 out. I tried to use it to identify what kind of\\xa0\\ntree am I looking at, and it would just tell me,\\xa0\\xa0 \"You\\'re looking at a tree with green leaves,\" and\\xa0\\nI\\'m like, \"Okay, thanks, that was not helpful.\" I also haven\\'t used the direct sharing features\\xa0\\nmuch. It\\'s there if you want it, but it hasn\\'t\\xa0\\xa0 been a go-to for me since outside of my job, I\\xa0\\ndon\\'t post or check social media all that often. So what are the downsides I\\'ve encountered while\\xa0\\nusing the Meta Ray-Bans? The main one involves the\\xa0\\xa0 Meta AI assistant. It just has some limitations.\\xa0\\nIt couldn\\'t identify songs currently playing off\\xa0\\xa0 my iPhone, and it won\\'t let you directly reply\\xa0\\nto a text that it\\'s reading aloud. And when you\\xa0\\xa0 ask it to play a song on Apple Music, it\\'ll play\\xa0\\nthat song but nothing after it. The integration\\xa0\\xa0 with your phone\\'s operating system isn\\'t as\\xa0\\nseamless as you want it to be, and it really\\xa0\\xa0 makes me curious how much better the experience\\xa0\\nwould be if these glasses were made by Apple or\\xa0\\xa0 Google. And it also makes you aware of how locked\\xa0\\ndown these phone operating systems still are. Also, because Meta AI is a large language learning\\xa0\\nmodel like ChatGPT and all of these other chatbots\\xa0\\xa0 out there, it can just confidently lie\\xa0\\nto you when it gives you an answer,\\xa0\\xa0 which makes the being able to ask\\xa0\\nit just for information that feature\\xa0\\xa0 a bit useless because I don\\'t trust it\\xa0\\nto actually give me the correct answer. Battery life is another downside. I\\xa0\\nthink closer to 8 hours would be ideal,\\xa0\\xa0 especially for long days where you\\'re\\xa0\\noutside and you don\\'t necessarily want\\xa0\\xa0 to have to put the glasses back in\\xa0\\ntheir case to charge for 15 minutes. Another downside is when taking photos, there\\'s\\xa0\\na slight 1 to 2 second delay when you press the\\xa0\\xa0 button and when the photo is actually taken,\\xa0\\nwhich can be frustrating because people will\\xa0\\xa0 often think you already took the photo and start\\xa0\\nmoving before the photo has actually been snapped. The last downside with these glasses is while\\xa0\\nthey are comfortable, if you wear them for\\xa0\\xa0 several hours, I\\'ve at least noticed that my\\xa0\\nears will get a little bit sore as you kind\\xa0\\xa0 of just feel the weight of them a little bit.\\xa0\\nAnd while it\\'s not sore enough to make me not\\xa0\\xa0 want to wear these as sunglasses, not at all, but\\xa0\\nI do still hope that Meta and Ray-Ban are able to\\xa0\\xa0 make these a little bit lighter and the sides\\xa0\\na little bit slimmer in the next generation. So would I recommend getting a pair of the Meta\\xa0\\nRay-Ban smart glasses? I was skeptical at first,\\xa0\\xa0 but these glasses have genuinely surprised\\xa0\\nme. They\\'re great for trips, adventures,\\xa0\\xa0 or just capturing everyday moments. The\\xa0\\ncamera quality is decent considering it\\'s\\xa0\\xa0 built into a pair of sunglasses, and the overall\\xa0\\nexperience of using them is just fun. So yes,\\xa0\\xa0 I would recommend them, but for whom, you may ask? Well, if you\\'re the type of person who\\xa0\\njust loses sunglasses really easily,\\xa0\\xa0 these probably are not for you. Unlike items\\xa0\\nlike AirPods, there\\'s no Find My or tracking\\xa0\\xa0 these if they get lost, and since there\\'s no\\xa0\\nfingerprint sensor or other authentication\\xa0\\xa0 method for the glasses, they\\'d be pretty\\xa0\\neasy for thieves to steal and reuse.\\xa0\\xa0 That\\'s something Meta should\\xa0\\ndefinitely address in future versions. How I look at these is if you\\'re already in the\\xa0\\nmarket for a pair of $200 designer sunglasses,\\xa0\\xa0 well, for just $100 more, you get\\xa0\\nan integrated camera, speakers,\\xa0\\xa0 and an assistant, and to me,\\xa0\\nthat\\'s a pretty easy upsell. Now another question a lot of people ask when\\xa0\\nlooking at these is should you get the version\\xa0\\xa0 with transition lenses so you can just keep\\xa0\\nwearing them as you come inside? That depends\\xa0\\xa0 on two things: One, you prefer noise cancellation\\xa0\\nwhen you\\'re inside working and/or taking calls,\\xa0\\xa0 and two, how much do you think you\\'ll really\\xa0\\nbenefit from having the integrated Meta\\xa0\\xa0 AI assistant? If you\\'re in a noisy\\xa0\\nenvironment or just an environment\\xa0\\xa0 you want that noise cancellation and you\\xa0\\ndon\\'t care about the AI features as much,\\xa0\\xa0 then I\\'d say just go for the sunglasses\\xa0\\nversion of these and save yourself some money. All in all, I\\'ve really come to enjoy the Meta\\xa0\\nRay-Ban smart glasses. They\\'re one of the best\\xa0\\xa0 wearables you can get that offer real value and\\xa0\\nutility today. I only wish they could be more\\xa0\\xa0 integrated with my phone and have slightly better\\xa0\\nbattery life, but it makes me really hopeful for\\xa0\\xa0 the future of this category. Where these are today\\xa0\\nis already pretty impressive, and I can\\'t imagine\\xa0\\xa0 all the things you\\'d be able to do with these\\xa0\\nif they had embedded displays. For now though,\\xa0\\xa0 I\\'m content just listening to music on these,\\xa0\\ntaking photos and videos, and sending someone an\\xa0\\xa0 occasional text or taking a call from my glasses,\\xa0\\nwhich is a pretty cool thing you can do now. I\\'ve left links to learn more about these\\xa0\\nglasses and check their current price in\\xa0\\xa0 the blog post for this video at 6monthslater.net.\\xa0\\nLink in the description. If you\\'re interested in\\xa0\\xa0 more futuristic tech, check out my review of the\\xa0\\nApple Vision Pro and The Meta Quest 3. You can get\\xa0\\xa0 to those by clicking here, or check out my videos\\xa0\\non other wearables like the Apple Watch and Pixel\\xa0\\xa0 Watch by clicking here. If you like this video and\\xa0\\nfound it helpful, make sure you give it a thumbs\\xa0\\xa0 up below and remember to subscribe for more. For 6\\xa0\\nMonths Later, I\\'m Josh Teder. Thanks for watching.'" | |
], | |
"application/vnd.google.colaboratory.intrinsic+json": { | |
"type": "string" | |
} | |
}, | |
"metadata": {}, | |
"execution_count": 22 | |
} | |
] | |
} | |
] | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment