marp | theme | _class | paginate | backgroundColor | backgroundImage |
---|---|---|---|---|---|
true |
gaia |
lead |
true |
The AI Conference from Nvidia
- Why NVIDIA GTC?(GPU Tech Conference)
- What's new from NVIDIA?
- NVIDIA Inference Microservices
- GTC sessions that are interesting
- Tritonserver
- Architecting for the New Language Model Stack [S62702]
- Summaries for other talks(OpenAI, Together, Job topic)
For AI, what are the options? Google TPU, Groq LPU, AMD ROCm™
- Ollama supports AMD
- AMD firmware is not open and tinygrad is struggling
- Groq does not provide options to run finetuned LLM. Not agnostic yet.
- NVIDIA GPUs hold a dominant position
- getting serious about cloud service(https://build.nvidia.com/explore/discover)
- $4500 per GPU for 1 year. $1 for 1 hour use
- docker & kubernetes: can't ignore NVIDIA for the firmware part for max performance
- Benchmarking LLM via Performance Analyzer
import os
from dotenv import load_dotenv
from langchain_nvidia_ai_endpoints import ChatNVIDIA, NVIDIAEmbeddings
# Embedding
load_dotenv()
os.environ['NVIDIA_API_KEY'] = os.getenv('NVIDIA_API_KEY')
llm = ChatNVIDIA(model="mixtral_8x7b")
document_embedder = NVIDIAEmbeddings(model="nvolveqa_40k", model_type="passage")
query_embedder = NVIDIAEmbeddings(model="nvolveqa_40k", model_type="query")
# LLM
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
prompt_template = ChatPromptTemplate.from_messages(
[("system", "You are a helpful AI assistant"), ("user", "{input}")]
)
user_input = st.chat_input("Can you tell me what NVIDIA is known for?")
llm = ChatNVIDIA(model="mixtral_8x7b")
chain = prompt_template | llm | StrOutputParser()
Check triton-inference-server tutorials
FROM nvcr.io/nvidia/tritonserver:23.10-py3
RUN pip install transformers==4.34.0 protobuf==3.20.3 sentencepiece==0.1.99 accelerate==0.23.0 einops==0.6.1
mkdir -p model_repository
cp -r hermes_2_pro/ model_repository/
docker build -t triton_transformer_server .
docker run --gpus all -it --rm --net=host \
--shm-size=1G --ulimit memlock=-1 \
--ulimit stack=67108864 \
-v ${PWD}/model_repository:/opt/tritonserver/model_repository \
triton_transformer_server tritonserver --model-repository=model_repository
# Notice that we have models path here
curl -X POST localhost:8000/v2/models/hermes_2_pro/infer \
-d '{"inputs": [{"name":"text_input","datatype":"BYTES","shape":[1],"data":["I am going"]}]}'
Foundation Model for Robots: GR00T
Speakers: Jensen Huang(NVIDIA), Ashish Vaswani(Essential AI), Noam Shazeer(Character AI), Aidan Gomez(Cohere), etc
- LLM allows software to understand and generate images based on textual prompts, marking the beginning of a new Industrial Revolution.
- Future: adaptive computation. universal transformers. with reasoning capability, then we don't need lots of data. then the quality of data matters.
- evals. measuring progress. observing the finished task matters.
Speaker: Brad Ligtcap, Chief Operating Officer, OpenAI
- Start small, then tackle bigger issues
- Use various sizes of language models for different tasks
- Monitor and swap models and agents as needed
- Aim for reasoning agents for complex actions
- Example: AI for patient care - from data to treatment
- Adapt interfaces for changing user interactions
Speaker: Percy Liang, Co-Founder, Together AI
- we can think about foundation models as infrastructure.
- The Center for Research on Foundation Models(CRFM)
- site: https://crfm.stanford.edu/blog.html
- HELM(holistic framework for evaluating foundation models): https://crfm.stanford.edu/helm/lite/latest/
- anything new? KTO(Knowledge Transfer Optimization) training
Navigating AI Careers in Europe [SE62721]
- AI is democratizing tech. You don't need to know the details of llm.
- everything will have some AI in it. dev jobs are not secure.
- all employees should be upscaled for AI wheter they are technical or not. latest "skills". if they are outdated then companies are also outdated soon. it will cripple the company's performance compare to other companies.
- Reguarding to the coding. we're not there yet for AGI. the way coding is done is changing. the role will not be the same. more like orchestration. validation of biz requirement might be still matters.