Skip to content

Instantly share code, notes, and snippets.

@smothiki
Created December 12, 2024 22:06
Show Gist options
  • Save smothiki/adef3a4377352219c77ecf1df1f6ee7c to your computer and use it in GitHub Desktop.
Save smothiki/adef3a4377352219c77ecf1df1f6ee7c to your computer and use it in GitHub Desktop.
- name: Llama 3.1 Instruct
displayName: Llama 3.1 Instruct
modelHubID: llama-3.1-instruct
category: Text Generation
type: HF
description: The Llama 3.1 8B-Instruct, 8B instruct and 8B base NIM simplifies the deployment of the Llama 3.1 70B-Instruct, 8B instruct and 8B base tuned models which is optimized for language understanding, reasoning, and text generation use cases, and outperforms many of the available open source chat models on common industry benchmarks.
requireLicense: True
licenseAgreements:
- label: Use Policy
url: https://llama.meta.com/llama3/use-policy/
- label: License Agreement
url: https://llama.meta.com/llama3/license/
modelVariants:
- variantId: Llama 3.1 8B Instruct
source:URL: https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/llama-3_1-70b-instruct-nemo
optimizationProfiles:
- profileId: meta-llama/Llama-3.1-8B-instruct
framework: TensorRT-LLM
displayName: Llama 3.1 70B Instruct A100 BF16 ThroughputLoRA
latestVersionSizeInBytes: 149093153785
modelFormat: trl-llm
spec:
- key: PROFILE
value: Throughput
- key: PRECISION
value: BF16
- key: GPU
value: A100
- key: COUNT
value: 4
- key: GPU DEVICE
value: 20b2:10de
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment