Created
May 28, 2024 05:05
-
-
Save rajivmehtaflex/82c68b3bfbd26f30e6852fd98c18b3db to your computer and use it in GitHub Desktop.
LLM Execution server
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
--model | |
/workspace/codesandbox-template-blank/llamafile/AutoCoder_S_6.gguf | |
--server | |
--host | |
0.0.0.0 | |
-ngl | |
100 | |
... |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Test gguf file | |
- Download llamafile | |
- Download/Create gruff file | |
./llamafile -m ./AutoCoder_S_6.gguf —server —port 8889 —temp 0.3 | |
Create standalone llm server | |
- Create .args file | |
- Compile to server | |
cp llamafile AutoCoder.llamafile | |
./zipalign -j0 AutoCoder.llamafile AutoCoder_S_6.gguf .args | |
./AutoCoder.llamafile | |
./AutoCoder.llamafile --port 8890 | |
curl http://localhost:8080/v1/chat/completions \ | |
‐H "Content‐Type: application/json" ‐d ‘{ | |
"model": "/workspace/codesandbox-template-blank/llamafile/AutoCoder_S_6.gguf", | |
"stream": true, | |
"messages": [ | |
{ | |
"role": "system", | |
"content": "You are a poetic assistant." | |
}, | |
{ | |
"role": "user", | |
"content": "Compose a poem that explains FORTRAN." | |
} | |
] | |
}’ |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment