Purpose:
Returns a list of available models. The server first checks an in‑memory cache (refreshed every 20 minutes) and returns the cached models if still valid; otherwise, it refreshes the cache by executing the ilab model list
command.
Request:
- Method: GET
- URL:
/models
Response:
- Status 200:
[ { "name": "model-name-1", "last_modified": "2025-02-18T12:34:56Z", "size": "123 MB" }, { "name": "model-name-2", "last_modified": "2025-02-17T09:21:30Z", "size": "456 MB" } // ... more models ]
Error Responses:
- 500 Internal Server Error: If encoding or cache refresh fails.
Purpose:
Retrieves a list of data records by running the ilab data list
command and parsing its output.
Request:
- Method: GET
- URL:
/data
Response:
- Status 200:
[ { "dataset": "dataset-1", "created_at": "2025-02-18T11:00:00Z", "file_size": "10 MB" }, { "dataset": "dataset-2", "created_at": "2025-02-17T15:30:00Z", "file_size": "20 MB" } // ... more datasets ]
Error Responses:
- 500 Internal Server Error: If there’s an error running the command or parsing its output.
Purpose:
Starts a background job to generate new data (by running the ilab data generate
command). In mock mode, it simulates a job.
Request:
- Method: POST
- URL:
/data/generate
- Body: No request body is required.
Response:
- Status 200:
The returned
{ "job_id": "g-<timestamp-nano>" }
job_id
uniquely identifies the generation job.
Error Responses:
- 500 Internal Server Error: If starting the job or creating a log file fails.
Purpose:
Starts a model training job. The endpoint performs a Git checkout for the provided branch and then initiates a training job via the ilab model train
command. It supports both real and mock modes.
Request:
- Method: POST
- URL:
/model/train
- Headers:
Content-Type: application/json
- Body:
{ "modelName": "models/instructlab/my-model", "branchName": "feature/train-improvements", "epochs": 10 // optional; must be a positive integer if provided }
Response:
- Status 200:
The
{ "job_id": "t-<timestamp-nano>" }
job_id
identifies the training job.
Error Responses:
- 400 Bad Request: If required fields (modelName or branchName) are missing or if
epochs
is invalid. - 500 Internal Server Error: If Git checkout or job creation fails.
Purpose:
Retrieves the current status of a job (data generation, training, pipeline, serving, etc.) given its job ID.
Request:
- Method: GET
- URL:
/jobs/{job_id}/status
Example:/jobs/g-123456789/status
Response:
- Status 200:
{ "job_id": "g-123456789", "status": "running", // possible values: "running", "finished", "failed" "branch": "feature/train-improvements", "command": "/path/to/ilab" }
Error Responses:
- 404 Not Found: If no job exists with the provided job_id.
- 500 Internal Server Error: On database query errors.
Purpose:
Retrieves the log file contents for the specified job ID.
Request:
- Method: GET
- URL:
/jobs/{job_id}/logs
Example:/jobs/t-123456789/logs
Response:
- Status 200: Plain text log content.
Error Responses:
- 404 Not Found: If either the job or its log file is not found.
- 500 Internal Server Error: If reading the log file fails.
Purpose:
Lists all jobs stored in the database.
Request:
- Method: GET
- URL:
/jobs
Response:
- Status 200:
[ { "job_id": "g-123456789", "cmd": "path/to/ilab", "args": ["data", "generate", "--pipeline", "full"], "status": "finished", "pid": 12345, "log_file": "logs/g-123456789.log", "start_time": "2025-02-18T12:00:00Z", "end_time": "2025-02-18T12:05:00Z", "branch": "", "served_model_name": "" }, // ... other jobs ]
Error Responses:
- 500 Internal Server Error: If there is a database error.
Purpose:
Orchestrates a full pipeline that first generates data and then starts a model training job. This endpoint creates a pipeline job that monitors both steps sequentially.
Request:
- Method: POST
- URL:
/pipeline/generate-train
- Headers:
Content-Type: application/json
- Body:
{ "modelName": "models/instructlab/my-model", "branchName": "feature/train-improvements", "epochs": 10 // optional }
Response:
- Status 200:
{ "pipeline_job_id": "p-<timestamp-nano>" }
Error Responses:
- 400 Bad Request: If required fields are missing.
- 500 Internal Server Error: If any of the steps (data generation or training) fails to start.
Purpose:
Serves the latest checkpoint of a model on port 8001. The endpoint checks for an optional checkpoint
parameter in the request body. If not provided, it selects the latest checkpoint directory (matching a prefix such as "samples_").
Request:
- Method: POST
- URL:
/model/serve-latest
- Headers:
Content-Type: application/json
- Body (optional):
{ "checkpoint": "samples_12345" // Optional; if omitted, the latest checkpoint is used. }
Response:
- Status 200:
If using vLLM mode, a container is launched and the response includes the corresponding job ID.
{ "status": "model process started", "job_id": "ml-<timestamp-nano>" }
Error Responses:
- 404 Not Found: If the checkpoints directory or specified checkpoint does not exist.
- 500 Internal Server Error: If starting the model serving process fails.
Purpose:
Serves the base model on port 8000. Depending on configuration (vLLM enabled or not), it either spawns a container or launches a local serving process.
Request:
- Method: POST
- URL:
/model/serve-base
- Body: No JSON payload required.
Response:
- Status 200:
{ "status": "model process started", "job_id": "ml-<timestamp-nano>" }
Error Responses:
- 500 Internal Server Error: If there is an error launching the serving process or container.
Purpose:
Runs a QnA evaluation by launching a Podman container with a specified model and YAML configuration. It validates the existence of both the model path and YAML file before execution.
Request:
- Method: POST
- URL:
/qna-eval
- Headers:
Content-Type: application/json
- Body:
{ "model_path": "/path/to/model", "yaml_file": "/path/to/config.yaml" }
Response:
- Status 200:
{ "result": "Evaluation output from QnA-eval container..." }
Error Responses:
- 400 Bad Request: If the specified
model_path
oryaml_file
does not exist or if the request body is malformed. - 500 Internal Server Error: If the Podman command fails.
Purpose:
Lists all checkpoint directories available in the expected checkpoints folder (typically under the user's home directory).
Request:
- Method: GET
- URL:
/checkpoints
Response:
- Status 200:
[ "checkpoint_1", "checkpoint_2", "samples_12345" ]
Error Responses:
- 404 Not Found: If the checkpoints directory does not exist.
- 500 Internal Server Error: If there is an error reading the directory.
Purpose:
Lists all running vLLM containers. The endpoint calls Podman to list containers that match the vLLM image and then inspects each container to extract arguments such as the served model name and model path.
Request:
- Method: GET
- URL:
/vllm-containers
Response:
- Status 200:
{ "containers": [ { "container_id": "abcdef123456", "image": "vllm/vllm-openai:latest", "command": "[...command...]", "created_at": "2025-02-18T12:00:00Z", "status": "Up 5 minutes", "ports": "8001/tcp", "names": "vllm_container_1", "served_model_name": "post-train", "model_path": "/path/to/model" } // ... additional containers ] }
Error Responses:
- 500 Internal Server Error: If listing or inspecting containers fails.
Purpose:
Stops (unloads) a running vLLM container based on the served model name. Only the names "pre-train"
or "post-train"
are valid.
Request:
- Method: POST
- URL:
/vllm-unload
- Headers:
Content-Type: application/json
- Body:
{ "model_name": "pre-train" }
Response:
- Status 200:
{ "status": "success", "message": "Model 'pre-train' unloaded successfully", "modelName": "pre-train" }
Error Responses:
- 400 Bad Request: If the provided
model_name
is not"pre-train"
or"post-train"
. - 500 Internal Server Error: If stopping the container fails.
Purpose:
Retrieves the status of a vLLM container for a specified served model. The request must include a query parameter specifying the model name (either "pre-train"
or "post-train"
). The endpoint checks container status and reads log files to determine if the container is still loading or has finished.
Request:
- Method: GET
- URL:
/vllm-status?model_name=post-train
Response:
- Status 200:
{ "status": "running" // possible values: "running", "loading", or "stopped" }
Error Responses:
- 400 Bad Request: If
model_name
is invalid. - 500 Internal Server Error: If querying the container status fails.
Purpose:
Checks the availability of GPUs by running the nvidia-smi
command and returns the number of free GPUs along with the total GPU count.
Request:
- Method: GET
- URL:
/gpu-free
Response:
- Status 200:
{ "free_gpus": 1, "total_gpus": 4 }
Error Responses:
- 500 Internal Server Error: If executing
nvidia-smi
fails.
Purpose:
A debug endpoint that returns the current mapping of served model names (e.g., "pre-train"
or "post-train"
) to their associated running job IDs (for vLLM container processes).
Request:
- Method: GET
- URL:
/served-model-jobids
Response:
- Status 200:
{ "pre-train": "v-123456789", "post-train": "v-987654321" }
Purpose:
(Implementation details not fully provided in the snippet.)
Typically, this endpoint is expected to convert a model from one format to another (e.g., from one file format to GGUF). The request body likely contains parameters such as source model path, target format, and additional options.
Request:
- Method: POST
- URL:
/model/convert
- Headers:
Content-Type: application/json
- Body:
{ "source_model_path": "/path/to/source/model", "target_format": "gguf" // ... additional conversion options }
Response:
- Status 200:
{ "status": "conversion completed", "converted_model_path": "/path/to/converted/model" }
Error Responses:
- 400 Bad Request: For invalid or missing parameters.
- 500 Internal Server Error: If the conversion process fails.