Skip to content

Instantly share code, notes, and snippets.

@nerdalert
Created February 18, 2025 15:16
Show Gist options
  • Save nerdalert/904cb341b31d1f3020c3493752b96372 to your computer and use it in GitHub Desktop.
Save nerdalert/904cb341b31d1f3020c3493752b96372 to your computer and use it in GitHub Desktop.

Ilab UI API Server


1. GET /models

Purpose:
Returns a list of available models. The server first checks an in‑memory cache (refreshed every 20 minutes) and returns the cached models if still valid; otherwise, it refreshes the cache by executing the ilab model list command.

Request:

  • Method: GET
  • URL: /models

Response:

  • Status 200:
    [
      {
        "name": "model-name-1",
        "last_modified": "2025-02-18T12:34:56Z",
        "size": "123 MB"
      },
      {
        "name": "model-name-2",
        "last_modified": "2025-02-17T09:21:30Z",
        "size": "456 MB"
      }
      // ... more models
    ]

Error Responses:

  • 500 Internal Server Error: If encoding or cache refresh fails.

2. GET /data

Purpose:
Retrieves a list of data records by running the ilab data list command and parsing its output.

Request:

  • Method: GET
  • URL: /data

Response:

  • Status 200:
    [
      {
        "dataset": "dataset-1",
        "created_at": "2025-02-18T11:00:00Z",
        "file_size": "10 MB"
      },
      {
        "dataset": "dataset-2",
        "created_at": "2025-02-17T15:30:00Z",
        "file_size": "20 MB"
      }
      // ... more datasets
    ]

Error Responses:

  • 500 Internal Server Error: If there’s an error running the command or parsing its output.

3. POST /data/generate

Purpose:
Starts a background job to generate new data (by running the ilab data generate command). In mock mode, it simulates a job.

Request:

  • Method: POST
  • URL: /data/generate
  • Body: No request body is required.

Response:

  • Status 200:
    {
      "job_id": "g-<timestamp-nano>"
    }
    The returned job_id uniquely identifies the generation job.

Error Responses:

  • 500 Internal Server Error: If starting the job or creating a log file fails.

4. POST /model/train

Purpose:
Starts a model training job. The endpoint performs a Git checkout for the provided branch and then initiates a training job via the ilab model train command. It supports both real and mock modes.

Request:

  • Method: POST
  • URL: /model/train
  • Headers: Content-Type: application/json
  • Body:
    {
      "modelName": "models/instructlab/my-model",
      "branchName": "feature/train-improvements",
      "epochs": 10  // optional; must be a positive integer if provided
    }

Response:

  • Status 200:
    {
      "job_id": "t-<timestamp-nano>"
    }
    The job_id identifies the training job.

Error Responses:

  • 400 Bad Request: If required fields (modelName or branchName) are missing or if epochs is invalid.
  • 500 Internal Server Error: If Git checkout or job creation fails.

5. GET /jobs/{job_id}/status

Purpose:
Retrieves the current status of a job (data generation, training, pipeline, serving, etc.) given its job ID.

Request:

  • Method: GET
  • URL: /jobs/{job_id}/status
    Example: /jobs/g-123456789/status

Response:

  • Status 200:
    {
      "job_id": "g-123456789",
      "status": "running",   // possible values: "running", "finished", "failed"
      "branch": "feature/train-improvements",
      "command": "/path/to/ilab"
    }

Error Responses:

  • 404 Not Found: If no job exists with the provided job_id.
  • 500 Internal Server Error: On database query errors.

6. GET /jobs/{job_id}/logs

Purpose:
Retrieves the log file contents for the specified job ID.

Request:

  • Method: GET
  • URL: /jobs/{job_id}/logs
    Example: /jobs/t-123456789/logs

Response:

  • Status 200: Plain text log content.

Error Responses:

  • 404 Not Found: If either the job or its log file is not found.
  • 500 Internal Server Error: If reading the log file fails.

7. GET /jobs

Purpose:
Lists all jobs stored in the database.

Request:

  • Method: GET
  • URL: /jobs

Response:

  • Status 200:
    [
      {
        "job_id": "g-123456789",
        "cmd": "path/to/ilab",
        "args": ["data", "generate", "--pipeline", "full"],
        "status": "finished",
        "pid": 12345,
        "log_file": "logs/g-123456789.log",
        "start_time": "2025-02-18T12:00:00Z",
        "end_time": "2025-02-18T12:05:00Z",
        "branch": "",
        "served_model_name": ""
      },
      // ... other jobs
    ]

Error Responses:

  • 500 Internal Server Error: If there is a database error.

8. POST /pipeline/generate-train

Purpose:
Orchestrates a full pipeline that first generates data and then starts a model training job. This endpoint creates a pipeline job that monitors both steps sequentially.

Request:

  • Method: POST
  • URL: /pipeline/generate-train
  • Headers: Content-Type: application/json
  • Body:
    {
      "modelName": "models/instructlab/my-model",
      "branchName": "feature/train-improvements",
      "epochs": 10  // optional
    }

Response:

  • Status 200:
    {
      "pipeline_job_id": "p-<timestamp-nano>"
    }

Error Responses:

  • 400 Bad Request: If required fields are missing.
  • 500 Internal Server Error: If any of the steps (data generation or training) fails to start.

9. POST /model/serve-latest

Purpose:
Serves the latest checkpoint of a model on port 8001. The endpoint checks for an optional checkpoint parameter in the request body. If not provided, it selects the latest checkpoint directory (matching a prefix such as "samples_").

Request:

  • Method: POST
  • URL: /model/serve-latest
  • Headers: Content-Type: application/json
  • Body (optional):
    {
      "checkpoint": "samples_12345"  // Optional; if omitted, the latest checkpoint is used.
    }

Response:

  • Status 200:
    {
      "status": "model process started",
      "job_id": "ml-<timestamp-nano>"
    }
    If using vLLM mode, a container is launched and the response includes the corresponding job ID.

Error Responses:

  • 404 Not Found: If the checkpoints directory or specified checkpoint does not exist.
  • 500 Internal Server Error: If starting the model serving process fails.

10. POST /model/serve-base

Purpose:
Serves the base model on port 8000. Depending on configuration (vLLM enabled or not), it either spawns a container or launches a local serving process.

Request:

  • Method: POST
  • URL: /model/serve-base
  • Body: No JSON payload required.

Response:

  • Status 200:
    {
      "status": "model process started",
      "job_id": "ml-<timestamp-nano>"
    }

Error Responses:

  • 500 Internal Server Error: If there is an error launching the serving process or container.

11. POST /qna-eval

Purpose:
Runs a QnA evaluation by launching a Podman container with a specified model and YAML configuration. It validates the existence of both the model path and YAML file before execution.

Request:

  • Method: POST
  • URL: /qna-eval
  • Headers: Content-Type: application/json
  • Body:
    {
      "model_path": "/path/to/model",
      "yaml_file": "/path/to/config.yaml"
    }

Response:

  • Status 200:
    {
      "result": "Evaluation output from QnA-eval container..."
    }

Error Responses:

  • 400 Bad Request: If the specified model_path or yaml_file does not exist or if the request body is malformed.
  • 500 Internal Server Error: If the Podman command fails.

12. GET /checkpoints

Purpose:
Lists all checkpoint directories available in the expected checkpoints folder (typically under the user's home directory).

Request:

  • Method: GET
  • URL: /checkpoints

Response:

  • Status 200:
    [
      "checkpoint_1",
      "checkpoint_2",
      "samples_12345"
    ]

Error Responses:

  • 404 Not Found: If the checkpoints directory does not exist.
  • 500 Internal Server Error: If there is an error reading the directory.

13. GET /vllm-containers

Purpose:
Lists all running vLLM containers. The endpoint calls Podman to list containers that match the vLLM image and then inspects each container to extract arguments such as the served model name and model path.

Request:

  • Method: GET
  • URL: /vllm-containers

Response:

  • Status 200:
    {
      "containers": [
        {
          "container_id": "abcdef123456",
          "image": "vllm/vllm-openai:latest",
          "command": "[...command...]",
          "created_at": "2025-02-18T12:00:00Z",
          "status": "Up 5 minutes",
          "ports": "8001/tcp",
          "names": "vllm_container_1",
          "served_model_name": "post-train",
          "model_path": "/path/to/model"
        }
        // ... additional containers
      ]
    }

Error Responses:

  • 500 Internal Server Error: If listing or inspecting containers fails.

14. POST /vllm-unload

Purpose:
Stops (unloads) a running vLLM container based on the served model name. Only the names "pre-train" or "post-train" are valid.

Request:

  • Method: POST
  • URL: /vllm-unload
  • Headers: Content-Type: application/json
  • Body:
    {
      "model_name": "pre-train"
    }

Response:

  • Status 200:
    {
      "status": "success",
      "message": "Model 'pre-train' unloaded successfully",
      "modelName": "pre-train"
    }

Error Responses:

  • 400 Bad Request: If the provided model_name is not "pre-train" or "post-train".
  • 500 Internal Server Error: If stopping the container fails.

15. GET /vllm-status

Purpose:
Retrieves the status of a vLLM container for a specified served model. The request must include a query parameter specifying the model name (either "pre-train" or "post-train"). The endpoint checks container status and reads log files to determine if the container is still loading or has finished.

Request:

  • Method: GET
  • URL: /vllm-status?model_name=post-train

Response:

  • Status 200:
    {
      "status": "running"  // possible values: "running", "loading", or "stopped"
    }

Error Responses:

  • 400 Bad Request: If model_name is invalid.
  • 500 Internal Server Error: If querying the container status fails.

16. GET /gpu-free

Purpose:
Checks the availability of GPUs by running the nvidia-smi command and returns the number of free GPUs along with the total GPU count.

Request:

  • Method: GET
  • URL: /gpu-free

Response:

  • Status 200:
    {
      "free_gpus": 1,
      "total_gpus": 4
    }

Error Responses:

  • 500 Internal Server Error: If executing nvidia-smi fails.

17. GET /served-model-jobids

Purpose:
A debug endpoint that returns the current mapping of served model names (e.g., "pre-train" or "post-train") to their associated running job IDs (for vLLM container processes).

Request:

  • Method: GET
  • URL: /served-model-jobids

Response:

  • Status 200:
    {
      "pre-train": "v-123456789",
      "post-train": "v-987654321"
    }

18. POST /model/convert

Purpose:
(Implementation details not fully provided in the snippet.)
Typically, this endpoint is expected to convert a model from one format to another (e.g., from one file format to GGUF). The request body likely contains parameters such as source model path, target format, and additional options.

Request:

  • Method: POST
  • URL: /model/convert
  • Headers: Content-Type: application/json
  • Body:
    {
      "source_model_path": "/path/to/source/model",
      "target_format": "gguf"
      // ... additional conversion options
    }

Response:

  • Status 200:
    {
      "status": "conversion completed",
      "converted_model_path": "/path/to/converted/model"
    }

Error Responses:

  • 400 Bad Request: For invalid or missing parameters.
  • 500 Internal Server Error: If the conversion process fails.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment