Created
May 19, 2025 20:00
-
-
Save morganmcg1/0c135058bf4daf062d0a9882eceae04b to your computer and use it in GitHub Desktop.
wandbot evals including Intercom bot answers - May 19th 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
[ | |
{ | |
"index": 0, | |
"question": """Hey I have a question about using wandb with fastapi in a prod environment. is it recommended to initialize wandb within a specific route function, ie | |
`@app.route('/') | |
def my_function(): | |
wandb.init(...) | |
` | |
or should i initialize beforehand: | |
`wandb.init(...) | |
@app.route('/') | |
def my_function(): | |
...` | |
I'm getting a long list of log items in the console and many of them are empty.""", | |
"answer": """We recommend initializing wandb as early as possible in your FastAPI application, preferably before defining your route functions. This helps capture any console output, including error messages, which can be useful for debugging. | |
To set up wandb in your FastAPI app: | |
Initialize wandb at the start of your main application file: | |
import wandb | |
wandb.init(project="your_project_name") | |
app = FastAPI() | |
@app.route('/') | |
def my_function(): | |
# Your route logic here | |
To address the empty log items, try increasing the initialization timeout: | |
import os | |
os.environ['WANDB_INIT_TIMEOUT'] = '600' | |
This gives wandb more time to start up. | |
If issues persist, enable debugging for more detailed logs: | |
os.environ['WANDB_DEBUG'] = 'true' | |
os.environ['WANDB_CORE_DEBUG'] = 'true' | |
These steps should help resolve your initialization and logging issues. """, | |
}, | |
{ | |
"index": 1, | |
"question": """Hey with wandb is it possible to link from the sweep api? | |
best_run = sweep.best_run() | |
I have the best run but it isn't clear to me how to link through the api. the UI works fine but trying to get the api working also. I have been reading the docs but missing something.""", | |
"answer": """To link to the best run from a sweep using the Weights & Biases API, you can use the best_run() method. Here's how to do it: | |
Import the wandb library: | |
import wandb | |
Initialize the API: | |
api = wandb.Api() | |
Get the sweep object: | |
sweep = api.sweep("<entity>/<project>/<sweep_id>") | |
Get the best run: | |
best_run = sweep.best_run() | |
This best_run object will contain information about the best run in the sweep, based on the metric defined in the sweep configuration. | |
You can then use this best_run object to access various properties or methods of the run, similar to how you'd interact with it in the UI.""", | |
}, | |
{ | |
"index": 2, | |
"question": """Explain how I can version datasets with Weights & Biases. How do I create a new version of a dataset? How does Weights & Biases work with versions? What are some best practices for versioning datasets?""", | |
"answer": """We use Artifacts for dataset versioning in Weights & Biases. Here's how it works: | |
Create new versions: | |
• From scratch: Use wandb.Artifact API with a new name. | |
• From existing version: Use artifact.new_draft() to modify a subset of files. | |
Versioning system: | |
• First version is v0, subsequent changes create v1, v2, etc. | |
• Use aliases like latest or best for easy reference. | |
Best practices: | |
• Log datasets as Artifacts in your runs. | |
• Use clear, short aliases for different versions. | |
• Explore datasets using W&B Tables for visualization. | |
Understanding versioning: | |
• W&B automatically tracks versions and creates a DAG lineage. | |
• Learn more in our Artifacts guide. | |
This system helps you track changes and manage your datasets easily throughout your ML workflow. """, | |
}, | |
{ | |
"index": 3, | |
"question": """Hi, can anybody help me with this issue? wandb.sdk.service.service.ServiceStartTimeoutError: Timed out waiting for wandb service to start after 30.0 seconds. Try increasing the timeout with the `_service_wait` setting. I have tried to set 300 seconds, reinstalled wandb as well. but I still get this error""", | |
"answer": """I understand this timeout error can be frustrating. Here are several approaches to resolve your ServiceStartTimeoutError: | |
Try setting the WANDB_INIT_TIMEOUT environment variable: | |
import os | |
os.environ['WANDB_INIT_TIMEOUT'] = '600' | |
Enable debugging to identify underlying issues: | |
os.environ['WANDB_DEBUG'] = 'true' | |
os.environ['WANDB_CORE_DEBUG'] = 'true' | |
Change the start method in your code: | |
wandb.init(settings=wandb.Settings(start_method="fork")) | |
For older SDK versions (pre-0.13.0), use "thread" instead of "fork". | |
For SDK version 0.12.5+, add this before your wandb.init(): | |
wandb.require("service") | |
Check your network connection and verify your API key and project settings are correct """, | |
}, | |
{ | |
"index": 4, | |
"question": """what is the difference between artifact.add_file and wandb.save""", | |
"answer": """The main difference between artifact.add_file and wandb.save in Weights & Biases is their purpose and how they handle files: | |
artifact.add_file: | |
• Used to add files to an Artifact object. | |
• Allows you to organize and version files within the Artifact structure. | |
• Useful for tracking datasets, models, and other assets. | |
wandb.save: | |
• Used to save individual files directly to the run. | |
• Doesn't create an Artifact structure. | |
• Typically used for saving code or small config files. | |
For better organization and versioning, especially for datasets and models, using Artifacts with add_file is often recommended """, | |
}, | |
{ | |
"index": 5, | |
"question": """Hi I have created an academic team, however I don't have admin rights in my own group and cannot access admin features at all""", | |
"answer": """When you create a team, you're not automatically assigned as an admin. Admin roles are specifically assigned by either: | |
The organization administrator (the first user who signed up with your organization domain) | |
An existing team administrator | |
As a team creator, you need to have the admin role explicitly assigned to you. Without admin rights, you won't be able to: | |
Add or remove team members | |
Manage team settings | |
Have full deletion permissions for all projects | |
To get admin rights, you'll need to contact your organization administrator and ask them to assign you the admin role for your academic team """, | |
}, | |
{ | |
"index": 6, | |
"question": """How can I add prediction-related data to a W&B run days after my training run (i.e. not in the same process/moment that I did the training)? Is it possible to link prediction data to an external tool such as BigQuery and still get rich visualizations in W&B?""", | |
"answer": """You can add prediction data to a completed W&B run using artifacts or by logging tables to an existing run. Here's how: | |
Using Artifacts: | |
run = wandb.init(project="my_project") | |
predictions = wandb.Artifact("test_predictions", type="predictions") | |
test_table = wandb.Table(data=your_prediction_data, columns=your_columns) | |
predictions.add(test_table, "predictions_key") | |
run.log_artifact(predictions) | |
Logging Tables to Runs: | |
run = wandb.init() | |
my_table = wandb.Table(columns=["id", "prediction", "ground_truth"], | |
data=your_prediction_data) | |
run.log({"predictions": my_table}) | |
For BigQuery data, you would need to extract the data first and then format it as a W&B Table before logging. This allows you to get the same rich visualizations for comparing predictions across time or different model versions """, | |
}, | |
{ | |
"index": 7, | |
"question": """On my laptop, wandb is configured to log in to my company's wandb. However, I want to use my personal wandb account for my personal projects. Even after wandb login --relogin, it will still try to go to company.wandb.ai instead of api.wandb.ai. How can I solve this? | |
Even better, is there a good way to handle two wandb accounts? I cannot give details about how I set my company wandb.""", | |
"answer": """You can specify a different host when logging in to your personal W&B account using the --host option: | |
wandb login --host https://api.wandb.ai --relogin | |
This will direct you to your personal account instead of your company's instance. | |
For managing two accounts, you can use the wandb init command to set up project-specific configurations: | |
Create a directory for your personal projects | |
Inside that directory, run: | |
wandb init -e YOUR_USERNAME -p YOUR_PROJECT --mode online | |
This creates a wandb/settings file in that directory that will override your global settings when you run wandb from there. | |
When switching between work and personal projects, just make sure you're in the appropriate directory before running your code """, | |
}, | |
{ | |
"index": 8, | |
"question": """I want to add a new columns to a wandb.Table with an operation involving other columns, what operations can I use and how to actually do this?""", | |
"answer": """You can add a new column to a wandb.Table based on operations involving other columns using the add_computed_columns method. | |
Here's how to do it: | |
# Create a function that defines your operation | |
def compute_new_columns(ndx, row): | |
# Example: Create a new column that adds values from columns 'a' and 'b' | |
return { | |
"sum_a_b": row["a"] + row["b"] | |
} | |
# Apply the function to create the new column | |
my_table.add_computed_columns(compute_new_columns) | |
The function you pass to add_computed_columns should: | |
Accept parameters ndx (row index) and row (dictionary of existing columns) | |
Return a dictionary with new column names as keys and computed values as values | |
You can perform any operation on the existing columns - addition, multiplication, string operations, or more complex transformations based on your needs """, | |
}, | |
{ | |
"index": 9, | |
"question": """Hey there, I just recently starting using the `WANDB_WATCH="all"` environment variable and I'm trying to find an intelligent way of viewing all these different values on the dashboard. Is there some type of high-level view? In particular I'd like to see how gradients/activations vary across layers for the `llama` model. How do I see outliers across the model weights? Are there any references/examples that you could point me to?""", | |
"answer": """When you set WANDB_WATCH="all", we log both gradients and parameters as histograms during model training. | |
To view these in the dashboard: | |
The data is automatically logged during training when this environment variable is set | |
You can also explicitly call wandb.watch(model, log="all", log_freq=10) in your code for more control | |
Once logged, you can: | |
Find histograms of gradients and parameters in the "Model" tab of your run page | |
Compare distributions across layers | |
Identify outliers by looking for unusual patterns in the histograms | |
Adjust the logging frequency with log_freq parameter if needed | |
For the llama model specifically, these visualizations will help you analyze how gradients and activations vary across the transformer layers """, | |
}, | |
{ | |
"index": 10, | |
"question": """Wandb ignores --configs flag when running training script. Running a script I want to specify what config YAML to use. The documentation states you can specify what configuration file to load using the `--configs` flag, however running | |
`python myscript.py --configs new-configs.yaml` | |
on the following script | |
`import wandb | |
wandb.init() | |
print(wandb.config)` | |
I can see that the configuration contained in `config-defaults.yaml` is being loaded instead. Why is this happening?""", | |
"answer": """The issue might be with how you're initializing wandb in your script. When using the --configs flag, you need to make sure wandb.init() is properly set up to use this command-line argument. | |
In your current script, you're using wandb.init() without any parameters, which means it will default to loading config-defaults.yaml automatically. | |
Try modifying your script to explicitly handle command-line arguments: | |
import wandb | |
# This will properly pick up the --configs flag | |
wandb.init() | |
print(wandb.config) | |
The --configs flag should work when used like this: | |
python myscript.py --configs new-configs.yaml | |
If you're still having issues, check that your YAML file is properly formatted and in the correct location relative to your script """, | |
}, | |
{ | |
"index": 11, | |
"question": """I want to submit a wandb Report to a conference, can I somehow anonymize the datA?""", | |
"answer": """When submitting a W&B Report to a conference, you can anonymize your data by: | |
Renaming runs in your report to make them more meaningful and less identifiable. This feature allows you to replace the original run names with anonymized versions while preserving the original names in your project. | |
For sharing with specific reviewers, use the "magic link" sharing option instead of making your project fully public. This gives access only to people with the link. | |
If you need more privacy controls, consider: | |
• Using the lock icon in the navigation bar to adjust privacy settings | |
• Creating a view-only link via a report for limited access | |
• Contacting your organization admin if certain options aren't available | |
The renamed runs will appear with their new names throughout all panel grids in your report, making your data presentation more professional while maintaining anonymity. """, | |
}, | |
{ | |
"index": 12, | |
"question": """i am trying to figure out the best way to organize/name/group my wandb runs when doing hyperparam tuning with cross-validation. | |
ideally, i'd like to be able to see the average score across all the runs that are part of a CV set. and then see the best CV score across the whole hyperparam sweep. (i am using optuna, not wandb, to manage the hyperparam sweep) | |
is there a recommended approach for this?""", | |
"answer": """For hyperparameter tuning with cross-validation in Weights & Biases, here's how to organize your runs: | |
Use the group parameter when initializing runs to organize related cross-validation runs together: | |
wandb.init(group="experiment_1", job_type="cross_validation") | |
This groups all CV runs for a specific hyperparameter set, making them viewable as a unified experiment. | |
Add metadata with config to track hyperparameters and cross-validation details: | |
wandb.init( | |
group="hyperparam_set_1", | |
config={"learning_rate": 0.01, "cv_fold": 3} | |
) | |
Use job_type to distinguish different runs within a group (e.g., "train" vs "eval"). | |
In the W&B UI, you can: | |
View grouped runs together | |
See the mean, min, max values across CV folds | |
Toggle grouping on/off with the "Group" button | |
Customize graph settings to show variance across CV folds | |
This approach lets you easily compare performance across different hyperparameter sets while tracking individual CV fold results. """, | |
}, | |
{ | |
"index": 13, | |
"question": """can you tell why this error occurred while uploading the training results to the server: | |
wandb: ERROR Error while calling W&B API: run lee086824/Graduate_Project/4iolqdaz was previously created and deleted; try a new run name (Response [409]>)""", | |
"answer": """This error occurs because you're trying to create a new run with the same ID as one that was previously deleted. When a run is deleted in W&B, its ID remains reserved for a period of time. | |
To resolve this issue, you have two options: | |
Use a different run name/ID in your code when initializing wandb | |
Recover the deleted run if it was deleted within the last 7 days: | |
• Go to your Project Overview page | |
• Click the three dots in the top right corner | |
• Select "Undelete recently deleted runs" | |
Note that you can only restore runs deleted within the last 7 days. If the run was deleted earlier, you'll need to use a new run name. """, | |
}, | |
{ | |
"index": 14, | |
"question": """why am I getting a table instead of confusion matrix chart using wandb.log({"confusion_matrix_normalized": wandb.plot.confusion_matrix( | |
y_true=y_test, | |
probs=y_probas, | |
class_names=text_labels | |
)})""", | |
"answer": """It seems you're encountering an unexpected result when trying to log a confusion matrix. To help you better, could you please share the exact error message or unexpected output you're seeing? This will help us pinpoint the issue more accurately. | |
Generally, when using wandb.plot.confusion_matrix(), you should get a confusion matrix chart, not a table. Here's a quick reminder of how to use it correctly: | |
cm = wandb.plot.confusion_matrix( | |
y_true=ground_truth, | |
preds=predictions, | |
class_names=class_names | |
) | |
wandb.log({"conf_mat": cm}) | |
Make sure you're providing the correct parameters: ground truth labels, predictions (or probabilities), and class names. If you're still getting a table, there might be an issue with the data format or how you're logging it """, | |
}, | |
{ | |
"index": 15, | |
"question": """How can I store data versions of large datasets using artefacts and tables? Such as millions of videos?""", | |
"answer": """We can help you store and version large datasets, including millions of videos, using our Artifacts and Tables features. Here's how: | |
Use Artifacts for dataset versioning: | |
• Create an Artifact with wandb.Artifact() | |
• Add your dataset files to the Artifact using artifact.add_file() or artifact.new_file() | |
• Log the Artifact with run.log_artifact() | |
For extremely large datasets stored externally: | |
• Use reference artifacts to track metadata (URLs, size, checksums) without uploading the actual data | |
Use Tables for structured data: | |
• Create a Table with wandb.Table() | |
• Add data incrementally with table.add_data() | |
• Log the Table to your run with run.log({"table_name": table}) | |
This approach allows you to version and track your large datasets efficiently, even with millions of videos. """, | |
}, | |
{ | |
"index": 16, | |
"question": """how to make runs in a sweep parallel""", | |
"answer": """To run sweep runs in parallel in Weights & Biases, you have a few options: | |
Multi-CPU parallelization: | |
• Open multiple terminal windows on your machine. | |
• In each window, run the command: wandb agent sweep_id | |
• Replace sweep_id with your actual sweep ID. | |
Multi-GPU parallelization: | |
• Open multiple terminal windows. | |
• In each window, specify a different GPU using CUDA_VISIBLE_DEVICES. | |
• For example: | |
• Terminal 1: CUDA_VISIBLE_DEVICES=0 wandb agent sweep_id | |
• Terminal 2: CUDA_VISIBLE_DEVICES=1 wandb agent sweep_id | |
Jupyter Notebook parallelization: | |
• Open multiple Jupyter Notebooks. | |
• In each notebook, run: wandb.agent(sweep_id=sweep_id, function=function_name) | |
• Replace sweep_id and function_name with your actual values. | |
By using these methods, you can run multiple agents in parallel, speeding up your hyperparameter optimization process """, | |
}, | |
{ | |
"index": 17, | |
"question": """Can I copy runs from one project from a team to another non-team project from? I can see all the runs in my wandb folder but I don't know how to sync those runs to another project. | |
I need this because I have used all free tracked hours provided by weights and biases for the team entity and now I wish to transfer those runs to my personal project. There is an option for moving those runs as mentioned here (https://docs.wandb.ai/guides/runs/manage-runs) using GUI, but now I can not access those runs as I am being redirected to billing page because of using all the tracked hours.""", | |
"answer": """Unfortunately, W&B doesn't support copying runs directly - only moving them through the UI, which you can't access due to your billing limitation. | |
However, you can use the W&B API to export data from your team project and then create new runs in your personal project: | |
Create an API key from your user profile (if you don't have one already) | |
Use the API to access your runs data: | |
import wandb | |
api = wandb.Api() | |
runs = api.runs("team-entity/project-name") | |
For each run, you can extract metrics, configs, and other data, then create new runs in your personal project with this data. | |
This approach won't be an exact copy, but it allows you to transfer the essential information from your team project to your personal account. """, | |
}, | |
{ | |
"index": 18, | |
"question": """I'm getting Internal Server Error for url: https://api.wandb.ai/graphql when trying to download an artifact with the following code | |
``` | |
processed_data_at = run.use_artifact(f'{params.PROCESSED_DATA_AT}:latest') | |
processed_dataset_dir = Path(processed_data_at.download()) | |
```""", | |
"answer": """I understand this error can be frustrating, and I'm sorry for the trouble you're experiencing. To help you better, could you please share the exact error message you're seeing? Additionally, it would be helpful to know which specific artifact you're trying to download. | |
In general, when downloading artifacts, we use the following process: | |
Use the artifact: | |
artifact = run.use_artifact('artifact_name:latest') | |
Download the artifact: | |
artifact_dir = artifact.download() | |
If you're still encountering issues, please provide more details about the artifact and the full error message. This will help us identify the precise issue and find a solution for you """, | |
}, | |
{ | |
"index": 19, | |
"question": """do you support yolov5? how can i log COCO bounding boxes?""", | |
"answer": """Yes, we support YOLOv5! Ultralytics' YOLOv5 is directly integrated with Weights & Biases, making it easy to track experiments, version models and datasets, and visualize predictions. | |
To log COCO bounding boxes, you can use our wandb.Image with the boxes parameter. Here's a simplified example: | |
import wandb | |
import numpy as np | |
run = wandb.init() | |
image = np.array(your_image) # Your image data | |
# Define your class labels | |
class_labels = {0: "person", 1: "car", 2: "road", 3: "building"} | |
# Create wandb.Image with bounding boxes | |
img = wandb.Image( | |
image, | |
boxes={ | |
"predictions": { | |
"box_data": [ | |
{ | |
"position": {"minX": 0.1, "maxX": 0.2, "minY": 0.3, "maxY": 0.4}, | |
"class_id": 1, | |
"box_caption": class_labels[1] | |
} | |
], | |
"class_labels": class_labels | |
} | |
} | |
) | |
# Log the image with bounding boxes | |
run.log({"detection_results": img}) | |
You can specify boxes in either relative coordinates (0-1) or pixel coordinates. """, | |
}, | |
{ | |
"index": 20, | |
"question": """I have the following code, which finds a sweep by the sweep's display name and then gets the runs associated with it. | |
``` | |
def get_sweeps_by_sweep_name(entity, project, sweep_name, require_unique = True): | |
sweep_ids = [] | |
for sweep in api.from_path(f"{entity}/{project}").sweeps(): | |
if sweep.name == sweep_name: | |
sweep_ids.append(sweep.id) | |
if require_unique == True and len(sweep_ids) > 1: | |
raise ValueError("There are multiple sweeps with the same name") | |
return sweep_ids # singular if require_unique == True | |
def get_runs_by_sweep_name(entity, project, sweep_name): | |
sweep_ids = get_sweeps_by_sweep_name(entity, project, sweep_name, require_unique = True) | |
return api.runs(f"{entity}/{project}", filters={"sweep": sweep_ids[0]}) | |
# call with | |
sweepruns = get_runs_by_sweep_name("MYENTITY", "MYPROJECT", "MYRUNNAME1") | |
``` | |
Is there a better way to do this, maybe with just a single filter?""", | |
"answer": """Yes, there's a more direct way to get runs from a sweep by its display name. You can use the api.sweep() method to get the sweep directly, then access its runs: | |
import wandb | |
api = wandb.Api() | |
# Get the sweep by entity/project/sweep_name | |
sweep = api.sweep(f"{entity}/{project}/{sweep_name}") | |
# Access all runs in the sweep | |
sweep_runs = sweep.runs | |
This approach is more efficient than your current code because: | |
It directly fetches the sweep without iterating through all sweeps | |
It automatically gives you access to all runs in that sweep | |
It's more concise and readable | |
If you need to filter the runs further, you can still do that after getting the sweep.runs collection """, | |
}, | |
{ | |
"index": 21, | |
"question": """can you provide a bit more clarity on the difference between setting `resume` in `wandb.init` to `allow` vs. `auto`? | |
I guess the difference has to do with whether the previous run crashed or not. I guess if the run didn't crash, `auto` may overwrite if there's matching `id`?""", | |
"answer": """The main difference between resume="allow" and resume="auto" in wandb.init is how they handle existing runs: | |
resume="allow": | |
• If a run with the specified ID exists, it resumes from the last step. | |
• If no run with that ID exists, it creates a new run with the given ID. | |
• This option gives you more control over resuming specific runs. | |
resume="auto": | |
• Automatically tries to resume the previous run if it crashed on the same machine. | |
• If no crashed run is found, it starts a new run. | |
• This option is more hands-off and useful for automatic crash recovery. | |
Neither option will overwrite existing runs. They either resume or create new runs based on the situation. """, | |
}, | |
{ | |
"index": 22, | |
"question": """I need some help with run resuming, does it automatically use the last saved checkpoint artifact model? How do i get my run resuming to start at the same place it ended? Right now its starting from epoch 1 every time.""", | |
"answer": """To resume your run from the last saved checkpoint, you need to: | |
Initialize a new run with the previous run ID: | |
with wandb.init( | |
project="your_project", | |
id="your_previous_run_id", | |
resume="must", | |
) as run: | |
Download your checkpoint artifact: | |
checkpoint_name = "checkpoint-your_run_id:latest" | |
checkpoint_artifact = run.use_artifact(checkpoint_name) | |
checkpoint_dir = checkpoint_artifact.download() | |
Use this checkpoint directory to initialize your model and trainer with the resume flag: | |
# For Hugging Face Transformers | |
trainer = Trainer(..., resume_from_checkpoint=True) | |
# For PyTorch Lightning | |
model = YourModel.load_from_checkpoint(Path(checkpoint_dir) / "model.ckpt") | |
This ensures your training continues from where it left off rather than starting from epoch 1.""", | |
}, | |
{ | |
"index": 23, | |
"question": """when using bayesian optimizer for sweep is it best to log the optimized metric only once (at the end of the training) or is it ok to log it throughout the training?""", | |
"answer": """When using the Bayesian optimizer for sweeps in Weights & Biases, it's generally better to log the optimized metric throughout the training, not just at the end. Here's why: | |
Continuous logging helps track progress and allows for early stopping if needed. | |
It provides more data points for the Bayesian optimizer to work with, potentially leading to better hyperparameter suggestions. | |
However, be mindful of logging frequency. For scalars (like most optimization metrics), aim for less than 100,000 logged points per metric. | |
You can adjust the logging frequency based on your total training steps. For example, if you have 1,000,000 steps, you might log every 100 or 1000 steps. | |
Remember to use wandb.log() to record your metric values throughout training, ensuring the Bayesian optimizer has enough information to make informed decisions. """, | |
}, | |
{ | |
"index": 24, | |
"question": """How can I link the run to model registry? How can I use the best run to do that?""", | |
"answer": """You can link a run to the model registry using the link_model method. Here's how: | |
import wandb | |
run = wandb.init(entity="your-entity", project="your-project") | |
run.link_model(path="path-to-model", registered_model_name="your-model-name") | |
run.finish() | |
To link your best run, simply use this method within that specific run or reference it: | |
Identify your best run based on your evaluation metrics | |
Use the link_model method with that run | |
Provide the path to your model files and the name for your registered model | |
If the registered model name already exists, your model will be added as a new version. If not, a new registered model will be created. | |
You can also link models interactively through the W&B UI after identifying your best run """, | |
}, | |
{ | |
"index": 25, | |
"question": """how can I use regex to search for all runs EXCEPT those containing a certain string, negative look-arounds such as ^((?!notthisrun).)*$ do not seem to work""", | |
"answer": """To search for runs excluding those with a specific string, you can use MongoDB's $nor operator with $regex in the W&B API: | |
api.runs( | |
path="your_entity/your_project", | |
filters={ | |
"$nor": [ | |
{"display_name": {"$regex": "notthisrun"}} | |
] | |
} | |
) | |
This query returns all runs where the display name does NOT match the regex pattern "notthisrun". You can apply this to other fields too, like config parameters or metrics. | |
The $nor operator performs a logical NOR operation on an array of expressions, selecting documents that fail all the specified conditions. """, | |
}, | |
{ | |
"index": 26, | |
"question": """Hi! I have a question regarding artifacts. I have deleted the data artifacts i've uploaded previously on wandb UI. I have uploaded new data. but seems like while using the new data artifacts (image paths) for future use, wandb is somehow returning the previously deleted paths.""", | |
"answer": """We understand this is unexpected behavior, and we'd like to help you resolve it. There are a few possible reasons for this: | |
Artifact versioning: When you use an artifact, W&B retrieves the version that was logged at the time of creation. Make sure you're using the latest version of your artifact. | |
Caching: W&B might be using cached data. Try clearing your local cache and re-downloading the artifact. | |
Reference artifacts: If you're using reference artifacts, ensure the underlying data in your file system has been updated. | |
To help us better understand your issue, could you please share: | |
The exact code you're using to create and use the artifact | |
Any error messages you're seeing | |
This information will help us provide a more accurate solution""", | |
}, | |
{ | |
"index": 27, | |
"question": """can I log image data with more than the standard 3 or 4 channels? How do I view it later online?""", | |
"answer": """Yes, you can log image data with more than the standard 3 or 4 channels in Weights & Biases. We support logging various types of media, including images with multiple channels. | |
To log multi-channel images, you can use the wandb.Image object. Here's a simple example: | |
import wandb | |
import numpy as np | |
# Create a multi-channel image (e.g., 6 channels) | |
multi_channel_img = np.random.rand(100, 100, 6) | |
# Log the image | |
wandb.log({"multi_channel_image": wandb.Image(multi_channel_img)}) | |
You can view these logged images later in the W&B UI. Our platform automatically handles the visualization of multi-channel images. | |
For more advanced logging and visualization options, check our documentation on logging media and objects. """, | |
}, | |
{ | |
"index": 28, | |
"question": """how to i download and concat every version of a Table thats been logged to artifacts?""", | |
"answer": """To download and concatenate all versions of a Table from artifacts, you'll need to: | |
First, get each version of your table using the artifact API: | |
import wandb | |
run = wandb.init(project="your_project") | |
# Get a specific version of your table | |
artifact_v1 = run.use_artifact("your-table-artifact:v0") | |
table_v1 = artifact_v1.get("table_name") | |
artifact_v2 = run.use_artifact("your-table-artifact:v1") | |
table_v2 = artifact_v2.get("table_name") | |
Then use wandb.JoinedTable() to combine them: | |
# Join tables (if they have a common key) | |
joined_table = wandb.JoinedTable(table_v1, table_v2, "common_key_column") | |
# Or convert to pandas dataframes and concatenate | |
df1 = table_v1.get_dataframe() | |
df2 = table_v2.get_dataframe() | |
combined_df = pd.concat([df1, df2]) | |
You can repeat this process for all versions by iterating through them programmatically """, | |
}, | |
{ | |
"index": 29, | |
"question": """Can I download a model from W&B artifact without a W&B API key?""", | |
"answer": """To download a model from a Weights & Biases artifact, you typically need to use the W&B API, which requires an API key. However, there are ways to download artifacts without creating a full W&B run. | |
You can use the W&B Public API to download an artifact without initializing a run: | |
import wandb | |
api = wandb.Api() | |
artifact = api.artifact("user/project/artifact:latest") | |
local_path = artifact.download() | |
This method still requires authentication, usually through an API key. | |
If you need to download artifacts without any authentication, you might need to explore alternative methods or contact our support team for more specific guidance based on your use case. """, | |
}, | |
{ | |
"index": 30, | |
"question": """I'm using a free account. How can I share the my existing personal project with others? Your past instruction "Move your personal project to the team by going to the project's settings page and changing the "Owner" to the team you created." didn't work. Actually, I could not find the bottom. Could you please show me the way of sharing my existing project with others step by step?""", | |
"answer": """With a free account, you can share your personal project by making it public or sharing a view-only link: | |
To make your project public: | |
Go to your project page in the W&B web app | |
Click the lock icon in the navigation bar | |
Select "Public" from the privacy settings | |
Save your changes | |
If you prefer limited sharing, you can share specific content through reports: | |
Create a report from your project | |
Click the "Share" button in the upper right corner | |
Either enter an email address or copy the magic link | |
People with the magic link can view your report without logging in | |
Note that if the "Public" option isn't available due to account restrictions, sharing through reports is your best option """, | |
}, | |
{ | |
"index": 31, | |
"question": """Hello, I was able to run and automatically upload my results to wandb yesterday but is unable to do this today (received error of Network error(ReadTimeout). Then i tried the wandb offline command and ran my codes successfully, but encountered Network error (ConnectTimeout) while trying to sync it to wandb online (this was a few hours later since the ReadTimeout error appeared). Was wondering what is the issue here...can anyone help?""", | |
"answer": """These network timeout errors typically occur when there are connectivity issues between your machine and our servers. | |
When you experience connection problems: | |
For immediate progress: Continue using wandb offline mode as you've been doing. This stores all your metrics locally without requiring an internet connection. | |
To sync later: Once your internet connection is stable, use the command wandb sync DIRECTORY where DIRECTORY is the path to your offline run data. This will upload your stored metrics to our servers. | |
The W&B client automatically tries to reconnect when internet issues occur, but persistent network problems (like firewalls, unstable connections, or proxy issues) can cause these timeout errors. | |
I understand these connection issues can be frustrating when you're trying to track your experiments. Let me know if syncing your offline runs works once your connection stabilizes. """, | |
}, | |
{ | |
"index": 32, | |
"question": """what firewall addresses and ports are needed to use wandb?""", | |
"answer": """I apologize, but the provided sources don't contain specific information about firewall addresses and ports needed to use Weights & Biases. However, I can provide some general guidance on how to start using W&B. | |
To begin using Weights & Biases, you'll need to: | |
Sign up for a W&B account at wandb.ai/site. | |
Install the W&B library using pip: pip install wandb. | |
Log in to W&B in your Python environment using wandb.login(). | |
For local usage, W&B typically communicates over HTTPS (port 443) to send data to their servers. If you're running a local W&B server, you can specify a custom port using the -p or --port option when starting the server. | |
To provide more specific information about firewall settings, could you tell me more about your setup? Are you using W&B in a corporate environment with strict firewall policies, or are you trying to set up a local W&B server? """, | |
}, | |
{ | |
"index": 33, | |
"question": """I have a question about exporting CSV files from a web panel. Whenever I do this, I always get two extra columns for MAX and MIN values, even if I only have one data curve. Does anyone know how to solve this issue?""", | |
"answer": """The extra MAX and MIN columns appear because of the default point aggregation settings in the web panel. These columns represent the minimum and maximum values for each data point bucket. | |
You can control this behavior by adjusting the min/max rendering settings in your line plots: | |
In your line plot visualization, look for the point aggregation settings | |
Change the min/max display option to "Never" instead of "Always" or "On hover" | |
This will ensure that when you export your CSV, only the aggregated line values will be included without the additional MIN and MAX columns. | |
If you need to export the complete run history without any aggregation, you can also download the run history artifact using the Parquet format through the API. """, | |
}, | |
{ | |
"index": 34, | |
"question": """I'd like to attach some images to my experiment run. I generate the images on a different host machine, but using the outputs of the current training run. How can I get this host machine to log images to the run that is happening on a different machine? I was thinking about sharing the run_id, but the docs seem to warn against this and recommend using groups. Groups seem like overkill for just needing to post images to the same run. Any suggestions?""", | |
"answer": """You can safely log images to an existing run from a different machine by using the run ID with the wandb.init() function and the mode="shared" setting. Here's how: | |
import wandb | |
# Initialize a run on the second machine using the run ID from your first machine | |
run = wandb.init( | |
id="YOUR_EXISTING_RUN_ID", # Replace with your actual run ID | |
settings=wandb.Settings(mode="shared", x_primary=False) | |
) | |
# Now you can log images to this run | |
run.log({"example_image": wandb.Image("path/to/your/image.jpg")}) | |
This approach is actually recommended for distributed training scenarios where you need to log from multiple machines to the same run. | |
The mode="shared" setting is specifically designed for this use case, allowing you to safely connect to and log to the same run from different machines. """, | |
}, | |
{ | |
"index": 35, | |
"question": """how would i pass the trained model name as metadata wandbtracker which is the module used in my training script, so that i can pass it accelerate launch args?""", | |
"answer": """To pass the trained model name as metadata to wandbtracker in your training script and include it in the accelerate launch arguments, you can use the init_trackers method from Accelerate. Here's how: | |
In your training script, use accelerator.init_trackers() to start logging with Weights & Biases. | |
Pass your model name and other metadata as part of the config parameter: | |
accelerator.init_trackers( | |
project_name="your_project_name", | |
config={ | |
"model_name": "your_model_name", | |
"other_metadata": "value" | |
} | |
) | |
This will make the model name and other metadata available in your W&B run. | |
Remember to call accelerator.end_training() when you're done to properly finish logging. """, | |
}, | |
{ | |
"index": 36, | |
"question": """I am new to wandb and am just trying to get started. The full blown courses are a bit to hefty for me so i just looked for some quick tutorial of the basic functionalities. | |
I am already failing at the .init function. When calling | |
``` | |
wandb.init(project='test') | |
``` | |
I get the following error: | |
``` | |
Problem at: /tmp/ipykernel_11856/3188084774.py 1 <module> | |
Traceback (most recent call last): | |
File "/home/sebastian/scripts/wandb tutorial/env/lib/python3.10/site-packages/wandb/sdk/wandb_init.py", line 1144, in init | |
run = wi.init() | |
File "/home/sebastian/scripts/wandb tutorial/env/lib/python3.10/site-packages/wandb/sdk/wandb_init.py", line 773, in init | |
raise error | |
wandb.errors.CommError: Error communicating with wandb process, exiting... | |
For more info see: https://docs.wandb.ai/guides/track/tracking-faq#initstarterror-error-communicating-with-wandb-process- | |
wandb: ERROR Abnormal program exit | |
``` | |
How can i resolve this?""", | |
"answer": """This error indicates that our library is having trouble launching the process that syncs data to the server. You can resolve this by modifying your initialization code. | |
For Linux and OS X environments, try: | |
wandb.init(project='test', settings=wandb.Settings(start_method="fork")) | |
If you're using a version prior to 0.13.0, use: | |
wandb.init(project='test', settings=wandb.Settings(start_method="thread")) | |
This changes how we start the background process that handles communication with our servers, which often fixes this particular error """, | |
}, | |
{ | |
"index": 37, | |
"question": """how do i delete all data associated with a specific personal account?""", | |
"answer": """To delete all data associated with your personal account, you need to delete your user account completely: | |
Go to your user settings page | |
Scroll to the bottom of the page | |
Click the Delete Account button | |
This action is irreversible and takes effect immediately, removing all data associated with your personal account. | |
Please note that this is different from deleting an organization account, which requires contacting our support team. """, | |
}, | |
{ | |
"index": 38, | |
"question": """what does this error indicate: AttributeError: 'WandbLogger' object has no attribute 'log_image'? I'm using Wandb logger to log some images inside the validation step of my model. Here is the code that I have used: | |
`if batch_idx % 100 == 0: | |
grid = torchvision.utils.make_grid(front_img) | |
imgs = [i for i in front_img[:8]] | |
wandb_logger.log_image(key='front_image', image=imgs)` | |
When I remove this line Wandb logs all the other parameters. | |
But when I tried to log the images i get the error indicated above. The documentation says there is a `log_image `attribute in `WandbLogger`.""", | |
"answer": """The error occurs because you're likely using an older version of PyTorch Lightning where the WandbLogger implementation doesn't include the log_image method. | |
There are two ways to fix this: | |
Update your PyTorch Lightning version to access the newer WandbLogger with the log_image method. | |
Use the direct approach through trainer.logger.experiment.log() instead: | |
trainer.logger.experiment.log({ | |
"front_image": [wandb.Image(img) for img in front_img[:8]] | |
}) | |
This alternative approach accesses the underlying wandb run object directly and works across all PyTorch Lightning versions. """, | |
}, | |
{ | |
"index": 39, | |
"question": """My runs table went wrong. It showed "There was a problem rendering these panels. Click this button to reset your workspace to the default layout. This preserves runs, but custom charts will be lost." I clicked the button "Reset my workspace" It didn't work, specifically, nothing happened. I cannot take a look at my tables now. How to solve this problem?""", | |
"answer": """I understand this is frustrating when your workspace isn't responding properly to reset attempts. Let me help you resolve this issue. | |
Instead of using the "Reset my workspace" button that didn't work, try these alternative steps: | |
At the top of your workspace, click the action menu (three dots "...") | |
Click Reset workspace | |
If that doesn't work, you can also try adjusting your workspace settings: | |
Click the gear icon near the top of your workspace | |
Select Workspace layout to modify how your workspace is structured | |
Try toggling options like "Hide empty sections during search" or "Sort panels alphabetically" | |
These steps should help restore access to your runs table and resolve the rendering issue. If you continue experiencing problems, please share the exact error message you're seeing for more specific help """, | |
}, | |
{ | |
"index": 40, | |
"question": """when inserting wandb config and log code into python training script, at what point in the training code should i place the wandb.finish() code?""", | |
"answer": """You should place the wandb.finish() code at the end of your Python training script. This ensures that all your data is logged and uploaded before the script exits. | |
Here's a basic structure for your script: | |
Start with wandb.init() at the beginning of your script. | |
Add your training code and wandb.log() calls throughout the script. | |
Place wandb.finish() at the very end of your script. | |
This approach helps prevent your training job from hanging at the end and ensures all data is properly synced. Remember, if you don't call wandb.finish(), the run will automatically end when your script exits, but it's a good practice to include it explicitly. """, | |
}, | |
{ | |
"index": 41, | |
"question": """how can I save codes into artifacts? I have tried the methods offered here: | |
https://docs.wandb.ai/guides/app/features/panels/code | |
https://docs.wandb.ai/ref/python/run#log_code | |
But I can only save codes from the file calling `wandb.init()`, I still cannot save other files. Is there a case to show how to really do this?""", | |
"answer": """We offer several ways to save code files other than the one calling wandb.init() into artifacts: | |
Use wandb.run.log_code(".") after wandb.init(): | |
import wandb | |
wandb.init() | |
wandb.run.log_code(".") | |
This saves all Python files in the current directory and subdirectories. | |
Use code_dir in settings: | |
import wandb | |
wandb.init(settings=wandb.Settings(code_dir=".")) | |
This also captures all Python files in the current directory and subdirectories. | |
Create a code artifact: | |
code_artifact = wandb.Artifact(type="code") | |
code_artifact.add_file("./train.py") | |
wandb.log_artifact(code_artifact) | |
This allows you to add specific files to the artifact. | |
For more control over which files to save, check our reference docs for log_code(). | |
Was that helpful? | |
""", | |
}, | |
{ | |
"index": 42, | |
"question": """Is it possible to embed a Weights & Biases plot inside a webpage without using a report?""", | |
"answer": """Yes, it's possible to embed Weights & Biases plots inside a webpage without using a full report. We offer a few ways to do this: | |
Use the to_html() method: You can generate HTML containing an iframe that displays a specific plot. | |
Create a custom embed: You can use an iframe to embed a specific plot or panel directly in your webpage. | |
Here's a simple example of how to embed a plot using an iframe: | |
<iframe src="YOUR_PLOT_URL" style="border:none;height:400px;width:100%"></iframe> | |
Replace "YOUR_PLOT_URL" with the direct URL to your W&B plot. You can adjust the height and width as needed. | |
For more detailed options or help with specific plot types, let me know and I'll be happy to provide more information. """, | |
}, | |
{ | |
"index": 43, | |
"question": """If `wandb.markdown` is an actual method like you say, then why do I get the error "`AttributeError: module 'wandb' has no attribute 'markdown'`"?""", | |
"answer": """This error occurs because there is no wandb.markdown method in the Weights & Biases Python package. The error AttributeError: module 'wandb' has no attribute 'markdown' indicates that you're trying to use a function that doesn't exist in the wandb module. | |
This type of error typically happens when: | |
You have a local directory named "wandb" in your working directory that's being imported instead of the actual wandb package | |
Your wandb installation is corrupted | |
To fix this issue: | |
pip uninstall wandb; rm -rI wandb; pip install wandb | |
This command will uninstall wandb, remove any local wandb directory, and reinstall the package properly. """, | |
}, | |
{ | |
"index": 44, | |
"question": """How do I authenticate with W&B when doing training on a cloud service? Does every user using the service to train have to authenticate personally even when we have a team account? Only the training code runs on the cloud server, the user is not logged in to the server machines. The user only uses a client application to initiate the training. How will the server authenticate?""", | |
"answer": """When running training on a cloud server where users aren't logged in, you can use a team service account for authentication. This allows you to log runs without individual user authentication. | |
To set this up: | |
Configure the team in your training environment. | |
Use a service account from that team to log runs. | |
If you want to attribute runs to specific users, set the WANDB_USERNAME or WANDB_USER_EMAIL environment variable. The user must be part of the team. | |
For SageMaker specifically, you can use wandb.sagemaker_auth(path="source_dir") to generate a secrets.env file with authentication details. W&B will automatically load this when wandb.init() is called. | |
To confirm which features you can access, could you please let me know if you're on a Free, Pro, or Enterprise plan? """, | |
}, | |
{ | |
"index": 45, | |
"question": """I would like to finish a run and remove all hooks that may be attached to my torch modules and remove any other affects that wandb might still has on my code after calling wand.finish(). How do i do that?""", | |
"answer": """To ensure all Weights & Biases hooks and effects are removed from your torch modules and code after calling wandb.finish(), you can follow these steps: | |
Call wandb.finish() at the end of your script to properly close the run. | |
If you've used wandb.watch() on your model, the hooks are automatically removed when the run ends. | |
For any custom logging or tracking, make sure to stop these processes before calling wandb.finish(). | |
After wandb.finish(), you can create a new model instance without W&B integration if needed. | |
If you're still experiencing issues, please share the exact error message or unexpected behavior you're seeing. This will help us provide more specific guidance """, | |
}, | |
{ | |
"index": 46, | |
"question": """how do I fix an error with wandb Table construction from pandas dataframe: TypeError: Data row contained incompatible types""", | |
"answer": """To fix the "TypeError: Data row contained incompatible types" error when creating a wandb Table from a pandas DataFrame, try these steps: | |
Check your DataFrame for mixed data types within columns. Ensure each column has consistent data types. | |
Use the allow_mixed_types parameter when creating the Table: | |
wandb.Table(dataframe=your_dataframe, allow_mixed_types=True) | |
This option disables type validation, allowing mixed types in columns. | |
If the error persists, review your DataFrame for any unexpected data types or null values. You may need to clean or preprocess your data before creating the Table. | |
Consider using the dtype parameter to explicitly set column data types when creating the Table. | |
If you're still facing issues, please share the exact error message and a sample of your DataFrame for more specific help. """, | |
}, | |
{ | |
"index": 47, | |
"question": """I started the Sweep on one machine, that is logged into a certain wandb account. I saved the sweep_id, and passed it on to another wandb account which also tried to run the same Sweep. I do this because I want multiple machines to run the same Sweep to aggregate their results. However, when trying to run the same Sweep as the first machine on the second with the set sweep_id, I get the error: | |
'wandb.errors.UsageError: could not find project project-name/uncategorized during createAgent""", | |
"answer": """To run the same Sweep across multiple machines using different Weights & Biases accounts, each account must have access to the same project where the Sweep was created. The error 'could not find project project-name/uncategorized during createAgent' usually means the second account does not have access to the specified project or the project does not exist under that account. | |
Here’s how to avoid this issue: | |
Make sure the project exists under the account you are using to launch the agent. If it does not, create the project in that account or have the project owner invite the other account as a collaborator or team member. | |
When starting the agent, specify both the --entity (username or team name) and --project flags to match the original Sweep’s location. For example: | |
wandb agent --entity original-entity --project project-name sweep_id | |
If you do not specify the correct entity and project, the agent will default to the current account’s default entity and may not find the Sweep or project, resulting in the error you saw. | |
Each account running the agent must have permission to write to the project where the Sweep was initialized. If you need to aggregate results across accounts, ensure all accounts are part of the same team or have been granted access to the project. | |
Was that helpful? | |
""", | |
}, | |
{ | |
"index": 48, | |
"question": """I'm getting `Artifact <artifact_name> already exists with type <old_type>; cannot create another with type <new_type>` although I deleted the <artifact_name> with the <old_type> in the UI and I can't see it any more. what else do I need to do to be able to create an artifact with the same name but the other type?""", | |
"answer": """I understand you're having trouble creating an artifact with the same name but a different type. To resolve this, you'll need to delete the existing artifact completely, not just from the UI. Here's what you can do: | |
Use the artifact_exists() method to check if the artifact still exists in the backend. | |
If it does, use the delete() method on the artifact to remove it entirely. | |
After confirming deletion, create a new artifact with the desired name and type. | |
Here's a sample code snippet: | |
import wandb | |
api = wandb.Api() | |
if api.artifact_exists("your_artifact_name:latest"): | |
artifact = api.artifact("your_artifact_name:latest") | |
artifact.delete() | |
# Now create your new artifact | |
new_artifact = wandb.Artifact("your_artifact_name", type="new_type") | |
Remember to replace "your_artifact_name" and "new_type" with your specific values. If you're still having issues, please share the exact error message you're seeing """, | |
}, | |
{ | |
"index": 49, | |
"question": """I really like the random names generated by wandb for every new run, however I mostly deal with grouped runs (for kfold training). Although we can name the groups using a custom name, is it possible to use the same engine that wandb uses for naming the runs, to name the groups instead.""", | |
"answer": """While we don't have a built-in feature to automatically generate random names for run groups, you can create a custom solution using our run naming system. Here's how: | |
Generate a random name using wandb.util.generate_id(). | |
Use this name for your group when initializing runs: | |
import wandb | |
group_name = wandb.util.generate_id() | |
wandb.init(group=group_name) | |
This approach will give your groups similar random names to individual runs. For k-fold training, you could use the same group name for all folds in a single experiment. | |
Remember, you can always access the run's name during execution using wandb.run.name if needed. """, | |
}, | |
{ | |
"index": 50, | |
"question": """i am using init_kwargs with accelerator for main process in my training network script. how can i track notes in the accelerate launch args?""", | |
"answer": """To track notes in the accelerate launch arguments when using init_kwargs with accelerator, you can use the init_trackers method of the Accelerator object. Here's how: | |
Create an Accelerator object with log_with="wandb". | |
Use init_trackers to start logging, passing your project name and any notes or config information. | |
Include your notes in the config parameter of init_trackers. | |
Here's an example: | |
from accelerate import Accelerator | |
accelerator = Accelerator(log_with="wandb") | |
accelerator.init_trackers( | |
project_name="my_project", | |
config={ | |
"notes": "Your notes here", | |
"learning_rate": 1e-2 | |
}, | |
init_kwargs={"wandb": {"entity": "my-wandb-team"}} | |
) | |
This approach will log your notes along with other config information to your W&B run. | |
Is that what you were looking for? """, | |
}, | |
{ | |
"index": 51, | |
"question": """i think i might have a bug with logging distributed training, how can I do it with wandb?""", | |
"answer": """Logging Distributed Training with W&B You have two main approaches for logging distributed training experiments: | |
Track a single process (recommended for PyTorch DDP): | |
• Log only from the rank 0 process (leader/coordinator) | |
• For fastai, use if rank_distrib() == 0: before calling wandb.init() | |
Track multiple processes: | |
• Log each process separately (one run per process) | |
• Log all processes to a single run | |
Implementation Example | |
import wandb | |
from fastai.distributed import * | |
# Only initialize wandb on the main process | |
if rank_distrib() == 0: | |
wandb.init("your_project") | |
callbacks = [WandbCallback()] | |
else: | |
callbacks = [] | |
# Continue with your training | |
learn = vision_learner(dls, model, metrics=metrics, cbs=callbacks) | |
with learn.distrib_ctx(): | |
[learn.fit](https://learn.fit)(epochs) | |
For troubleshooting, check if you're initializing wandb on all processes instead of just the main one, which could cause duplicate runs """, | |
}, | |
{ | |
"index": 52, | |
"question": """Explain in detail how can I use wandb to train a LLM and is there any way to log prompts keeping them compatible with tools like LangChainAI, OpenAI, Huggingface Transformers, etc.""", | |
"answer": """To train a large language model (LLM) with Weights & Biases (W&B), you can use our integrations with popular frameworks like Hugging Face Transformers. Start by installing the required libraries and setting up your environment. For example, with Hugging Face, you can use scripts like run_glue.py and set the WANDB_PROJECT environment variable to track your training runs automatically. This setup logs metrics, model checkpoints, and artifacts directly to your W&B dashboard, making it easy to monitor and compare experiments. | |
For prompt logging and compatibility with tools like LangChain, OpenAI, and Hugging Face, W&B offers integrations and tracing tools. With LangChain.js, you can use the WandbTracer to log prompts and chain executions, ensuring compatibility and easy tracking. This approach allows you to capture prompt inputs, outputs, and metadata in a structured way, making your workflow compatible with other LLM tools and frameworks. | |
Additionally, W&B Weave provides advanced tracing, output evaluation, and cost tracking for LLM workflows, supporting prompt logging and analysis. This is especially useful for prompt engineering and optimization tasks, and it works well with frameworks like DSPy and LangChain. | |
For more detailed, step-by-step guides and best practices, you can explore our AI Academy courses on LLM training, prompting, and app development, which cover practical integrations and logging strategies. """, | |
}, | |
{ | |
"index": 53, | |
"question": """who does weights and biases sell to?""", | |
"answer": """Weights & Biases offers different pricing plans tailored to various customers: | |
Free tier: For individual users or small teams getting started | |
Pro plan: For professional teams needing more advanced features | |
Enterprise license: For organizations requiring custom roles, fine-grained access control, and advanced security features | |
Enterprise customers can access features like: | |
Custom roles with specific permissions | |
Viewer roles for broader audience sharing | |
Model registry admin roles | |
Dedicated Cloud deployments on AWS and Azure | |
The platform is designed for AI developers, machine learning practitioners, and organizations building and deploying ML models and generative AI applications. """, | |
}, | |
{ | |
"index": 54, | |
"question": """What production-grade features are not available in a free personal license for W&B server?""", | |
"answer": """ """, | |
}, | |
{ | |
"index": 55, | |
"question": """how does Weights & Biases know that the actual dataset in S3 has not changed even though the version of the object has been updated in S3, thereby preventing Weights & Biases from creating a new version of the artifact ?""", | |
"answer": """Based on the documentation, several production-grade features require an Enterprise license and are not available with a free personal license for W&B Server: | |
Important security capabilities and enterprise-friendly functionality | |
Advanced security features like HIPAA compliance, Single Sign On (SSO), and Customer Managed Encryption Keys (CMEK) | |
Viewer role functionality - which allows sharing with broader audiences while limiting permissions | |
Model registry admin role - for controlling model promotion processes | |
Ability to enforce username and full name using OIDC claims from SSO providers | |
Option to disable personal projects to ensure all projects are created within teams | |
Private connectivity features (for Dedicated Cloud on GCP or Azure) | |
To access these features, you would need to obtain an Enterprise license """, | |
}, | |
{ | |
"index": 56, | |
"question": """where can I find a schema for the yaml or json definition of a sweep plan""", | |
"answer": """You can find examples and templates for sweep configuration in YAML or JSON format in our documentation. | |
The basic structure uses key-value pairs with these top-level keys: | |
program: Your training script | |
method: Search strategy (bayes, random, grid) | |
metric: What to optimize | |
parameters: Hyperparameters to search through | |
early_terminate: Optional stopping criteria | |
Here's a template example: | |
program: <insert> | |
method: <insert> | |
parameter: | |
hyperparameter_name0: | |
value: 0 | |
hyperparameter_name1: | |
values: [0, 0, 0] | |
# More parameters... | |
For complete documentation on all available options, check our "Sweep configuration options" guide which details all supported keys and parameter constraints """, | |
}, | |
{ | |
"index": 57, | |
"question": """please explain the use of datum when constructing custom charts. I would like an explanation of when to explicitly use `datum` in custom charts. The custom line chart provided by wandb uses datum.""", | |
"answer": """The provided sources don't specifically mention the use of datum in custom charts for Weights & Biases. However, I can explain how custom charts work in W&B based on the available information. | |
In W&B, you can create custom charts by logging data and then visualizing it. Here's a basic process: | |
Log data using wandb.Table() or other logging methods. | |
Use wandb.plot.line() or similar functions to create custom charts. | |
Customize the chart using the W&B UI, where you can map data to chart fields. | |
For line charts, you can add custom calculated expressions to the panel, including Y Axis Expressions and X Axis Expressions. | |
To better help you with the use of datum in custom charts, could you provide more context about where you've seen datum used in W&B custom charts? This would help me give you a more accurate answer. """, | |
}, | |
{ | |
"index": 58, | |
"question": """I try to create a table with the columns ["full_path", "audio", "speaker_ID", "sentence", "label" ]. Everything works fine, but the problem is that I have around 8k audiofiles. | |
I work using google colab, and during the process of the table creation (around half of the process duration) the system crashes - not enough RAM (12 Gb). Seems like it firstly loads all audio into the RAM and only then uploads it to the W and B. if it is already more than 12 Gb in RAM, the final table on the W and B will be the same size then?""", | |
"answer": """ """, | |
}, | |
{ | |
"index": 59, | |
"question": """How can I log my results in my team profile and not in my personal profile?""", | |
"answer": """When your Google Colab crashes due to RAM limitations while creating a table with audio files, the final table size in Weights & Biases would indeed be large, potentially exceeding your available RAM. | |
The issue you're experiencing is that all audio files are being loaded into memory before uploading to W&B. This approach isn't efficient for large datasets. | |
Instead, consider: | |
Adding data incrementally using table.add_data() for each audio file, which allows processing one file at a time | |
Using batching to process smaller chunks of your dataset | |
Streaming the data directly to W&B rather than loading everything into memory first | |
W&B's default storage plan is 100GB, so storage capacity shouldn't be an issue for your 8k audio files, but the RAM limitations in Colab are the bottleneck. | |
For large media datasets, consider using the WandbEvalCallback which helps reduce memory footprint by referencing data after it's uploaded. | |
Is that what you were looking for? | |
""", | |
}, | |
{ | |
"index": 60, | |
"question": """The movement is choppy when I drag either horizontally or vertically the wandb table. It seems that it takes a long time loading blocks of data. Is there a way to accomodate this?""", | |
"answer": """We understand you're experiencing slow performance when dragging the W&B table. To improve this, try reducing the number of rows displayed. This can help speed up loading times. | |
You can do this in two ways: | |
Use the "Per page" control at the bottom-right of the page. You can set it to 10, 25, 50, or 100 rows. | |
If you prefer, you can manually change the 'pageSize' parameter in the URL to a number below 100. | |
Could you try these steps and let us know if it helps? Also, it would be helpful if you could share more details about what you were doing when you noticed the choppiness, and any specific error messages you saw. """, | |
}, | |
{ | |
"index": 61, | |
"question": """I am encountering issues with using sweep. The dev_loss variable over which I want to fine tune my hyperparameters does not appear in the dashboard (or is considered as null) but I checked in my code that it is neither null nor nan and it has the correct type float.""", | |
"answer": """The issue is likely in how you're logging the dev_loss metric to Weights & Biases. When using sweeps, you need to explicitly log the metric that's being optimized. | |
If you're storing dev_loss in a dictionary (like metrics = {"dev_loss": loss_value}), make sure you're accessing the specific value when logging: | |
# Correct way | |
wandb.log({"dev_loss": metrics["dev_loss"]}) | |
# Incorrect way | |
wandb.log({"dev_loss": metrics}) | |
The incorrect method passes the entire dictionary to the "dev_loss" key instead of the actual float value, causing it to appear as null in the dashboard. | |
Also, ensure you're not logging metrics inside a subdirectory, as this can prevent the sweep from properly tracking your optimization metric. """, | |
}, | |
{ | |
"index": 62, | |
"question": """Is there an integration with PyTorch? Can you show me an example?""", | |
"answer": """Yes, Weights & Biases has a direct integration with PyTorch. Here's a simple example of how to use it: | |
# Import the library | |
import wandb | |
# Start a new experiment | |
wandb.init(project="new-sota-model") | |
# Track hyperparameters | |
wandb.config = {"learning_rate": 0.001, "epochs": 100, "batch_size": 128} | |
# Set up model and data | |
model, dataloader = get_model(), get_data() | |
# Optional: track gradients | |
wandb.watch(model) | |
for batch in dataloader: | |
metrics = model.training_step() | |
# Log metrics inside your training loop | |
wandb.log(metrics) | |
# Optional: save model at the end | |
model.to_onnx() | |
wandb.save("model.onnx") | |
You can try this integration in a Colab notebook: Simple PyTorch Integration. """, | |
}, | |
{ | |
"index": 63, | |
"question": """I've used the following code to log a metric | |
`wandb.log({"metric": [4, 5, 6]})` | |
but then found out that Wandb doesn't support plotting a list by default. I want to create a line plot where the y-axis is the first element of the metric array, and the x-axis is the step. | |
I've read the Custom Charts section of the document. I think I should use Vega to access the first element of the array. Here are the steps that I've taken: For the custom chart, I've set the data source as "history" and selected the "metric" key. | |
`query { | |
runSets | |
(runSets: "${runSets}" ) { | |
id | |
name | |
history | |
(keys: ["metric" ] ) | |
} | |
}` | |
In the Vega script, I tried to flatten the array, using this part of the documentation | |
"`transform": { | |
... | |
{"type": "flatten", "fields": ["${field:metric}"]}, | |
}` | |
This gives me a warning that "type" and "fields" arguments are not allowed, which means I should include this flattening block somewhere else (not in the transform section). I'm afraid I don't know where, and how I can achieve this. Is this even possible? If not, I think in my notebook I should write a script that accesses the wandb.run log data, and transform the data for each run. if so, any tips for that solution is also appreciated.""", | |
"answer": """Instead of using Vega's flatten transform, I recommend using W&B's built-in plotting functions to create your line plot. Here's a simpler approach: | |
Extract the first element of your array when logging: | |
# Instead of: wandb.log({"metric": [4, 5, 6]}) | |
# Do this: | |
metric_array = [4, 5, 6] | |
wandb.log({"metric_first_element": metric_array[0]}) | |
Alternatively, use wandb.plot.line() to create a custom line plot: | |
# Create x-y pairs where x is step and y is first element | |
data = [[step, metric_array[0]] for step in steps] | |
table = wandb.Table(data=data, columns=["step", "first_element"]) | |
wandb.log({ | |
"first_element_plot": wandb.plot.line( | |
table, "step", "first_element", title="First Element vs Step" | |
) | |
}) | |
This approach gives you direct control over which array element to plot and avoids the complexity of Vega transformations """, | |
}, | |
{ | |
"index": 64, | |
"question": """How can I log molecules to W&B? Is there any limitation of logging molecules to W&B?""", | |
"answer": """You can log molecules to Weights & Biases using the wandb.Molecule class, which offers two main methods: | |
from_rdkit - Convert RDKit-supported files or objects: | |
molecule = wandb.Molecule.from_rdkit( | |
data_or_path, # file name or rdkit.Chem.rdchem.Mol object | |
caption="Optional description" | |
) | |
wandb.log({"molecule": molecule}) | |
from_smiles - Convert directly from SMILES strings: | |
molecule = wandb.Molecule.from_smiles( | |
"C1=CC=CC=C1", # SMILES string for benzene | |
caption="Benzene molecule" | |
) | |
wandb.log({"molecule": molecule}) | |
Both methods have options to convert molecules to 3D and optimize them, though this can be time-consuming for complex molecules. | |
There are no explicit row limits mentioned for molecules (unlike tables which have a 200,000 row limit), but complex molecules may take longer to process, especially with 3D conversion and optimization enabled. """, | |
}, | |
{ | |
"index": 65, | |
"question": """I want to create my own integration to properly track the outputs of a NLP large language model. How can I make it automatically log for me in a single line of code""", | |
"answer": """For logging NLP large language model outputs to Weights & Biases, we offer two main approaches: | |
If you're using Hugging Face Transformers, you can use our built-in integration: | |
from transformers.integrations import WandbCallback | |
trainer = Trainer( | |
model=model, | |
callbacks=[WandbCallback()] | |
) | |
For LangChain.js applications, you can use: | |
import {WandbTracer} from '@wandb/sdk/integrations/langchain'; | |
const wbTracer = await WandbTracer.init({project: 'my-llm-project'}); | |
chain.call({input: "My prompt"}, wbTracer) | |
These integrations automatically track model outputs, metrics, and configurations with minimal code. """, | |
}, | |
{ | |
"index": 66, | |
"question": """Can we start multiple long-running WANDB runs at the same time, withinin the same project?""", | |
"answer": """Yes, you can start multiple long-running Weights & Biases runs simultaneously within the same project. Our system allows you to create and manage multiple runs concurrently, which is useful for parallel experiments or distributed training. | |
To start multiple runs, you can use the wandb.init() function in your code for each run you want to create. Each call to wandb.init() will create a new run in your project. | |
For example: | |
import wandb | |
# Start first run | |
run1 = wandb.init(project="your_project_name") | |
# Start second run | |
run2 = wandb.init(project="your_project_name") | |
This approach allows you to track and compare multiple experiments or models simultaneously within the same project. """, | |
}, | |
{ | |
"index": 67, | |
"question": """For distributed parallel runs, how can I get the `run_id` when I'm on a process other than RANK 0?""", | |
"answer": """To get the run_id in a distributed parallel run from a process that's not RANK 0, you need to share the run_id from the main process (RANK 0) with other processes. Here's how you can do it: | |
In the main process (RANK 0), start the run and get the run_id: | |
if rank_distrib() == 0: | |
run = wandb.init() | |
run_id = run.id | |
Share this run_id with other processes using your distributed communication method (e.g., torch.distributed). | |
In non-RANK 0 processes, use the shared run_id to initialize the run: | |
if rank_distrib() != 0: | |
wandb.init(id=run_id) | |
This approach ensures all processes use the same run_id, allowing you to track your distributed training in a single W&B run. """, | |
}, | |
{ | |
"index": 68, | |
"question": """how can i use the method add_named_result over a specific trace_tree that was initiatied by WandTracer ?""", | |
"answer": """The add_named_result method is not directly associated with the WBTraceTree object or WandTracer. The WBTraceTree is a media object for trace tree data, which consists of a root span and an optional model dictionary. | |
To help you better, could you provide more context about what you're trying to achieve with the add_named_result method? Are you working with a specific tracing library or framework alongside Weights & Biases? """, | |
}, | |
{ | |
"index": 69, | |
"question": """How to export a single chart's data using the API?""", | |
"answer": """To export a single chart's data using our API, you'll need to follow these steps: | |
Create an API key from your user profile. | |
Find the run path (entity/project/run_id) for the run containing your chart. | |
Use the wandb.Api() to access the run: | |
import wandb | |
api = wandb.Api() | |
run = api.run("entity/project/run_id") | |
Use run.history() or run.scan_history() to get the chart data: | |
# For sampled data (default 500 points) | |
history = run.history() | |
# For all data points | |
history = run.scan_history() | |
Filter the data for your specific chart metric: | |
chart_data = history[["_step", "your_metric_name"]] | |
Export the data as needed (e.g., to CSV): | |
chart_data.to_csv("chart_data.csv", index=False) | |
Remember to replace "entity/project/run_id" and "your_metric_name" with your specific values. """, | |
}, | |
{ | |
"index": 70, | |
"question": """How do I get an artifact's metadata without downloading the artifact ?""", | |
"answer": """You can access an artifact's metadata without downloading it using the W&B API. Here's how: | |
import wandb | |
# Option 1: Using the Public API | |
api = wandb.Api() | |
artifact = api.artifact("entity/project/artifact:alias") | |
metadata = artifact.metadata | |
The metadata property gives you access to the artifact's metadata dictionary without downloading the actual artifact files. | |
You can also use the artifactVersion-metadata operation in the Query Panel to return the metadata dictionary for an artifact version. | |
This approach lets you check the metadata information while saving bandwidth and storage space since you're not downloading the artifact's contents. """, | |
}, | |
{ | |
"index": 71, | |
"question": """How do I run multiple instances of a sweep across multiple GPUs, note that I didn't configure the sweep through a yaml file.""", | |
"answer": """To run multiple instances of a sweep across multiple GPUs without using a YAML file, you can define your sweep configuration as a Python dictionary and initialize the sweep with wandb.sweep(sweep_config). This returns a sweep_id. | |
To parallelize across GPUs, open a separate terminal for each GPU and set the CUDA_VISIBLE_DEVICES environment variable to the GPU index you want to use. Then, start a sweep agent in each terminal with the same sweep_id. For example: | |
# Terminal 1 (GPU 0) | |
CUDA_VISIBLE_DEVICES=0 wandb agent sweep_id | |
# Terminal 2 (GPU 1) | |
CUDA_VISIBLE_DEVICES=1 wandb agent sweep_id | |
This approach lets you run multiple sweep agents in parallel, each on a different GPU, without needing a YAML configuration file """, | |
}, | |
{ | |
"index": 72, | |
"question": """I have reproduced this error with Tensorflow 2.12 and 2.13 and wandb 0.15.4 and 0.15.5. In the case where the following two conditions hold, | |
wandb is imported before tensorflow and | |
A function decorated with @tf.function calls another function also decorated with @tf.function, | |
the following error occurs when running code from a Python script: | |
`[libprotobuf FATAL google/protobuf/message_lite.cc:353] CHECK failed: target + size == res: | |
libc++abi: terminating with uncaught exception of type google::protobuf::FatalException: CHECK failed: target + size == res: | |
zsh: abort python mfe.py` | |
Running from a Jupyter notebook the kernel simply crashes without an error message. | |
Removing the inner @tf.function or changing the order of imports resolves this error. | |
Here is a MFE: | |
`import wandb # switch order of imports to toggle error | |
import tensorflow as tf | |
@tf.function # comment this out to toggle error | |
def custom_score(data): | |
pass | |
data = tf.random.uniform((100, 20, 24, 2), 0, 1) | |
train = tf.data.Dataset.from_tensor_slices(data).batch(50) | |
@tf.function | |
def train_step(data): | |
custom_score(data) | |
tf.config.run_functions_eagerly(False) | |
tf.print("Start training") | |
train_step(next(iter(train))) | |
tf.print("Runs finished without error.")` | |
Could someone please enlighten me as to what is going on here. This error also occurs when calling an @tf.function function from a keras model's train_step() function, I assume because it is also an Autograph function. It is impractical for me to run custom_score() eagerly as it massively slows down training times.""", | |
"answer": """We understand this issue is frustrating, and we're here to help you resolve it. The error you're encountering is likely due to a compatibility issue between TensorFlow and the protobuf library used by Weights & Biases. | |
To resolve this without changing import order or running functions eagerly, you can try the following: | |
Update both TensorFlow and Weights & Biases to their latest versions. | |
Use the TensorFlow estimator hook provided by Weights & Biases: | |
import tensorflow as tf | |
import wandb | |
wandb.init(config=tf.flags.FLAGS) | |
estimator.train(hooks=[wandb.tensorflow.WandbHook(steps_per_log=1000)]) | |
This approach allows more control over logging and may bypass the protobuf issue. | |
Can you provide more details about your specific use case and why changing the import order isn't feasible for your project? """, | |
}, | |
{ | |
"index": 73, | |
"question": """is there a good way to join 2 tables together programmatically?""", | |
"answer": """You can join two tables programmatically in Weights & Biases using wandb.JoinedTable(). Here's how: | |
import wandb | |
# Initialize your run | |
run = wandb.init(project="my_project") | |
# Get your tables (either from artifacts or create them) | |
table_1 = run.use_artifact("table1:latest").get("samples") | |
table_2 = run.use_artifact("table2:latest").get("samples") | |
# Join tables on a common key | |
joined_table = wandb.JoinedTable(table_1, table_2, "join_key") | |
# Log the joined table to an artifact | |
joined_artifact = wandb.Artifact("joined_tables", "analysis") | |
joined_artifact.add(joined_table, "joined_data") | |
run.log_artifact(joined_artifact) | |
The join_key parameter specifies which column to use for matching rows between tables. | |
You can also join tables directly in queries using the join operation with selectors for determining which columns to join on """, | |
}, | |
{ | |
"index": 74, | |
"question": """This is a very quick and basic question, that I don't think it even qualifies as "support": In the Runs view / Workspace, what button do we push to hide the sidebar with the list of all the run names? I'm on my iPad and I need more screen real estate to see my graphs. (If I were on desktop I could grab the boundary of the sidebar and move it, but on mobile that's not an option)""", | |
"answer": """To hide the sidebar with the list of run names in the Runs view/Workspace on your iPad, you can use the expand button. This button allows you to collapse the sidebar, giving you more screen space for your graphs. | |
Look for the expand button near the top of the sidebar. Clicking this will collapse the sidebar, expanding the main view area where your graphs are displayed. This should help you see your graphs more clearly on your iPad's screen. | |
If you need to bring the sidebar back later, you can use the same button to expand it again. """, | |
}, | |
{ | |
"index": 75, | |
"question": """Write Python code to loop through all Tables within a run and download each table directly to a Pandas DataFrame""", | |
"answer": """ """, | |
}, | |
{ | |
"index": 76, | |
"question": """what is a weave query expression that can calculate the median of a column of numbers in a wandb Table""", | |
"answer": """Here's how to loop through all Tables in a W&B run and download each to a Pandas DataFrame: | |
import wandb | |
import pandas as pd | |
# Access your run | |
api = wandb.Api() | |
run = api.run("<entity>/<project>/<run_id>") | |
# Get all logged data including tables | |
history = run.scan_history() | |
# Dictionary to store DataFrames | |
table_dataframes = {} | |
# Loop through history to find tables | |
for row in history: | |
for key, value in row.items(): | |
if isinstance(value, wandb.data_types.Table): | |
# Convert the table to a DataFrame | |
df = pd.DataFrame( | |
data=value.data, | |
columns=value.columns | |
) | |
table_dataframes[key] = df | |
print(f"Downloaded table: {key}") | |
# Now table_dataframes contains all tables as pandas DataFrames | |
This code gets all logged data from your run, identifies tables, and converts each to a DataFrame using the table's data and columns. """, | |
}, | |
{ | |
"index": 77, | |
"question": """example of login and authentication with sagemaker estimator train step""", | |
"answer": """To use W&B with a SageMaker Estimator during the train step, you need to set up authentication. Here's how to do it: | |
Create a requirements.txt file in your source directory and include wandb. | |
In your training script, use wandb.sagemaker_auth(path="source_dir") to generate a secrets.env file. This file will contain your W&B credentials. | |
Add the secrets.env file to your .gitignore to keep your credentials secure. | |
In your training script, call wandb.init(). This will automatically load the credentials from the secrets.env file. | |
Remember, if you're using a built-in SageMaker estimator, these steps are essential for proper authentication. This setup will help you start logging your SageMaker training runs with W&B. """, | |
}, | |
{ | |
"index": 78, | |
"question": """what are good practices for optimizing tracking of large data volumes?""", | |
"answer": """When tracking large data volumes with W&B, here are the best practices: | |
Use reference artifacts instead of uploading files directly to W&B. This lets you track files stored externally (S3, GCS, Azure, or file systems) while only storing metadata in W&B. | |
For cloud storage tracking: | |
artifact = wandb.Artifact("dataset-name", type="dataset") | |
artifact.add_reference("s3://my-bucket/datasets/my-data") | |
run.log_artifact(artifact) | |
For filesystem references: | |
artifact.add_reference("file:///path/to/data/") | |
You can adjust the file limit with max_objects= parameter. | |
Keep run counts under recommended limits (100,000 on SaaS Cloud, 10,000 on Dedicated/Self-managed) to maintain performance. | |
Consider moving older runs to "archive" projects to keep your working projects lean. | |
Set TTL (time-to-live) policies on artifacts for automatic deletion of older data. """, | |
}, | |
{ | |
"index": 79, | |
"question": """How do I specify what dir to cache artifacts?""", | |
"answer": """By default, artifacts in Weights & Biases download to the artifacts/ folder. You have two options to change this location: | |
Pass the directory path directly to the download method: | |
artifact.download(root="your/custom/directory") | |
Set the WANDB_ARTIFACT_DIR environment variable to your preferred location. | |
This gives you control over where artifact files are stored on your system """, | |
}, | |
{ | |
"index": 80, | |
"question": """In Prompts, how can I resize the Trace Timeline to make it bigger or full screen?""", | |
"answer": """To make the Trace Timeline panel bigger or view it in full screen, hover over the panel and click the action menu (the ... icon). Then, select the full-screen button, which looks like a viewfinder or an outline showing the four corners of a square. This will expand the panel to full screen for easier viewing. To return to the regular workspace, click the left-pointing arrow at the top of the page. """, | |
}, | |
{ | |
"index": 81, | |
"question": """I keep getting `Error in callback <bound method _WandbInit._pause_backend of <wandb.sdk.wandb_init._WandbInit object at 0x7fcee805af50>>` | |
```TypeError Traceback (most recent call last) | |
TypeError: _WandbInit._pause_backend() takes 1 positional argument but 2 were given | |
``` | |
when calling `wandb.init()`. It was working ok earlier. I get this with versions 0.15.12 and 0.15.10. | |
How can I resolve this?""", | |
"answer": """I understand this error can be frustrating when you're trying to run your code. This appears to be an issue with how the wandb process is being initialized. | |
Try modifying your wandb.init() call with a different start method: | |
wandb.init(settings=wandb.Settings(start_method="fork")) | |
If you're using a version older than 0.13.0, use this instead: | |
wandb.init(settings=wandb.Settings(start_method="thread")) | |
This should resolve the TypeError by changing how the background process that handles data synchronization is launched. """, | |
}, | |
{ | |
"index": 82, | |
"question": """I have a question about sweeps. How can you constrain relationship between parameters. For example, I now that if `num_layers * hidden_dim` is large, I'll run out of GPU memory. So, also I would like to explore some hyperparameter space, there are some combination I know will fail. `optuna` as a way to do that: you can throw an special exception to cancel a run during a sweep, so that it is not recorded. Is there something similar in W&B, or another way of pruning unwanted combination of hyperparameters?""", | |
"answer": """We don't have a direct equivalent to Optuna's run cancellation feature, but we offer ways to manage hyperparameter relationships in sweeps. You can define parameter constraints in your sweep configuration to avoid unwanted combinations. | |
For example, you can use the parameters key in your sweep config to specify valid ranges or distributions for each hyperparameter. This helps prevent combinations that might cause memory issues. | |
Additionally, we've recently added Optuna as a sweeps scheduler option in our Launch feature. This allows for more efficient exploration of hyperparameters, which might help address your concern about pruning unwanted combinations. | |
If you need more control, consider implementing checks in your training function to exit early if certain conditions are met, logging the results before ending the run. """, | |
}, | |
{ | |
"index": 83, | |
"question": """Hello, I am trying to save the gradients as I would with WandbCallback, but with a custom callback: | |
Is this correct? | |
class GradientClipperCallback(tf.keras.callbacks.Callback): | |
def init(self, model): | |
self.model = model | |
def on_epoch_end(self, epoch, logs=None): | |
trainable_vars = self.model.trainable_variables | |
with tf.GradientTape() as tape: | |
y_pred = self.model(self.model.inputs, training=False) | |
gradients = tape.gradient(y_pred, trainable_vars) | |
wandb.log({"gradients": gradients}, step=epoch)""", | |
"answer": """Your implementation needs some adjustments to correctly log gradients. The WandbCallback has built-in functionality for gradient logging that you can use instead of creating a custom implementation. | |
Here's how to log gradients with the official WandbCallback: | |
import wandb | |
from wandb.integration.keras import WandbCallback | |
# Initialize wandb | |
wandb.init(project="your-project-name") | |
# Set up your model | |
# ... | |
# Use WandbCallback with gradient logging enabled | |
[model.fit](https://model.fit)( | |
X_train, | |
y_train, | |
validation_data=(X_test, y_test), | |
callbacks=[WandbCallback( | |
log_gradients=True, | |
training_data=(X_train, y_train) # Required for gradient logging | |
)] | |
) | |
The key parameters are log_gradients=True and providing your training_data which is needed for calculating the gradients """, | |
}, | |
{ | |
"index": 84, | |
"question": """Can you introduce me wandb? I'm a beginner.""", | |
"answer": """Weights & Biases (W&B) is an AI developer platform that helps you track, visualize, and manage machine learning experiments. | |
As a beginner, here's what you need to know: | |
What W&B does: | |
Tracks machine learning experiments and metrics | |
Visualizes model performance | |
Helps with hyperparameter tuning | |
Lets you save and share models | |
Getting started: | |
Sign up for a free account at wandb.ai | |
Install the library: pip install wandb | |
Add a few lines to your code: | |
import wandb | |
wandb.init(project="my-first-project") | |
# Your ML code here | |
wandb.log({"accuracy": 0.9}) | |
W&B works with popular frameworks like PyTorch, Keras, and scikit-learn. The basic workflow involves initializing a run, tracking metrics during training, and viewing results in the web dashboard. | |
Start with simple experiment tracking, then explore more advanced features as you grow comfortable with the platform. """, | |
}, | |
{ | |
"index": 85, | |
"question": """How can I stop a run in the web interface?""", | |
"answer": """I apologize, but I don't have specific information about stopping a run directly from the Weights & Biases web interface. However, I can provide some general guidance on managing runs. | |
Typically, runs are controlled from your local environment where the code is executed. To stop a run, you would usually interrupt the process on your machine (e.g., using Ctrl+C in the terminal). | |
If you need to manage runs that have already finished or crashed, you can use the web interface to view and organize them. The web interface allows you to compare runs, view metrics, and analyze results. | |
Can you provide more details about what you're trying to achieve? Are you looking to stop an ongoing run, or do you need to manage completed runs in some way? """, | |
}, | |
{ | |
"index": 86, | |
"question": """Can public cloud customers make W&B reports accessible to their colleagues without W&B access?""", | |
"answer": """Yes, public cloud customers can share Weights & Biases reports with colleagues who don't have W&B accounts. Team admins can enable magic link sharing, which allows easy access to reports for people outside the ML team. | |
To share a report: | |
Create the report using the W&B Reports feature | |
Use the sharing options to generate a magic link | |
Send this link to your colleagues | |
When colleagues receive the link, they can view the report after a simple sign-up process.This makes it easy to share findings throughout your organization, even with those who aren't regular W&B users. | |
Keep in mind that team admins can also disable magic link sharing for increased privacy if needed. """, | |
}, | |
{ | |
"index": 87, | |
"question": """what are best practices for logging artifacts that optimally wont consume all of my storage.""", | |
"answer": """To manage artifact storage efficiently in Weights & Biases: | |
Split artifacts into multiple files when it makes sense, which helps with scaling. | |
Before downloading artifacts, W&B checks if the right version is available locally using hashing technology (similar to torrenting and git), which prevents duplicate downloads and saves storage. | |
Use the "Manage artifact storage and memory allocation" feature to control how artifacts use your storage. | |
Consider implementing time-to-live (TTL) policies for your artifacts to automatically manage data retention. | |
You can control where artifacts are downloaded by checking the documentation on "Where are artifacts downloaded, and how can I control that?" """, | |
}, | |
{ | |
"index": 88, | |
"question": """In the Runs view / Workspace, what button do we push to hide the sidebar with the list of all the run names?""", | |
"answer": """In the Runs sidebar of the project, you can click the table icon in the upper-right corner to toggle the sidebar view. When the sidebar is expanded, this icon collapses it back to hide the full runs table. | |
The Runs Sidebar in the Workspace tab shows a list of all runs in your project. You can use this toggle feature to minimize the sidebar when you need more space to view your visualizations or other project data. """, | |
}, | |
{ | |
"index": 89, | |
"question": """I am logging some metrics using wandb.log method and I have runs grouped by Group. Within that group each run has the metrics logged. While analyzing the runs in the table format on the UI previously before some days it would show average metric of the runs within the group next to the group name, but it's showing basically nothing now. Has anything changed in wandb ?""", | |
"answer": """We're sorry to hear you're experiencing this change in the UI. It seems there might have been an update that affected how group metrics are displayed. To help you better, could you please provide more details? | |
When did you last see the average metric next to the group name? | |
Are you using any specific grouping method (e.g., wandb.run.group)? | |
Can you share a screenshot of what you're currently seeing? | |
Recent updates have expanded grouping features, including the ability to create new Groups after initial logging and dynamic grouping by columns in the UI. These changes might have affected how metrics are displayed for groups. | |
We'd like to investigate this further to provide a more accurate solution. Your additional information will help us understand the specific issue you're facing. """, | |
}, | |
{ | |
"index": 90, | |
"question": """does weights and balances have an integration with LangChain?""", | |
"answer": """Yes, we have an integration with LangChain.js (version 0.0.75 or higher). You can use our JavaScript library to integrate with LangChain.js as follows: | |
import {WandbTracer} from '@wandb/sdk/integrations/langchain'; | |
const wbTracer = await WandbTracer.init({project: 'langchain-test'}); | |
// run your langchain workloads... | |
chain.call({input: "My prompt"}, wbTracer) | |
await WandbTracer.finish(); | |
For more detailed examples, you can check out our test implementation in the wandb-js GitHub repositor """, | |
}, | |
{ | |
"index": 91, | |
"question": """how can I get the path to where an artifact is being written?""", | |
"answer": """By default, artifacts are downloaded to the artifacts/ folder in your working directory. | |
When you download an artifact, you can get its path in two ways: | |
The download() method returns the path where the artifact is saved: | |
artifact_path = artifact.download() | |
print(artifact_path) | |
You can specify a custom download location by: | |
• Passing it directly to the download method | |
• Setting the WANDB_ARTIFACT_DIR environment variable | |
This lets you track exactly where your artifacts are being stored on your system. """, | |
}, | |
{ | |
"index": 92, | |
"question": """is there an example of how to use Launch on Sagemaker using BYOI (Bring your own image)?""", | |
"answer": """Yes, we support using your own image (BYOI) with Launch on SageMaker. Here's a brief overview: | |
Store your Docker image in Amazon Elastic Container Registry (ECR). | |
When setting up your Launch queue for SageMaker, choose the option to use an existing Docker image. | |
In your Launch agent configuration file (launch-config.yaml), specify the ECR registry: | |
registry: | |
type: ecr | |
uri: <ecr-repo-arn> | |
Make sure your Launch agent has the necessary permissions to access the ECR repository. | |
When submitting a Launch job, reference your custom image. | |
Remember to upload your Docker image to ECR before submitting new Launch jobs if you're using image-based jobs """, | |
}, | |
{ | |
"index": 93, | |
"question": """Can I get an enlarged/focused view for charts?""", | |
"answer": """Yes, you can get an enlarged or focused view of charts in Weights & Biases using the zoom feature: | |
For line plots: | |
Click and drag to create a rectangle over the area you want to zoom in on | |
This will zoom both vertically and horizontally, changing both x-axis and y-axis views | |
You can also explore specific regions of your data: | |
Click and drag to select a specific region to zoom in on | |
For even more control, you can use synchronized zooming mode: | |
This feature automatically zooms all other line plots with a common x-axis to the same range | |
Enable this in the workspace display settings for line plots | |
To return to the original view, simply click outside the zoomed area. """, | |
}, | |
{ | |
"index": 94, | |
"question": """How can weights & biases help when building llm-powered apps?""", | |
"answer": """We offer several tools to help you build LLM-powered apps: | |
W&B Weave: This framework helps you track, experiment with, evaluate, and improve LLM applications. | |
Tracing & Monitoring: You can track LLM calls and app logic to debug and analyze your systems. | |
Experimentation: Our LLM Playground lets you test different models and prompts. | |
Evaluation: We provide custom and pre-built scorers to assess your LLM app's performance. | |
Integration with LangChain.js: Our JavaScript library works with LangChain.js for building LLM apps. | |
These tools can help you create, refine, and monitor your LLM-powered applications effectively. """, | |
}, | |
{ | |
"index": 95, | |
"question": """What are some great resources for me to better understand Weights & Biases usage for Deep Learning?""", | |
"answer": """We offer several great resources to help you understand how to use our tools for Deep Learning: | |
Tutorials: We have interactive tutorials covering the basics of experiment tracking, model evaluation, and hyperparameter tuning. | |
AI Academy: Our GitHub repository contains educational materials on deep learning, including math refreshers for machine learning. | |
Integration guides: We provide detailed guides for popular frameworks like PyTorch and Keras, with example Colab notebooks to get you started quickly. | |
These resources will help you learn how to track experiments, visualize predictions, tune hyperparameters, and more using our platform. Start with the tutorials for a solid foundation, then explore the integration guides for your preferred framework. """, | |
}, | |
{ | |
"index": 96, | |
"question": """Can multiple users write to the same stream table?""", | |
"answer": """While Weights & Biases documentation doesn't explicitly mention multiple users writing to the same stream table, it does provide information on how to log tables to runs. Each time a table is logged to the same key, a new version is created and stored in the backend. This suggests that multiple users could potentially write to the same stream table, as long as they have access to the same run. | |
To better help you, could you provide more context about your specific use case? Are you trying to collaborate on a project where multiple team members need to update the same table? """, | |
}, | |
{ | |
"index": 97, | |
"question": """please show me how to log training output_name model of the current training run""", | |
"answer": """To log the training output model of your current run, you can use the run.log_model() method. Here's how to do it: | |
First, make sure you've initialized a W&B run. | |
After training your model, save it to a local file or directory. | |
Use run.log_model() to log the model. | |
Here's a simple example: | |
import wandb | |
# Initialize a W&B run | |
run = wandb.init(project="your-project", entity="your-entity") | |
# Train your model... | |
# Save your model | |
model_path = "path/to/your/saved/model" | |
# Log the model | |
run.log_model(path=model_path, name="your_model_name") | |
# End the run | |
run.finish() | |
Replace "path/to/your/saved/model" with the actual path where your model is saved, and "your_model_name" with a name for your model artifact. """, | |
}, | |
] |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment