Skip to content

Jeff Liu's AI Learning Notes

WB

Weights & Biases (W&B)

Weights & Biases (commonly known as W&B or wandb) is an experiment tracking and visualization platform designed for machine learning teams. Compared to TensorBoard, W&B offers advanced features such as cloud storage, team collaboration, and automated hyperparameter search.

Basic Usage

Installation and Setup

pip install wandb
wandb login  # 输入 API Key（从 wandb.ai 获取）

Basic Integration

import wandb

# 初始化实验
wandb.init(
    project="my-project",
    name="resnet50-baseline",
    config={
        "learning_rate": 1e-3,
        "batch_size": 32,
        "epochs": 100,
        "architecture": "ResNet-50",
        "optimizer": "AdamW",
    }
)

for epoch in range(num_epochs):
    train_loss = train_one_epoch(model, train_loader, optimizer)
    val_loss, val_acc = evaluate(model, val_loader)

    # 记录指标
    wandb.log({
        "train_loss": train_loss,
        "val_loss": val_loss,
        "val_accuracy": val_acc,
        "lr": optimizer.param_groups[0]['lr'],
        "epoch": epoch,
    })

# 保存模型
wandb.save("model_best.pth")
wandb.finish()

Core Features

Experiment Comparison

W&B automatically creates a Run for each call to wandb.init(). From the web interface, you can:

Compare training curves across different experiments side by side
Filter and sort experiments by hyperparameters
Create custom Dashboards

Hyperparameter Search (Sweep)

W&B Sweep provides automated hyperparameter search:

# 定义搜索空间
sweep_config = {
    "method": "bayes",  # bayes, grid, random
    "metric": {"name": "val_accuracy", "goal": "maximize"},
    "parameters": {
        "learning_rate": {"min": 1e-5, "max": 1e-2, "distribution": "log_uniform_values"},
        "batch_size": {"values": [16, 32, 64, 128]},
        "weight_decay": {"min": 1e-5, "max": 1e-1, "distribution": "log_uniform_values"},
    }
}

sweep_id = wandb.sweep(sweep_config, project="my-project")

def train():
    wandb.init()
    config = wandb.config
    # 使用 config.learning_rate, config.batch_size 等进行训练
    ...

wandb.agent(sweep_id, function=train, count=50)  # 运行50次实验

Artifacts (Asset Management)

Artifacts are used for versioned management of datasets, models, and other files:

# 保存模型为 artifact
artifact = wandb.Artifact('trained-model', type='model')
artifact.add_file('model_best.pth')
wandb.log_artifact(artifact)

# 加载 artifact
artifact = wandb.use_artifact('trained-model:latest')
artifact_dir = artifact.download()

Tables and Visualization

# 记录预测样本
table = wandb.Table(columns=["image", "prediction", "ground_truth"])
for img, pred, gt in samples:
    table.add_data(wandb.Image(img), pred, gt)
wandb.log({"predictions": table})

Comparison with Other Tools

Feature	TensorBoard	W&B	MLflow
Experiment Tracking	Local	Cloud-based	Local/Remote
Team Collaboration	Not supported	Native support	Supported
Hyperparameter Search	Not supported	Sweep (Bayesian, etc.)	Not supported (requires integration)
Model Registry	Not supported	Artifacts	Model Registry
Free Tier	Completely free	Free for individuals	Open-source and free
Learning Curve	Low	Low	Moderate

Recommendations:

Personal projects / quick experiments -> TensorBoard (zero configuration)
Team collaboration / production projects -> W&B (most comprehensive feature set)
Self-hosted / MLOps Pipeline requirements -> MLflow

References

评论 #