Notebook-to-Script Workflow¶

How to iterate quickly in Jupyter notebooks and run large-scale experiments as Python scripts on OSC.

Overview¶

A common research workflow looks like this:

Explore and prototype in a Jupyter notebook (small data, quick feedback)
Convert working code into a parameterized .py script
Run experiments at scale via SLURM batch jobs on OSC

This page walks through each stage with concrete examples.

graph LR
    A[Jupyter Notebook<br/>Prototype & Debug] --> B[Python Script<br/>Parameterize & Clean]
    B --> C[SLURM Batch Job<br/>Run at Scale]
    C --> D[Notebook<br/>Analyze Results]

Stage 1: Prototype in a Notebook¶

Launching Jupyter on OSC via OnDemand¶

The easiest way to use notebooks on OSC is through the OSC OnDemand portal:

Log in at ondemand.osc.edu
Navigate to Interactive Apps > Jupyter
Select your project account, number of cores, and time limit
Click Launch and wait for the session to start
Click Connect to Jupyter when the session is ready

Use debug resources for prototyping

When iterating on code, request minimal resources (1-2 cores, 1 GPU if needed, 1-2 hours). Save large allocations for production runs.

Prototyping Best Practices¶

Keep your notebook focused on experimentation:

# Cell 1 — Imports and configuration
import torch
import torch.nn as nn
from torch.utils.data import DataLoader
import matplotlib.pyplot as plt
import numpy as np

# Use a small subset for fast iteration
DATA_DIR = "/fs/scratch/PAS1234/$USER/datasets/my_data"
SUBSET_SIZE = 500   # Small subset for prototyping
DEVICE = "cuda" if torch.cuda.is_available() else "cpu"

# Cell 2 — Load a small data sample
dataset = MyDataset(DATA_DIR, max_samples=SUBSET_SIZE)
loader = DataLoader(dataset, batch_size=32, shuffle=True)

print(f"Dataset size: {len(dataset)}")
sample, label = dataset[0]
print(f"Sample shape: {sample.shape}, Label: {label}")

# Cell 3 — Define and test model
model = MyModel(num_classes=10).to(DEVICE)
batch = next(iter(loader))
inputs, labels = batch[0].to(DEVICE), batch[1].to(DEVICE)
output = model(inputs)
print(f"Output shape: {output.shape}")

# Cell 4 — Train for a few epochs to verify everything works
optimizer = torch.optim.Adam(model.parameters(), lr=1e-3)
criterion = nn.CrossEntropyLoss()

for epoch in range(3):
    total_loss = 0
    for inputs, labels in loader:
        inputs, labels = inputs.to(DEVICE), labels.to(DEVICE)
        optimizer.zero_grad()
        loss = criterion(model(inputs), labels)
        loss.backward()
        optimizer.step()
        total_loss += loss.item()
    print(f"Epoch {epoch}: loss={total_loss/len(loader):.4f}")

# Cell 5 — Visualize results to confirm correctness
plt.plot(losses)
plt.xlabel("Step")
plt.ylabel("Loss")
plt.title("Training Loss")
plt.show()

Don't run full experiments in notebooks

Notebooks are great for prototyping, but they have drawbacks for long runs:

If the browser tab closes or your OnDemand session times out, you lose running state
Notebooks aren't easily parameterized for sweeps
Reproducibility is harder to guarantee (cell execution order matters)
They consume interactive resources that could be used for batch jobs

Stage 2: Convert to a Python Script¶

Once your notebook code works on a small sample, extract it into a standalone .py script.

Manual Conversion¶

Pull the working cells into a clean script with argparse for configuration:

src/train.py:

"""Training script — converted from prototyping notebook."""
import argparse
import os
import torch
import torch.nn as nn
from torch.utils.data import DataLoader
from src.models.model import MyModel
from src.data.dataset import MyDataset


def parse_args():
    parser = argparse.ArgumentParser(description="Train MyModel")
    parser.add_argument("--data-dir", type=str, required=True)
    parser.add_argument("--output-dir", type=str, default="results")
    parser.add_argument("--epochs", type=int, default=100)
    parser.add_argument("--batch-size", type=int, default=64)
    parser.add_argument("--lr", type=float, default=1e-3)
    parser.add_argument("--num-classes", type=int, default=10)
    parser.add_argument("--num-workers", type=int, default=4)
    parser.add_argument("--seed", type=int, default=42)
    parser.add_argument("--resume", type=str, default=None,
                        help="Path to checkpoint to resume from")
    return parser.parse_args()


def train(args):
    # Reproducibility
    torch.manual_seed(args.seed)

    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    print(f"Using device: {device}")

    # Data
    train_dataset = MyDataset(os.path.join(args.data_dir, "train"))
    val_dataset = MyDataset(os.path.join(args.data_dir, "val"))
    train_loader = DataLoader(train_dataset, batch_size=args.batch_size,
                              shuffle=True, num_workers=args.num_workers)
    val_loader = DataLoader(val_dataset, batch_size=args.batch_size,
                            shuffle=False, num_workers=args.num_workers)

    # Model, optimizer, loss
    model = MyModel(num_classes=args.num_classes).to(device)
    optimizer = torch.optim.Adam(model.parameters(), lr=args.lr)
    criterion = nn.CrossEntropyLoss()
    start_epoch = 0

    # Resume from checkpoint
    if args.resume:
        ckpt = torch.load(args.resume, map_location=device)
        model.load_state_dict(ckpt["model_state_dict"])
        optimizer.load_state_dict(ckpt["optimizer_state_dict"])
        start_epoch = ckpt["epoch"] + 1
        print(f"Resumed from epoch {start_epoch}")

    # Training loop
    os.makedirs(args.output_dir, exist_ok=True)
    best_val_acc = 0.0

    for epoch in range(start_epoch, args.epochs):
        model.train()
        train_loss = 0
        for inputs, labels in train_loader:
            inputs, labels = inputs.to(device), labels.to(device)
            optimizer.zero_grad()
            loss = criterion(model(inputs), labels)
            loss.backward()
            optimizer.step()
            train_loss += loss.item()

        # Validation
        model.eval()
        correct, total = 0, 0
        with torch.no_grad():
            for inputs, labels in val_loader:
                inputs, labels = inputs.to(device), labels.to(device)
                preds = model(inputs).argmax(dim=1)
                correct += (preds == labels).sum().item()
                total += labels.size(0)
        val_acc = correct / total

        print(f"Epoch {epoch}: train_loss={train_loss/len(train_loader):.4f}  "
              f"val_acc={val_acc:.4f}")

        # Checkpoint
        if val_acc > best_val_acc:
            best_val_acc = val_acc
            torch.save({
                "epoch": epoch,
                "model_state_dict": model.state_dict(),
                "optimizer_state_dict": optimizer.state_dict(),
                "val_acc": val_acc,
            }, os.path.join(args.output_dir, "best_model.pt"))

    print(f"Done. Best val_acc: {best_val_acc:.4f}")


if __name__ == "__main__":
    train(parse_args())

Automated Conversion with `nbconvert`¶

You can also export a notebook directly and then clean it up:

# Convert notebook to .py file
jupyter nbconvert --to script notebooks/prototype.ipynb

# This creates notebooks/prototype.py
# Then edit it to add argparse, remove interactive cells, etc.

Key Differences: Notebook vs. Script¶

Aspect	Notebook	Script
Configuration	Hardcoded variables in cells	`argparse` or config files
Data size	Small subset	Full dataset
Visualization	Inline `plt.show()`	Save to file (`plt.savefig()`)
Output	Cell output visible	Print statements, log files
Execution	Manual cell-by-cell	`python train.py --args`
Reproducibility	Cell order dependent	Deterministic top-to-bottom

Stage 3: Run at Scale on OSC¶

Basic Experiment Job¶

scripts/run_experiment.sh:

#!/bin/bash
#SBATCH --job-name=train_exp
#SBATCH --account=PAS1234
#SBATCH --gpus-per-node=1
#SBATCH --cpus-per-task=4
#SBATCH --mem=32G
#SBATCH --time=08:00:00
#SBATCH --output=logs/train_%j.out

module load python/3.12
# module load cuda/12.4  # Only needed for custom CUDA extensions — PyPI torch bundles CUDA
source ~/venvs/myproject/bin/activate

SCRATCH=/fs/scratch/PAS1234/$USER/my_project

python src/train.py \
    --data-dir $SCRATCH/datasets/my_data \
    --output-dir $SCRATCH/results/baseline \
    --epochs 100 \
    --batch-size 64 \
    --lr 1e-3 \
    --seed 42

# Create logs directory and submit
mkdir -p logs
sbatch scripts/run_experiment.sh

Hyperparameter Sweeps with Job Arrays¶

Use SLURM job arrays to run many configurations in parallel. For sweep script templates and job array patterns, see the Job Submission Guide.

Using Config Files Instead of CLI Args¶

For complex experiments, YAML config files are cleaner than long argument lists:

configs/baseline.yaml:

data_dir: /fs/scratch/PAS1234/user/datasets/my_data
output_dir: /fs/scratch/PAS1234/user/results/baseline
epochs: 100
batch_size: 64
lr: 0.001
num_classes: 10
seed: 42

Load in your script with a library like OmegaConf or plain PyYAML:

import yaml

with open(args.config) as f:
    cfg = yaml.safe_load(f)

Stage 4: Analyze Results Back in a Notebook¶

After experiments finish, use a notebook to compare runs and generate figures:

# Cell — Load and compare sweep results
import os, json
import pandas as pd
import matplotlib.pyplot as plt

results_dir = "/fs/scratch/PAS1234/user/my_project/results"
rows = []
for run in os.listdir(results_dir):
    metrics_path = os.path.join(results_dir, run, "metrics.json")
    if os.path.exists(metrics_path):
        with open(metrics_path) as f:
            m = json.load(f)
        m["run"] = run
        rows.append(m)

df = pd.DataFrame(rows)
print(df.sort_values("val_acc", ascending=False).head(10))

# Cell — Plot comparison
fig, ax = plt.subplots()
for lr, group in df.groupby("lr"):
    group.plot(x="batch_size", y="val_acc", ax=ax, label=f"lr={lr}", marker="o")
ax.set_ylabel("Validation Accuracy")
ax.set_title("Hyperparameter Sweep Results")
plt.savefig("figures/sweep_results.png", dpi=150)
plt.show()

Summary: The Full Cycle¶

graph TD
    A[1. Prototype in Notebook] -->|Small data, quick iterations| B[2. Convert to .py Script]
    B -->|Add argparse, clean up| C[3. Submit SLURM Jobs]
    C -->|Full data, sweeps| D[4. Analyze in Notebook]
    D -->|New ideas| A

Stage	Tool	Data Size	Resources
Prototype	Jupyter via OnDemand	Small subset	1-2 cores, optional GPU
Convert	Editor / `nbconvert`	N/A	N/A
Run at scale	`sbatch` / job arrays	Full dataset	Multi-GPU, long walltime
Analyze results	Jupyter via OnDemand	Result files	1-2 cores

Next Steps¶

Review Job Submission for SLURM details
Set up Data & Experiment Tracking for experiment logging with MLflow or W&B
Use a pipeline orchestrator for multi-step pipelines
Learn about PyTorch & GPU Setup for training at scale