Skip to content

File Transfer Guide

Learn efficient methods to transfer files between your local machine and OSC.

Quick Reference

Method Best For Speed Ease of Use
VS Code Small files (<1 MB), editing Medium ⭐⭐⭐⭐⭐
SCP Single files, medium size Medium ⭐⭐⭐⭐
Rsync Large files (>100 MB), directory sync Fast ⭐⭐⭐
SFTP Interactive browsing Medium ⭐⭐⭐⭐
OnDemand Web uploads (<100 MB) Medium ⭐⭐⭐⭐⭐
Git Code and scripts only Fast ⭐⭐⭐⭐
flowchart TD
    A{What are you\ntransferring?} --> B[Code / scripts]
    A --> C[Small files\n< 1 MB]
    A --> D[Medium files\n1–100 MB]
    A --> E[Large files / dirs\n> 100 MB]
    B --> F[Git push/pull]
    C --> G[VS Code\ndrag & drop]
    D --> H{Single file or\ndirectory?}
    H -->|Single file| I[SCP]
    H -->|Directory| J[Rsync]
    E --> K[Rsync\n--partial --progress]

Method 1: VS Code (Easiest)

If you're using Remote Development, VS Code makes file transfer simple.

Drag and Drop

  1. Connect to OSC via Remote-SSH
  2. Drag files from your local file explorer into VS Code's file browser
  3. Files upload automatically

Download Files

  1. Right-click any file in VS Code
  2. Select "Download..."
  3. Choose destination

Upload Files

  1. Right-click in the file browser
  2. Select "Upload..."
  3. Choose files to upload

Method 2: SCP (Secure Copy)

SCP is built into SSH and works from the command line.

Basic Usage

# Upload file to OSC
scp local_file.txt pitzer:~/

# Upload to specific directory
scp local_file.txt pitzer:~/projects/data/

# Download file from OSC
scp pitzer:~/remote_file.txt ./

# Download to specific directory
scp pitzer:~/remote_file.txt ~/Downloads/

Copy Directories

# Upload directory
scp -r local_directory/ pitzer:~/remote_directory/

# Download directory
scp -r pitzer:~/remote_directory/ ./local_directory/

Multiple Files

# Upload multiple files
scp file1.txt file2.txt file3.txt pitzer:~/data/

# Download multiple files
scp pitzer:~/data/*.csv ./

With Progress

# Show progress (not supported by default SCP)
# Use rsync instead (see below)

Rsync is the most efficient method for large files and directories.

Why Rsync?

  • ✅ Resumes interrupted transfers
  • ✅ Only transfers changed files
  • ✅ Shows progress
  • ✅ Can preserve permissions and timestamps
  • ✅ Compression during transfer

Basic Rsync

# Upload directory to OSC
rsync -avz --progress local_directory/ pitzer:~/remote_directory/

# Download directory from OSC
rsync -avz --progress pitzer:~/remote_directory/ ./local_directory/

Rsync Options Explained

  • -a (archive): Preserves permissions, timestamps, symbolic links
  • -v (verbose): Show files being transferred
  • -z (compress): Compress during transfer
  • --progress: Show transfer progress
  • -h (human-readable): Show sizes in KB, MB, GB
  • --exclude: Exclude files/patterns
  • --delete: Delete files in destination not in source

Common Rsync Commands

# Sync with human-readable progress
rsync -avzh --progress source/ pitzer:~/destination/

# Exclude certain files
rsync -avz --progress --exclude='*.log' --exclude='.git/' source/ pitzer:~/destination/

# Dry run (see what would be transferred)
rsync -avzn --progress source/ pitzer:~/destination/

# Sync and delete files not in source
rsync -avz --progress --delete source/ pitzer:~/destination/

# Resume interrupted transfer
rsync -avz --progress --partial source/ pitzer:~/destination/
Rsync Exclude Patterns

Create .rsync-exclude file:

# Exclude patterns
*.log
*.tmp
.git/
__pycache__/
*.pyc
node_modules/
.DS_Store

Use it:

rsync -avz --progress --exclude-from=.rsync-exclude source/ pitzer:~/destination/

Large Dataset Sync
# For multi-GB datasets
rsync -avzh --progress \
  --partial --partial-dir=.rsync-partial \
  --exclude='*.log' \
  ./dataset/ pitzer:~/projects/data/dataset/

Options:

  • --partial: Keep partial files if transfer interrupted
  • --partial-dir: Store partial files in hidden directory

Method 4: OSC OnDemand (Web Interface)

OSC OnDemand includes a built-in file manager for uploading, downloading, and browsing files. See the OnDemand Guide for details.

Method 5: Git (For Code)

For code and small files, use Git:

# On local machine
git add .
git commit -m "Update code"
git push origin main

# On OSC
git pull origin main

Best Practices with Git

  • ✅ Use for code, scripts, configs
  • ✅ Use for small data files (< 10 MB)
  • ❌ Don't commit large datasets
  • ❌ Don't commit binary files (model checkpoints)
  • ❌ Don't commit generated files

Use .gitignore:

# Python
__pycache__/
*.pyc
.venv/

# Data
*.csv
*.h5
*.hdf5
*.npy
data/
datasets/

# Models
*.pth
*.ckpt
checkpoints/

Best Practices

1. Organize Your Files

On OSC:

~/
├── projects/
│   ├── project1/
│   │   ├── code/
│   │   ├── data/
│   │   └── results/
│   └── project2/
├── scratch/        # Temporary large files
└── shared/         # Shared with lab members

2. Use Scratch Space for Large Data

# Scratch space (faster, more space, but not backed up)
cd $TMPDIR          # Temporary scratch
cd /fs/scratch/     # Persistent scratch (your project)

3. Compress Before Transfer

# On local machine
tar -czf dataset.tar.gz dataset/
rsync -avz --progress dataset.tar.gz pitzer:~/

# On OSC
ssh pitzer
tar -xzf dataset.tar.gz
Checksum Verification for Large Transfers
# Generate checksum locally
sha256sum large_file.zip > checksum.txt

# Transfer both
rsync -avz --progress large_file.zip checksum.txt pitzer:~/

# Verify on OSC
ssh pitzer
sha256sum -c checksum.txt

5. Use .rsyncignore

Create project-level .rsyncignore:

.git/
__pycache__/
*.pyc
.vscode/
.ipynb_checkpoints/
*.log

Sync command:

rsync -avz --progress --exclude-from=.rsyncignore ./ pitzer:~/project/

Troubleshooting

Transfer Interrupted

Rsync resumes automatically:

rsync -avz --progress --partial source/ pitzer:~/destination/

SCP doesn't resume - use rsync instead.

Permission Denied

# Check permissions on OSC
ssh pitzer
ls -la ~/destination/
chmod 755 ~/destination/  # Fix if needed

Slow Transfer Speed

Causes: - Network congestion - Large number of small files - Uncompressed transfer

Solutions:

# Compress during transfer
rsync -avz --progress source/ pitzer:~/destination/

# Archive first, then transfer
tar -czf archive.tar.gz source/
rsync -avz --progress archive.tar.gz pitzer:~/

Connection Drops During Transfer

Use tmux or screen to keep transfers running if your connection drops. For tmux setup and usage, see Remote Development — Using tmux for Persistent Sessions.

Disk Quota Exceeded

# Check quota on OSC
ssh pitzer
quota -s

# Clean up if needed
du -sh ~/*/  # See directory sizes

Next Steps

Resources