What is Metadata?

Metadata is additional structured data you can attach to each job run. It helps you:

Filter and search job runs in the dashboard
Track custom metrics and dimensions
Correlate failures with specific data characteristics
Generate custom reports and analytics

Adding Metadata to Jobs

Python Example

Attach metadata to provide context about each run

from seerpy import SEER

seer = SEER(api_key="your_api_key")

try:
    # Process data
    data = load_data("customers.csv")
    results = process_data(data)
    
    # Send success with metadata
    seer.success(
        metadata={
            "records_processed": len(data),
            "records_output": len(results),
            "source_file": "customers.csv",
            "environment": "production",
            "data_date": "2024-01-15",
            "processing_time_seconds": 45.2
        }
    )
    
except Exception as e:
    # Include metadata on errors too
    seer.error(
        error_message=str(e),
        metadata={
            "source_file": "customers.csv",
            "environment": "production",
            "partial_records": len(data) if 'data' in locals() else 0
        }
    )

Useful Metadata Fields

Data Processing

records_processed - Number of records handled
source_file - Input file name or path
data_date - Date of data being processed
file_size_mb - Size of input file
processing_time - Duration in seconds

Environment Information

environment - production, staging, development
region - us-east-1, eu-west-1, etc.
server - Hostname or server identifier
version - Application or script version

Business Context

customer_id - Customer or tenant identifier
department - Finance, Marketing, Operations
priority - high, medium, low
cost_usd - Processing cost for billing

Filtering by Metadata in Dashboard

Using Metadata for Analysis

Filter and search runs based on metadata fields

Example Queries

Find all failures for a specific customer:
Filter: Status = Failed, Metadata: customer_id = "CUST-123"
Identify slow processing runs:
Filter: Metadata: processing_time_seconds > 300
Track production issues only:
Filter: Status = Failed, Metadata: environment = "production"
Analyze by data date:
Group runs by Metadata: data_date to see trends over time

Best Practices

✓ Do:

Use consistent field names across all jobs
Keep metadata concise (avoid large objects)
Use appropriate data types (numbers for metrics, strings for labels)
Include environment and version information
Add business-relevant context for better insights

✗ Don't:

Include sensitive data (passwords, API keys, PII)
Send huge objects (keep metadata under 10KB)
Use inconsistent naming conventions
Include redundant data already tracked by SEER

Metadata Best Practices