Core Concepts

Understand the building blocks of Parser Run: projects, schemas, jobs, and data.

Projects

A project is a scraping configuration targeting a specific data goal. Each project has its own database, schema, and schedule.

Example: Price Tracker Project
{
  id: "nike-price-tracker",
  name: "Nike Price Tracker",
  description: "Track Nike shoe prices across retailers",
  urls: ["https://nike.com/shoes", "https://zappos.com/nike"],
  schedule: "daily",
  status: "active"
}

Projects can be created via conversation or the API.

Schemas

A schema defines what data to extract from each page. Parser Run auto-detects schemas using LLM analysis.

TypeDescriptionExample
textString valuesProduct name, brand
numberDecimal numbersPrice, rating
integerWhole numbersReview count, stock qty
booleanTrue/falseIn stock, on sale
dateISO date stringRelease date
arrayList of valuesTags, sizes, colors

Jobs

A job is a single execution of a project's scraping pipeline. Jobs progress through stages:

  1. 1. discovery— Find all pages matching the target patterns
  2. 2. acquisition— Fetch HTML from each page
  3. 3. parsing— Extract data according to schema
  4. 4. complete— Data stored and ready to query

Records

Records are the extracted data rows stored in your project database.

Example: Product Record
{
  id: "rec_abc123",
  projectId: "nike-price-tracker",
  jobId: "job_xyz789",
  url: "https://nike.com/shoes/air-max-90",
  data: {
    product_name: "Nike Air Max 90",
    price: 129.99,
    brand: "Nike",
    rating: 4.7,
    review_count: 2341,
    in_stock: true
  },
  extractedAt: "2025-01-15T10:30:00Z"
}

HTML Archives

Every page fetch is archived for debugging and replay. This enables:

  • Debugging — See exactly what the scraper saw
  • Re-extraction — Update extraction logic without re-fetching
  • Audit trail — Prove data provenance
  • A/B testing — Compare extraction algorithms

Alerts & Webhooks

Configure notifications when your data changes:

Alert Configuration
{
  type: "price_drop",
  condition: {
    field: "price",
    operator: "decreased_by",
    value: 10,  // percent
  },
  notify: ["email", "webhook"],
  webhookUrl: "https://your-app.com/hooks/price-alert"
}

Next Steps