Skip to main content
POST
/
workspaces
/
{workspaceId}
/
crawler
/
jobs
Create crawler job
curl --request POST \
  --url https://eu-gcp-api.vg-stuff.com/v3/workspaces/{workspaceId}/crawler/jobs \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "urls": [
    "https://example.com"
  ],
  "crawl": true,
  "crawlOptions": {
    "maxPages": 25,
    "urlMatchers": [
      "/"
    ],
    "stayOnDomain": true
  },
  "webhook": {
    "url": "https://example.com/webhooks/crawler",
    "events": [
      "page_scraped",
      "job_completed",
      "job_failed"
    ]
  }
}
'
{
  "success": true,
  "message": "<string>",
  "data": {
    "id": "<string>",
    "workspaceId": "<string>",
    "status": "queued",
    "primaryUrl": "<string>",
    "urls": [
      "<string>"
    ],
    "crawl": true,
    "crawlOptions": {
      "maxPages": 250,
      "urlMatchers": [
        "<string>"
      ],
      "unMatchers": [
        "<string>"
      ],
      "stayOnDomain": true
    },
    "useProxy": true,
    "deep": true,
    "refreshRate": "<string>",
    "toAgentId": "<string>",
    "toAgentIds": [
      "<string>"
    ],
    "done": true,
    "failed": true,
    "isCancelled": true,
    "message": "<string>",
    "resultError": "<string>",
    "createdAt": "<string>",
    "ts": 123,
    "currentPageIndex": 123,
    "scrapedPagesNum": 123,
    "failedPagesNum": 123,
    "pageLimit": 123,
    "creditsPerPage": 123,
    "estimatedCredits": 123,
    "activeScrapeUrl": "<string>",
    "crawlerJobId": "<string>",
    "webhook": {
      "url": "<string>",
      "events": [
        "page_scraped"
      ],
      "hasSecret": true,
      "hasBearerToken": true,
      "headerKeys": [
        "<string>"
      ]
    }
  }
}

Overview

Creates a new crawler job for a workspace and starts processing it in the background.
Crawler jobs are immutable after creation. If you need different settings, delete the job and create a new one.

Supports

  • Single-page scrape jobs
  • Multi-URL scrape jobs
  • Crawl jobs with crawlOptions
  • Optional outbound webhooks for page_scraped, job_completed, and job_failed

Billing

Credits are estimated at submission time and actually consumed per successful scraped page.
Use useProxy: true only when needed. Proxy scraping has higher per-page credit cost.

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Path Parameters

workspaceId
string
required

The workspace that owns the crawler job.

Body

application/json
urls
string<uri>[]
required

One or more source URLs to scrape or use as crawl entry points.

Minimum array length: 1
crawl
boolean

If true, discovered URLs can be followed and scraped as part of the same job.

crawlOptions
object
deep
boolean

If true, use deep scraping behavior.

useProxy
boolean

If true, the crawler uses proxy scraping and paid proxy pricing.

refreshRate
string

Optional refresh cadence for KB-linked scrapes.

toAgentId
string

Optional single agent destination for KB import.

toAgentIds
string[]

Optional list of agent destinations for KB import.

webhook
object

Optional outbound webhook that receives page_scraped, job_completed, and job_failed events.

Response

Successful response

success
boolean
required
message
string
required
data
object
required