GET
/
crawl
/
status
/
{jobId}
Get the status of a crawl job
curl --request GET \
  --url https://api.firecrawl.dev/v0/crawl/status/{jobId} \
  --header 'Authorization: Bearer <token>'
{
  "status": "<string>",
  "current": 123,
  "total": 123,
  "data": [
    {
      "markdown": "<string>",
      "content": "<string>",
      "html": "<string>",
      "rawHtml": "<string>",
      "index": 123,
      "metadata": {
        "title": "<string>",
        "description": "<string>",
        "language": "<string>",
        "sourceURL": "<string>",
        "<any other metadata> ": "<string>",
        "pageStatusCode": 123,
        "pageError": "<string>"
      }
    }
  ],
  "partial_data": [
    {
      "markdown": "<string>",
      "content": "<string>",
      "html": "<string>",
      "rawHtml": "<string>",
      "index": 123,
      "metadata": {
        "title": "<string>",
        "description": "<string>",
        "language": "<string>",
        "sourceURL": "<string>",
        "<any other metadata> ": "<string>",
        "pageStatusCode": 123,
        "pageError": "<string>"
      }
    }
  ]
}
此端点检索爬取作业的状态。如果作业未完成,响应中将包含在 partial_data 内的内容。一旦作业完成,内容将在 data 下可用。 我们建议自行跟踪爬取作业,因为爬取状态结果在24小时后可能会过期。

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Path Parameters

jobId
string
required

ID of the crawl job

Response

Successful response

status
string

Status of the job (completed, active, failed, paused)

current
integer

Current page number

total
integer

Total number of pages

data
object[]

Data returned from the job (null when it is in progress)

partial_data
object[]

Partial documents returned as it is being crawled (streaming). This feature is currently in alpha - expect breaking changes When a page is ready, it will append to the partial_data array, so there is no need to wait for the entire website to be crawled. When the crawl is done, partial_data will become empty and the result will be available in data. There is a max of 50 items in the array response. The oldest item (top of the array) will be removed when the new item is added to the array.