使用Firecrawl搜索网页并抓取结果

Firecrawl将其SERP(搜索引擎结果页面)API与其强大的抓取基础设施集成,通过单一端点提供无缝的搜索和抓取功能。原因如下:

  1. 统一的搜索查询: 用户通过SERP端点提交搜索查询。

  2. 自动化结果抓取: Firecrawl自动处理搜索结果,并利用其抓取能力从每个结果页面提取数据。

  3. 数据交付: 从所有结果页面抓取的数据被编译并以干净的markdown格式交付,随时可用。

这种集成使用户能够高效地执行网络搜索,并以最少的努力从多个来源获得全面、抓取的数据。

有关更多详细信息,请参阅搜索端点文档

搜索任何查询

/search 端点

用于搜索网络,获取最相关的结果,抓取每个页面并返回markdown。

curl -X POST https://api.firecrawl.dev/v0/search \
    -H 'Content-Type: application/json' \
    -H 'Authorization: Bearer YOUR_API_KEY' \
    -d '{
      "query": "firecrawl",
      "pageOptions": {
        "fetchPageContent": true // false for a fast serp api
      }
    }'
{
  "success": true,
  "data": [
    {
      "url": "https://docs.firecrawl.dev",
      "markdown": "# Markdown Content",
      "provider": "web-scraper",
      "metadata": {
        "title": "Firecrawl | Scrape the web reliably for your LLMs",
        "description": "AI for CX and Sales",
        "language": null,
        "sourceURL": "https://docs.firecrawl.dev/"
      }
    }
  ]
}

使用Python SDK

安装Python SDK

pip install firecrawl-py

搜索查询

from firecrawl import FirecrawlApp

app = FirecrawlApp(api_key="YOUR_API_KEY")

result = app.search(query="What is firecrawl?")

响应将与上面的curl命令中显示的内容相似。

使用JavaScript SDK

安装JavaScript SDK

npm install @mendable/firecrawl-js

搜索查询

import FirecrawlApp from '@mendable/firecrawl-js';

// 使用您的API密钥初始化FirecrawlApp
const app = new FirecrawlApp({ apiKey: 'YOUR_API_KEY' });

// 执行搜索
const result = await app.search('What is firecrawl?');

响应将与上面的curl命令中显示的内容相似。

使用Go SDK

安装Go SDK

go get github.com/mendableai/firecrawl-go

搜索查询

import (
  "fmt"
  "log"

  "github.com/mendableai/firecrawl-go"
)

func main() {
  app, err := firecrawl.NewFirecrawlApp("YOUR_API_KEY")
  if err != nil {
      log.Fatalf("Failed to initialize FirecrawlApp: %v", err)
  }

  query := "What is firecrawl?"
  searchResult, err := app.Search(query)
  if err != nil {
    log.Fatalf("Failed to search: %v", err)
  }
  fmt.Println(searchResult)
}

使用Rust SDK

安装Rust SDK

在您的Cargo.toml中添加以下内容:

[dependencies]
firecrawl = "^0.1"
tokio = { version = "^1", features = ["full"] }
serde = { version = "^1.0", features = ["derive"] }
serde_json = "^1.0"
uuid = { version = "^1.10", features = ["v4"] }

[build-dependencies]
tokio = { version = "1", features = ["full"] }

搜索查询

async fn main() {
  let api_key = "YOUR_API_KEY";
  let api_url = "https://api.firecrawl.dev";
  let app = FirecrawlApp::new(api_key, api_url).expect("Failed to initialize FirecrawlApp");

  let query = "What is firecrawl?";
  let search_result = app.search(query).await;

  match search_result {
    Ok(data) => println!("Search Result: {}", data),
    Err(e) => eprintln!("Failed to search: {}", e),
  }
}