概述

StagehandTool 将 Stagehand 框架与 CrewAI 集成，使智能体能够使用自然语言指令与网站交互并自动化浏览器任务。

概述

Stagehand 是一个由 Browserbase 构建的强大浏览器自动化框架，允许 AI 智能体执行以下操作：

导航到网站
点击按钮、链接及其他元素
填写表单
从网页提取数据
观察和识别元素
执行复杂工作流

StagehandTool 包装了 Stagehand Python SDK，通过以下三个核心原语为 CrewAI 智能体提供浏览器控制能力：

Act（行动）：执行点击、输入或导航等操作
Extract（提取）：从网页提取结构化数据
Observe（观察）：识别和分析页面上的元素

先决条件

在使用此工具之前，请确保您拥有：

一个带有 API 密钥和项目 ID 的 Browserbase 帐户
一个 LLM（OpenAI 或 Anthropic Claude）的 API 密钥
已安装 Stagehand Python SDK

安装所需依赖项

pip install stagehand-py

用法

基本实现

StagehandTool 可以通过两种方式实现：

1. 使用上下文管理器（推荐）

推荐使用上下文管理器方法，因为它确保即使发生异常也能正确清理资源。

from crewai import Agent, Task, Crew
from crewai_tools import StagehandTool
from stagehand.schemas import AvailableModel

# Initialize the tool with your API keys using a context manager
with StagehandTool(
    api_key="your-browserbase-api-key",
    project_id="your-browserbase-project-id",
    model_api_key="your-llm-api-key",  # OpenAI or Anthropic API key
    model_name=AvailableModel.CLAUDE_3_7_SONNET_LATEST,  # Optional: specify which model to use
) as stagehand_tool:
    # Create an agent with the tool
    researcher = Agent(
        role="Web Researcher",
        goal="Find and summarize information from websites",
        backstory="I'm an expert at finding information online.",
        verbose=True,
        tools=[stagehand_tool],
    )

    # Create a task that uses the tool
    research_task = Task(
        description="Go to https://www.example.com and tell me what you see on the homepage.",
        agent=researcher,
    )

    # Run the crew
    crew = Crew(
        agents=[researcher],
        tasks=[research_task],
        verbose=True,
    )

    result = crew.kickoff()
    print(result)

2. 手动资源管理

from crewai import Agent, Task, Crew
from crewai_tools import StagehandTool
from stagehand.schemas import AvailableModel

# Initialize the tool with your API keys
stagehand_tool = StagehandTool(
    api_key="your-browserbase-api-key",
    project_id="your-browserbase-project-id",
    model_api_key="your-llm-api-key",
    model_name=AvailableModel.CLAUDE_3_7_SONNET_LATEST,
)

try:
    # Create an agent with the tool
    researcher = Agent(
        role="Web Researcher",
        goal="Find and summarize information from websites",
        backstory="I'm an expert at finding information online.",
        verbose=True,
        tools=[stagehand_tool],
    )

    # Create a task that uses the tool
    research_task = Task(
        description="Go to https://www.example.com and tell me what you see on the homepage.",
        agent=researcher,
    )

    # Run the crew
    crew = Crew(
        agents=[researcher],
        tasks=[research_task],
        verbose=True,
    )

    result = crew.kickoff()
    print(result)
finally:
    # Explicitly clean up resources
    stagehand_tool.close()

命令类型

StagehandTool 支持三种不同的命令类型，用于特定的网络自动化任务：

1. Act 命令

act 命令类型（默认）支持网页交互，例如点击按钮、填写表单和导航。

# Perform an action (default behavior)
result = stagehand_tool.run(
    instruction="Click the login button", 
    url="https://example.com",
    command_type="act"  # Default, so can be omitted
)

# Fill out a form
result = stagehand_tool.run(
    instruction="Fill the contact form with name 'John Doe', email 'john@example.com', and message 'Hello world'", 
    url="https://example.com/contact"
)

2. Extract 命令

extract 命令类型从网页检索结构化数据。

# Extract all product information
result = stagehand_tool.run(
    instruction="Extract all product names, prices, and descriptions", 
    url="https://example.com/products",
    command_type="extract"
)

# Extract specific information with a selector
result = stagehand_tool.run(
    instruction="Extract the main article title and content", 
    url="https://example.com/blog/article",
    command_type="extract",
    selector=".article-container"  # Optional CSS selector
)

3. Observe 命令

observe 命令类型识别和分析网页元素。

# Find interactive elements
result = stagehand_tool.run(
    instruction="Find all interactive elements in the navigation menu", 
    url="https://example.com",
    command_type="observe"
)

# Identify form fields
result = stagehand_tool.run(
    instruction="Identify all the input fields in the registration form", 
    url="https://example.com/register",
    command_type="observe",
    selector="#registration-form"
)

配置选项

使用这些参数自定义 StagehandTool 的行为

stagehand_tool = StagehandTool(
    api_key="your-browserbase-api-key",
    project_id="your-browserbase-project-id",
    model_api_key="your-llm-api-key",
    model_name=AvailableModel.CLAUDE_3_7_SONNET_LATEST,
    dom_settle_timeout_ms=5000,  # Wait longer for DOM to settle
    headless=True,  # Run browser in headless mode
    self_heal=True,  # Attempt to recover from errors
    wait_for_captcha_solves=True,  # Wait for CAPTCHA solving
    verbose=1,  # Control logging verbosity (0-3)
)

最佳实践

具体明确：提供详细说明以获得更好的结果
选择适当的命令类型：为您的任务选择正确的命令类型
使用选择器：利用 CSS 选择器提高准确性
分解复杂任务：将复杂工作流拆分为多个工具调用
实现错误处理：为潜在问题添加错误处理

故障排除

常见问题及解决方案

会话问题：验证 Browserbase 和 LLM 提供商的 API 密钥
未找到元素：对于较慢的页面，增加 dom_settle_timeout_ms
操作失败：先使用 observe 识别正确的元素
数据不完整：优化指令或提供特定选择器

其他资源

关于 CrewAI 集成的问题，请访问：

加入 Stagehand 的 Slack 社区
在 Stagehand 仓库中提交问题
访问 Stagehand 文档

Spider 抓取器 TXT RAG 搜索

在此页面

概述
概述
先决条件
用法
基本实现
1. 使用上下文管理器（推荐）
2. 手动资源管理
命令类型
1. Act 命令
2. Extract 命令
3. Observe 命令
配置选项
最佳实践
故障排除
其他资源

开始

指南

核心概念

工具

智能体监控与可观察性

学习

遥测

Stagehand 工具

概述

概述

先决条件

用法

基本实现

1. 使用上下文管理器（推荐）

2. 手动资源管理

命令类型

1. Act 命令

2. Extract 命令

3. Observe 命令

配置选项

最佳实践

故障排除

其他资源

开始

指南

核心概念

工具

智能体监控与可观察性

学习

遥测

​概述

​概述

​先决条件

​用法

​基本实现

​1. 使用上下文管理器（推荐）

​2. 手动资源管理

​命令类型

​1. Act 命令

​2. Extract 命令

​3. Observe 命令

​配置选项

​最佳实践

​故障排除

​其他资源

概述

概述

先决条件

用法

基本实现

1. 使用上下文管理器（推荐）

2. 手动资源管理

命令类型

1. Act 命令

2. Extract 命令

3. Observe 命令

配置选项

最佳实践

故障排除

其他资源