概述

StagehandTool 将 Stagehand 框架与 CrewAI 集成，使智能体能够使用自然语言指令与网站交互并自动化浏览器任务。

概述

Stagehand 是一个由 Browserbase 构建的强大浏览器自动化框架，允许 AI 智能体

导航至网站
点击按钮、链接和其他元素
填写表单
从网页中提取数据
观察并识别元素
执行复杂的工作流

StagehandTool 封装了 Stagehand Python SDK，通过三个核心原语为 CrewAI 智能体提供浏览器控制能力

Act：执行点击、输入或导航等操作
Extract：从网页中提取结构化数据
Observe：识别并分析页面上的元素

先决条件

在使用此工具之前，请确保您已具备

一个拥有 API 密钥和项目 ID 的 Browserbase 账户
一个 LLM（OpenAI 或 Anthropic Claude）的 API 密钥
已安装 Stagehand Python SDK

安装所需的依赖项

pip install stagehand-py

用法

基本实现

StagehandTool 可以通过两种方式实现

1. 使用上下文管理器（推荐）

推荐使用上下文管理器方法，因为它能确保即使发生异常也能正确清理资源。

from crewai import Agent, Task, Crew
from crewai_tools import StagehandTool
from stagehand.schemas import AvailableModel

# Initialize the tool with your API keys using a context manager
with StagehandTool(
    api_key="your-browserbase-api-key",
    project_id="your-browserbase-project-id",
    model_api_key="your-llm-api-key",  # OpenAI or Anthropic API key
    model_name=AvailableModel.CLAUDE_3_7_SONNET_LATEST,  # Optional: specify which model to use
) as stagehand_tool:
    # Create an agent with the tool
    researcher = Agent(
        role="Web Researcher",
        goal="Find and summarize information from websites",
        backstory="I'm an expert at finding information online.",
        verbose=True,
        tools=[stagehand_tool],
    )

    # Create a task that uses the tool
    research_task = Task(
        description="Go to https://www.example.com and tell me what you see on the homepage.",
        agent=researcher,
    )

    # Run the crew
    crew = Crew(
        agents=[researcher],
        tasks=[research_task],
        verbose=True,
    )

    result = crew.kickoff()
    print(result)

2. 手动资源管理

from crewai import Agent, Task, Crew
from crewai_tools import StagehandTool
from stagehand.schemas import AvailableModel

# Initialize the tool with your API keys
stagehand_tool = StagehandTool(
    api_key="your-browserbase-api-key",
    project_id="your-browserbase-project-id",
    model_api_key="your-llm-api-key",
    model_name=AvailableModel.CLAUDE_3_7_SONNET_LATEST,
)

try:
    # Create an agent with the tool
    researcher = Agent(
        role="Web Researcher",
        goal="Find and summarize information from websites",
        backstory="I'm an expert at finding information online.",
        verbose=True,
        tools=[stagehand_tool],
    )

    # Create a task that uses the tool
    research_task = Task(
        description="Go to https://www.example.com and tell me what you see on the homepage.",
        agent=researcher,
    )

    # Run the crew
    crew = Crew(
        agents=[researcher],
        tasks=[research_task],
        verbose=True,
    )

    result = crew.kickoff()
    print(result)
finally:
    # Explicitly clean up resources
    stagehand_tool.close()

命令类型

StagehandTool 支持三种不同的命令类型，用于特定的 Web 自动化任务

1. Act 命令

act 命令类型（默认）用于网页交互，如点击按钮、填写表单和导航。

# Perform an action (default behavior)
result = stagehand_tool.run(
    instruction="Click the login button", 
    url="https://example.com",
    command_type="act"  # Default, so can be omitted
)

# Fill out a form
result = stagehand_tool.run(
    instruction="Fill the contact form with name 'John Doe', email 'john@example.com', and message 'Hello world'", 
    url="https://example.com/contact"
)

2. Extract 命令

extract 命令类型用于从网页中检索结构化数据。

# Extract all product information
result = stagehand_tool.run(
    instruction="Extract all product names, prices, and descriptions", 
    url="https://example.com/products",
    command_type="extract"
)

# Extract specific information with a selector
result = stagehand_tool.run(
    instruction="Extract the main article title and content", 
    url="https://example.com/blog/article",
    command_type="extract",
    selector=".article-container"  # Optional CSS selector
)

3. Observe 命令

observe 命令类型用于识别和分析网页元素。

# Find interactive elements
result = stagehand_tool.run(
    instruction="Find all interactive elements in the navigation menu", 
    url="https://example.com",
    command_type="observe"
)

# Identify form fields
result = stagehand_tool.run(
    instruction="Identify all the input fields in the registration form", 
    url="https://example.com/register",
    command_type="observe",
    selector="#registration-form"
)

配置选项

使用这些参数自定义 StagehandTool 的行为

stagehand_tool = StagehandTool(
    api_key="your-browserbase-api-key",
    project_id="your-browserbase-project-id",
    model_api_key="your-llm-api-key",
    model_name=AvailableModel.CLAUDE_3_7_SONNET_LATEST,
    dom_settle_timeout_ms=5000,  # Wait longer for DOM to settle
    headless=True,  # Run browser in headless mode
    self_heal=True,  # Attempt to recover from errors
    wait_for_captcha_solves=True,  # Wait for CAPTCHA solving
    verbose=1,  # Control logging verbosity (0-3)
)

最佳实践

具体化：提供详细的指令以获得更好的结果
选择合适的命令类型：为您的任务选择正确的命令类型
使用选择器：利用 CSS 选择器提高准确性
分解复杂任务：将复杂的工作流拆分为多个工具调用
实现错误处理：为潜在问题添加错误处理

故障排除

常见问题及解决方案

会话问题：验证 Browserbase 和 LLM 提供商的 API 密钥
元素未找到：对于加载较慢的页面，增加 dom_settle_timeout_ms
操作失败：首先使用 observe 识别正确的元素
数据不完整：优化指令或提供特定的选择器

其他资源

关于 CrewAI 集成的问题

加入 Stagehand 的 Slack 社区
在 Stagehand 仓库中提出问题
访问 Stagehand 文档

开始使用

指南

核心概念

MCP 集成

工具

可观测性

学习

遥测

Stagehand 工具

概述

概述

先决条件

用法

基本实现

1. 使用上下文管理器（推荐）

2. 手动资源管理

命令类型

1. Act 命令

2. Extract 命令

3. Observe 命令

配置选项

最佳实践

故障排除

其他资源

开始使用

指南

核心概念

MCP 集成

工具

可观测性

学习

遥测

​概述

​概述

​先决条件

​用法

​基本实现

​1. 使用上下文管理器（推荐）

​2. 手动资源管理

​命令类型

​1. Act 命令

​2. Extract 命令

​3. Observe 命令

​配置选项

​最佳实践

​故障排除

​其他资源

概述

概述

先决条件

用法

基本实现

1. 使用上下文管理器（推荐）

2. 手动资源管理

命令类型

1. Act 命令

2. Extract 命令

3. Observe 命令

配置选项

最佳实践

故障排除

其他资源