概述
LLM 钩子在两个关键点执行- LLM 调用之前:修改消息、验证输入或阻止执行
- LLM 调用之后:转换响应、清理输出或修改对话历史记录
钩子类型
LLM 调用前钩子
在每次 LLM 调用之前执行,这些钩子可以- 检查和修改发送到 LLM 的消息
- 根据条件阻止 LLM 执行
- 实现速率限制或审批门
- 添加上下文或系统消息
- 记录请求详情
复制
询问 AI
def before_hook(context: LLMCallHookContext) -> bool | None:
# Return False to block execution
# Return True or None to allow execution
...
LLM 调用后钩子
在每次 LLM 调用之后执行,这些钩子可以- 修改或清理 LLM 响应
- 添加元数据或格式
- 记录响应详情
- 更新对话历史记录
- 实现内容过滤
复制
询问 AI
def after_hook(context: LLMCallHookContext) -> str | None:
# Return modified response string
# Return None to keep original response
...
LLM 钩子上下文
LLMCallHookContext 对象提供对执行状态的全面访问
复制
询问 AI
class LLMCallHookContext:
executor: CrewAgentExecutor # Full executor reference
messages: list # Mutable message list
agent: Agent # Current agent
task: Task # Current task
crew: Crew # Crew instance
llm: BaseLLM # LLM instance
iterations: int # Current iteration count
response: str | None # LLM response (after hooks only)
修改消息
重要提示: 始终原地修改消息复制
询问 AI
# ✅ Correct - modify in-place
def add_context(context: LLMCallHookContext) -> None:
context.messages.append({"role": "system", "content": "Be concise"})
# ❌ Wrong - replaces list reference
def wrong_approach(context: LLMCallHookContext) -> None:
context.messages = [{"role": "system", "content": "Be concise"}]
注册方法
1. 全局钩子注册
注册适用于所有团队的所有 LLM 调用的钩子复制
询问 AI
from crewai.hooks import register_before_llm_call_hook, register_after_llm_call_hook
def log_llm_call(context):
print(f"LLM call by {context.agent.role} at iteration {context.iterations}")
return None # Allow execution
register_before_llm_call_hook(log_llm_call)
2. 基于装饰器的注册
使用装饰器以获得更简洁的语法复制
询问 AI
from crewai.hooks import before_llm_call, after_llm_call
@before_llm_call
def validate_iteration_count(context):
if context.iterations > 10:
print("⚠️ Exceeded maximum iterations")
return False # Block execution
return None
@after_llm_call
def sanitize_response(context):
if context.response and "API_KEY" in context.response:
return context.response.replace("API_KEY", "[REDACTED]")
return None
3. 团队范围的钩子
为特定的团队实例注册钩子复制
询问 AI
@CrewBase
class MyProjCrew:
@before_llm_call_crew
def validate_inputs(self, context):
# Only applies to this crew
if context.iterations == 0:
print(f"Starting task: {context.task.description}")
return None
@after_llm_call_crew
def log_responses(self, context):
# Crew-specific response logging
print(f"Response length: {len(context.response)}")
return None
@crew
def crew(self) -> Crew:
return Crew(
agents=self.agents,
tasks=self.tasks,
process=Process.sequential,
verbose=True
)
常见用例
1. 迭代限制
复制
询问 AI
@before_llm_call
def limit_iterations(context: LLMCallHookContext) -> bool | None:
max_iterations = 15
if context.iterations > max_iterations:
print(f"⛔ Blocked: Exceeded {max_iterations} iterations")
return False # Block execution
return None
2. 人工审批门
复制
询问 AI
@before_llm_call
def require_approval(context: LLMCallHookContext) -> bool | None:
if context.iterations > 5:
response = context.request_human_input(
prompt=f"Iteration {context.iterations}: Approve LLM call?",
default_message="Press Enter to approve, or type 'no' to block:"
)
if response.lower() == "no":
print("🚫 LLM call blocked by user")
return False
return None
3. 添加系统上下文
复制
询问 AI
@before_llm_call
def add_guardrails(context: LLMCallHookContext) -> None:
# Add safety guidelines to every LLM call
context.messages.append({
"role": "system",
"content": "Ensure responses are factual and cite sources when possible."
})
return None
4. 响应清理
复制
询问 AI
@after_llm_call
def sanitize_sensitive_data(context: LLMCallHookContext) -> str | None:
if not context.response:
return None
# Remove sensitive patterns
import re
sanitized = context.response
sanitized = re.sub(r'\b\d{3}-\d{2}-\d{4}\b', '[SSN-REDACTED]', sanitized)
sanitized = re.sub(r'\b\d{4}[- ]?\d{4}[- ]?\d{4}[- ]?\d{4}\b', '[CARD-REDACTED]', sanitized)
return sanitized
5. 成本追踪
复制
询问 AI
import tiktoken
@before_llm_call
def track_token_usage(context: LLMCallHookContext) -> None:
encoding = tiktoken.get_encoding("cl100k_base")
total_tokens = sum(
len(encoding.encode(msg.get("content", "")))
for msg in context.messages
)
print(f"📊 Input tokens: ~{total_tokens}")
return None
@after_llm_call
def track_response_tokens(context: LLMCallHookContext) -> None:
if context.response:
encoding = tiktoken.get_encoding("cl100k_base")
tokens = len(encoding.encode(context.response))
print(f"📊 Response tokens: ~{tokens}")
return None
6. 调试日志
复制
询问 AI
@before_llm_call
def debug_request(context: LLMCallHookContext) -> None:
print(f"""
🔍 LLM Call Debug:
- Agent: {context.agent.role}
- Task: {context.task.description[:50]}...
- Iteration: {context.iterations}
- Message Count: {len(context.messages)}
- Last Message: {context.messages[-1] if context.messages else 'None'}
""")
return None
@after_llm_call
def debug_response(context: LLMCallHookContext) -> None:
if context.response:
print(f"✅ Response Preview: {context.response[:100]}...")
return None
钩子管理
注销钩子
复制
询问 AI
from crewai.hooks import (
unregister_before_llm_call_hook,
unregister_after_llm_call_hook
)
# Unregister specific hook
def my_hook(context):
...
register_before_llm_call_hook(my_hook)
# Later...
unregister_before_llm_call_hook(my_hook) # Returns True if found
清除钩子
复制
询问 AI
from crewai.hooks import (
clear_before_llm_call_hooks,
clear_after_llm_call_hooks,
clear_all_llm_call_hooks
)
# Clear specific hook type
count = clear_before_llm_call_hooks()
print(f"Cleared {count} before hooks")
# Clear all LLM hooks
before_count, after_count = clear_all_llm_call_hooks()
print(f"Cleared {before_count} before and {after_count} after hooks")
列出已注册的钩子
复制
询问 AI
from crewai.hooks import (
get_before_llm_call_hooks,
get_after_llm_call_hooks
)
# Get current hooks
before_hooks = get_before_llm_call_hooks()
after_hooks = get_after_llm_call_hooks()
print(f"Registered: {len(before_hooks)} before, {len(after_hooks)} after")
高级模式
条件钩子执行
复制
询问 AI
@before_llm_call
def conditional_blocking(context: LLMCallHookContext) -> bool | None:
# Only block for specific agents
if context.agent.role == "researcher" and context.iterations > 10:
return False
# Only block for specific tasks
if "sensitive" in context.task.description.lower() and context.iterations > 5:
return False
return None
上下文感知修改
复制
询问 AI
@before_llm_call
def adaptive_prompting(context: LLMCallHookContext) -> None:
# Add different context based on iteration
if context.iterations == 0:
context.messages.append({
"role": "system",
"content": "Start with a high-level overview."
})
elif context.iterations > 3:
context.messages.append({
"role": "system",
"content": "Focus on specific details and provide examples."
})
return None
链式钩子
复制
询问 AI
# Multiple hooks execute in registration order
@before_llm_call
def first_hook(context):
print("1. First hook executed")
return None
@before_llm_call
def second_hook(context):
print("2. Second hook executed")
return None
@before_llm_call
def blocking_hook(context):
if context.iterations > 10:
print("3. Blocking hook - execution stopped")
return False # Subsequent hooks won't execute
print("3. Blocking hook - execution allowed")
return None
最佳实践
- 保持钩子专注:每个钩子都应该只有一个职责
- 避免繁重计算:钩子在每次 LLM 调用时执行
- 优雅处理错误:使用 try-except 防止钩子失败导致执行中断
- 使用类型提示:利用
LLMCallHookContext以获得更好的 IDE 支持 - 记录钩子行为:特别是对于阻塞条件
- 独立测试钩子:在生产中使用之前对钩子进行单元测试
- 在测试中清除钩子:在测试运行之间使用
clear_all_llm_call_hooks() - 原地修改:始终原地修改
context.messages,切勿替换
错误处理
复制
询问 AI
@before_llm_call
def safe_hook(context: LLMCallHookContext) -> bool | None:
try:
# Your hook logic
if some_condition:
return False
except Exception as e:
print(f"⚠️ Hook error: {e}")
# Decide: allow or block on error
return None # Allow execution despite error
类型安全
复制
询问 AI
from crewai.hooks import LLMCallHookContext, BeforeLLMCallHookType, AfterLLMCallHookType
# Explicit type annotations
def my_before_hook(context: LLMCallHookContext) -> bool | None:
return None
def my_after_hook(context: LLMCallHookContext) -> str | None:
return None
# Type-safe registration
register_before_llm_call_hook(my_before_hook)
register_after_llm_call_hook(my_after_hook)
故障排除
钩子未执行
- 验证钩子是否在团队执行前注册
- 检查上一个钩子是否返回
False(阻止后续钩子) - 确保钩子签名与预期类型匹配
消息修改未持久化
- 使用原地修改:
context.messages.append() - 不要替换列表:
context.messages = []
响应修改无效
- 从 after 钩子返回修改后的字符串
- 返回
None将保留原始响应
