跳转到主要内容

概述

CrewAI 框架提供了一个复杂的记忆系统,旨在显著增强 AI 智能体的能力。CrewAI 提供两种不同的记忆方法,以服务于不同的使用场景
  1. 基础记忆系统 - 内置的短期、长期和实体记忆。
  2. 外部记忆 - 独立的外部记忆提供商。

记忆系统组件

组件描述
短期记忆使用 RAG 临时存储最近的交互和结果,使智能体能够在当前执行期间回忆和利用与其当前上下文相关的信息。
长期记忆保存过去执行中有价值的见解和学习成果,使智能体能够随着时间的推移建立和完善其知识。
实体记忆捕获并组织任务期间遇到的实体(人、地点、概念)信息,从而促进更深入的理解和关系映射。使用 RAG 存储实体信息。
上下文记忆通过结合短期记忆长期记忆外部记忆实体记忆来维护交互的上下文,有助于在一系列任务或对话中保持智能体响应的一致性和相关性。
最简单且最常用的方法。只需一个参数即可为您的 crew 启用记忆功能。

快速入门

from crewai import Crew, Agent, Task, Process

# Enable basic memory system
crew = Crew(
    agents=[...],
    tasks=[...],
    process=Process.sequential,
    memory=True,  # Enables short-term, long-term, and entity memory
    verbose=True
)

工作原理

  • 短期记忆:使用 ChromaDB 和 RAG 处理当前上下文。
  • 长期记忆:使用 SQLite3 跨会话存储任务结果。
  • 实体记忆:使用 RAG 跟踪实体(人、地点、概念)。
  • 存储位置:通过 appdirs 包指定平台特定的位置。
  • 自定义存储目录:设置 CREWAI_STORAGE_DIR 环境变量。

存储位置的透明性

理解存储位置:CrewAI 使用平台特定的目录来存储记忆和知识文件,遵循操作系统的惯例。了解这些位置有助于生产部署、备份和调试。

CrewAI 在哪里存储文件

默认情况下,CrewAI 使用 appdirs 库来根据平台惯例确定存储位置。以下是您的文件的确切存储位置:

各平台的默认存储位置

macOS
~/Library/Application Support/CrewAI/{project_name}/
├── knowledge/           # Knowledge base ChromaDB files
├── short_term_memory/   # Short-term memory ChromaDB files
├── long_term_memory/    # Long-term memory ChromaDB files
├── entities/            # Entity memory ChromaDB files
└── long_term_memory_storage.db  # SQLite database
Linux
~/.local/share/CrewAI/{project_name}/
├── knowledge/
├── short_term_memory/
├── long_term_memory/
├── entities/
└── long_term_memory_storage.db
Windows
C:\Users\{username}\AppData\Local\CrewAI\{project_name}\
├── knowledge\
├── short_term_memory\
├── long_term_memory\
├── entities\
└── long_term_memory_storage.db

查找您的存储位置

要确切查看 CrewAI 在您的系统上存储文件的位置
from crewai.utilities.paths import db_storage_path
import os

# Get the base storage path
storage_path = db_storage_path()
print(f"CrewAI storage location: {storage_path}")

# List all CrewAI storage directories
if os.path.exists(storage_path):
    print("\nStored files and directories:")
    for item in os.listdir(storage_path):
        item_path = os.path.join(storage_path, item)
        if os.path.isdir(item_path):
            print(f"📁 {item}/")
            # Show ChromaDB collections
            if os.path.exists(item_path):
                for subitem in os.listdir(item_path):
                    print(f"   └── {subitem}")
        else:
            print(f"📄 {item}")
else:
    print("No CrewAI storage directory found yet.")

控制存储位置

import os
from crewai import Crew

# Set custom storage location
os.environ["CREWAI_STORAGE_DIR"] = "./my_project_storage"

# All memory and knowledge will now be stored in ./my_project_storage/
crew = Crew(
    agents=[...],
    tasks=[...],
    memory=True
)

选项 2:自定义存储路径

import os
from crewai import Crew
from crewai.memory import LongTermMemory
from crewai.memory.storage.ltm_sqlite_storage import LTMSQLiteStorage

# Configure custom storage location
custom_storage_path = "./storage"
os.makedirs(custom_storage_path, exist_ok=True)

crew = Crew(
    memory=True,
    long_term_memory=LongTermMemory(
        storage=LTMSQLiteStorage(
            db_path=f"{custom_storage_path}/memory.db"
        )
    )
)

选项 3:项目特定存储

import os
from pathlib import Path

# Store in project directory
project_root = Path(__file__).parent
storage_dir = project_root / "crewai_storage"

os.environ["CREWAI_STORAGE_DIR"] = str(storage_dir)

# Now all storage will be in your project directory

嵌入提供商默认设置

默认嵌入提供商:为保证一致性和可靠性,CrewAI 默认使用 OpenAI embeddings。您可以轻松自定义此设置以匹配您的 LLM 提供商或使用本地嵌入。

了解默认行为

# When using Claude as your LLM...
from crewai import Agent, LLM

agent = Agent(
    role="Analyst",
    goal="Analyze data",
    backstory="Expert analyst",
    llm=LLM(provider="anthropic", model="claude-3-sonnet")  # Using Claude
)

# CrewAI will use OpenAI embeddings by default for consistency
# You can easily customize this to match your preferred provider

自定义嵌入提供商

from crewai import Crew

# Option 1: Match your LLM provider
crew = Crew(
    agents=[agent],
    tasks=[task],
    memory=True,
    embedder={
        "provider": "anthropic", # Match your LLM provider
        "config": {
            "api_key": "your-anthropic-key",
            "model": "text-embedding-3-small"
        }
    }
)

# Option 2: Use local embeddings (no external API calls)
crew = Crew(
    agents=[agent],
    tasks=[task],
    memory=True,
    embedder={
        "provider": "ollama",
        "config": {"model": "mxbai-embed-large"}
    }
)

调试存储问题

检查存储权限

import os
from crewai.utilities.paths import db_storage_path

storage_path = db_storage_path()
print(f"Storage path: {storage_path}")
print(f"Path exists: {os.path.exists(storage_path)}")
print(f"Is writable: {os.access(storage_path, os.W_OK) if os.path.exists(storage_path) else 'Path does not exist'}")

# Create with proper permissions
if not os.path.exists(storage_path):
    os.makedirs(storage_path, mode=0o755, exist_ok=True)
    print(f"Created storage directory: {storage_path}")

检查 ChromaDB 集合

import chromadb
from crewai.utilities.paths import db_storage_path

# Connect to CrewAI's ChromaDB
storage_path = db_storage_path()
chroma_path = os.path.join(storage_path, "knowledge")

if os.path.exists(chroma_path):
    client = chromadb.PersistentClient(path=chroma_path)
    collections = client.list_collections()

    print("ChromaDB Collections:")
    for collection in collections:
        print(f"  - {collection.name}: {collection.count()} documents")
else:
    print("No ChromaDB storage found")

重置存储 (用于调试)

from crewai import Crew

# Reset all memory storage
crew = Crew(agents=[...], tasks=[...], memory=True)

# Reset specific memory types
crew.reset_memories(command_type='short')     # Short-term memory
crew.reset_memories(command_type='long')      # Long-term memory
crew.reset_memories(command_type='entity')    # Entity memory
crew.reset_memories(command_type='knowledge') # Knowledge storage

生产最佳实践

  1. 设置 CREWAI_STORAGE_DIR 到生产环境中的已知位置,以更好地进行控制。
  2. 选择明确的嵌入提供商以匹配您的 LLM 设置。
  3. 监控存储目录的大小,以应对大规模部署。
  4. 将存储目录包含在您的备份策略中。
  5. 设置适当的文件权限 (目录为 0o755,文件为 0o644)。
  6. 为容器化部署使用项目相对路径

常见存储问题

“ChromaDB 权限被拒绝”错误
# Fix permissions
chmod -R 755 ~/.local/share/CrewAI/
“数据库已锁定”错误
# Ensure only one CrewAI instance accesses storage
import fcntl
import os

storage_path = db_storage_path()
lock_file = os.path.join(storage_path, ".crewai.lock")

with open(lock_file, 'w') as f:
    fcntl.flock(f.fileno(), fcntl.LOCK_EX | fcntl.LOCK_NB)
    # Your CrewAI code here
存储在两次运行之间未持久化
# Verify storage location is consistent
import os
print("CREWAI_STORAGE_DIR:", os.getenv("CREWAI_STORAGE_DIR"))
print("Current working directory:", os.getcwd())
print("Computed storage path:", db_storage_path())

自定义嵌入器配置

CrewAI 支持多种嵌入提供商,让您可以灵活选择最适合您使用场景的选项。以下是为您的记忆系统配置不同嵌入提供商的综合指南。

为何选择不同的嵌入提供商?

  • 成本优化:本地嵌入 (Ollama) 在初始设置后是免费的。
  • 隐私:使用 Ollama 将您的数据保留在本地,或使用您首选的云提供商。
  • 性能:某些模型在特定领域或语言上表现更佳。
  • 一致性:将您的嵌入提供商与您的 LLM 提供商匹配。
  • 合规性:满足特定的法规或组织要求。

OpenAI Embeddings (默认)

OpenAI 提供可靠、高质量的嵌入,适用于大多数使用场景。
from crewai import Crew

# Basic OpenAI configuration (uses environment OPENAI_API_KEY)
crew = Crew(
    agents=[...],
    tasks=[...],
    memory=True,
    embedder={
        "provider": "openai",
        "config": {
            "model": "text-embedding-3-small"  # or "text-embedding-3-large"
        }
    }
)

# Advanced OpenAI configuration
crew = Crew(
    memory=True,
    embedder={
        "provider": "openai",
        "config": {
            "api_key": "your-openai-api-key",  # Optional: override env var
            "model": "text-embedding-3-large",
            "dimensions": 1536,  # Optional: reduce dimensions for smaller storage
            "organization_id": "your-org-id"  # Optional: for organization accounts
        }
    }
)

Azure OpenAI Embeddings

适用于拥有 Azure OpenAI 部署的企业用户。
crew = Crew(
    memory=True,
    embedder={
        "provider": "openai",  # Use openai provider for Azure
        "config": {
            "api_key": "your-azure-api-key",
            "api_base": "https://your-resource.openai.azure.com/",
            "api_type": "azure",
            "api_version": "2023-05-15",
            "model": "text-embedding-3-small",
            "deployment_id": "your-deployment-name"  # Azure deployment name
        }
    }
)

Google AI Embeddings

使用 Google 的文本嵌入模型与 Google Cloud 服务集成。
crew = Crew(
    memory=True,
    embedder={
        "provider": "google",
        "config": {
            "api_key": "your-google-api-key",
            "model": "text-embedding-004"  # or "text-embedding-preview-0409"
        }
    }
)

Vertex AI Embeddings

适用于拥有 Vertex AI 访问权限的 Google Cloud 用户。
crew = Crew(
    memory=True,
    embedder={
        "provider": "vertexai",
        "config": {
            "project_id": "your-gcp-project-id",
            "region": "us-central1",  # or your preferred region
            "api_key": "your-service-account-key",
            "model_name": "textembedding-gecko"
        }
    }
)

Ollama Embeddings (本地)

在本地运行嵌入,以保护隐私并节省成本。
# First, install and run Ollama locally, then pull an embedding model:
# ollama pull mxbai-embed-large

crew = Crew(
    memory=True,
    embedder={
        "provider": "ollama",
        "config": {
            "model": "mxbai-embed-large",  # or "nomic-embed-text"
            "url": "https://:11434/api/embeddings"  # Default Ollama URL
        }
    }
)

# For custom Ollama installations
crew = Crew(
    memory=True,
    embedder={
        "provider": "ollama",
        "config": {
            "model": "mxbai-embed-large",
            "url": "http://your-ollama-server:11434/api/embeddings"
        }
    }
)

Cohere Embeddings

使用 Cohere 的嵌入模型以支持多语言。
crew = Crew(
    memory=True,
    embedder={
        "provider": "cohere",
        "config": {
            "api_key": "your-cohere-api-key",
            "model": "embed-english-v3.0"  # or "embed-multilingual-v3.0"
        }
    }
)

VoyageAI Embeddings

为检索任务优化的高性能嵌入。
crew = Crew(
    memory=True,
    embedder={
        "provider": "voyageai",
        "config": {
            "api_key": "your-voyage-api-key",
            "model": "voyage-large-2",  # or "voyage-code-2" for code
            "input_type": "document"  # or "query"
        }
    }
)

AWS Bedrock Embeddings

适用于拥有 Bedrock 访问权限的 AWS 用户。
crew = Crew(
    memory=True,
    embedder={
        "provider": "bedrock",
        "config": {
            "aws_access_key_id": "your-access-key",
            "aws_secret_access_key": "your-secret-key",
            "region_name": "us-east-1",
            "model": "amazon.titan-embed-text-v1"
        }
    }
)

Hugging Face Embeddings

使用来自 Hugging Face 的开源模型。
crew = Crew(
    memory=True,
    embedder={
        "provider": "huggingface",
        "config": {
            "api_key": "your-hf-token",  # Optional for public models
            "model": "sentence-transformers/all-MiniLM-L6-v2",
            "api_url": "https://api-inference.huggingface.co"  # or your custom endpoint
        }
    }
)

IBM Watson Embeddings

适用于 IBM Cloud 用户。
crew = Crew(
    memory=True,
    embedder={
        "provider": "watson",
        "config": {
            "api_key": "your-watson-api-key",
            "url": "your-watson-instance-url",
            "model": "ibm/slate-125m-english-rtrvr"
        }
    }
)

Mem0 提供商

短期记忆和实体记忆均支持与 Mem0 OSS 和 Mem0 Client 的紧密集成。以下是如何将 Mem0 用作提供商的方法。
from crewai.memory.short_term.short_term_memory import ShortTermMemory
from crewai.memory.entity_entity_memory import EntityMemory

mem0_oss_embedder_config = {
        "provider": "mem0",
        "config": {
            "user_id": "john",
            "local_mem0_config": {
                "vector_store": {"provider": "qdrant","config": {"host": "localhost", "port": 6333}},
                "llm": {"provider": "openai","config": {"api_key": "your-api-key", "model": "gpt-4"}},
                "embedder": {"provider": "openai","config": {"api_key": "your-api-key", "model": "text-embedding-3-small"}}
            },
            "infer": True # Optional defaults to True
        },
    }


mem0_client_embedder_config = {
        "provider": "mem0",
        "config": {
            "user_id": "john",
            "org_id": "my_org_id",        # Optional
            "project_id": "my_project_id", # Optional
            "api_key": "custom-api-key"    # Optional - overrides env var
            "run_id": "my_run_id",        # Optional - for short-term memory
            "includes": "include1",       # Optional 
            "excludes": "exclude1",       # Optional
            "infer": True                 # Optional defaults to True
            "custom_categories": new_categories  # Optional - custom categories for user memory
        },
    }


short_term_memory_mem0_oss = ShortTermMemory(embedder_config=mem0_oss_embedder_config) # Short Term Memory with Mem0 OSS
short_term_memory_mem0_client = ShortTermMemory(embedder_config=mem0_client_embedder_config) # Short Term Memory with Mem0 Client
entity_memory_mem0_oss = EntityMemory(embedder_config=mem0_oss_embedder_config) # Entity Memory with Mem0 OSS
entity_memory_mem0_client = EntityMemory(embedder_config=mem0_client_embedder_config) # Short Term Memory with Mem0 Client

crew = Crew(
    memory=True,
    short_term_memory=short_term_memory_mem0_oss, # or short_term_memory_mem0_client
    entity_memory=entity_memory_mem0_oss # or entity_memory_mem0_client
)

选择正确的嵌入提供商

选择嵌入提供商时,请考虑性能、隐私、成本和集成需求等因素。
以下是一个比较,以帮助您做出决定:
提供商最适合优点缺点
OpenAI通用场景,高可靠性高质量,经过广泛测试付费服务,需要 API 密钥
Ollama注重隐私,节省成本免费,本地运行,完全私密需要本地安装/设置
Google AI集成到 Google 生态系统性能强劲,支持良好需要 Google 账户
Azure OpenAI企业与合规需求企业级功能,安全性高设置过程更复杂
Cohere多语言内容处理出色的语言支持更小众的使用场景
VoyageAI信息检索与搜索为检索任务优化相对较新的提供商
Mem0针对每个用户的个性化为搜索优化的嵌入付费服务,需要 API 密钥

环境变量配置

为安全起见,请将 API 密钥存储在环境变量中。
import os

# Set environment variables
os.environ["OPENAI_API_KEY"] = "your-openai-key"
os.environ["GOOGLE_API_KEY"] = "your-google-key"
os.environ["COHERE_API_KEY"] = "your-cohere-key"

# Use without exposing keys in code
crew = Crew(
    memory=True,
    embedder={
        "provider": "openai",
        "config": {
            "model": "text-embedding-3-small"
            # API key automatically loaded from environment
        }
    }
)

测试不同的嵌入提供商

比较不同嵌入提供商,以找到最适合您特定使用场景的选项。
from crewai import Crew
from crewai.utilities.paths import db_storage_path

# Test different providers with the same data
providers_to_test = [
    {
        "name": "OpenAI",
        "config": {
            "provider": "openai",
            "config": {"model": "text-embedding-3-small"}
        }
    },
    {
        "name": "Ollama",
        "config": {
            "provider": "ollama",
            "config": {"model": "mxbai-embed-large"}
        }
    }
]

for provider in providers_to_test:
    print(f"\nTesting {provider['name']} embeddings...")

    # Create crew with specific embedder
    crew = Crew(
        agents=[...],
        tasks=[...],
        memory=True,
        embedder=provider['config']
    )

    # Run your test and measure performance
    result = crew.kickoff()
    print(f"{provider['name']} completed successfully")

嵌入问题故障排除

模型未找到错误
# Verify model availability
from crewai.rag.embeddings.configurator import EmbeddingConfigurator

configurator = EmbeddingConfigurator()
try:
    embedder = configurator.configure_embedder({
        "provider": "ollama",
        "config": {"model": "mxbai-embed-large"}
    })
    print("Embedder configured successfully")
except Exception as e:
    print(f"Configuration error: {e}")
API 密钥问题
import os

# Check if API keys are set
required_keys = ["OPENAI_API_KEY", "GOOGLE_API_KEY", "COHERE_API_KEY"]
for key in required_keys:
    if os.getenv(key):
        print(f"✅ {key} is set")
    else:
        print(f"❌ {key} is not set")
性能比较
import time

def test_embedding_performance(embedder_config, test_text="This is a test document"):
    start_time = time.time()

    crew = Crew(
        agents=[...],
        tasks=[...],
        memory=True,
        embedder=embedder_config
    )

    # Simulate memory operation
    crew.kickoff()

    end_time = time.time()
    return end_time - start_time

# Compare performance
openai_time = test_embedding_performance({
    "provider": "openai",
    "config": {"model": "text-embedding-3-small"}
})

ollama_time = test_embedding_performance({
    "provider": "ollama",
    "config": {"model": "mxbai-embed-large"}
})

print(f"OpenAI: {openai_time:.2f}s")
print(f"Ollama: {ollama_time:.2f}s")

实体记忆的批处理行为

实体记忆支持在一次性保存多个实体时进行批处理。当您传递一个 EntityMemoryItem 列表时,系统会:
  • 发出一个包含 entity_count 的 MemorySaveStartedEvent
  • 在内部保存每个实体,收集任何部分错误
  • 发出一个包含聚合元数据(已保存数量、错误)的 MemorySaveCompletedEvent
  • 如果某些实体保存失败,则引发一个部分保存异常(包含数量)
这在一次操作中写入许多实体时,可以提高性能和可观察性。

2. 外部记忆

外部记忆提供了一个独立的记忆系统,它独立于 crew 的内置记忆运行。这对于专门的记忆提供商或跨应用记忆共享是理想的选择。

使用 Mem0 的基本外部记忆

import os
from crewai import Agent, Crew, Process, Task
from crewai.memory.external.external_memory import ExternalMemory

# Create external memory instance with local Mem0 Configuration
external_memory = ExternalMemory(
    embedder_config={
        "provider": "mem0",
        "config": {
            "user_id": "john",
            "local_mem0_config": {
                "vector_store": {
                    "provider": "qdrant",
                    "config": {"host": "localhost", "port": 6333}
                },
                "llm": {
                    "provider": "openai",
                    "config": {"api_key": "your-api-key", "model": "gpt-4"}
                },
                "embedder": {
                    "provider": "openai",
                    "config": {"api_key": "your-api-key", "model": "text-embedding-3-small"}
                }
            },
            "infer": True # Optional defaults to True
        },
    }
)

crew = Crew(
    agents=[...],
    tasks=[...],
    external_memory=external_memory, # Separate from basic memory
    process=Process.sequential,
    verbose=True
)

使用 Mem0 Client 的高级外部记忆

当使用 Mem0 Client 时,您可以通过使用‘includes’、‘excludes’、‘custom_categories’、‘infer’ 和 ‘run_id’(这仅适用于短期记忆)等参数来进一步自定义记忆配置。您可以在 Mem0 文档中找到更多详细信息。
import os
from crewai import Agent, Crew, Process, Task
from crewai.memory.external.external_memory import ExternalMemory

new_categories = [
    {"lifestyle_management_concerns": "Tracks daily routines, habits, hobbies and interests including cooking, time management and work-life balance"},
    {"seeking_structure": "Documents goals around creating routines, schedules, and organized systems in various life areas"},
    {"personal_information": "Basic information about the user including name, preferences, and personality traits"}
]

os.environ["MEM0_API_KEY"] = "your-api-key"

# Create external memory instance with Mem0 Client
external_memory = ExternalMemory(
    embedder_config={
        "provider": "mem0",
        "config": {
            "user_id": "john",
            "org_id": "my_org_id",        # Optional
            "project_id": "my_project_id", # Optional
            "api_key": "custom-api-key"    # Optional - overrides env var
            "run_id": "my_run_id",        # Optional - for short-term memory
            "includes": "include1",       # Optional 
            "excludes": "exclude1",       # Optional
            "infer": True                 # Optional defaults to True
            "custom_categories": new_categories  # Optional - custom categories for user memory
        },
    }
)

crew = Crew(
    agents=[...],
    tasks=[...],
    external_memory=external_memory, # Separate from basic memory
    process=Process.sequential,
    verbose=True
)

自定义存储实现

from crewai.memory.external.external_memory import ExternalMemory
from crewai.memory.storage.interface import Storage

class CustomStorage(Storage):
    def __init__(self):
        self.memories = []

    def save(self, value, metadata=None, agent=None):
        self.memories.append({
            "value": value,
            "metadata": metadata,
            "agent": agent
        })

    def search(self, query, limit=10, score_threshold=0.5):
        # Implement your search logic here
        return [m for m in self.memories if query.lower() in str(m["value"]).lower()]

    def reset(self):
        self.memories = []

# Use custom storage
external_memory = ExternalMemory(storage=CustomStorage())

crew = Crew(
    agents=[...],
    tasks=[...],
    external_memory=external_memory
)

🧠 记忆系统比较

类别功能基础记忆外部记忆
易用性设置复杂度简单中等
集成内置(上下文相关)独立
持久性存储本地文件自定义 / Mem0
跨会话支持
个性化用户特定记忆
自定义提供商有限任何提供商
使用场景匹配推荐用于大多数通用使用场景专业/自定义需求

支持的嵌入提供商

OpenAI (默认)

crew = Crew(
    memory=True,
    embedder={
        "provider": "openai",
        "config": {"model": "text-embedding-3-small"}
    }
)

Ollama

crew = Crew(
    memory=True,
    embedder={
        "provider": "ollama",
        "config": {"model": "mxbai-embed-large"}
    }
)

Google AI

crew = Crew(
    memory=True,
    embedder={
        "provider": "google",
        "config": {
            "api_key": "your-api-key",
            "model": "text-embedding-004"
        }
    }
)

Azure OpenAI

crew = Crew(
    memory=True,
    embedder={
        "provider": "openai",
        "config": {
            "api_key": "your-api-key",
            "api_base": "https://your-resource.openai.azure.com/",
            "api_version": "2023-05-15",
            "model_name": "text-embedding-3-small"
        }
    }
)

Vertex AI

crew = Crew(
    memory=True,
    embedder={
        "provider": "vertexai",
        "config": {
            "project_id": "your-project-id",
            "region": "your-region",
            "api_key": "your-api-key",
            "model_name": "textembedding-gecko"
        }
    }
)

安全最佳实践

环境变量

import os
from crewai import Crew

# Store sensitive data in environment variables
crew = Crew(
    memory=True,
    embedder={
        "provider": "openai",
        "config": {
            "api_key": os.getenv("OPENAI_API_KEY"),
            "model": "text-embedding-3-small"
        }
    }
)

存储安全

import os
from crewai import Crew
from crewai.memory import LongTermMemory
from crewai.memory.storage.ltm_sqlite_storage import LTMSQLiteStorage

# Use secure storage paths
storage_path = os.getenv("CREWAI_STORAGE_DIR", "./storage")
os.makedirs(storage_path, mode=0o700, exist_ok=True)  # Restricted permissions

crew = Crew(
    memory=True,
    long_term_memory=LongTermMemory(
        storage=LTMSQLiteStorage(
            db_path=f"{storage_path}/memory.db"
        )
    )
)

故障排除

常见问题

记忆在会话之间没有持久化?
  • 检查 CREWAI_STORAGE_DIR 环境变量。
  • 确保对存储目录有写入权限。
  • 确认记忆功能已通过 memory=True 启用。
Mem0 认证错误?
  • 确认 MEM0_API_KEY 环境变量已设置。
  • 在 Mem0 仪表板上检查 API 密钥权限。
  • 确保 mem0ai 包已安装。
处理大数据集时内存占用过高?
  • 考虑使用带有自定义存储的外部记忆。
  • 在自定义存储的搜索方法中实现分页。
  • 使用较小的嵌入模型以减少内存占用。

性能提示

  • 对于大多数使用场景,使用 memory=True (最简单快捷)。
  • 仅在需要用户特定的持久化时才使用用户记忆。
  • 对于大规模或特殊需求,考虑使用外部记忆。
  • 选择较小的嵌入模型以加快处理速度。
  • 设置适当的搜索限制以控制记忆检索的大小。

使用 CrewAI 记忆系统的好处

  • 🦾 自适应学习: Crew 会随着时间的推移变得更加高效,适应新信息并改进其处理任务的方法。
  • 🫡 增强的个性化: 记忆功能使智能体能够记住用户偏好和历史互动,从而提供个性化的体验。
  • 🧠 改进的问题解决能力: 访问丰富的记忆库有助于智能体利用过去的学习经验和上下文洞察,做出更明智的决策。

记忆事件

CrewAI 的事件系统为记忆操作提供了强大的洞察力。通过利用记忆事件,您可以监控、调试和优化记忆系统的性能和行为。

可用的记忆事件

CrewAI 会发出以下与记忆相关的事件:
事件描述关键属性
MemoryQueryStartedEvent当记忆查询开始时发出query, limit, score_threshold
MemoryQueryCompletedEvent当记忆查询成功完成时发出query, results, limit, score_threshold, query_time_ms
MemoryQueryFailedEvent当记忆查询失败时发出query, limit, score_threshold, error
MemorySaveStartedEvent当记忆保存操作开始时发出value, metadata, agent_role
MemorySaveCompletedEvent当记忆保存操作成功完成时发出value, metadata, agent_role, save_time_ms
MemorySaveFailedEvent当记忆保存操作失败时发出value, metadata, agent_role, error
MemoryRetrievalStartedEvent当为任务提示开始记忆检索时发出task_id
MemoryRetrievalCompletedEvent当记忆检索成功完成时发出task_id, memory_content, retrieval_time_ms

实际应用

1. 记忆性能监控

跟踪记忆操作的计时以优化您的应用程序。
from crewai.events import (
    BaseEventListener,
    MemoryQueryCompletedEvent,
    MemorySaveCompletedEvent
)
import time

class MemoryPerformanceMonitor(BaseEventListener):
    def __init__(self):
        super().__init__()
        self.query_times = []
        self.save_times = []

    def setup_listeners(self, crewai_event_bus):
        @crewai_event_bus.on(MemoryQueryCompletedEvent)
        def on_memory_query_completed(source, event: MemoryQueryCompletedEvent):
            self.query_times.append(event.query_time_ms)
            print(f"Memory query completed in {event.query_time_ms:.2f}ms. Query: '{event.query}'")
            print(f"Average query time: {sum(self.query_times)/len(self.query_times):.2f}ms")

        @crewai_event_bus.on(MemorySaveCompletedEvent)
        def on_memory_save_completed(source, event: MemorySaveCompletedEvent):
            self.save_times.append(event.save_time_ms)
            print(f"Memory save completed in {event.save_time_ms:.2f}ms")
            print(f"Average save time: {sum(self.save_times)/len(self.save_times):.2f}ms")

# Create an instance of your listener
memory_monitor = MemoryPerformanceMonitor()

2. 记忆内容日志记录

记录记忆操作以进行调试和获取洞察。
from crewai.events import (
    BaseEventListener,
    MemorySaveStartedEvent,
    MemoryQueryStartedEvent,
    MemoryRetrievalCompletedEvent
)
import logging

# Configure logging
logger = logging.getLogger('memory_events')

class MemoryLogger(BaseEventListener):
    def setup_listeners(self, crewai_event_bus):
        @crewai_event_bus.on(MemorySaveStartedEvent)
        def on_memory_save_started(source, event: MemorySaveStartedEvent):
            if event.agent_role:
                logger.info(f"Agent '{event.agent_role}' saving memory: {event.value[:50]}...")
            else:
                logger.info(f"Saving memory: {event.value[:50]}...")

        @crewai_event_bus.on(MemoryQueryStartedEvent)
        def on_memory_query_started(source, event: MemoryQueryStartedEvent):
            logger.info(f"Memory query started: '{event.query}' (limit: {event.limit})")

        @crewai_event_bus.on(MemoryRetrievalCompletedEvent)
        def on_memory_retrieval_completed(source, event: MemoryRetrievalCompletedEvent):
            if event.task_id:
                logger.info(f"Memory retrieved for task {event.task_id} in {event.retrieval_time_ms:.2f}ms")
            else:
                logger.info(f"Memory retrieved in {event.retrieval_time_ms:.2f}ms")
            logger.debug(f"Memory content: {event.memory_content}")

# Create an instance of your listener
memory_logger = MemoryLogger()

3. 错误跟踪和通知

捕获并响应记忆错误。
from crewai.events import (
    BaseEventListener,
    MemorySaveFailedEvent,
    MemoryQueryFailedEvent
)
import logging
from typing import Optional

# Configure logging
logger = logging.getLogger('memory_errors')

class MemoryErrorTracker(BaseEventListener):
    def __init__(self, notify_email: Optional[str] = None):
        super().__init__()
        self.notify_email = notify_email
        self.error_count = 0

    def setup_listeners(self, crewai_event_bus):
        @crewai_event_bus.on(MemorySaveFailedEvent)
        def on_memory_save_failed(source, event: MemorySaveFailedEvent):
            self.error_count += 1
            agent_info = f"Agent '{event.agent_role}'" if event.agent_role else "Unknown agent"
            error_message = f"Memory save failed: {event.error}. {agent_info}"
            logger.error(error_message)

            if self.notify_email and self.error_count % 5 == 0:
                self._send_notification(error_message)

        @crewai_event_bus.on(MemoryQueryFailedEvent)
        def on_memory_query_failed(source, event: MemoryQueryFailedEvent):
            self.error_count += 1
            error_message = f"Memory query failed: {event.error}. Query: '{event.query}'"
            logger.error(error_message)

            if self.notify_email and self.error_count % 5 == 0:
                self._send_notification(error_message)

    def _send_notification(self, message):
        # Implement your notification system (email, Slack, etc.)
        print(f"[NOTIFICATION] Would send to {self.notify_email}: {message}")

# Create an instance of your listener
error_tracker = MemoryErrorTracker(notify_email="admin@example.com")

与分析平台集成

记忆事件可以转发到分析和监控平台,以跟踪性能指标、检测异常并可视化记忆使用模式。
from crewai.events import (
    BaseEventListener,
    MemoryQueryCompletedEvent,
    MemorySaveCompletedEvent
)

class MemoryAnalyticsForwarder(BaseEventListener):
    def __init__(self, analytics_client):
        super().__init__()
        self.client = analytics_client

    def setup_listeners(self, crewai_event_bus):
        @crewai_event_bus.on(MemoryQueryCompletedEvent)
        def on_memory_query_completed(source, event: MemoryQueryCompletedEvent):
            # Forward query metrics to analytics platform
            self.client.track_metric({
                "event_type": "memory_query",
                "query": event.query,
                "duration_ms": event.query_time_ms,
                "result_count": len(event.results) if hasattr(event.results, "__len__") else 0,
                "timestamp": event.timestamp
            })

        @crewai_event_bus.on(MemorySaveCompletedEvent)
        def on_memory_save_completed(source, event: MemorySaveCompletedEvent):
            # Forward save metrics to analytics platform
            self.client.track_metric({
                "event_type": "memory_save",
                "agent_role": event.agent_role,
                "duration_ms": event.save_time_ms,
                "timestamp": event.timestamp
            })

记忆事件监听器的最佳实践

  1. 保持处理程序轻量级:避免在事件处理程序中进行复杂的处理,以防止性能影响。
  2. 使用适当的日志级别:正常操作使用 INFO,详细信息使用 DEBUG,问题使用 ERROR。
  3. 尽可能批量处理指标:在发送到外部系统之前累积指标。
  4. 优雅地处理异常:确保您的事件处理程序不会因意外数据而崩溃。
  5. 考虑内存消耗:注意存储大量事件数据的问题。

结论

将 CrewAI 的记忆系统集成到您的项目中非常简单。通过利用提供的记忆组件和配置,您可以迅速赋予您的智能体记忆、推理和从交互中学习的能力,从而解锁新的智能和能力水平。