m0o0scar/📖 modelscope!agentscope.md

[github] modelscope/agentscope

Source

Python / 40.3K lines of code. Start building LLM-empowered multi-agent applications in an easier way.

URL: https://github.com/modelscope/agentscope

Conversation

▲ Explain the major function of the project and how it works by referencing the most important and relevant code snippets.

AgentScope is a multi-agent platform designed to empower developers to build LLM-powered multi-agent applications with ease, reliability, and high performance. It features three high-level capabilities:

Easy-to-Use: AgentScope provides a user-friendly interface for developers to build multi-agent applications using Python. It offers various prebuilt components, such as agents, pipelines, and services, that can be easily integrated into applications.
High Robustness: AgentScope supports customized fault-tolerance controls and retry mechanisms to enhance application stability. It handles errors gracefully and provides mechanisms for developers to build customized fault-tolerant strategies.
Actor-Based Distribution: AgentScope enables developers to build distributed multi-agent applications in a centralized programming manner. It introduces an actor-based distributed mechanism that enables centralized programming of complex distributed workflows and automatic parallel optimization.

Here's how AgentScope works:

Initialization: The agentscope.init() function initializes the AgentScope environment. It loads model configurations, sets up logging, and initializes the monitor.

import agentscope

agentscope.init(
    model_configs="./model_configs.json",
    logger_level="INFO",
    save_dir="./runs",
    save_log=True,
    save_code=True,
    save_api_invoke=False,
    use_monitor=True,
)

Creating Agents: Developers can create agents using AgentScope's built-in agent classes or by customizing their own agent classes.

from agentscope.agents import DialogAgent, UserAgent

dialog_agent = DialogAgent(
    name="assistant",
    model_config_name="my_openai_config",
    sys_prompt="You are a helpful AI assistant",
)
user_agent = UserAgent()

Message Passing: Agents communicate with each other through messages, which are represented as Python dictionaries.

from agentscope.message import Msg

message_from_alice = Msg("Alice", "Hi!")
message_from_bob = Msg("Bob", "What about this picture I took?", url="/path/to/picture.jpg")

Pipelines: Pipelines define the flow of messages between agents. AgentScope provides various pipeline types, such as SequentialPipeline, IfElsePipeline, and SwitchPipeline, to manage complex interactions.
```
from agentscope.pipelines import SequentialPipeline

pipe = SequentialPipeline([dialog_agent, user_agent])
x = pipe(x)
```
Services: Services provide functional APIs that enable agents to perform specific tasks, such as executing Python code, web search, file operations, and more.
```
from agentscope.service import execute_python_code

result = execute_python_code(code="print('Hello world!')")
```

Distribution: AgentScope supports distributed deployment of agents. Developers can convert agents to their distributed versions using the to_dist() method.

dialog_agent = DialogAgent(
    name="assistant",
    model_config_name="my_openai_config",
    sys_prompt="You are a helpful AI assistant",
).to_dist()

Monitoring: AgentScope provides a monitor to track API usage and costs. This helps developers prevent overutilization and ensure compliance with rate limits.
```
from agentscope.utils import MonitorFactory

monitor = MonitorFactory.get_monitor()
monitor.register("token_num", metric_unit="token", quota=1000)
```
AgentScope Studio: AgentScope Studio is a web UI toolkit for building and monitoring multi-agent applications. It provides a user-friendly interface for building applications with drag-and-drop functionality and a dashboard for monitoring running applications.

AgentScope is a powerful and flexible platform for building LLM-powered multi-agent applications. Its ease of use, robustness, and support for distributed deployment make it an ideal choice for developers looking to build complex and sophisticated applications.

# Data Source Content

Full data source content here

- Welcome to join our community on | [Discord](https://discord.gg/eYMpfnkG8h) | DingTalk | |----------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------| |

|

| ---- ## News -

**[2024-07-18]** AgentScope supports streaming mode now! Refer to our [tutorial](https://modelscope.github.io/agentscope/en/tutorial/203-stream.html) and example [conversation in stream mode](https://github.com/modelscope/agentscope/tree/main/examples/conversation_in_stream_mode) for more details.

-

**[2024-07-15]** AgentScope has implemented the Mixture-of-Agents algorithm. Refer to our [MoA example](https://github.com/modelscope/agentscope/blob/main/examples/conversation_mixture_of_agents) for more details. -

**[2024-06-14]** A new prompt tuning module is available in AgentScope to help developers generate and optimize the agents' system prompts! Refer to our [tutorial](https://modelscope.github.io/agentscope/en/tutorial/209-prompt_opt.html) for more details! -

**[2024-06-11]** The RAG functionality is available for agents in **AgentScope** now! [**A quick introduction to RAG in AgentScope**](https://modelscope.github.io/agentscope/en/tutorial/210-rag.html) can help you equip your agent with external knowledge! -

**[2024-06-09]** We release **AgentScope** v0.0.5 now! In this new version, [**AgentScope Workstation**](https://modelscope.github.io/agentscope/en/tutorial/209-gui.html) (the online version is running on [agentscope.io](https://agentscope.io)) is open-sourced with the refactored [**AgentScope Studio**](https://modelscope.github.io/agentscope/en/tutorial/209-gui.html)! - **[2024-05-24]** We are pleased to announce that features related to the **AgentScope Workstation** will soon be open-sourced! The online website services are temporarily offline. The online website service will be upgraded and back online shortly. Stay tuned... - **[2024-05-15]** A new **Parser Module** for **formatted response** is added in AgentScope! Refer to our [tutorial](https://modelscope.github.io/agentscope/en/tutorial/203-parser.html) for more details. The [`DictDialogAgent`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/agents/dict_dialog_agent.py) and [werewolf game](https://github.com/modelscope/agentscope/tree/main/examples/game_werewolf) example are updated simultaneously. - **[2024-05-14]** Dear AgentScope users, we are conducting a survey on **AgentScope Workstation & Copilot** user experience. We currently need your valuable feedback to help us improve the experience of AgentScope's Drag & Drop multi-agent application development and Copilot. Your feedback is valuable and the survey will take about 3~5 minutes. Please click [URL](https://survey.aliyun.com/apps/zhiliao/vgpTppn22) to participate in questionnaire surveys. Thank you very much for your support and contribution! - **[2024-05-14]** AgentScope supports **gpt-4o** as well as other OpenAI vision models now! Try gpt-4o with its [model configuration](./examples/model_configs_template/openai_chat_template.json) and new example [Conversation with gpt-4o](./examples/conversation_with_gpt-4o)! - **[2024-04-30]** We release **AgentScope** v0.0.4 now! - **[2024-04-27]** [AgentScope Workstation](https://agentscope.aliyun.com/) is now online! You are welcome to try building your multi-agent application simply with our *drag-and-drop platform* and ask our *copilot* questions about AgentScope! - **[2024-04-19]** AgentScope supports Llama3 now! We provide [scripts](https://github.com/modelscope/agentscope/blob/main/examples/model_llama3) and example [model configuration](https://github.com/modelscope/agentscope/blob/main/examples/model_llama3) for quick set-up. Feel free to try llama3 in our examples! - **[2024-04-06]** We release **AgentScope** v0.0.3 now! - **[2024-04-06]** New examples [Gomoku](https://github.com/modelscope/agentscope/blob/main/examples/game_gomoku), [Conversation with ReAct Agent](https://github.com/modelscope/agentscope/blob/main/examples/conversation_with_react_agent), [Conversation with RAG Agent](https://github.com/modelscope/agentscope/blob/main/examples/conversation_with_RAG_agents) and [Distributed Parallel Optimization](https://github.com/modelscope/agentscope/blob/main/examples/distributed_parallel_optimization) are available now! - **[2024-03-19]** We release **AgentScope** v0.0.2 now! In this new version, AgentScope supports [ollama](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#supported-models)(A local CPU inference engine), [DashScope](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#supported-models) and Google [Gemini](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#supported-models) APIs. - **[2024-03-19]** New examples ["Autonomous Conversation with Mentions"](https://github.com/modelscope/agentscope/blob/main/examples/conversation_with_mentions) and ["Basic Conversation with LangChain library"](https://github.com/modelscope/agentscope/blob/main/examples/conversation_with_langchain) are available now! - **[2024-03-19]** The [Chinese tutorial](https://modelscope.github.io/agentscope/zh_CN/index.html) of AgentScope is online now! - **[2024-02-27]** We release **AgentScope v0.0.1** now, which is also available in [PyPI](https://pypi.org/project/agentscope/)! - **[2024-02-14]** We release our paper "AgentScope: A Flexible yet Robust Multi-Agent Platform" in [arXiv](https://arxiv.org/abs/2402.14034) now! --- ## What's AgentScope? AgentScope is an innovative multi-agent platform designed to empower developers to build multi-agent applications with large-scale models. It features three high-level capabilities: - 🤝 **Easy-to-Use**: Designed for developers, with [fruitful components](https://modelscope.github.io/agentscope/en/tutorial/204-service.html#), [comprehensive documentation](https://modelscope.github.io/agentscope/en/index.html), and broad compatibility. Besides, [AgentScope Workstation](https://agentscope.aliyun.com/) provides a *drag-and-drop programming platform* and a *copilot* for beginners of AgentScope! - ✅ **High Robustness**: Supporting customized fault-tolerance controls and retry mechanisms to enhance application stability. - 🚀 **Actor-Based Distribution**: Building distributed multi-agent applications in a centralized programming manner for streamlined development. **Supported Model Libraries** AgentScope provides a list of `ModelWrapper` to support both local model services and third-party model APIs. | API | Task | Model Wrapper | Configuration | Some Supported Models | |------------------------|-----------------|---------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------|-----------------------------------------------------------------| | OpenAI API | Chat | [`OpenAIChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/openai_model.py) |[guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#openai-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/openai_chat_template.json) | gpt-4o, gpt-4, gpt-3.5-turbo, ... | | | Embedding | [`OpenAIEmbeddingWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/openai_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#openai-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/openai_embedding_template.json) | text-embedding-ada-002, ... | | | DALL·E | [`OpenAIDALLEWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/openai_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#openai-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/openai_dall_e_template.json) | dall-e-2, dall-e-3 | | DashScope API | Chat | [`DashScopeChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/dashscope_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#dashscope-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/dashscope_chat_template.json) | qwen-plus, qwen-max, ... | | | Image Synthesis | [`DashScopeImageSynthesisWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/dashscope_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#dashscope-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/dashscope_image_synthesis_template.json) | wanx-v1 | | | Text Embedding | [`DashScopeTextEmbeddingWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/dashscope_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#dashscope-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/dashscope_text_embedding_template.json) | text-embedding-v1, text-embedding-v2, ... | | | Multimodal | [`DashScopeMultiModalWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/dashscope_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#dashscope-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/dashscope_multimodal_template.json) | qwen-vl-max, qwen-vl-chat-v1, qwen-audio-chat | | Gemini API | Chat | [`GeminiChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/gemini_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#gemini-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/gemini_chat_template.json) | gemini-pro, ... | | | Embedding | [`GeminiEmbeddingWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/gemini_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#gemini-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/gemini_embedding_template.json) | models/embedding-001, ... | | ZhipuAI API | Chat | [`ZhipuAIChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/zhipu_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#zhipu-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/zhipu_chat_template.json) | glm-4, ... | | | Embedding | [`ZhipuAIEmbeddingWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/zhipu_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#zhipu-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/zhipu_embedding_template.json) | embedding-2, ... | | ollama | Chat | [`OllamaChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/ollama_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#ollama-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/ollama_chat_template.json) | llama3, llama2, Mistral, ... | | | Embedding | [`OllamaEmbeddingWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/ollama_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#ollama-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/ollama_embedding_template.json) | llama2, Mistral, ... | | | Generation | [`OllamaGenerationWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/ollama_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#ollama-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/ollama_generate_template.json) | llama2, Mistral, ... | | LiteLLM API | Chat | [`LiteLLMChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/litellm_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#litellm-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/litellm_chat_template.json) | [models supported by litellm](https://docs.litellm.ai/docs/)... | | Post Request based API | - | [`PostAPIModelWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/post_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#post-request-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/postapi_model_config_template.json) | - | **Supported Local Model Deployment** AgentScope enables developers to rapidly deploy local model services using the following libraries. - [ollama (CPU inference)](https://github.com/modelscope/agentscope/blob/main/scripts/README.md#ollama) - [Flask + Transformers](https://github.com/modelscope/agentscope/blob/main/scripts/README.md#with-transformers-library) - [Flask + ModelScope](https://github.com/modelscope/agentscope/blob/main/scripts/README.md#with-modelscope-library) - [FastChat](https://github.com/modelscope/agentscope/blob/main/scripts/README.md#fastchat) - [vllm](https://github.com/modelscope/agentscope/blob/main/scripts/README.md#vllm) **Supported Services** - Web Search - Data Query - Retrieval - Code Execution - File Operation - Text Processing - Multi Modality **Example Applications** - Model - [Using Llama3 in AgentScope](https://github.com/modelscope/agentscope/blob/main/examples/model_llama3) - Conversation - [Basic Conversation](https://github.com/modelscope/agentscope/blob/main/examples/conversation_basic) - [Autonomous Conversation with Mentions](https://github.com/modelscope/agentscope/blob/main/examples/conversation_with_mentions) - [Self-Organizing Conversation](https://github.com/modelscope/agentscope/blob/main/examples/conversation_self_organizing) - [Basic Conversation with LangChain library](https://github.com/modelscope/agentscope/blob/main/examples/conversation_with_langchain) - [Conversation with ReAct Agent](https://github.com/modelscope/agentscope/blob/main/examples/conversation_with_react_agent) - [Conversation in Natural Language to Query SQL](https://github.com/modelscope/agentscope/blob/main/examples/conversation_nl2sql/) - [Conversation with RAG Agent](https://github.com/modelscope/agentscope/blob/main/examples/conversation_with_RAG_agents) -

[Conversation with gpt-4o](https://github.com/modelscope/agentscope/blob/main/examples/conversation_with_gpt-4o) -

[Conversation with Software Engineering Agent](https://github.com/modelscope/agentscope/blob/main/examples/conversation_with_swe-agent/) -

[Conversation with Customized Tools](https://github.com/modelscope/agentscope/blob/main/examples/conversation_with_customized_services/) -

[Mixture of Agents Algorithm](https://github.com/modelscope/agentscope/blob/main/examples/conversation_mixture_of_agents/) -

[Conversation in Stream Mode](https://github.com/modelscope/agentscope/blob/main/examples/conversation_in_stream_mode/) -

[Conversation with CodeAct Agent](https://github.com/modelscope/agentscope/blob/main/examples/conversation_with_codeact_agent/) - Game - [Gomoku](https://github.com/modelscope/agentscope/blob/main/examples/game_gomoku) - [Werewolf](https://github.com/modelscope/agentscope/blob/main/examples/game_werewolf) - Distribution - [Distributed Conversation](https://github.com/modelscope/agentscope/blob/main/examples/distributed_conversation) - [Distributed Debate](https://github.com/modelscope/agentscope/blob/main/examples/distributed_debate) - [Distributed Parallel Optimization](https://github.com/modelscope/agentscope/blob/main/examples/distributed_parallel_optimization) - [Distributed Large Scale Simulation](https://github.com/modelscope/agentscope/blob/main/examples/distributed_simulation) More models, services and examples are coming soon! ## Installation AgentScope requires **Python 3.9** or higher. ***Note: This project is currently in active development, it's recommended to install AgentScope from source.*** ### From source - Install AgentScope in editable mode: ```bash # Pull the source code from GitHub git clone https://github.com/modelscope/agentscope.git # Install the package in editable mode cd agentscope pip install -e . ``` - To build distributed multi-agent applications: ```bash # On windows pip install -e .[distribute] # On mac pip install -e .\[distribute\] ``` ### Using pip - Install AgentScope from pip: ```bash pip install agentscope --pre ``` ## Quick Start ### Configuration In AgentScope, the model deployment and invocation are decoupled by `ModelWrapper`. To use these model wrappers, you need to prepare a model config file as follows. ```python model_config = { # The identifies of your config and used model wrapper "config_name": "{your_config_name}", # The name to identify the config "model_type": "{model_type}", # The type to identify the model wrapper # Detailed parameters into initialize the model wrapper # ... } ``` Taking OpenAI Chat API as an example, the model configuration is as follows: ```python openai_model_config = { "config_name": "my_openai_config", # The name to identify the config "model_type": "openai_chat", # The type to identify the model wrapper # Detailed parameters into initialize the model wrapper "model_name": "gpt-4", # The used model in openai API, e.g. gpt-4, gpt-3.5-turbo, etc. "api_key": "xxx", # The API key for OpenAI API. If not set, env # variable OPENAI_API_KEY will be used. "organization": "xxx", # The organization for OpenAI API. If not set, env # variable OPENAI_ORGANIZATION will be used. } ``` More details about how to set up local model services and prepare model configurations is in our [tutorial](https://modelscope.github.io/agentscope/index.html#welcome-to-agentscope-tutorial-hub). ### Create Agents Create built-in user and assistant agents as follows. ```python from agentscope.agents import DialogAgent, UserAgent import agentscope # Load model configs agentscope.init(model_configs="./model_configs.json") # Create a dialog agent and a user agent dialog_agent = DialogAgent(name="assistant", model_config_name="my_openai_config") user_agent = UserAgent() ``` ### Construct Conversation In AgentScope, **message** is the bridge among agents, which is a **dict** that contains two necessary fields `name` and `content` and an optional field `url` to local files (image, video or audio) or website. ```python from agentscope.message import Msg x = Msg(name="Alice", content="Hi!") x = Msg("Bob", "What about this picture I took?", url="/path/to/picture.jpg") ``` Start a conversation between two agents (e.g. dialog_agent and user_agent) with the following code: ```python x = None while True: x = dialog_agent(x) x = user_agent(x) if x.content == "exit": # user input "exit" to exit the conversation_basic break ``` ### AgentScope Studio AgentScope provides an easy-to-use runtime user interface capable of displaying multimodal output on the front end, including text, images, audio and video. Refer to our [tutorial](https://modelscope.github.io/agentscope/en/tutorial/209-gui.html) for more details.

## Tutorial - [About AgentScope](https://modelscope.github.io/agentscope/zh_CN/tutorial/101-agentscope.html) - [Installation](https://modelscope.github.io/agentscope/zh_CN/tutorial/102-installation.html) - [Quick Start](https://modelscope.github.io/agentscope/zh_CN/tutorial/103-example.html) - [Model](https://modelscope.github.io/agentscope/zh_CN/tutorial/203-model.html) - [Prompt Engineering](https://modelscope.github.io/agentscope/zh_CN/tutorial/206-prompt.html) - [Agent](https://modelscope.github.io/agentscope/zh_CN/tutorial/201-agent.html) - [Memory](https://modelscope.github.io/agentscope/zh_CN/tutorial/205-memory.html) - [Response Parser](https://modelscope.github.io/agentscope/zh_CN/tutorial/203-parser.html) - [Tool](https://modelscope.github.io/agentscope/zh_CN/tutorial/204-service.html) - [Pipeline and MsgHub](https://modelscope.github.io/agentscope/zh_CN/tutorial/202-pipeline.html) - [Distribution](https://modelscope.github.io/agentscope/zh_CN/tutorial/208-distribute.html) - [AgentScope Studio](https://modelscope.github.io/agentscope/zh_CN/tutorial/209-gui.html) - [Logging](https://modelscope.github.io/agentscope/zh_CN/tutorial/105-logging.html) - [Monitor](https://modelscope.github.io/agentscope/zh_CN/tutorial/207-monitor.html) - [Example: Werewolf Game](https://modelscope.github.io/agentscope/zh_CN/tutorial/104-usecase.html) ## License AgentScope is released under Apache License 2.0. ## Contributing Contributions are always welcomed! We provide a developer version with additional pre-commit hooks to perform checks compared to the official version: ```bash # For windows pip install -e .[dev] # For mac pip install -e .\[dev\] # Install pre-commit hooks pre-commit install ``` Please refer to our [Contribution Guide](https://modelscope.github.io/agentscope/en/tutorial/302-contribute.html) for more details. ## References If you find our work helpful for your research or application, please cite [our paper](https://arxiv.org/abs/2402.14034): ``` @article{agentscope, author = {Dawei Gao and Zitao Li and Xuchen Pan and Weirui Kuang and Zhijian Ma and Bingchen Qian and Fei Wei and Wenhao Zhang and Yuexiang Xie and Daoyuan Chen and Liuyi Yao and Hongyi Peng and Zeyu Zhang and Lin Zhu and Chen Cheng and Hongzhu Shi and Yaliang Li and Bolin Ding and Jingren Zhou}, title = {AgentScope: A Flexible yet Robust Multi-Agent Platform}, journal = {CoRR}, volume = {abs/2402.14034}, year = {2024}, } ``` ``` modelscope/agentscope/blob/main/README_ZH.md: ```md [English](https://github.com/modelscope/agentscope/blob/main/README.md) | 中文 # AgentScope

- 欢迎加入我们的社区 | [Discord](https://discord.gg/eYMpfnkG8h) | 钉钉群 | |---------|----------| |

|

| ---- ## 新闻 -

**[2024-07-18]** AgentScope 已支持模型流式输出。请参考我们的 [**教程**](https://modelscope.github.io/agentscope/zh_CN/tutorial/203-stream.html) 和 [**流式对话样例**](https://github.com/modelscope/agentscope/tree/main/examples/conversation_in_stream_mode)！

-

**[2024-07-15]** AgentScope 中添加了 Mixture of Agents 算法。使用样例请参考 [MoA 示例](https://github.com/modelscope/agentscope/blob/main/examples/conversation_mixture_of_agents)。 -

**[2024-06-14]** 新的提示调优（Prompt tuning）模块已经上线 AgentScope，用以帮助开发者生成和优化智能体的 system prompt。更多的细节和使用样例请参考 AgentScope [教程](https://modelscope.github.io/agentscope/en/tutorial/209-prompt_opt.html)！ -

**[2024-06-11]** RAG功能现在已经整合进 **AgentScope** 中! 大家可以根据 [**简要介绍AgentScope中的RAG**](https://modelscope.github.io/agentscope/en/tutorial/210-rag.html) ，让自己的agent用上外部知识! -

**[2024-06-09]** AgentScope v0.0.5 已经更新！在这个新版本中，我们开源了 [**AgentScope Workstation**](https://modelscope.github.io/agentscope/en/tutorial/209-gui.html) (在线版本的网址是[agentscope.io](https://agentscope.io))！ - **[2024-05-24]** 我们很高兴地宣布 **AgentScope Workstation** 相关功能即将开源。我们的网站服务暂时下线。在线服务会很快升级重新上线，敬请期待... - **[2024-05-15]** 用于解析模型格式化输出的**解析器**模块已经上线 AgentScope！更轻松的构建多智能体应用，使用方法请参考[教程](https://modelscope.github.io/agentscope/en/tutorial/203-parser.html)。与此同时，[`DictDialogAgent`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/agents/dict_dialog_agent.py) 类和 [狼人杀游戏](https://github.com/modelscope/agentscope/tree/main/examples/game_werewolf) 样例也已经同步更新！ - **[2024-05-14]** 目前 AgentScope 正在进行 AgentScope Workstation & Copilot 用户体验反馈活动，需要您宝贵的意见来帮助我们改善 AgentScope 的拖拽式多智能体应用开发与 Copilot 体验。您的每一个反馈都十分宝贵，请点击 [链接](https://survey.aliyun.com/apps/zhiliao/vgpTppn22) 参与问卷，感谢您的支持！ - **[2024-05-14]** AgentScope 现已支持 **gpt-4o** 等 OpenAI Vision 模型! 模型配置请见[链接](./examples/model_configs_template/openai_chat_template.json)。同时，新的样例“[与gpt-4o模型对话](./examples/conversation_with_gpt-4o)”已上线! - **[2024-04-30]** 我们现在发布了**AgentScope** v0.0.4版本！ - **[2024-04-27]** [AgentScope Workstation](https://agentscope.aliyun.com/)上线了！欢迎使用 Workstation 体验如何在*拖拉拽编程平台* 零代码搭建多智体应用，也欢迎大家通过*copilot*查询AgentScope各种小知识！ - **[2024-04-19]** AgentScope现已经支持Llama3！我们提供了面向CPU推理和GPU推理的[脚本](./examples/model_llama3)和[模型配置](./examples/model_llama3)，一键式开启Llama3的探索，在我们的样例中尝试Llama3吧！ - **[2024-04-06]** 我们现在发布了**AgentScope** v0.0.3版本！ - **[2024-04-06]** 新的样例“[五子棋](./examples/game_gomoku)”，“[与ReAct智能体对话](./examples/conversation_with_react_agent)”，“[与RAG智能体对话](./examples/conversation_with_RAG_agents)”，“[分布式并行优化](./examples/distributed_parallel_optimization)”上线了！ - **[2024-03-19]** 我们现在发布了**AgentScope** v0.0.2版本！在这个新版本中，AgentScope支持了[ollama](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#supported-models)（本地CPU推理引擎），[DashScope](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#supported-models)和[Gemini](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#supported-models) APIs。 - **[2024-03-19]** 新的样例“[带有@功能的自主对话](./examples/conversation_with_mentions)”和“[兼容LangChain的基础对话](./examples/conversation_with_langchain)”上线了！ - **[2024-03-19]** AgentScope的[中文教程](https://modelscope.github.io/agentscope/zh_CN/index.html)上线了！ - **[2024-02-27]** 我们现在发布了**AgentScope** v0.0.1版本！现在，AgentScope也可以在[PyPI](https://pypi.org/project/agentscope/)上下载 - **[2024-02-14]** 我们在arXiv上发布了论文“[AgentScope: A Flexible yet Robust Multi-Agent Platform](https://arxiv.org/abs/2402.14034)”! --- ## 什么是AgentScope？ AgentScope是一个创新的多智能体开发平台，旨在赋予开发人员使用大模型轻松构建多智能体应用的能力。 - 🤝 **高易用**： AgentScope专为开发人员设计，提供了[丰富的组件](https://modelscope.github.io/agentscope/en/tutorial/204-service.html#), [全面的文档](https://modelscope.github.io/agentscope/zh_CN/index.html)和广泛的兼容性。同时，[AgentScope Workstation](https://agentscope.aliyun.com/)提供了在线拖拉拽编程和在线小助手(copilot)功能，帮助开发者迅速上手！ - ✅ **高鲁棒**：支持自定义的容错控制和重试机制，以提高应用程序的稳定性。 - 🚀 **分布式**：支持以中心化的方式构建分布式多智能体应用程序。 **支持的模型API** AgentScope提供了一系列`ModelWrapper`来支持本地模型服务和第三方模型API。 | API | Task | Model Wrapper | Configuration | Some Supported Models | |------------------------|-----------------|---------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------|-----------------------------------------------| | OpenAI API | Chat | [`OpenAIChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/openai_model.py) |[guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#openai-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/openai_chat_template.json) | gpt-4o, gpt-4, gpt-3.5-turbo, ... | | | Embedding | [`OpenAIEmbeddingWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/openai_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#openai-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/openai_embedding_template.json) | text-embedding-ada-002, ... | | | DALL·E | [`OpenAIDALLEWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/openai_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#openai-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/openai_dall_e_template.json) | dall-e-2, dall-e-3 | | DashScope API | Chat | [`DashScopeChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/dashscope_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#dashscope-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/dashscope_chat_template.json) | qwen-plus, qwen-max, ... | | | Image Synthesis | [`DashScopeImageSynthesisWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/dashscope_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#dashscope-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/dashscope_image_synthesis_template.json) | wanx-v1 | | | Text Embedding | [`DashScopeTextEmbeddingWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/dashscope_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#dashscope-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/dashscope_text_embedding_template.json) | text-embedding-v1, text-embedding-v2, ... | | | Multimodal | [`DashScopeMultiModalWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/dashscope_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#dashscope-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/dashscope_multimodal_template.json) | qwen-vl-max, qwen-vl-chat-v1, qwen-audio-chat | | Gemini API | Chat | [`GeminiChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/gemini_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#gemini-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/gemini_chat_template.json) | gemini-pro, ... | | | Embedding | [`GeminiEmbeddingWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/gemini_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#gemini-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/gemini_embedding_template.json) | models/embedding-001, ... | | ZhipuAI API | Chat | [`ZhipuAIChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/zhipu_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#zhipu-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/zhipu_chat_template.json) | glm-4, ... | | | Embedding | [`ZhipuAIEmbeddingWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/zhipu_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#zhipu-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/zhipu_embedding_template.json) | embedding-2, ... | | ollama | Chat | [`OllamaChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/ollama_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#ollama-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/ollama_chat_template.json) | llama3, llama2, Mistral, ... | | | Embedding | [`OllamaEmbeddingWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/ollama_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#ollama-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/ollama_embedding_template.json) | llama2, Mistral, ... | | | Generation | [`OllamaGenerationWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/ollama_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#ollama-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/ollama_generate_template.json) | llama2, Mistral, ... | | LiteLLM API | Chat | [`LiteLLMChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/litellm_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#litellm-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/litellm_chat_template.json) | [models supported by litellm](https://docs.litellm.ai/docs/)... | | Post Request based API | - | [`PostAPIModelWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/post_model.py) | [guidance](https://modelscope.github.io/agentscope/en/tutorial/203-model.html#post-request-api)
[template](https://github.com/modelscope/agentscope/blob/main/examples/model_configs_template/postapi_model_config_template.json) | - | **支持的本地模型部署** AgentScope支持使用以下库快速部署本地模型服务。 - [ollama (CPU inference)](https://github.com/modelscope/agentscope/blob/main/scripts/README.md#ollama) - [Flask + Transformers](https://github.com/modelscope/agentscope/blob/main/scripts/README.md#with-transformers-library) - [Flask + ModelScope](https://github.com/modelscope/agentscope/blob/main/scripts/README.md#with-modelscope-library) - [FastChat](https://github.com/modelscope/agentscope/blob/main/scripts/README.md#fastchat) - [vllm](https://github.com/modelscope/agentscope/blob/main/scripts/README.md#vllm) **支持的服务** - 网络搜索 - 数据查询 - 数据检索 - 代码执行 - 文件操作 - 文本处理 - 多模态生成 **样例应用** - 模型 - [在AgentScope中使用Llama3](./examples/model_llama3) - 对话 - [基础对话](./examples/conversation_basic) - [带有@功能的自主对话](./examples/conversation_with_mentions) - [智能体自组织的对话](./examples/conversation_self_organizing) - [兼容LangChain的基础对话](./examples/conversation_with_langchain) - [与ReAct智能体对话](./examples/conversation_with_react_agent) - [通过对话查询SQL信息](./examples/conversation_nl2sql/) - [与RAG智能体对话](./examples/conversation_with_RAG_agents) -

[与gpt-4o模型对话](./examples/conversation_with_gpt-4o) -

[与自定义服务对话](./examples/conversation_with_customized_services/) -

[与SoftWare Engineering智能体对话](./examples/conversation_with_swe-agent/) -

[自定义工具函数](./examples/conversation_with_customized_services/) -

[Mixture of Agents算法](https://github.com/modelscope/agentscope/blob/main/examples/conversation_mixture_of_agents/) -

[流式对话](https://github.com/modelscope/agentscope/blob/main/examples/conversation_in_stream_mode/) -

[与CodeAct智能体对话](https://github.com/modelscope/agentscope/blob/main/examples/conversation_with_codeact_agent/) - 游戏 - [五子棋](./examples/game_gomoku) - [狼人杀](./examples/game_werewolf) - 分布式 - [分布式对话](./examples/distributed_conversation) - [分布式辩论](./examples/distributed_debate) - [分布式并行优化](./examples/distributed_parallel_optimization) - [分布式大规模仿真](./examples/distributed_simulation) 更多模型API、服务和示例即将推出！ ## 安装 AgentScope需要Python 3.9或更高版本。 ***注意：该项目目前正在积极开发中，建议从源码安装AgentScope。*** ### 从源码安装 - 以编辑模式安装AgentScope： ```bash # 从github拉取源代码 git clone https://github.com/modelscope/agentscope.git # 以编辑模式安装包 cd agentscope pip install -e . ``` - 构建分布式多智能体应用需要按照以下方式安装： ```bash # 在windows上 pip install -e .[distribute] # 在mac上 pip install -e .\[distribute\] ``` ### 使用pip - 从pip安装的AgentScope ```bash pip install agentscope --pre ``` ## 快速开始 ### 配置 AgentScope中，模型的部署和调用是通过`ModelWrapper`实现解耦的。为了使用这些`ModelWrapper`, 您需要准备如下的模型配置文件： ```python model_config = { # 模型配置的名称，以及使用的模型wrapper "config_name": "{your_config_name}", # 模型配置的名称 "model_type": "{model_type}", # 模型wrapper的类型 # 用以初始化模型wrapper的详细参数 # ... } ``` 以OpenAI Chat API为例，模型配置如下： ```python openai_model_config = { "config_name": "my_openai_config", # 模型配置的名称 "model_type": "openai_chat", # 模型wrapper的类型 # 用以初始化模型wrapper的详细参数 "model_name": "gpt-4", # OpenAI API中的模型名 "api_key": "xxx", # OpenAI API的API密钥。如果未设置，将使用环境变量OPENAI_API_KEY。 "organization": "xxx", # OpenAI API的组织。如果未设置，将使用环境变量OPENAI_ORGANIZATION。 } ``` 关于部署本地模型服务和准备模型配置的更多细节，请参阅我们的[教程](https://modelscope.github.io/agentscope/index.html#welcome-to-agentscope-tutorial-hub)。 ### 创建Agent 创建AgentScope内置的`DialogAgent`和`UserAgent`对象. ```python from agentscope.agents import DialogAgent, UserAgent import agentscope # 加载模型配置 agentscope.init(model_configs="./model_configs.json") # 创建对话Agent和用户Agent dialog_agent = DialogAgent(name="assistant", model_config_name="my_openai_config") user_agent = UserAgent() ``` #### 构造对话在AgentScope中，**Message**是Agent之间的桥梁，它是一个python**字典**（dict），包含两个必要字段`name`和`content`，以及一个可选字段`url`用于本地文件（图片、视频或音频）或网络链接。 ```python from agentscope.message import Msg x = Msg(name="Alice", content="Hi!") x = Msg("Bob", "What about this picture I took?", url="/path/to/picture.jpg") ``` 使用以下代码开始两个Agent（dialog_agent和user_agent）之间的对话： ```python x = None while True: x = dialog_agent(x) x = user_agent(x) if x.content == "exit": # 用户输入"exit"退出对话 break ``` ### AgentScope前端 AgentScope 提供了一个易于使用的运行时用户界面，能够在前端显示多模态输出，包括文本、图像、音频和视频。参考我们的[教程](https://modelscope.github.io/agentscope/zh_CN/tutorial/209-gui.html)了解更多细节。

## 教程 - [关于AgentScope](https://modelscope.github.io/agentscope/zh_CN/tutorial/101-agentscope.html) - [安装](https://modelscope.github.io/agentscope/zh_CN/tutorial/102-installation.html) - [快速开始](https://modelscope.github.io/agentscope/zh_CN/tutorial/103-example.html) - [模型](https://modelscope.github.io/agentscope/zh_CN/tutorial/203-model.html) - [提示工程](https://modelscope.github.io/agentscope/zh_CN/tutorial/206-prompt.html) - [Agent](https://modelscope.github.io/agentscope/zh_CN/tutorial/201-agent.html) - [记忆](https://modelscope.github.io/agentscope/zh_CN/tutorial/205-memory.html) - [结果解析](https://modelscope.github.io/agentscope/zh_CN/tutorial/203-parser.html) - [工具](https://modelscope.github.io/agentscope/zh_CN/tutorial/204-service.html) - [Pipeline和MsgHub](https://modelscope.github.io/agentscope/zh_CN/tutorial/202-pipeline.html) - [分布式](https://modelscope.github.io/agentscope/zh_CN/tutorial/208-distribute.html) - [AgentScope Studio](https://modelscope.github.io/agentscope/zh_CN/tutorial/209-gui.html) - [日志](https://modelscope.github.io/agentscope/zh_CN/tutorial/105-logging.html) - [监控器](https://modelscope.github.io/agentscope/zh_CN/tutorial/207-monitor.html) - [样例：狼人杀游戏](https://modelscope.github.io/agentscope/zh_CN/tutorial/104-usecase.html) ## License AgentScope根据Apache License 2.0发布。 ## 贡献欢迎参与到AgentScope的构建中！我们提供了一个带有额外 pre-commit 钩子以执行检查的开发者版本，与官方版本相比： ```bash # 对于windows pip install -e .[dev] # 对于mac pip install -e .\[dev\] # 安装pre-commit钩子 pre-commit install ``` 请参阅我们的[贡献指南](https://modelscope.github.io/agentscope/zh_CN/tutorial/302-contribute.html)了解更多细节。 ## 引用如果您觉得我们的工作对您的研究或应用有帮助，请引用[我们的论文](https://arxiv.org/abs/2402.14034)。 ``` @article{agentscope, author = {Dawei Gao and Zitao Li and Xuchen Pan and Weirui Kuang and Zhijian Ma and Bingchen Qian and Fei Wei and Wenhao Zhang and Yuexiang Xie and Daoyuan Chen and Liuyi Yao and Hongyi Peng and Zeyu Zhang and Lin Zhu and Chen Cheng and Hongzhu Shi and Yaliang Li and Bolin Ding and Jingren Zhou}, title = {AgentScope: A Flexible yet Robust Multi-Agent Platform}, journal = {CoRR}, volume = {abs/2402.14034}, year = {2024}, } ``` ``` modelscope/agentscope/blob/main/docs/README.md: ```md # AgentScope Documentation ## Build Documentation Please use the following commands to build sphinx doc of AgentScope. ```shell # step 1: Install dependencies pip install sphinx sphinx-autobuild sphinx_rtd_theme myst-parser sphinxcontrib-mermaid # step 2: go into the sphinx_doc dir cd sphinx_doc # step 3: build the sphinx doc ./build_sphinx_doc.sh # step 4: view sphinx_doc/build/html/index.html using your browser ``` ``` modelscope/agentscope/blob/main/docs/sphinx_doc/assets/redirect.html: ```html AgentScope Documentation

Redirecting to English documentation...

If you are not redirected, click here.

``` modelscope/agentscope/blob/main/docs/sphinx_doc/en/source/_templates/language_selector.html: ```html

English | 中文

``` modelscope/agentscope/blob/main/docs/sphinx_doc/en/source/_templates/layout.html: ```html {% extends "!layout.html" %} {% block sidebartitle %} {{ super() }} {% include "language_selector.html" %} {% endblock %} ``` modelscope/agentscope/blob/main/docs/sphinx_doc/en/source/conf.py: ```py # -*- coding: utf-8 -*- # Configuration file for the Sphinx documentation builder. # # This file only contains a selection of the most common options. For a full # list see the documentation: # https://www.sphinx-doc.org/en/master/usage/configuration.html # -- Path setup -------------------------------------------------------------- # If extensions (or modules to document with autodoc) are in another directory, # add these directories to sys.path here. If the directory is relative to the # documentation root, use os.path.abspath to make it absolute, like shown here. # # import os # import sys # sys.path.insert(0, os.path.abspath("../../../src/agentscope")) # -- Project information ----------------------------------------------------- language = "en" project = "AgentScope" copyright = "2024, Alibaba Tongyi Lab" author = "SysML team of Alibaba Tongyi Lab" # -- General configuration --------------------------------------------------- # Add any Sphinx extension module names here, as strings. They can be # extensions coming with Sphinx (named 'sphinx.ext.*') or your custom # ones. extensions = [ "sphinx.ext.autodoc", "sphinx.ext.autosummary", "sphinx.ext.viewcode", "sphinx.ext.napoleon", "sphinxcontrib.mermaid", "myst_parser", "sphinx.ext.autosectionlabel", ] # Prefix document path to section labels, otherwise autogenerated labels would # look like 'heading' rather than 'path/to/file:heading' autosectionlabel_prefix_document = True autosummary_generate = True autosummary_ignore_module_all = False autodoc_member_order = "bysource" autodoc_default_options = { "members": True, "special-members": "__init__", } # Add any paths that contain templates here, relative to this directory. templates_path = ["_templates"] # List of patterns, relative to source directory, that match files and # directories to ignore when looking for source files. # This pattern also affects html_static_path and html_extra_path. exclude_patterns = ["_build", "Thumbs.db", ".DS_Store"] autodoc_default_options = { "members": True, "special-members": "__init__", } # -- Options for HTML output ------------------------------------------------- # The theme to use for HTML and HTML Help pages. See the documentation for # a list of builtin themes. # html_theme = "sphinx_rtd_theme" # Add any paths that contain custom static files (such as style sheets) here, # relative to this directory. They are copied after the builtin static files, # so a file named "default.css" will overwrite the builtin "default.css". html_static_path = ["_static"] html_theme_options = { "navigation_depth": 2, } source_suffix = { ".rst": "restructuredtext", ".md": "markdown", } html_css_files = [ "custom.css", ] ``` modelscope/agentscope/blob/main/docs/sphinx_doc/en/source/tutorial/101-agentscope.md: ```md (101-agentscope-en)= # About AgentScope In this tutorial, we will provide an overview of AgentScope by answering several questions, including what's AgentScope, what can AgentScope provide, and why we should choose AgentScope. Let's get started! ## What is AgentScope? AgentScope is a developer-centric multi-agent platform, which enables developers to build their LLM-empowered multi-agent applications with less effort. With the advance of large language models, developers are able to build diverse applications. In order to connect LLMs to data and services and solve complex tasks, AgentScope provides a series of development tools and components for ease of development. It features - **usability**, - **robustness**, - **the support of multi-modal data**, - **distributed deployment**. ## Key Concepts ### Message Message is a carrier of information (e.g. instructions, multi-modal data, and dialogue). In AgentScope, message is a Python dict subclass with `name` and `content` as necessary fields, and `url` as an optional field referring to additional resources. ### Agent Agent is an autonomous entity capable of interacting with environment and agents, and taking actions to change the environment. In AgentScope, an agent takes message as input and generates corresponding response message. ### Service Service refers to the functional APIs that enable agents to perform specific tasks. In AgentScope, services are categorized into model API services, which are channels to use the LLMs, and general API services, which provide a variety of tool functions. ### Workflow Workflow represents ordered sequences of agent executions and message exchanges between agents, analogous to computational graphs in TensorFlow, but with the flexibility to accommodate non-DAG structures. ## Why AgentScope? **Exceptional usability for developers.** AgentScope provides high usability for developers with flexible syntactic sugars, ready-to-use components, and pre-built examples. **Robust fault tolerance for diverse models and APIs.** AgentScope ensures robust fault tolerance for diverse models, APIs, and allows developers to build customized fault-tolerant strategies. **Extensive compatibility for multi-modal application.** AgentScope supports multi-modal data (e.g., files, images, audio and videos) in both dialog presentation, message transmission and data storage. **Optimized efficiency for distributed multi-agent operations.** AgentScope introduces an actor-based distributed mechanism that enables centralized programming of complex distributed workflows, and automatic parallel optimization. ## How is AgentScope designed? The architecture of AgentScope comprises three hierarchical layers. The layers provide supports for multi-agent applications from different levels, including elementary and advanced functionalities of a single agent (**utility layer**), resources and runtime management (**manager and wrapper layer**), and agent-level to workflow-level programming interfaces (**agent layer**). AgentScope introduces intuitive abstractions designed to fulfill the diverse functionalities inherent to each layer and simplify the complicated interlayer dependencies when building multi-agent systems. Furthermore, we offer programming interfaces and default mechanisms to strengthen the resilience of multi-agent systems against faults within different layers. ## AgentScope Code Structure ```bash AgentScope ├── src │ ├── agentscope │ | ├── agents # Core components and implementations pertaining to agents. │ | ├── memory # Structures for agent memory. │ | ├── models # Interfaces for integrating diverse model APIs. │ | ├── pipelines # Fundamental components and implementations for running pipelines. │ | ├── rpc # Rpc module for agent distributed deployment. │ | ├── service # Services offering functions independent of memory and state. | | ├── web # WebUI used to show dialogs. │ | ├── utils # Auxiliary utilities and helper functions. │ | ├── message.py # Definitions and implementations of messaging between agents. │ | ├── prompt.py # Prompt engineering module for model input. │ | ├── ... .. │ | ├── ... .. ├── scripts # Scripts for launching local Model API ├── examples # Pre-built examples of different applications. ├── docs # Documentation tool for API reference. ├── tests # Unittest modules for continuous integration. ├── LICENSE # The official licensing agreement for AgentScope usage. └── setup.py # Setup script for installing. ├── ... .. └── ... .. ``` [[Return to the top]](#101-agentscope) ``` modelscope/agentscope/blob/main/docs/sphinx_doc/en/source/tutorial/102-installation.md: ```md (102-installation-en)= # Installation To install AgentScope, you need to have Python 3.9 or higher installed. We recommend setting up a new virtual environment specifically for AgentScope: ## Create a Virtual Environment ### Using Conda If you're using Conda as your package and environment management tool, you can create a new virtual environment with Python 3.9 using the following commands: ```bash # Create a new virtual environment named 'agentscope' with Python 3.9 conda create -n agentscope python=3.9 # Activate the virtual environment conda activate agentscope ``` ### Using Virtualenv Alternatively, if you prefer `virtualenv`, you can install it first (if it's not already installed) and then create a new virtual environment as shown: ```bash # Install virtualenv if it is not already installed pip install virtualenv # Create a new virtual environment named 'agentscope' with Python 3.9 virtualenv agentscope --python=python3.9 # Activate the virtual environment source agentscope/bin/activate # On Windows use `agentscope\Scripts\activate` ``` ## Installing AgentScope ### Install with Pip If you prefer to install AgentScope from Pypi, you can do so easily using `pip`: ```bash # For centralized multi-agent applications pip install agentscope --pre # For distributed multi-agent applications pip install agentscope[distribute] --pre # On Mac use `pip install agentscope\[distribute\] --pre` ``` ### Install from Source For users who prefer to install AgentScope directly from the source code, follow these steps to clone the repository and install the platform in editable mode: **_Note: This project is under active development, it's recommended to install AgentScope from source._** ```bash # Pull the source code from Github git clone https://github.com/modelscope/agentscope.git cd agentscope # For centralized multi-agent applications pip install -e . # For distributed multi-agent applications pip install -e .[distribute] # On Mac use `pip install -e .\[distribute\]` ``` **Note**: The `[distribute]` option installs additional dependencies required for distributed applications. Remember to activate your virtual environment before running these commands. [[Return to the top]](#102-installation-en) ``` modelscope/agentscope/blob/main/docs/sphinx_doc/en/source/tutorial/103-example.md: ```md (103-start-en)= # Quick Start AgentScope is designed with a flexible communication mechanism. In this tutorial, we will introduce the basic usage of AgentScope via a simple standalone conversation between two agents (e.g. user and assistant agents). ## Step1: Prepare Model AgentScope decouples the deployment and invocation of models to better build multi-agent applications. In terms of model deployment, users can use third-party model services such as OpenAI API, Google Gemini API, HuggingFace/ModelScope Inference API, or quickly deploy local open-source model services through the [scripts](https://github.com/modelscope/agentscope/blob/main/scripts/README.md) in the repository. While for model invocation, users should prepare a model configuration to specify the model service. Taking OpenAI Chat API as an example, the model configuration is like this: ```python model_config = { "config_name": "{config_name}", # A unique name for the model config. "model_type": "openai_chat", # Choose from "openai_chat", "openai_dall_e", or "openai_embedding". "model_name": "{model_name}", # The model identifier used in the OpenAI API, such as "gpt-3.5-turbo", "gpt-4", or "text-embedding-ada-002". "api_key": "xxx", # Your OpenAI API key. If unset, the environment variable OPENAI_API_KEY is used. "organization": "xxx", # Your OpenAI organization ID. If unset, the environment variable OPENAI_ORGANIZATION is used. } ``` More details about model invocation, deployment and open-source models please refer to [Model](203-model-en) section. After preparing the model configuration, you can register your configuration by calling the `init` method of AgentScope. Additionally, you can load multiple model configurations at once. ```python import agentscope # init once by passing a list of config dict openai_cfg_dict = { # ... } modelscope_cfg_dict = { # ... } agentscope.init(model_configs=[openai_cfg_dict, modelscope_cfg_dict]) ``` ## Step2: Create Agents Creating agents is straightforward in AgentScope. After initializing AgentScope with your model configurations (Step 1 above), you can then define each agent with its corresponding role and specific model. ```python import agentscope from agentscope.agents import DialogAgent, UserAgent # read model configs agentscope.init(model_configs="./openai_model_configs.json") # Create a dialog agent and a user agent dialogAgent = DialogAgent(name="assistant", model_config_name="gpt-4", sys_prompt="You are a helpful ai assistant") userAgent = UserAgent() ``` **NOTE**: Please refer to [Customizing Your Own Agent](201-agent-en) for all available agents. ## Step3: Agent Conversation "Message" is the primary means of communication between agents in AgentScope. They are Python dictionaries comprising essential fields like the actual `content` of this message and the sender's `name`. Optionally, a message can include a `url` to either a local file (image, video or audio) or website. ```python from agentscope.message import Msg # Example of a simple text message from Alice message_from_alice = Msg("Alice", "Hi!") # Example of a message from Bob with an attached image message_from_bob = Msg("Bob", "What about this picture I took?", url="/path/to/picture.jpg") ``` To start a conversation between two agents, such as `dialog_agent` and `user_agent`, you can use the following loop. The conversation continues until the user inputs `"exit"` which terminates the interaction. ```python x = None while True: x = dialogAgent(x) x = userAgent(x) # Terminate the conversation if the user types "exit" if x.content == "exit": print("Exiting the conversation.") break ``` For a more advanced approach, AgentScope offers the option of using pipelines to manage the flow of messages between agents. The `sequentialpipeline` stands for sequential speech, where each agent receive message from last agent and generate its response accordingly. ```python from agentscope.pipelines.functional import sequentialpipeline # Execute the conversation loop within a pipeline structure x = None while x is None or x.content != "exit": x = sequentialpipeline([dialog_agent, user_agent]) ``` For more details about how to utilize pipelines for complex agent interactions, please refer to [Pipeline and MsgHub](202-pipeline-en). [[Return to the top]](#103-start-en) ``` modelscope/agentscope/blob/main/docs/sphinx_doc/en/source/tutorial/104-usecase.md: ```md (104-usecase-en)= # Example: Werewolf Game

**Werewolf** is a well-known social-deduction game, that involves an imaginary village where a few villagers are secretly werewolves, and the objective is to identify who they are before they eliminate all other players. It's a good use case to demonstrate the interaction between multiple autonomous agents, each with its own objectives and the need for communication. Let the adventure begin to unlock the potential of multi-agent applications with AgentScope! ## Getting Started Firstly, ensure that you have installed and configured AgentScope properly. Besides, we will involve the basic concepts of `Model API`, `Agent`, `Msg`, and `Pipeline,` as described in [Tutorial-Concept](101-agentscope.md). **Note**: all the configurations and code for this tutorial can be found in `examples/game_werewolf`. ### Step 1: Prepare Model API and Set Model Configs As we discussed in the last tutorial, you need to prepare your model configurations into a JSON file for standard OpenAI chat API, FastChat, and vllm. More details and advanced usages such as configuring local models with POST API are presented in [Tutorial-Model-API](203-model.md). ```json [ { "config_name": "gpt-4-temperature-0.0", "model_type": "openai_chat", "model_name": "gpt-4", "api_key": "xxx", "organization": "xxx", "generate_args": { "temperature": 0.0 } } ] ``` ### Step 2: Define the Roles of Each Agent In the Werewolf game, agents assume a variety of roles, each endowed with distinctive abilities and objectives. Below, we will outline the agent classes corresponding to each role: - Villager: Ordinary townsfolk trying to survive. - Werewolf: Predators in disguise, aiming to outlast the villagers. - Seer: A villager with the power to see the true nature of one player each night. - Witch: A villager who can save or poison a player each night. To implement your own agent, you need to inherit `AgentBase` and implement the `reply` function, which is executed when an agent instance is called via `agent1(x)`. ```python from agentscope.agents import AgentBase from agentscope.message import Msg from typing import Optional, Union, Sequence class MyAgent(AgentBase): def reply(self, x: Optional[Union[Msg, Sequence[Msg]]] = None) -> Msg: # Do something here ... return x ``` AgentScope provides several out-of-the-box Agents implements and organizes them as an *Agent Pool*. In this application, we use a built-in agent, `DictDialogAgent`. Here we give an example configuration of `DictDialogAgent` for a player assigned as the role of a werewolf: ```json { "class": "DictDialogAgent", "args": { "name": "Player1", "sys_prompt": "Act as a player in a werewolf game. You are Player1 and\nthere are totally 6 players, named Player1, Player2, Player3, Player4, Player5 and Player6.\n\nPLAYER ROLES:\nIn werewolf game, players are divided into two werewolves, two villagers, one seer, and one witch. Note only werewolves know who are their teammates.\nWerewolves: They know their teammates' identities and attempt to eliminate a villager each night while trying to remain undetected.\nVillagers: They do not know who the werewolves are and must work together during the day to deduce who the werewolves might be and vote to eliminate them.\nSeer: A villager with the ability to learn the true identity of one player each night. This role is crucial for the villagers to gain information.\nWitch: A character who has a one-time ability to save a player from being eliminated at night (sometimes this is a potion of life) and a one-time ability to eliminate a player at night (a potion of death).\n\nGAME RULE:\nThe game consists of two phases: night phase and day phase. The two phases are repeated until werewolf or villager wins the game.\n1. Night Phase: During the night, the werewolves discuss and vote for a player to eliminate. Special roles also perform their actions at this time (e.g., the Seer chooses a player to learn their role, the witch chooses a decide if save the player).\n2. Day Phase: During the day, all surviving players discuss who they suspect might be a werewolf. No one reveals their role unless it serves a strategic purpose. After the discussion, a vote is taken, and the player with the most votes is \"lynched\" or eliminated from the game.\n\nVICTORY CONDITION:\nFor werewolves, they win the game if the number of werewolves is equal to or greater than the number of remaining villagers.\nFor villagers, they win if they identify and eliminate all of the werewolves in the group.\n\nCONSTRAINTS:\n1. Your response should be in the first person.\n2. This is a conversational game. You should respond only based on the conversation history and your strategy.\n\nYou are playing werewolf in this game.\n", "model_config_name": "gpt-3.5-turbo", "use_memory": true } } ``` In this configuration, `Player1` is designated as a `DictDialogAgent`. The parameters include a system prompt (`sys_prompt`) that can guide the agent's behavior, a model config name (`model_config_name`) that determines the name of the model configuration, and a flag (`use_memory`) indicating whether the agent should remember past interactions. For other players, configurations can be customized based on their roles. Each role may have different prompts, models, or memory settings. You can refer to the JSON file located at `examples/game_werewolf/configs/agent_configs.json` within the AgentScope examples directory. ### Step 3: Initialize AgentScope and the Agents Now we have defined the roles in the application and we can initialize the AgentScope environment and all agents. This process is simplified by AgentScope via a few lines, based on the configuration files we've prepared (assuming there are **2** werewolves, **2** villagers, **1** witch, and **1** seer): ```python import agentscope # read model and agent configs, and initialize agents automatically survivors = agentscope.init( model_configs="./configs/model_configs.json", agent_configs="./configs/agent_configs.json", logger_level="DEBUG", ) # Define the roles within the game. This list should match the order and number # of agents specified in the 'agent_configs.json' file. roles = ["werewolf", "werewolf", "villager", "villager", "seer", "witch"] # Based on their roles, assign the initialized agents to variables. # This helps us reference them easily in the game logic. wolves, villagers, witch, seer = survivors[:2], survivors[2:-2], survivors[-1], survivors[-2] ``` Through this snippet of code, we've allocated roles to our agents and associated them with the configurations that dictate their behavior in the application. ### Step 4: Set Up the Game Logic In this step, you will set up the game logic and orchestrate the flow of the Werewolf game using AgentScope's helper utilities. #### Parser In order to allow `DictDialogAgent` to output fields customized by the users, and to increase the success rate of parsing different fields by LLMs, we have added the `parser` module. Here is the configuration of a parser example: ``` to_wolves_vote = "Which player do you vote to kill?" wolves_vote_parser = MarkdownJsonDictParser( content_hint={ "thought": "what you thought", "vote": "player_name", }, required_keys=["thought", "vote"], keys_to_memory="vote", keys_to_content="vote", ) ``` For more details about the `parser` module，please see [here](https://modelscope.github.io/agentscope/en/tutorial/203-parser.html). #### Leverage Pipeline and MsgHub To simplify the construction of agent communication, AgentScope provides two helpful concepts: **Pipeline** and **MsgHub**. - **Pipeline**: It allows users to program communication among agents easily. ```python from agentscope.pipelines import SequentialPipeline pipe = SequentialPipeline(agent1, agent2, agent3) x = pipe(x) # the message x will be passed and replied by agent 1,2,3 in order ``` - **MsgHub**: You may have noticed that all the above examples are one-to-one communication. To achieve a group chat, we provide another communication helper utility `msghub`. With it, the messages from participants will be broadcast to all other participants automatically. In such cases, participating agents even don't need input and output messages. All we need to do is to decide the order of speaking. Besides, `msghub` also supports dynamic control of participants. ```python with msghub(participants=[agent1, agent2, agent3]) as hub: agent1() agent2() # Broadcast a message to all participants hub.broadcast(Msg("Host", "Welcome to join the group chat!")) # Add or delete participants dynamically hub.delete(agent1) hub.add(agent4) ``` #### Implement Werewolf Pipeline The game logic is divided into two major phases: (1) night when werewolves act, and (2) daytime when all players discuss and vote. Each phase will be handled by a section of code using pipelines to manage multi-agent communications. - **1.1 Night Phase: Werewolves Discuss and Vote** During the night phase, werewolves must discuss among themselves to decide on a target. The `msghub` function creates a message hub for the werewolves to communicate in, where every message sent by an agent is observable by all other agents within the `msghub`. ```python # start the game for i in range(1, MAX_GAME_ROUND + 1): # Night phase: werewolves discuss hint = HostMsg(content=Prompts.to_wolves.format(n2s(wolves))) with msghub(wolves, announcement=hint) as hub: set_parsers(wolves, Prompts.wolves_discuss_parser) for _ in range(MAX_WEREWOLF_DISCUSSION_ROUND): x = sequentialpipeline(wolves) if x.metadata.get("finish_discussion", False): break ``` After the discussion, werewolves proceed to vote for their target, and the majority's choice is determined. The result of the vote is then broadcast to all werewolves. **Note**: the detailed prompts and utility functions can be found in `examples/game_werewolf`. ```python # werewolves vote set_parsers(wolves, Prompts.wolves_vote_parser) hint = HostMsg(content=Prompts.to_wolves_vote) votes = [extract_name_and_id(wolf(hint).content)[0] for wolf in wolves] # broadcast the result to werewolves dead_player = [majority_vote(votes)] hub.broadcast( HostMsg(content=Prompts.to_wolves_res.format(dead_player[0])), ) ``` - **1.2 Witch's Turn** If the witch is still alive, she gets the opportunity to use her powers to either save the player chosen by the werewolves or use her poison. ```python # Witch's turn healing_used_tonight = False if witch in survivors: if healing: # Witch decides whether to use the healing potion hint = HostMsg( content=Prompts.to_witch_resurrect.format_map( {"witch_name": witch.name, "dead_name": dead_player[0]}, ), ) # Witch decides whether to use the poison set_parsers(witch, Prompts.witch_resurrect_parser) if witch(hint).metadata.get("resurrect", False): healing_used_tonight = True dead_player.pop() healing = False ``` - **1.3 Seer's Turn** The seer has a chance to reveal the true identity of a player. This information can be crucial for the villagers. The `observe()` function allows each agent to take note of a message without immediately replying to it. ```python # Seer's turn if seer in survivors: # Seer chooses a player to reveal their identity hint = HostMsg( content=Prompts.to_seer.format(seer.name, n2s(survivors)), ) set_parsers(seer, Prompts.seer_parser) x = seer(hint) player, idx = extract_name_and_id(x.content) role = "werewolf" if roles[idx] == "werewolf" else "villager" hint = HostMsg(content=Prompts.to_seer_result.format(player, role)) seer.observe(hint) ``` - **1.4 Update Alive Players** Based on the actions taken during the night, the list of surviving players needs to be updated. ```python # Update the list of survivors and werewolves after the night's events survivors, wolves = update_alive_players(survivors, wolves, dead_player) ``` - **2.1 Daytime Phase: Discussion and Voting** During the day, all players will discuss and then vote to eliminate a suspected werewolf. ```python # Daytime discussion with msghub(survivors, announcement=hints) as hub: # Discuss set_parsers(survivors, Prompts.survivors_discuss_parser) x = sequentialpipeline(survivors) # Vote set_parsers(survivors, Prompts.survivors_vote_parser) hint = HostMsg(content=Prompts.to_all_vote.format(n2s(survivors))) votes = [extract_name_and_id(_(hint).content)[0] for _ in survivors] vote_res = majority_vote(votes) # Broadcast the voting result to all players result = HostMsg(content=Prompts.to_all_res.format(vote_res)) hub.broadcast(result) # Update the list of survivors and werewolves after the vote survivors, wolves = update_alive_players(survivors, wolves, vote_res) ``` - **2.2 Check for Winning Conditions** After each phase, the game checks if the werewolves or villagers have won. ```python # Check if either side has won if check_winning(survivors, wolves, "Moderator"): break ``` - **2.3 Continue to the Next Round** If neither werewolves nor villagers win, the game continues to the next round. ```python # If the game hasn't ended, prepare for the next round hub.broadcast(HostMsg(content=Prompts.to_all_continue)) ``` These code blocks outline the core game loop for Werewolf using AgentScope's `msghub` and `pipeline`, which help to easily manage the operational logic of an application. ### Step 5: Run the Application With the game logic and agents set up, you're ready to run the Werewolf game. By executing the `pipeline`, the game will proceed through the predefined phases, with agents interacting based on their roles and the strategies coded above: ```bash cd examples/game_werewolf python main.py # Assuming the pipeline is implemented in main.py ``` It is recommended that you start the game in [AgentScope Studio](https://modelscope.github.io/agentscope/en/tutorial/209-gui.html), where you will see the following output in the corresponding link: ![s](https://img.alicdn.com/imgextra/i3/O1CN01n2Q2tR1aCFD2gpTdu_!!6000000003293-1-tps-960-482.gif) [[Return to the top]](#104-usecase-en) ``` modelscope/agentscope/blob/main/docs/sphinx_doc/en/source/tutorial/105-logging.md: ```md (105-logging-en)= # Logging Welcome to the tutorial on logging in multi-agent applications with AgentScope. We'll also touch on how you can visualize these logs using a simple web interface. This guide will help you track the agent's interactions and system information in a clearer and more organized way. ## Logging The logging utilities consist of a custom setup for the `loguru.logger`, which is an enhancement over Python's built-in `logging` module. We provide custom features: - **Colored Output**: Assigns different colors to different speakers in a chat to enhance readability. - **Redirecting Standard Error (stderr)**: Captures error messages and logs them with the `ERROR` level. - **Custom Log Levels**: Adds a custom level called `CHAT` that is specifically designed for logging dialogue interactions. - **Special Formatting**: Format logs with timestamps, levels, function names, and line numbers. Chat messages are formatted differently to stand out. ### Setting Up the Logger We recommend setting up the logger via `agentscope.init`, and you can set the log level: ```python import agentscope LOG_LEVEL = Literal[ "CHAT", "TRACE", "DEBUG", "INFO", "SUCCESS", "WARNING", "ERROR", "CRITICAL", ] agentscope.init(..., logger_level="INFO") ``` ### Logging a Chat Message Logging chat messages helps keep a record of the conversation between agents. Here's how you can do it: ```python # Log a simple string message. logger.chat("Hello World!") # Log a `msg` representing dialogue with a speaker and content. logger.chat({"name": "User", "content": "Hello, how are you?"}) logger.chat({"name": "Agent", "content": "I'm fine, thank you!"}) ``` ### Logging a System information System logs are crucial for tracking the application's state and identifying issues. Here's how to log different levels of system information: ```python # Log general information useful for understanding the flow of the application. logger.info("The dialogue agent has started successfully.") # Log a warning message indicating a potential issue that isn't immediately problematic. logger.warning("The agent is running slower than expected.") # Log an error message when something has gone wrong. logger.error("The agent encountered an unexpected error while processing a request.") ``` [[Return to the top]](#105-logging-en) ``` modelscope/agentscope/blob/main/docs/sphinx_doc/en/source/tutorial/201-agent.md: ```md (201-agent-en)= # Agent This tutorial helps you to understand the `Agent` in more depth and navigate through the process of crafting your own custom agent with AgentScope. We start by introducing the fundamental abstraction called `AgentBase`, which serves as the base class to maintain the general behaviors of all agents. Then, we will go through the *AgentPool*, an ensemble of pre-built, specialized agents, each designed with a specific purpose in mind. Finally, we will demonstrate how to customize your own agent, ensuring it fits the needs of your project. ## Understanding `AgentBase` The `AgentBase` class is the architectural cornerstone for all agent constructs within the AgentScope. As the superclass of all custom agents, it provides a comprehensive template consisting of essential attributes and methods that underpin the core functionalities of any conversational agent. Each AgentBase derivative is composed of several key characteristics: * `memory`: This attribute enables agents to retain and recall past interactions, allowing them to maintain context in ongoing conversations. For more details about `memory`, we defer to [Memory and Message Management](205-memory). * `model`: The model is the computational engine of the agent, responsible for making a response given existing memory and input. For more details about `model`, we defer to [Using Different Model Sources with Model API](#203-model). * `sys_prompt` & `engine`: The system prompt acts as predefined instructions that guide the agent in its interactions; and the `engine` is used to dynamically generate a suitable prompt. For more details about them, we defer to [Prompt Engine](206-prompt). * `to_dist`: Used to create a distributed version of the agent, to support efficient collaboration among multiple agents. Note that `to_dist` is a reserved field and will be automatically added to the initialization function of any subclass of `AgentBase`. For more details about `to_dist`, please refer to [Distribution](208-distribute). In addition to these attributes, `AgentBase` endows agents with pivotal methods such as `observe` and `reply`: * `observe()`: Through this method, an agent can take note of *message* without immediately replying, allowing it to update its memory based on the observed *message*. * `reply()`: This is the primary method that developers must implement. It defines the agent's behavior in response to an incoming *message*, encapsulating the logic that results in the agent's output. Besides, for unified interfaces and type hints, we introduce another base class `Operator`, which indicates performing some operation on input data by the `__call__` function. And we make `AgentBase` a subclass of `Operator`. ```python class AgentBase(Operator): # ... [code omitted for brevity] def __init__( self, name: str, sys_prompt: Optional[str] = None, model_config_name: str = None, use_memory: bool = True, memory_config: Optional[dict] = None, ) -> None: # ... [code omitted for brevity] def observe(self, x: Union[Msg, Sequence[Msg]]) -> None: # An optional method for updating the agent's internal state based on # messages it has observed. This method can be used to enrich the # agent's understanding and memory without producing an immediate # response. if self.memory: self.memory.add(x) def reply(self, x: Optional[Union[Msg, Sequence[Msg]]] = None) -> Msg: # The core method to be implemented by custom agents. It defines the # logic for processing an input message and generating a suitable # response. raise NotImplementedError( f"Agent [{type(self).__name__}] is missing the required " f'"reply" function.', ) # ... [code omitted for brevity] ``` ## Exploring the AgentPool The *AgentPool* within AgentScope is a curated ensemble of ready-to-use, specialized agents. Each of these agents is tailored for a distinct role and comes equipped with default behaviors that address specific tasks. The *AgentPool* is designed to expedite the development process by providing various templates of `Agent`. Below is a table summarizing the functionality of some of the key agents available in the Agent Pool: | Agent Type | Description | Typical Use Cases | | ------------------ | ----------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------- | | `AgentBase` | Serves as the superclass for all agents, providing essential attributes and methods. | The foundation for building any custom agent. | | `DialogAgent` | Manages dialogues by understanding context and generating coherent responses. | Customer service bots, virtual assistants. | | `DictDialogAgent` | Manages dialogues by understanding context and generating coherent responses, and the responses are in json format. | Customer service bots, virtual assistants. | | `UserAgent` | Interacts with the user to collect input, generating messages that may include URLs or additional specifics based on required keys. | Collecting user input for agents | | `TextToImageAgent` | An agent that convert user input text to image. | Converting text to image | | `ReActAgent` | An agent class that implements the ReAct algorithm. | Solving complex tasks | | *More to Come* | AgentScope is continuously expanding its pool with more specialized agents for diverse applications. | | ## Customizing Agents from the AgentPool Customizing an agent from AgentPool enables you to tailor its functionality to meet the unique demands of your multi-agent application. You have the flexibility to modify existing agents with minimal effort by **adjusting configurations** and prompts or, for more extensive customization, you can engage in secondary development. Below, we provide usages of how to configure various agents from the AgentPool: ### `DialogAgent` * **Reply Method**: The `reply` method is where the main logic for processing input *message* and generating responses. ```python def reply(self, x: Optional[Union[Msg, Sequence[Msg]]] = None) -> Msg: # Additional processing steps can occur here # Record the input if needed if self.memory: self.memory.add(x) # Generate a prompt for the language model using the system prompt and memory prompt = self.model.format( Msg("system", self.sys_prompt, role="system"), self.memory and self.memory.get_memory() or x, # type: ignore[arg-type] ) # Invoke the language model with the prepared prompt response = self.model(prompt).text #Format the response and create a message object msg = Msg(self.name, response, role="assistant") # Print/speak the message in this agent's voice self.speak(msg) # Record the message to memory and return it if self.memory: self.memory.add(msg) return msg ``` * **Usages:** To tailor a `DialogAgent` for a customer service bot: ```python from agentscope.agents import DialogAgent # Configuration for the DialogAgent dialog_agent_config = { "name": "ServiceBot", "model_config_name": "gpt-3.5", # Specify the model used for dialogue generation "sys_prompt": "Act as AI assistant to interact with the others. Try to " "reponse on one line.\n", # Custom prompt for the agent # Other configurations specific to the DialogAgent } # Create and configure the DialogAgent service_bot = DialogAgent(**dialog_agent_config) ``` ### `UserAgent` * **Reply Method**: This method processes user input by prompting for content and if needed, additional keys and a URL. The gathered data is stored in a *message* object in the agent's memory for logging or later use and returns the message as a response. ```python def reply( self, x: Optional[Union[Msg, Sequence[Msg]]] = None, required_keys: Optional[Union[list[str], str]] = None, ) -> Msg: # Check if there is initial data to be added to memory if self.memory: self.memory.add(x) content = input(f"{self.name}: ") # Prompt the user for input kwargs = {} # Prompt for additional information based on the required keys if required_keys is not None: if isinstance(required_keys, str): required_keys = [required_keys] for key in required_keys: kwargs[key] = input(f"{key}: ") # Optionally prompt for a URL if required url = None if self.require_url: url = input("URL: ") # Create a message object with the collected input and additional details msg = Msg(self.name, content=content, url=url, **kwargs) # Add the message object to memory if self.memory: self.memory.add(msg) return msg ``` * **Usages:** To configure a `UserAgent` for collecting user input and URLs (of file, image, video, audio , or website): ```python from agentscope.agents import UserAgent # Configuration for UserAgent user_agent_config = { "name": "User", "require_url": True, # If true, the agent will require a URL } # Create and configure the UserAgent user_proxy_agent = UserAgent(**user_agent_config) ``` [[Return to the top]](#201-agent-en) ``` modelscope/agentscope/blob/main/docs/sphinx_doc/en/source/tutorial/202-pipeline.md: ```md (202-pipeline-en)= # Pipeline and MsgHub **Pipeline & MsgHub** (message hub) are one or a sequence of steps describing how the structured `Msg` passes between multi-agents, which streamlines the process of collaboration across agents. `Pipeline` allows users to program communication among agents easily, and `MsgHub` enables message sharing among agents like a group chat. ## Pipelines `Pipeline` in AgentScope serves as conduits through which messages pass among agents. In AgentScope, an `Agent` is a subclass of an `Operator` that performs some operation on input data. Pipelines extend this concept by encapsulating multiple agents, and also act as an `Operator`. Here is the base class for all pipeline types: ```python class PipelineBase(Operator): """Base interface of all pipelines.""" # ... [code omitted for brevity] @abstractmethod def __call__(self, x: Optional[dict] = None) -> dict: """Define the actions taken by this pipeline. Args: x (Optional[`dict`], optional): Dialog history and some environmental information Returns: `dict`: The pipeline's response to the input. """ ``` ### Category AgentScope provides two main types of pipelines based on their implementation strategy: * **Operator-Type Pipelines** * These pipelines are object-oriented and inherit from the `PipelineBase`. They are operators themselves and can be combined with other operators to create complex interaction patterns. ```python # Instantiate and invoke pipeline = ClsPipeline(agent1, agent2, agent3) x = pipeline(x) ``` * **Functional Pipelines** * Functional pipelines provide similar control flow mechanisms as the class-based pipelines but are implemented as standalone functions. These are useful for scenarios where a class-based setup may not be necessary or preferred. ```python # Just invoke x = funcpipeline(agent1, agent2, agent3, x) ``` Pipelines are categorized based on their functionality, much like programming language constructs. The table below outlines the different pipelines available in AgentScope: | Operator-Type Pipeline | Functional Pipeline | Description | | -------------------- | -------------------- | ------------------------------------------------------------ | | `SequentialPipeline` | `sequentialpipeline` | Executes a sequence of operators in order, passing the output of one as the input to the next. | | `IfElsePipeline` | `ifelsepipeline` | Implements conditional logic, executing one operator if a condition is true and another if it is false. | | `SwitchPipeline` | `switchpipeline` | Facilitates multi-branch selection, executing an operator from a mapped set based on the evaluation of a condition. | | `ForLoopPipeline` | `forlooppipeline` | Repeatedly executes an operator for a set number of iterations or until a specified break condition is met. | | `WhileLoopPipeline` | `whilelooppipeline` | Continuously executes an operator as long as a given condition remains true. | | - | `placeholder` | Acts as a placeholder in branches that do not require any operations in flow control like if-else/switch | ### Usage This section illustrates how pipelines can simplify the implementation of logic in multi-agent applications by comparing the usage of pipelines versus approaches without pipelines. **Note：** Please note that in the examples provided below, we use the term `agent` to represent any instance that can act as an `Operator`. This is for ease of understanding and to illustrate how pipelines orchestrate interactions between different operations. You can replace `agent` with any `Operator`, thus allowing for a mix of `agent` and `pipeline` in practice. #### `SequentialPipeline` * Without pipeline: ```python x = agent1(x) x = agent2(x) x = agent3(x) ``` * Using pipeline: ```python from agentscope.pipelines import SequentialPipeline pipe = SequentialPipeline([agent1, agent2, agent3]) x = pipe(x) ``` * Using functional pipeline: ```python from agentscope.pipelines import sequentialpipeline x = sequentialpipeline([agent1, agent2, agent3], x) ``` #### `IfElsePipeline` * Without pipeline: ```python if condition(x): x = agent1(x) else: x = agent2(x) ``` * Using pipeline: ```python from agentscope.pipelines import IfElsePipeline pipe = IfElsePipeline(condition, agent1, agent2) x = pipe(x) ``` * Using functional pipeline: ```python from agentscope.functional import ifelsepipeline x = ifelsepipeline(condition, agent1, agent2, x) ``` #### `SwitchPipeline` * Without pipeline: ```python switch_result = condition(x) if switch_result == case1: x = agent1(x) elif switch_result == case2: x = agent2(x) else: x = default_agent(x) ``` * Using pipeline: ```python from agentscope.pipelines import SwitchPipeline case_operators = {case1: agent1, case2: agent2} pipe = SwitchPipeline(condition, case_operators, default_agent) x = pipe(x) ``` * Using functional pipeline: ```python from agentscope.functional import switchpipeline case_operators = {case1: agent1, case2: agent2} x = switchpipeline(condition, case_operators, default_agent, x) ``` #### `ForLoopPipeline` * Without pipeline: ```python for i in range(max_iterations): x = agent(x) if break_condition(x): break ``` * Using pipeline: ```python from agentscope.pipelines import ForLoopPipeline pipe = ForLoopPipeline(agent, max_iterations, break_condition) x = pipe(x) ``` * Using functional pipeline: ```python from agentscope.functional import forlooppipeline x = forlooppipeline(agent, max_iterations, break_condition, x) ``` #### `WhileLoopPipeline` * Without pipeline: ```python while condition(x): x = agent(x) ``` * Using pipeline: ```python from agentscope.pipelines import WhileLoopPipeline pipe = WhileLoopPipeline(agent, condition) x = pipe(x) ``` * Using functional pipeline: ```python from agentscope.functional import whilelooppipeline x = whilelooppipeline(agent, condition, x) ``` ### Pipeline Combination It's worth noting that AgentScope supports the combination of pipelines to create complex interactions. For example, we can create a pipeline that executes a sequence of agents in order, and then executes another pipeline that executes a sequence of agents in condition. ```python from agentscope.pipelines import SequentialPipeline, IfElsePipeline # Create a pipeline that executes agents in order pipe1 = SequentialPipeline([agent1, agent2, agent3]) # Create a pipeline that executes agents in ifElsePipeline pipe2 = IfElsePipeline(condition, agent4, agent5) # Create a pipeline that executes pipe1 and pipe2 in order pipe3 = SequentialPipeline([pipe1, pipe2]) # Invoke the pipeline x = pipe3(x) ``` ## MsgHub `MsgHub` is designed to manage dialogue among a group of agents, allowing for the sharing of messages. Through `MsgHub`, agents can broadcast messages to all other agents in the group with `broadcast`. Here is the core class for a `MsgHub`: ```python class MsgHubManager: """MsgHub manager class for sharing dialog among a group of agents.""" # ... [code omitted for brevity] def broadcast(self, msg: Union[dict, list[dict]]) -> None: """Broadcast the message to all participants.""" for agent in self.participants: agent.observe(msg) def add(self, new_participant: Union[Sequence[AgentBase], AgentBase]) -> None: """Add new participant into this hub""" # ... [code omitted for brevity] def delete(self, participant: Union[Sequence[AgentBase], AgentBase]) -> None: """Delete agents from participant.""" # ... [code omitted for brevity] ``` ### Usage #### Creating a MsgHub To create a `MsgHub`, instantiate a `MsgHubManager` by calling the `msghub` helper function with a list of participating agents. Additionally, you can supply an optional initial announcement that, if provided, will be broadcast to all participants upon initialization. ```python from agentscope.msg_hub import msghub # Initialize MsgHub with participating agents hub_manager = msghub( participants=[agent1, agent2, agent3], announcement=initial_announcement ) ``` #### Broadcast message in MsgHub The `MsgHubManager` can be used with a context manager to handle the setup and teardown of the message hub environment: ```python with msghub( participants=[agent1, agent2, agent3], announcement=initial_announcement ) as hub: # Agents can now broadcast and receive messages within this block agent1() agent2() # Or manually broadcast a message hub.broadcast(some_message) ``` Upon exiting the context block, the `MsgHubManager` ensures that each agent's audience is cleared, preventing any unintended message sharing outside of the hub context. #### Adding and Deleting Participants You can dynamically add or remove agents from the `MsgHub`: ```python # Add a new participant hub.add(new_agent) # Remove an existing participant hub.delete(existing_agent) ``` [[Return to the top]](#202-pipeline-en) ``` modelscope/agentscope/blob/main/docs/sphinx_doc/en/source/tutorial/203-model.md: ```md (203-model-en)= # Model In AgentScope, the model deployment and invocation are decoupled by `ModelWrapper`. Developers can specify their own model by providing model configurations, and AgentScope also provides scripts to support developers to customize model services. ## Supported Models Currently, AgentScope supports the following model service APIs: - OpenAI API, including chat, image generation (DALL-E), and Embedding. - DashScope API, including chat, image sythesis and text embedding. - Gemini API, including chat and embedding. - ZhipuAI API, including chat and embedding. - Ollama API, including chat, embedding and generation. - LiteLLM API, including chat, with various model APIs. - Post Request API, model inference services based on Post requests, including Huggingface/ModelScope Inference API and various post request based model APIs. ## Configuration In AgentScope, users specify the model configuration through the `model_configs` parameter in the `agentscope.init` interface. `model_configs` can be a **dictionary**, **a list of dictionaries**, or a **path** to model configuration file. ```python import agentscope agentscope.init(model_configs=MODEL_CONFIG_OR_PATH) ``` ### Configuration Format In AgentScope, the model configuration is a dictionary used to specify the type of model and set the call parameters. We divide the fields in the model configuration into two categories: _basic parameters_ and _detailed parameters_. Among them, the basic parameters include `config_name` and `model_type`, which are used to distinguish different model configurations and specific `ModelWrapper` types. The detailed parameters will be fed into the corresponding model class's constructor to initialize the model instance. ```python { # Basic parameters "config_name": "gpt-4-temperature-0.0", # Model configuration name "model_type": "openai_chat", # Correspond to `ModelWrapper` type # Detailed parameters # ... } ``` #### Basic Parameters In basic parameters, `config_name` is the identifier of the model configuration, which we will use to specify the model service when initializing an agent. `model_type` corresponds to the type of `ModelWrapper` and is used to specify the type of model service. It corresponds to the `model_type` field in the `ModelWrapper` class in the source code. ```python class OpenAIChatWrapper(OpenAIWrapperBase): """The model wrapper for OpenAI's chat API.""" model_type: str = "openai_chat" # ... ``` In the current AgentScope, the supported `model_type` types, the corresponding `ModelWrapper` classes, and the supported APIs are as follows: | API | Task | Model Wrapper | `model_type` | Some Supported Models | |------------------------|-----------------|---------------------------------------------------------------------------------------------------------------------------------|-------------------------------|--------------------------------------------------| | OpenAI API | Chat | [`OpenAIChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/openai_model.py) | `"openai_chat"` | gpt-4, gpt-3.5-turbo, ... | | | Embedding | [`OpenAIEmbeddingWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/openai_model.py) | `"openai_embedding"` | text-embedding-ada-002, ... | | | DALL·E | [`OpenAIDALLEWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/openai_model.py) | `"openai_dall_e"` | dall-e-2, dall-e-3 | | DashScope API | Chat | [`DashScopeChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/dashscope_model.py) | `"dashscope_chat"` | qwen-plus, qwen-max, ... | | | Image Synthesis | [`DashScopeImageSynthesisWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/dashscope_model.py) | `"dashscope_image_synthesis"` | wanx-v1 | | | Text Embedding | [`DashScopeTextEmbeddingWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/dashscope_model.py) | `"dashscope_text_embedding"` | text-embedding-v1, text-embedding-v2, ... | | | Multimodal | [`DashScopeMultiModalWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/dashscope_model.py) | `"dashscope_multimodal"` | qwen-vl-plus, qwen-vl-max, qwen-audio-turbo, ... | | Gemini API | Chat | [`GeminiChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/gemini_model.py) | `"gemini_chat"` | gemini-pro, ... | | | Embedding | [`GeminiEmbeddingWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/gemini_model.py) | `"gemini_embedding"` | models/embedding-001, ... | | ZhipuAI API | Chat | [`ZhipuAIChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/zhipu_model.py) | `"zhipuai_chat"` | glm4, ... | | | Embedding | [`ZhipuAIEmbeddingWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/zhipu_model.py) | `"zhipuai_embedding"` | embedding-2, ... | | ollama | Chat | [`OllamaChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/ollama_model.py) | `"ollama_chat"` | llama2, ... | | | Embedding | [`OllamaEmbeddingWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/ollama_model.py) | `"ollama_embedding"` | llama2, ... | | | Generation | [`OllamaGenerationWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/ollama_model.py) | `"ollama_generate"` | llama2, ... | | LiteLLM API | Chat | [`LiteLLMChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/litellm_model.py) | `"litellm_chat"` | - | | Post Request based API | - | [`PostAPIModelWrapperBase`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/post_model.py) | `"post_api"` | - | | | Chat | [`PostAPIChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/post_model.py) | `"post_api_chat"` | meta-llama/Meta-Llama-3-8B-Instruct, ... | | | Image Synthesis | [`PostAPIDALLEWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/post_model.py) | `post_api_dall_e` | - | | | | Embedding | [`PostAPIEmbeddingWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/post_model.py) | `post_api_embedding` | - | #### Detailed Parameters In AgentScope, the detailed parameters are different according to the different `ModelWrapper` classes. To specify the detailed parameters, you need to refer to the specific `ModelWrapper` class and its constructor. Here we provide example configurations for different model wrappers. ##### OpenAI API

OpenAI Chat API (agents.models.OpenAIChatWrapper)

```python { "config_name": "{your_config_name}", "model_type": "openai_chat", # Required parameters "model_name": "gpt-4", # Optional parameters "api_key": "{your_api_key}", # OpenAI API Key, if not provided, it will be read from the environment variable "organization": "{your_organization}", # Organization name, if not provided, it will be read from the environment variable "client_args": { # Parameters for initializing the OpenAI API Client # e.g. "max_retries": 3, }, "generate_args": { # Parameters passed to the model when calling # e.g. "temperature": 0.0 }, "budget": 100 # API budget } ```

OpenAI DALL·E API (agentscope.models.OpenAIDALLEWrapper)

```python { "config_name": "{your_config_name}", "model_type": "openai_dall_e", # Required parameters "model_name": "{model_name}", # OpenAI model name, e.g. dall-e-2, dall-e-3 # Optional parameters "api_key": "{your_api_key}", # OpenAI API Key, if not provided, it will be read from the environment variable "organization": "{your_organization}", # Organization name, if not provided, it will be read from the environment variable "client_args": { # Parameters for initializing the OpenAI API Client # e.g. "max_retries": 3, }, "generate_args": { # Parameters passed to the model when calling # e.g. "n": 1, "size": "512x512" } } ```

OpenAI Embedding API (agentscope.models.OpenAIEmbeddingWrapper)

```python { "config_name": "{your_config_name}", "model_type": "openai_embedding", # Required parameters "model_name": "{model_name}", # OpenAI model name, e.g. text-embedding-ada-002, text-embedding-3-small # Optional parameters "api_key": "{your_api_key}", # OpenAI API Key, if not provided, it will be read from the environment variable "organization": "{your_organization}", # Organization name, if not provided, it will be read from the environment variable "client_args": { # Parameters for initializing the OpenAI API Client # e.g. "max_retries": 3, }, "generate_args": { # Parameters passed to the model when calling # e.g. "encoding_format": "float" } } ```

#### DashScope API

DashScope Chat API (agentscope.models.DashScopeChatWrapper)

```python { "config_name": "my_dashscope_chat_config", "model_type": "dashscope_chat", # Required parameters "model_name": "{model_name}", # The model name in DashScope API, e.g. qwen-max # Optional parameters "api_key": "{your_api_key}", # DashScope API Key, if not provided, it will be read from the environment variable "generate_args": { # e.g. "temperature": 0.5 }, } ```

DashScope Image Synthesis API (agentscope.models.DashScopeImageSynthesisWrapper)

```python { "config_name": "my_dashscope_image_synthesis_config", "model_type": "dashscope_image_synthesis", # Required parameters "model_name": "{model_name}", # The model name in DashScope Image Synthesis API, e.g. wanx-v1 # Optional parameters "api_key": "{your_api_key}", "generate_args": { "negative_prompt": "xxx", "n": 1, # ... } } ```

DashScope Text Embedding API (agentscope.models.DashScopeTextEmbeddingWrapper)

```python { "config_name": "my_dashscope_text_embedding_config", "model_type": "dashscope_text_embedding", # Required parameters "model_name": "{model_name}", # The model name in DashScope Text Embedding API, e.g. text-embedding-v1 # Optional parameters "api_key": "{your_api_key}", "generate_args": { # ... }, } ```

DashScope Multimodal Conversation API (agentscope.models.DashScopeMultiModalWrapper)

```python { "config_name": "my_dashscope_multimodal_config", "model_type": "dashscope_multimodal", # Required parameters "model_name": "{model_name}", # The model name in DashScope Multimodal Conversation API, e.g. qwen-vl-plus # Optional parameters "api_key": "{your_api_key}", "generate_args": { # ... }, } ```

#### Gemini API

Gemini Chat API (agentscope.models.GeminiChatWrapper)

```python { "config_name": "my_gemini_chat_config", "model_type": "gemini_chat", # Required parameters "model_name": "{model_name}", # The model name in Gemini API, e.g. gemini-pro # Optional parameters "api_key": "{your_api_key}", # If not provided, the API key will be read from the environment variable GEMINI_API_KEY } ```

Gemini Embedding API (agentscope.models.GeminiEmbeddingWrapper)

```python { "config_name": "my_gemini_embedding_config", "model_type": "gemini_embedding", # Required parameters "model_name": "{model_name}", # The model name in Gemini API, e.g. models/embedding-001 # Optional parameters "api_key": "{your_api_key}", # If not provided, the API key will be read from the environment variable GEMINI_API_KEY } ```

#### ZhipuAI API

ZhipuAI Chat API (agentscope.models.ZhipuAIChatWrapper)

```python { "config_name": "my_zhipuai_chat_config", "model_type": "zhipuai_chat", # Required parameters "model_name": "{model_name}", # The model name in ZhipuAI API, e.g. glm-4 # Optional parameters "api_key": "{your_api_key}" } ```

ZhipuAI Embedding API (agentscope.models.ZhipuAIEmbeddingWrapper)

```python { "config_name": "my_zhipuai_embedding_config", "model_type": "zhipuai_embedding", # Required parameters "model_name": "{model_name}", # The model name in ZhipuAI API, e.g. embedding-2 # Optional parameters "api_key": "{your_api_key}", } ```

#### Ollama API

Ollama Chat API (agentscope.models.OllamaChatWrapper)

```python { "config_name": "my_ollama_chat_config", "model_type": "ollama_chat", # Required parameters "model_name": "{model_name}", # The model name used in ollama API, e.g. llama2 # Optional parameters "options": { # Parameters passed to the model when calling # e.g. "temperature": 0., "seed": 123, }, "keep_alive": "5m", # Controls how long the model will stay loaded into memory } ```

Ollama Generation API (agentscope.models.OllamaGenerationWrapper)

```python { "config_name": "my_ollama_generate_config", "model_type": "ollama_generate", # Required parameters "model_name": "{model_name}", # The model name used in ollama API, e.g. llama2 # Optional parameters "options": { # Parameters passed to the model when calling # "temperature": 0., "seed": 123, }, "keep_alive": "5m", # Controls how long the model will stay loaded into memory } ```

Ollama Embedding API (agentscope.models.OllamaEmbeddingWrapper)

```python { "config_name": "my_ollama_embedding_config", "model_type": "ollama_embedding", # Required parameters "model_name": "{model_name}", # The model name used in ollama API, e.g. llama2 # Optional parameters "options": { # Parameters passed to the model when calling # "temperature": 0., "seed": 123, }, "keep_alive": "5m", # Controls how long the model will stay loaded into memory } ```

#### LiteLLM Chat API

LiteLLM Chat API (agentscope.models.LiteLLMChatModelWrapper)

```python { "config_name": "lite_llm_openai_chat_gpt-3.5-turbo", "model_type": "litellm_chat", "model_name": "gpt-3.5-turbo" # You should note that for different models, you should set the corresponding environment variables, such as OPENAI_API_KEY, etc. You may refer to https://docs.litellm.ai/docs/ for this. }, ```

#### Post Request API

Post Request Chat API (agentscope.models.PostAPIChatWrapper)

```python { "config_name": "my_postapichatwrapper_config", "model_type": "post_api_chat", # Required parameters "api_url": "https://xxx.xxx", "headers": { # e.g. "Authorization": "Bearer xxx", }, # Optional parameters "messages_key": "messages", } ``` > ⚠️ The Post Request Chat model wrapper (`PostAPIChatWrapper`) has the following properties: > 1) The `.format()` function makes sure the input messages become a list of dicts. > 2) The `._parse_response()` function assumes the generated text will be in `response["data"]["response"]["choices"][0]["message"]["content"]`

Post Request Image Synthesis API (agentscope.models.PostAPIDALLEWrapper)

```python { "config_name": "my_postapiwrapper_config", "model_type": "post_api_dall_e", # Required parameters "api_url": "https://xxx.xxx", "headers": { # e.g. "Authorization": "Bearer xxx", }, # Optional parameters "messages_key": "messages", } ``` > ⚠️ The Post Request Image Synthesis model wrapper (`PostAPIDALLEWrapper`) has the following properties: > 1) The `._parse_response()` function assumes the generated image will be presented as urls in `response["data"]["response"]["data"][i]["url"]`

Post Request Embedding API (agentscope.models.PostAPIEmbeddingWrapper)

```python { "config_name": "my_postapiwrapper_config", "model_type": "post_api_embedding", # Required parameters "api_url": "https://xxx.xxx", "headers": { # e.g. "Authorization": "Bearer xxx", }, # Optional parameters "messages_key": "messages", } ``` > ⚠️ The Post Request Embedding model wrapper (`PostAPIEmbeddingWrapper`) has the following properties: > 1) The `._parse_response()` function assumes the generated embeddings will be in `response["data"]["response"]["data"][i]["embedding"]`

Post Request API (agentscope.models.PostAPIModelWrapperBase)

```python { "config_name": "my_postapiwrapper_config", "model_type": "post_api", # Required parameters "api_url": "https://xxx.xxx", "headers": { # e.g. "Authorization": "Bearer xxx", }, # Optional parameters "messages_key": "messages", } ``` > ⚠️ Post Request model wrapper (`PostAPIModelWrapperBase`) returns raw HTTP responses from the API in ModelResponse, and the `.format()` is not implemented. It is recommended to use `Post Request Chat API` when running examples with chats. > `PostAPIModelWrapperBase` can be used when > 1) only the raw HTTP response is wanted and `.format()` is not called; > 2) Or, the developers want to overwrite the `.format()` and/or `._parse_response()` functions.

## Build Model Service from Scratch For developers who need to build their own model services, AgentScope provides some scripts to help developers quickly build model services. You can find these scripts and instructions in the [scripts](https://github.com/modelscope/agentscope/tree/main/scripts) directory. Specifically, AgentScope provides the following model service scripts: - [CPU inference engine **ollama**](https://github.com/modelscope/agentscope/blob/main/scripts/README.md#ollama) - [Model service based on **Flask + Transformers**](https://github.com/modelscope/agentscope/blob/main/scripts/README.md#with-transformers-library) - [Model service based on **Flask + ModelScope**](https://github.com/modelscope/agentscope/blob/main/scripts/README.md#with-modelscope-library) - [**FastChat** inference engine](https://github.com/modelscope/agentscope/blob/main/scripts/README.md#fastchat) - [**vllm** inference engine](https://github.com/modelscope/agentscope/blob/main/scripts/README.md#vllm) About how to quickly start these model services, users can refer to the [README.md](https://github.com/modelscope/agentscope/blob/main/scripts/README.md) file under the [scripts](https://github.com/modelscope/agentscope/blob/main/scripts/) directory. ## Creat Your Own Model Wrapper AgentScope allows developers to customize their own model wrappers. The new model wrapper class should - inherit from `ModelWrapperBase` class, - provide a `model_type` field to identify this model wrapper in the model configuration, and - implement its `__init__` and `__call__` functions. The following is an example for creating a new model wrapper class. ```python from agentscope.models import ModelWrapperBase class MyModelWrapper(ModelWrapperBase): model_type: str = "my_model" def __init__(self, config_name, my_arg1, my_arg2, **kwargs): # Initialize the model instance super().__init__(config_name=config_name) # ... def __call__(self, input, **kwargs) -> str: # Call the model instance # ... ``` After creating the new model wrapper class, the model wrapper will be registered into AgentScope automatically. You can use it in the model configuration directly. ```python my_model_config = { # Basic parameters "config_name": "my_model_config", "model_type": "my_model", # Detailed parameters "my_arg1": "xxx", "my_arg2": "yyy", # ... } ``` [[Return to Top]](#203-model-en) ``` modelscope/agentscope/blob/main/docs/sphinx_doc/en/source/tutorial/203-parser.md: ```md (203-parser-en)= # Response Parser ## Table of Contents - [Background](#background) - [Parser Module](#parser-module) - [Overview](#overview) - [String Type](#string-type) - [MarkdownCodeBlockParser](#markdowncodeblockparser) - [Initialization](#initialization) - [Format Instruction Template](#format-instruction-template) - [Parse Function](#parse-function) - [Dictionary Type](#dictionary-type) - [MarkdownJsonDictParser](#markdownjsondictparser) - [Initialization & Format Instruction Template](#initialization--format-instruction-template) - [Validation](#validation) - [MultiTaggedContentParser](#multitaggedcontentparser) - [Initialization & Format Instruction Template](#initialization--format-instruction-template-1) - [Parse Function](#parse-function-1) - [JSON / Python Object Type](#json--python-object-type) - [MarkdownJsonObjectParser](#markdownjsonobjectparser) - [Initialization & Format Instruction Template](#initialization--format-instruction-template-2) - [Parse Function](#parse-function-2) - [Typical Use Cases](#typical-use-cases) - [WereWolf Game](#werewolf-game) - [ReAct Agent and Tool Usage](#react-agent-and-tool-usage) - [Customized Parser](#customized-parser) ## Background In the process of building LLM-empowered application, parsing the LLM generated string into a specific format and extracting the required information is a very important step. However, due to the following reasons, this process is also a very complex process: 1. **Diversity**: The target format of parsing is diverse, and the information to be extracted may be a specific text, a JSON object, or a complex data structure. 2. **Complexity**: The result parsing is not only to convert the text generated by LLM into the target format, but also involves a series of issues such as prompt engineering (reminding LLM what format of output should be generated), error handling, etc. 3. **Flexibility**: Even in the same application, different stages may also require the agent to generate output in different formats. For the convenience of developers, AgentScope provides a parser module to help developers parse LLM response into a specific format. By using the parser module, developers can easily parse the response into the target format by simple configuration, and switch the target format flexibly. In AgentScope, the parser module features 1. **Flexibility**: Developers can flexibly set the required format, flexibly switch the parser without modifying the code of agent class. That is, the specific "target format" and the agent's `reply` function are decoupled. 2. **Freedom**: The format instruction, result parsing and prompt engineering are all explicitly finished in the `reply` function. Developers and users can freely choose to use the parser or parse LLM response by their own code. 3. **Transparency**: When using the parser, the process and results of prompt construction are completely visible and transparent to developers in the `reply` function, and developers can precisely debug their applications. ## Parser Module ### Overview The main functions of the parser module include: 1. Provide "format instruction", that is, remind LLM where to generate what output, for example ```` You should generate python code in a fenced code block as follows ```python {your_python_code} ``` ```` 2. Provide a parse function, which directly parses the text generated by LLM into the target data format, 3. Post-processing for dictionary format. After parsing the text into a dictionary, different fields may have different uses. AgentScope provides multiple built-in parsers, and developers can choose according to their needs. | Target Format | Parser Class | Description | |---------------------------|----------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | String | `MarkdownCodeBlockParser` | Requires LLM to generate specified text within a Markdown code block marked by ```. The result is a string. | | Dictionary | `MarkdownJsonDictParser` | Requires LLM to produce a specified dictionary within the code block marked by \```json and \```. The result is a Python dictionary. | | | `MultiTaggedContentParser` | Requires LLM to generate specified content within multiple tags. Contents from different tags will be parsed into a single Python dictionary with different key-value pairs. | | | `RegexTaggedContentParser` | For uncertain tag names and quantities, allows users to modify regular expressions, and the return result is a dictionary. | | JSON / Python Object Type | `MarkdownJsonObjectParser` | Requires LLM to produce specified content within the code block marked by \```json and \```. The result will be converted into a Python object via json.loads. | > **NOTE**: Compared to `MarkdownJsonDictParser`, `MultiTaggedContentParser` is more suitable for weak LLMs and when the required format is too complex. > For example, when LLM is required to generate Python code, if the code is returned directly within a dictionary, LLM needs to be aware of escaping characters (\t, \n, ...), and the differences between double and single quotes when calling `json.loads` > > In contrast, `MultiTaggedContentParser` guides LLM to generate each key-value pair separately in individual tags and then combines them into a dictionary, thus reducing the difficulty. >**NOTE**: The built-in strategies to construct format instruction just provide some examples. In AgentScope, developer has complete control over prompt construction. So they can choose not to use the format instruction provided by parsers, customizing their format instruction by hand or implementing new parser class are all feasible. In the following sections, we will introduce the usage of these parsers based on different target formats. ### String Type #### MarkdownCodeBlockParser ##### Initialization - `MarkdownCodeBlockParser` requires LLM to generate specific text within a specified code block in Markdown format. Different languages can be specified with the `language_name` parameter to utilize the large model's ability to produce corresponding outputs. For example, when asking the large model to produce Python code, initialize as follows: ```python from agentscope.parsers import MarkdownCodeBlockParser parser = MarkdownCodeBlockParser(language_name="python", content_hint="your python code") ``` ##### Format Instruction Template - `MarkdownCodeBlockParser` provides the following format instruction template. When the user calls the `format_instruction` attribute, `{language_name}` will be replaced with the string entered at initialization: ```` You should generate {language_name} code in a {language_name} fenced code block as follows: ```{language_name} {content_hint} ``` ```` - For the above initialization with `language_name` as `"python"`, when the `format_instruction` attribute is called, the following string will be returned: ```python print(parser.format_instruction) ``` ```` You should generate python code in a python fenced code block as follows ```python your python code ``` ```` ##### Parse Function - `MarkdownCodeBlockParser` provides a `parse` method to parse the text generated by LLM。Its input and output are both `ModelResponse` objects, and the parsing result will be mounted on the `parsed` attribute of the output object. ````python res = parser.parse( ModelResponse( text="""The following is generated python code ```python print("Hello world!") ``` """ ) ) print(res.parsed) ```` ``` print("hello world!") ``` ### Dictionary Type Different from string and general JSON/Python object, as a powerful format in LLM applications, AgentScope provides additional post-processing functions for dictionary type. When initializing the parser, you can set the `keys_to_content`, `keys_to_memory`, and `keys_to_metadata` parameters to achieve filtering of key-value pairs when calling the parser's `to_content`, `to_memory`, and `to_metadata` methods. - `keys_to_content` specifies the key-value pairs that will be placed in the `content` field of the returned `Msg` object. The content field will be returned to other agents, participate in their prompt construction, and will also be called by the `self.speak` function for display. - `keys_to_memory` specifies the key-value pairs that will be stored in the memory of the agent. - `keys_to_metadata` specifies the key-value pairs that will be placed in the `metadata` field of the returned `Msg` object, which can be used for application control flow judgment, or mount some information that does not need to be returned to other agents. The three parameters receive bool values, string and a list of strings. The meaning of their values is as follows: - `False`: The corresponding filter function will return `None`. - `True`: The whole dictionary will be returned. - `str`: The corresponding value will be directly returned. - `List[str]`: A filtered dictionary will be returned according to the list of keys. By default, `keys_to_content` and `keys_to_memory` are `True`, that is, the whole dictionary will be returned. `keys_to_metadata` defaults to `False`, that is, the corresponding filter function will return `None`. For example, the dictionary generated by the werewolf in the daytime discussion in a werewolf game. In this example, - `"thought"` should not be returned to other agents, but should be stored in the agent's memory to ensure the continuity of the werewolf strategy; - `"speak"` should be returned to other agents and stored in the agent's memory; - `"finish_discussion"` is used in the application's control flow to determine whether the discussion has ended. To save tokens, this field should not be returned to other agents or stored in the agent's memory. ```python { "thought": "The others didn't realize I was a werewolf. I should end the discussion soon.", "speak": "I agree with you.", "finish_discussion": True } ``` In AgentScope, we achieve post-processing by calling the `to_content`, `to_memory`, and `to_metadata` methods, as shown in the following code: - The code for the application's control flow, create the corresponding parser object and load it ```python from agentscope.parsers import MarkdownJsonDictParser # ... agent = DictDialogAgent(...) # Take MarkdownJsonDictParser as example parser = MarkdownJsonDictParser( content_hint={ "thought": "what you thought", "speak": "what you speak", "finish_discussion": "whether the discussion is finished" }, keys_to_content="speak", keys_to_memory=["thought", "speak"], keys_to_metadata=["finish_discussion"] ) # Load parser, which is equivalent to specifying the required format agent.set_parser(parser) # The discussion process while True: # ... x = agent(x) # Break the loop according to the finish_discussion field in metadata if x.metadata["finish_discussion"]: break ``` - Filter the dictionary in the agent's `reply` function ```python # ... def reply(self, x: Optional[Union[Msg, Sequence[Msg]]] = None) -> Msg: # ... res = self.model(prompt, parse_func=self.parser.parse) # Story the thought and speak fields into memory self.memory.add( Msg( self.name, content=self.parser.to_memory(res.parsed), role="assistant", ) ) # Store in content and metadata fields in the returned Msg object msg = Msg( self.name, content=self.parser.to_content(res.parsed), role="assistant", metadata=self.parser.to_metadata(res.parsed), ) self.speak(msg) return msg ``` > **Note**: `keys_to_content`, `keys_to_memory`, and `keys_to_metadata` parameters can be a string, a list of strings, or a bool value. > - For `True`, the `to_content`, `to_memory`, and `to_metadata` methods will directly return the whole dictionary. > - For `False`, the `to_content`, `to_memory`, and `to_metadata` methods will directly return `None`. > - For a string, the `to_content`, `to_memory`, and `to_metadata` methods will directly extract the corresponding value. For example, if `keys_to_content="speak"`, the `to_content` method will put `res.parsed["speak"]` into the `content` field of the `Msg` object, and the `content` field will be a string rather than a dictionary. > - For a list of string, the `to_content`, `to_memory`, and `to_metadata` methods will filter the dictionary according to the list of keys. > ```python > parser = MarkdownJsonDictParser( > content_hint={ > "thought": "what you thought", > "speak": "what you speak", > }, > keys_to_content="speak", > keys_to_memory=["thought", "speak"], > ) > > example_dict = {"thought": "abc", "speak": "def"} > print(parser.to_content(example_dict)) # def > print(parser.to_memory(example_dict)) # {"thought": "abc", "speak": "def"} > print(parser.to_metadata(example_dict)) # None > ``` > ``` > def > {"thought": "abc", "speak": "def"} > None > ``` #### Parsers For dictionary type return values, AgentScope provides multiple parsers for developers to choose from according to their needs. ##### RegexTaggedContentParser ###### Initialization `RegexTaggedContentParser` is designed for scenarios where 1) the tag name is uncertain, and 2) the number of tags is uncertain. In this case, the parser cannot provide a general response format instruction, so developers need to provide the corresponding response format instruction (`format_instruction`) when initializing. Of course, the developers can handle the prompt engineering by themselves optionally. ```python from agentscope.parsers import RegexTaggedContentParser parser = RegexTaggedContentParser( format_instruction="""Respond with specific tags as outlined below what you thought what you speak """, try_parse_json=True, # Try to parse the content of the tag as JSON object required_keys=["thought", "speak"] # Required keys in the returned dictionary ) ``` ##### MarkdownJsonDictParser ###### Initialization & Format Instruction Template - `MarkdownJsonDictParser` requires LLM to generate dictionary within a code block fenced by \```json and \``` tags. - Except `keys_to_content`, `keys_to_memory` and `keys_to_metadata`, the `content_hint` parameter can be provided to give an example and explanation of the response result, that is, to remind LLM where and what kind of dictionary should be generated. This parameter can be a string or a dictionary. For dictionary, it will be automatically converted to a string when constructing the format instruction. ```python from agentscope.parsers import MarkdownJsonDictParser # dictionary as content_hint MarkdownJsonDictParser( content_hint={ "thought": "what you thought", "speak": "what you speak", } ) # or string as content_hint MarkdownJsonDictParser( content_hint="""{ "thought": "what you thought", "speak": "what you speak", }""" ) ``` - The corresponding `instruction_format` attribute ```` You should respond a json object in a json fenced code block as follows: ```json {content_hint} ``` ```` ###### Validation The `content_hint` parameter in `MarkdownJsonDictParser` also supports type validation based on Pydantic. When initializing, you can set `content_hint` to a Pydantic model class, and AgentScope will modify the `instruction_format` attribute based on this class. Besides, Pydantic will be used to validate the dictionary returned by LLM during parsing. A simple example is as follows, where `"..."` can be filled with specific type validation rules, which can be referred to the [Pydantic](https://docs.pydantic.dev/latest/) documentation. ```python from pydantic import BaseModel, Field from agentscope.parsers import MarkdownJsonDictParser class Schema(BaseModel): thought: str = Field(..., description="what you thought") speak: str = Field(..., description="what you speak") end_discussion: bool = Field(..., description="whether the discussion is finished") parser = MarkdownJsonDictParser(content_hint=Schema) ``` - The corresponding `instruction_format` attribute ```` Respond a JSON dictionary in a markdown's fenced code block as follows: ```json {a_JSON_dictionary} ``` The generated JSON dictionary MUST follow this schema: {'properties': {'speak': {'description': 'what you speak', 'title': 'Speak', 'type': 'string'}, 'thought': {'description': 'what you thought', 'title': 'Thought', 'type': 'string'}, 'end_discussion': {'description': 'whether the discussion reached an agreement or not', 'title': 'End Discussion', 'type': 'boolean'}}, 'required': ['speak', 'thought', 'end_discussion'], 'title': 'Schema', 'type': 'object'} ```` - During the parsing process, Pydantic will be used for type validation, and an exception will be thrown if the validation fails. Meanwhile, Pydantic also provides some fault tolerance capabilities, such as converting the string `"true"` to Python's `True`: ```` parser.parser(""" ```json { "thought": "The others didn't realize I was a werewolf. I should end the discussion soon.", "speak": "I agree with you.", "end_discussion": "true" } ``` """) ```` ##### MultiTaggedContentParser `MultiTaggedContentParser` asks LLM to generate specific content within multiple tag pairs. The content from different tag pairs will be parsed into a single Python dictionary. Its usage is similar to `MarkdownJsonDictParser`, but the initialization method is different, and it is more suitable for weak LLMs or complex return content. ###### Initialization & Format Instruction Template Within `MultiTaggedContentParser`, each tag pair will be specified by as `TaggedContent` object, which contains - Tag name (`name`), the key value in the returned dictionary - Start tag (`tag_begin`) - Hint for content (`content_hint`) - End tag (`tag_end`) - Content parsing indication (`parse_json`), default as `False`. When set to `True`, the parser will automatically add hint that requires JSON object between the tags, and its extracted content will be parsed into a Python object via `json.loads` ```python from agentscope.parsers import MultiTaggedContentParser, TaggedContent parser = MultiTaggedContentParser( TaggedContent( name="thought", tag_begin="[THOUGHT]", content_hint="what you thought", tag_end="[/THOUGHT]" ), TaggedContent( name="speak", tag_begin="[SPEAK]", content_hint="what you speak", tag_end="[/SPEAK]" ), TaggedContent( name="finish_discussion", tag_begin="[FINISH_DISCUSSION]", content_hint="true/false, whether the discussion is finished", tag_end="[/FINISH_DISCUSSION]", parse_json=True, # we expect the content of this field to be parsed directly into a Python boolean value ) ) print(parser.format_instruction) ``` ``` Respond with specific tags as outlined below, and the content between [FINISH_DISCUSSION] and [/FINISH_DISCUSSION] MUST be a JSON object: [THOUGHT]what you thought[/THOUGHT] [SPEAK]what you speak[/SPEAK] [FINISH_DISCUSSION]true/false, whether the discussion is finished[/FINISH_DISCUSSION] ``` ###### Parse Function - `MultiTaggedContentParser`'s parsing result is a dictionary, whose keys are the value of `name` in the `TaggedContent` objects. The following is an example of parsing the LLM response in the werewolf game: ```python res_dict = parser.parse( ModelResponse( text="""As a werewolf, I should keep pretending to be a villager [THOUGHT]The others didn't realize I was a werewolf. I should end the discussion soon.[/THOUGHT] [SPEAK]I agree with you.[/SPEAK] [FINISH_DISCUSSION]true[/FINISH_DISCUSSION]""" ) ) print(res_dict) ``` ``` { "thought": "The others didn't realize I was a werewolf. I should end the discussion soon.", "speak": "I agree with you.", "finish_discussion": true } ``` ### JSON / Python Object Type #### MarkdownJsonObjectParser `MarkdownJsonObjectParser` also uses the \```json and \``` tags in Markdown, but does not limit the content type. It can be a list, dictionary, number, string, etc., which can be parsed into a Python object via `json.loads`. ##### Initialization & Format Instruction Template ```python from agentscope.parsers import MarkdownJsonObjectParser parser = MarkdownJsonObjectParser( content_hint="{A list of numbers.}" ) print(parser.format_instruction) ``` ```` You should respond a json object in a json fenced code block as follows: ```json {a list of numbers} ``` ```` ##### Parse Function ````python res = parser.parse( ModelResponse( text="""Yes, here is the generated list ```json [1,2,3,4,5] ``` """) ) print(type(res)) print(res) ```` ``` [1, 2, 3, 4, 5] ``` ## Typical Use Cases ### WereWolf Game Werewolf game is a classic use case of dictionary parser. In different stages of the game, the same agent needs to generate different identification fields in addition to `"thought"` and `"speak"`, such as whether the discussion is over, whether the seer uses its ability, whether the witch uses the antidote and poison, and voting. AgentScope has built-in examples of [werewolf game](https://github.com/modelscope/agentscope/tree/main/examples/game_werewolf), which uses `DictDialogAgent` class and different parsers to achieve flexible target format switching. By using the post-processing function of the parser, it separates "thought" and "speak", and controls the progress of the game successfully. More details can be found in the werewolf game [source code](https://github.com/modelscope/agentscope/tree/main/examples/game_werewolf). ### ReAct Agent and Tool Usage `ReActAgent` is an agent class built for tool usage in AgentScope, based on the ReAct algorithm, and can be used with different tool functions. The tool call, format parsing, and implementation of `ReActAgent` are similar to the parser. For detailed implementation, please refer to the [source code](https://github.com/modelscope/agentscope/blob/main/src/agentscope/agents/react_agent.py). ## Customized Parser AgentScope provides a base class `ParserBase` for parsers. Developers can inherit this base class, and implement the `format_instruction` attribute and `parse` method to create their own parser. For dictionary type parsing, you can also inherit the `agentscope.parser.DictFilterMixin` class to implement post-processing for dictionary type. ```python from abc import ABC, abstractmethod from agentscope.models import ModelResponse class ParserBase(ABC): """The base class for model response parser.""" format_instruction: str """The instruction for the response format.""" @abstractmethod def parse(self, response: ModelResponse) -> ModelResponse: """Parse the response text to a specific object, and stored in the parsed field of the response object.""" # ... ``` ``` modelscope/agentscope/blob/main/docs/sphinx_doc/en/source/tutorial/203-stream.md: ```md (203-stream-en)= # Streaming AgentScope supports streaming mode for the following LLM APIs in both **terminal** and **AgentScope Studio**. | API | Model Wrapper | `model_type` field in model configuration | |--------------------|---------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------| | OpenAI Chat API | [`OpenAIChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/openai_model.py) | `"openai_chat"` | | DashScope Chat API | [`DashScopeChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/dashscope_model.py) | `"dashscope_chat"` | | Gemini Chat API | [`GeminiChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/gemini_model.py) | `"gemini_chat"` | | ZhipuAI Chat API | [`ZhipuAIChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/zhipu_model.py) | `"zhipuai_chat"` | | ollama Chat API | [`OllamaChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/ollama_model.py) | `"ollama_chat"` | | LiteLLM Chat API | [`LiteLLMChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/litellm_model.py) | `"litellm_chat"` | ## Setup Streaming Mode AgentScope allows users to set up streaming mode in both model configuration and model calling. ### In Model Configuration To use streaming mode, set the stream field to `True` in the model configuration. ```python model_config = { "config_name": "xxx", "model_type": "xxx", "stream": True, # ... } ``` ### In Model Calling Within an agent, you can call the model with the `stream` parameter set to `True`. Note the `stream` parameter in the model calling will override the `stream` field in the model configuration. ```python class MyAgent(AgentBase): # ... def reply(self, x: Optional[Msg, Sequence[Msg]] = None) -> Msg: # ... response = self.model( prompt, stream=True, ) # ... ``` ## Printing in Streaming Mode In streaming mode, the `stream` field of a model response will be a generator, and the `text` field will be `None`. For compatibility with the non-streaming mode, once the `text` field is accessed, the generator in `stream` field will be iterated to generate the full text and store it in the `text` field. So that even in streaming mode, users can handle the response text in `text` field as usual. However, if you want to print in streaming mode, just put the generator in `self.speak` to print the streaming text in the terminal and AgentScope Studio. After printing the streaming response, the full text of the response will be available in the `response.text` field. ```python def reply(self, x: Optional[Msg, Sequence[Msg]] = None) -> Msg: # ... # Use stream=True if you want to set up streaming mode in model calling response = self.model(prompt) # For now, the response.text is None # Print the response in streaming mode in terminal and AgentScope Studio (if available) self.speak(response.stream) # After printing, the response.text will be the full text of the response, and you can handle it as usual msg = Msg(self.name, content=response.text, role="assistant") self.memory.add(msg) return msg ``` ## Advanced Usage For users who want to handle the streaming response by themselves, they can iterate the generator and handle the response text in their own way. An example of how to handle the streaming response is in the `speak` function of `AgentBase` as follows. The `log_stream_msg` function will print the streaming response in the terminal and AgentScope Studio (if registered). ```python # ... elif isinstance(content, GeneratorType): # The streaming message must share the same id for displaying in # the agentscope studio. msg = Msg(name=self.name, content="", role="assistant") for last, text_chunk in content: msg.content = text_chunk log_stream_msg(msg, last=last) else: # ... ``` However, they should remember the following points: 1. When iterating the generator, the `response.text` field will include the text that has been iterated automatically. 2. The generator in the `stream` field will generate a tuple of boolean and text. The boolean indicates whether the text is the end of the response, and the text is the response text until now. 3. To print streaming text in AgentScope Studio, the message id should be the same for one response in the `log_stream_msg` function. ```python def reply(self, x: Optional[Msg, Sequence[Msg]] = None) -> Msg: # ... response = self.model(prompt) # For now, the response.text is None # Iterate the generator and handle the response text by yourself for last_chunk, text in response.stream: # Handle the text in your way # ... ``` [[Return to the top]](#203-stream-en) ``` modelscope/agentscope/blob/main/docs/sphinx_doc/en/source/tutorial/204-service.md: ```md (204-service-en)= # Tool Service function is a set of multi-functional utility tools that can be used to enhance the capabilities of agents, such as executing Python code, web search, file operations, and more. This tutorial provides an overview of the service functions available in AgentScope and how to use them to enhance the capabilities of your agents. ## Built-in Service Functions The following table outlines the various Service functions by type. These functions can be called using `agentscope.service.{function_name}`. | Service Scene | Service Function Name | Description | |-----------------------------|----------------------------|----------------------------------------------------------------------------------------------------------------| | Code | `execute_python_code` | Execute a piece of Python code, optionally inside a Docker container. | | | `NoteBookExecutor.run_code_on_notebook` | Compute Execute a segment of Python code in the IPython environment of the NoteBookExecutor, adhering to the IPython interactive computing style. | | Retrieval | `retrieve_from_list` | Retrieve a specific item from a list based on given criteria. | | | `cos_sim` | Compute the cosine similarity between two different embeddings. | | SQL Query | `query_mysql` | Execute SQL queries on a MySQL database and return results. | | | `query_sqlite` | Execute SQL queries on a SQLite database and return results. | | | `query_mongodb` | Perform queries or operations on a MongoDB collection. | | Text Processing | `summarization` | Summarize a piece of text using a large language model to highlight its main points. | | Web | `bing_search` | Perform bing search | | | `google_search` | Perform google search | | | `arxiv_search` | Perform arXiv search | | | `download_from_url` | Download file from given URL. | | | `load_web` | Load and parse the web page of the specified url (currently only supports HTML). | | | `digest_webpage` | Digest the content of a already loaded web page (currently only supports HTML). | | `dblp_search_publications` | Search publications in the DBLP database | | `dblp_search_authors` | Search for author information in the DBLP database | | | `dblp_search_venues` | Search for venue information in the DBLP database | | File | `create_file` | Create a new file at a specified path, optionally with initial content. | | | `delete_file` | Delete a file specified by a file path. | | | `move_file` | Move or rename a file from one path to another. | | | `create_directory` | Create a new directory at a specified path. | | | `delete_directory` | Delete a directory and all its contents. | | | `move_directory` | Move or rename a directory from one path to another. | | | `read_text_file` | Read and return the content of a text file. | | | `write_text_file` | Write text content to a file at a specified path. | | | `read_json_file` | Read and parse the content of a JSON file. | | | `write_json_file` | Serialize a Python object to JSON and write to a file. | | Multi Modality | `dashscope_text_to_image` | Convert text to image using Dashscope API. | | | `dashscope_image_to_text` | Convert image to text using Dashscope API. | | | `dashscope_text_to_audio` | Convert text to audio using Dashscope API. | | | `openai_text_to_image` | Convert text to image using OpenAI API | | `openai_edit_image` | Edit an image based on the provided mask and prompt using OpenAI API | | `openai_create_image_variation` | Create variations of an image using OpenAI API | | `openai_image_to_text` | Convert text to image using OpenAI API | | `openai_text_to_audio` | Convert text to audio using OpenAI API | | `openai_audio_to_text` | Convert audio to text using OpenAI API | *More services coming soon* | | More service functions are in development and will be added to AgentScope to further enhance its capabilities. | About each service function, you can find detailed information in the [API document](https://modelscope.github.io/agentscope/). ## How to use Service Functions AgentScope provides two classes for service functions, `ServiceToolkit` and `ServiceResponse`. ### About Service Toolkit The use of tools for LLM usually involves five steps: 1. **Prepare tool functions**. That is, developers should pre-process the functions by providing necessary parameters, e.g. api key, username, password, etc. 2. **Prepare instruction for LLM**. A detailed description for these tool functions are required for the LLM to understand them properly. 3. **Guide LLM how to use tool functions**. A format description for calling functions is required. 4. **Parse LLM response**. Once the LLM generates a response, we need to parse it according to above format in the third step. 5. **Call functions and handle exceptions**. Calling the functions, return the results, and handle exceptions. To simplify the above steps and improve reusability, AgentScope introduces `ServiceToolkit`. It can - register python functions - generate tool function descriptions in both string and JSON schema format - generate usage instruction for LLM - parse the model response, call the tool functions, and handle exceptions #### How to use Follow the steps below to use `ServiceToolkit`: 1. Init a `ServiceToolkit` object and register service functions with necessary parameters. Take the following Bing search function as an example. ```python def bing_search( question: str, api_key: str, num_results: int = 10, **kwargs: Any, ) -> ServiceResponse: """ Search question in Bing Search API and return the searching results Args: question (`str`): The search query string. api_key (`str`): The API key provided for authenticating with the Bing Search API. num_results (`int`, defaults to `10`): The number of search results to return. **kwargs (`Any`): Additional keyword arguments to be included in the search query. For more details, please refer to https://learn.microsoft.com/en-us/bing/search-apis/bing-web-search/reference/query-parameters [omitted for brevity] """ ``` We register the function in a `ServiceToolkit` object by providing `api_key` and `num_results` as necessary parameters. ```python from agentscope.service import ServiceToolkit service_toolkit = ServiceToolkit() service_toolkit.add( bing_search, api_key="xxx", num_results=3 ) ``` 2. Use the `tools_instruction` attribute to instruct LLM in prompt, or use the `json_schemas` attribute to get the JSON schema format descriptions to construct customized instruction or directly use in model APIs (e.g. OpenAI Chat API). ````text >> print(service_toolkit.tools_instruction) ## Tool Functions: The following tool functions are available in the format of ``` {index}. {function name}: {function description} {argument1 name} ({argument type}): {argument description} {argument2 name} ({argument type}): {argument description} ... ``` 1. bing_search: Search question in Bing Search API and return the searching results question (str): The search query string. ```` ````text >> print(service_toolkit.json_schemas) { "bing_search": { "type": "function", "function": { "name": "bing_search", "description": "Search question in Bing Search API and return the searching results", "parameters": { "type": "object", "properties": { "question": { "type": "string", "description": "The search query string." } }, "required": [ "question" ] } } } } ```` 3. Guide LLM how to use tool functions by the `tools_calling_format` attribute. The ServiceToolkit module requires LLM to return a list of dictionaries in JSON format, where each dictionary represents a function call. It must contain two fields, `name` and `arguments`, where `name` is the function name and `arguments` is a dictionary that maps from the argument name to the argument value. ```text >> print(service_toolkit.tools_calling_format) [{"name": "{function name}", "arguments": {"{argument1 name}": xxx, "{argument2 name}": xxx}}] ``` 4. Parse the LLM response and call functions by its `parse_and_call_func` method. This function takes a string or a parsed dictionary as input. - When the input is a string, this function will parse it accordingly and execute the function with the parsed arguments. - While if the input is a parse dictionary, it will call the function directly. ```python # a string input string_input = '[{"name": "bing_search", "arguments": {"question": "xxx"}}]' res_of_string_input = service_toolkit.parse_and_call_func(string_input) # or a parsed dictionary dict_input = [{"name": "bing_search", "arguments": {"question": "xxx"}}] # res_of_dict_input is the same as res_of_string_input res_of_dict_input = service_toolkit.parse_and_call_func(dict_input) print(res_of_string_input) ``` ``` 1. Execute function bing_search [ARGUMENTS]: question: xxx [STATUS]: SUCCESS [RESULT]: ... ``` More specific examples refer to the `ReActAgent` class in `agentscope.agents`. #### Create new Service Function A new service function that can be used by `ServiceToolkit` should meet the following requirements: 1. Well-formatted docstring (Google style is recommended), so that the `ServiceToolkit` can extract both the function descriptions. 2. The name of the service function should be self-explanatory, so that the LLM can understand the function and use it properly. 3. The typing of the arguments should be provided when defining the function (e.g. `def func(a: int, b: str, c: bool)`), so that the agent can specify the arguments properly. ### About ServiceResponse `ServiceResponse` is a wrapper for the execution results of the services, containing two fields, `status` and `content`. When the Service function runs to completion normally, `status` is `ServiceExecStatus.SUCCESS`, and `content` is the return value of the function. When an error occurs during execution, `status` is `ServiceExecStatus.Error`, and `content` contains the error message. ```python class ServiceResponse(dict): """Used to wrap the execution results of the services""" __setattr__ = dict.__setitem__ __getattr__ = dict.__getitem__ def __init__( self, status: ServiceExecStatus, content: Any, ): """Constructor of ServiceResponse Args: status (`ServiceExeStatus`): The execution status of the service. content (`Any`) If the argument`status` is `SUCCESS`, `content` is the response. We use `object` here to support various objects, e.g. str, dict, image, video, etc. Otherwise, `content` is the error message. """ self.status = status self.content = content # [omitted for brevity] ``` ## Example ```python import json import inspect from agentscope.service import ServiceResponse from agentscope.agents import AgentBase from agentscope.message import Msg from typing import Optional, Union, Sequence def create_file(file_path: str, content: str = "") -> ServiceResponse: """ Create a file and write content to it. Args: file_path (str): The path to the file to be created. content (str): The content to be written to the file. Returns: ServiceResponse: A boolean indicating success or failure, and a string containing any error message (if any), including the error type. """ # ... [omitted for brevity] class YourAgent(AgentBase): # ... [omitted for brevity] def reply(self, x: Optional[Union[Msg, Sequence[Msg]]] = None) -> Msg: # ... [omitted for brevity] # construct a prompt to ask the agent to provide the parameters in JSON format prompt = ( f"To complete the user request\n```{x['content']}```\n" "Please provide the necessary parameters in JSON format for the " "function:\n" f"Function: {create_file.__name__}\n" "Description: Create a file and write content to it.\n" ) # add detailed information about the function parameters sig = inspect.signature(create_file) parameters = sig.parameters.items() params_prompt = "\n".join( f"- {name} ({param.annotation.__name__}): " f"{'(default: ' + json.dumps(param.default) + ')'if param.default is not inspect.Parameter.empty else ''}" for name, param in parameters ) prompt += params_prompt # get the model response model_response = self.model(prompt).text # parse the model response and call the create_file function try: kwargs = json.loads(model_response) create_file(**kwargs) except: # Error handling pass # ... [omitted for brevity] ``` [[Return to Top]](#204-service-en) ``` modelscope/agentscope/blob/main/docs/sphinx_doc/en/source/tutorial/205-memory.md: ```md (205-memory-en)= # Memory In AgentScope, memory is used to store historical information, allowing the agent to provide more coherent and natural responses based on context. This tutorial will first introduce the carrier of information in memory, message, and then introduce the functions and usage of the memory module in AgentScope. ## About Message ### `MessageBase` Class In AgentScope, the message base class is a subclass of Python dictionary, consisting of two required fields (`name` and `content`) and an optional field (`url`). Specifically, the `name` field represents the originator of the message, the `content` field represents the content of the message, and the `url` field represents the data link attached to the message, which can be a local link to multi-modal data or a web link. As a dictionary type, developers can also add other fields as needed. When a message is created, a unique ID is automatically generated to identify the message. The creation time of the message is also automatically recorded in the form of a timestamp. In the specific implementation, AgentScope first provides a `MessageBase` base class to define the basic properties and usage of messages. Unlike general dictionary types, the instantiated objects of `MessageBase` can access attribute values through `object_name.{attribute_name}` or `object_name['attribute_name']`. The key attributes of the `MessageBase` class are as follows: - **`name`**: This attribute denotes the originator of the message. It's a critical piece of metadata, useful in scenarios where distinguishing between different speakers is necessary. - **`content`**: The substance of the message itself. It can include text, structured data, or any other form of content that is relevant to the interaction and requires processing by the agent. - **`url`**: An optional attribute that allows the message to be linked to external resources. These can be direct links to files, multi-modal data, or web pages. - **`timestamp`**: A timestamp indicating when the message was created. - **`id`**: Each message is assigned a unique identifier (ID) upon creation. ```python class MessageBase(dict): """Base Message class, which is used to maintain information for dialog, memory and used to construct prompt. """ def __init__( self, name: str, content: Any, url: Optional[Union[Sequence[str], str]] = None, timestamp: Optional[str] = None, **kwargs: Any, ) -> None: """Initialize the message object Args: name (`str`): The name of who send the message. It's often used in role-playing scenario to tell the name of the sender. However, you can also only use `role` when calling openai api. The usage of `name` refers to https://cookbook.openai.com/examples/how_to_format_inputs_to_chatgpt_models. content (`Any`): The content of the message. url (`Optional[Union[list[str], str]]`, defaults to None): A url to file, image, video, audio or website. timestamp (`Optional[str]`, defaults to None): The timestamp of the message, if None, it will be set to current time. **kwargs (`Any`): Other attributes of the message. For OpenAI API, you should add "role" from `["system", "user", "assistant", "function"]`. When calling OpenAI API, `"role": "assistant"` will be added to the messages that don't have "role" attribute. """ # id and timestamp will be added to the object as its attributes # rather than items in dict self.id = uuid4().hex if timestamp is None: self.timestamp = _get_timestamp() else: self.timestamp = timestamp self.name = name self.content = content if url: self.url = url self.update(kwargs) def __getattr__(self, key: Any) -> Any: try: return self[key] except KeyError as e: raise AttributeError(f"no attribute '{key}'") from e def __setattr__(self, key: Any, value: Any) -> None: self[key] = value def __delattr__(self, key: Any) -> None: try: del self[key] except KeyError as e: raise AttributeError(f"no attribute '{key}'") from e def to_str(self) -> str: """Return the string representation of the message""" raise NotImplementedError def serialize(self) -> str: """Return the serialized message.""" raise NotImplementedError # ... [省略代码以简化] ``` ### `Msg` Class `Msg` class extends `MessageBase` and represents a standard *message*. `Msg` provides concrete definitions for the `to_str` and `serialize` methods to enable string representation and serialization suitable for the agent's operational context. Within an `Agent` class, its `reply` function typically returns an instance of `Msg` to facilitate message passing within AgentScope. ```python class Msg(MessageBase): """The Message class.""" def __init__( self, name: str, content: Any, url: Optional[Union[Sequence[str], str]] = None, timestamp: Optional[str] = None, echo: bool = False, **kwargs: Any, ) -> None: super().__init__( name=name, content=content, url=url, timestamp=timestamp, **kwargs, ) if echo: logger.chat(self) def to_str(self) -> str: """Return the string representation of the message""" return f"{self.name}: {self.content}" def serialize(self) -> str: return json.dumps({"__type": "Msg", **self}) ``` ## About Memory ### `MemoryBase` Class `MemoryBase` is an abstract class that handles an agent's memory in a structured way. It defines operations for storing, retrieving, deleting, and manipulating *message*'s content. ```python class MemoryBase(ABC): # ... [code omitted for brevity] def get_memory( self, return_type: PromptType = PromptType.LIST, recent_n: Optional[int] = None, filter_func: Optional[Callable[[int, dict], bool]] = None, ) -> Union[list, str]: raise NotImplementedError def add(self, memories: Union[list[dict], dict]) -> None: raise NotImplementedError def delete(self, index: Union[Iterable, int]) -> None: raise NotImplementedError def load( self, memories: Union[str, dict, list], overwrite: bool = False, ) -> None: raise NotImplementedError def export( self, to_mem: bool = False, file_path: Optional[str] = None, ) -> Optional[list]: raise NotImplementedError def clear(self) -> None: raise NotImplementedError def size(self) -> int: raise NotImplementedError ``` Here are the key methods of `MemoryBase`: - **`get_memory`**: This method is responsible for retrieving stored messages from the agent's memory. It can return these messages in different formats as specified by the `return_type`. The method can also retrieve a specific number of recent messages if `recent_n` is provided, and it can apply a filtering function (`filter_func`) to select messages based on custom criteria. - **`add`**: This method is used to add a new *message* to the agent's memory. It can accept a single message or a list of messages. Each message is typically an instance of `MessageBase` or its subclasses. - **`delete`**: This method enables the removal of messages from memory by their index (or indices if an iterable is provided). - **`load`**: This method allows for the bulk loading of messages into the agent's memory from an external source. The `overwrite` parameter determines whether to clear the existing memory before loading the new set of messages. - **`export`**: This method facilitates exporting the stored *message* from the agent's memory either to an external file (specified by `file_path`) or directly into the working memory of the program (if `to_mem` is set to `True`). - **`clear`**: This method purges all *message* from the agent's memory, essentially resetting it. - **`size`**: This method returns the number of messages currently stored in the agent's memory. ### `TemporaryMemory` The `TemporaryMemory` class is a concrete implementation of `MemoryBase`, providing a memory store that exists during the runtime of an agent, which is used as the default memory type of agents. Besides all the behaviors from `MemoryBase`, the `TemporaryMemory` additionally provides methods for retrieval: - **`retrieve_by_embedding`**: Retrieves `messages` that are most similar to a query, based on their embeddings. It uses a provided metric to determine the relevance and can return the top `k` most relevant messages. - **`get_embeddings`**: Return the embeddings for all messages in memory. If a message does not have an embedding and an embedding model is provided, it will generate and store the embedding for the message. For more details about the usage of `Memory` and `Msg`, please refer to the API references. [[Return to the top]](#205-memory-en) ``` modelscope/agentscope/blob/main/docs/sphinx_doc/en/source/tutorial/206-prompt.md: ```md (206-prompt-en)= # Prompt Engineering Prompt engineering is critical in LLM-empowered applications. However, crafting prompts for large language models (LLMs) can be challenging, especially with different requirements from various model APIs. To ease the process of adapting prompt to different model APIs, AgentScope provides a structured way to organize different data types (e.g. instruction, hints, dialogue history) into the desired format. Note there is no **one-size-fits-all** solution for prompt crafting. **The goal of built-in strategies is to enable beginners to smoothly invoke the model API, rather than achieve the best performance**. For advanced users, we highly recommend developers to customize prompts according to their needs and model API requirements. ## Challenges in Prompt Construction In multi-agent applications, LLM often plays different roles in a conversation. When using third-party chat APIs, it has the following challenges: 1. Most third-party chat APIs are designed for chatbot scenario, and the `role` field only supports `"user"` and `"assistant"`. 2. Some model APIs require `"user"` and `"assistant"` must speak alternatively, and `"user"` must speak in the beginning and end of the input messages list. Such requirements make it difficult to build a multi-agent conversation when the agent may act as many different roles and speak continuously. To help beginners to quickly start with AgentScope, we provide the following built-in strategies for most chat and generation related model APIs. ## Built-in Prompt Strategies In AgentScope, we provide built-in strategies for the following chat and generation model APIs. - [OpenAIChatWrapper](#openaichatwrapper) - [DashScopeChatWrapper](#dashscopechatwrapper) - [DashScopeMultiModalWrapper](#dashscopemultimodalwrapper) - [OllamaChatWrapper](#ollamachatwrapper) - [OllamaGenerationWrapper](#ollamagenerationwrapper) - [GeminiChatWrapper](#geminichatwrapper) - [ZhipuAIChatWrapper](#zhipuaichatwrapper) These strategies are implemented in the `format` functions of the model wrapper classes. It accepts `Msg` objects, a list of `Msg` objects, or their mixture as input. However, `format` function will first reorganize them into a list of `Msg` objects, so for simplicity in the following sections we treat the input as a list of `Msg` objects. ### OpenAIChatWrapper `OpenAIChatWrapper` encapsulates the OpenAI chat API, it takes a list of dictionaries as input, where the dictionary must obey the following rules (updated in 2024/03/22): - Require `role` and `content` fields, and an optional `name` field. - The `role` field must be either `"system"`, `"user"`, or `"assistant"`. #### Prompt Strategy ##### Non-Vision Models In OpenAI Chat API, the `name` field enables the model to distinguish different speakers in the conversation. Therefore, the strategy of `format` function in `OpenAIChatWrapper` is simple: - `Msg`: Pass a dictionary with `role`, `content`, and `name` fields directly. - `List`: Parse each element in the list according to the above rules. An example is shown below: ```python from agentscope.models import OpenAIChatWrapper from agentscope.message import Msg model = OpenAIChatWrapper( config_name="", # empty since we directly initialize the model wrapper model_name="gpt-4", ) prompt = model.format( Msg("system", "You're a helpful assistant", role="system"), # Msg object [ # a list of Msg objects Msg(name="Bob", content="Hi.", role="assistant"), Msg(name="Alice", content="Nice to meet you!", role="assistant"), ], ) print(prompt) ``` ```bash [ {"role": "system", "name": "system", "content": "You are a helpful assistant"}, {"role": "assistant", "name": "Bob", "content": "Hi."}, {"role": "assistant", "name": "Alice", "content": "Nice to meet you!"), ] ``` ##### Vision Models For vision models (gpt-4-turbo, gpt-4o, ...), if the input message contains image urls, the generated `content` field will be a list of dicts, which contains text and image urls. Specifically, the web image urls will be pass to OpenAI Chat API directly, while the local image urls will be converted to base64 format. More details please refer to the [official guidance](https://platform.openai.com/docs/guides/vision). Note the invalid image urls (e.g. `/Users/xxx/test.mp3`) will be ignored. ```python from agentscope.models import OpenAIChatWrapper from agentscope.message import Msg model = OpenAIChatWrapper( config_name="", # empty since we directly initialize the model wrapper model_name="gpt-4o", ) prompt = model.format( Msg("system", "You're a helpful assistant", role="system"), # Msg object [ # a list of Msg objects Msg(name="user", content="Describe this image", role="user", url="https://xxx.png"), Msg(name="user", content="And these images", role="user", url=["/Users/xxx/test.png", "/Users/xxx/test.mp3"]), ], ) print(prompt) ``` ```python [ { "role": "system", "name": "system", "content": "You are a helpful assistant" }, { "role": "user", "name": "user", "content": [ { "type": "text", "text": "Describe this image" }, { "type": "image_url", "image_url": { "url": "https://xxx.png" } }, ] }, { "role": "user", "name": "user", "content": [ { "type": "text", "text": "And these images" }, { "type": "image_url", "image_url": { "url": "data:image/png;base64,YWJjZGVm..." # for /Users/xxx/test.png } }, ] }, ] ``` ### DashScopeChatWrapper `DashScopeChatWrapper` encapsulates the DashScope chat API, which takes a list of messages as input. The message must obey the following rules (updated in 2024/03/22): - Require `role` and `content` fields, and `role` must be either `"user"` `"system"` or `"assistant"`. - If `role` is `"system"`, this message must and can only be the first message in the list. - The `user` and `assistant` must speak alternatively. - The `user` must speak in the beginning and end of the input messages list. #### Prompt Strategy If the role field of the first message is `"system"`, it will be converted into a single message with the `role` field as `"system"` and the `content` field as the system message. The rest of the messages will be converted into a message with the `role` field as `"user"` and the `content` field as the dialogue history. An example is shown below: ```python from agentscope.models import DashScopeChatWrapper from agentscope.message import Msg model = DashScopeChatWrapper( config_name="", # empty since we directly initialize the model wrapper model_name="qwen-max", ) prompt = model.format( Msg("system", "You're a helpful assistant", role="system"), # Msg object [ # a list of Msg objects Msg(name="Bob", content="Hi!", role="assistant"), Msg(name="Alice", content="Nice to meet you!", role="assistant"), ], ) print(prompt) ``` ```bash [ {"role": "system", "content": "You are a helpful assistant"}, {"role": "user", "content": "## Dialogue History\nBob: Hi!\nAlice: Nice to meet you!"}, ] ``` ### DashScopeMultiModalWrapper `DashScopeMultiModalWrapper` encapsulates the DashScope multimodal conversation API, which takes a list of messages as input. The message must obey the following rules (updated in 2024/04/04): - Each message is a dictionary with `role` and `content` fields. - The `role` field must be either `"user"`, `"system"`, or `"assistant"`. - The `content` field must be a list of dictionaries, where - Each dictionary only contains one key-value pair, whose key must be `text`, `image` or `audio`. - `text` field is a string, representing the text content. - `image` field is a string, representing the image url. - `audio` field is a string, representing the audio url. - The `content` field can contain multiple dictionaries with the key `image` or multiple dictionaries with the key `audio` at the same time. For example: ```python [ { "role": "user", "content": [ {"text": "What's the difference between these two pictures?"}, {"image": "https://xxx1.png"}, {"image": "https://xxx2.png"} ] }, { "role": "assistant", "content": [{"text": "The first picture is a cat, and the second picture is a dog."}] }, { "role": "user", "content": [{"text": "I see, thanks!"}] } ] ``` - The message with the `role` field as `"system"` must and can only be the first message in the list. - The last message must have the `role` field as `"user"`. - The `user` and `assistant` messages must alternate. #### Prompt Strategy Based on the above rules, the `format` function in `DashScopeMultiModalWrapper` will parse the input messages as follows: - If the first message in the input message list has a `role` field with the value `"system"`, it will be converted into a system message with the `role` field as `"system"` and the `content` field as the system message. If the `url` field in the input `Msg` object is not `None`, a dictionary with the key `"image"` or `"audio"` will be added to the `content` based on its type. - The rest of the messages will be converted into a message with the `role` field as `"user"` and the `content` field as the dialogue history. For each message, if their `url` field is not `None`, it will add a dictionary with the key `"image"` or `"audio"` to the `content` based on the file type that the `url` points to. An example: ```python from agentscope.models import DashScopeMultiModalWrapper from agentscope.message import Msg model = DashScopeMultiModalWrapper( config_name="", # empty since we directly initialize the model wrapper model_name="qwen-vl-plus", ) prompt = model.format( Msg("system", "You're a helpful assistant", role="system", url="url_to_png1"), # Msg object [ # a list of Msg objects Msg(name="Bob", content="Hi!", role="assistant", url="url_to_png2"), Msg(name="Alice", content="Nice to meet you!", role="assistant", url="url_to_png3"), ], ) print(prompt) ``` ```bash [ { "role": "system", "content": [ {"text": "You are a helpful assistant"}, {"image": "url_to_png1"} ] }, { "role": "user", "content": [ {"text": "## Dialogue History\nBob: Hi!\nAlice: Nice to meet you!"}, {"image": "url_to_png2"}, {"image": "url_to_png3"}, ] } ] ``` ### LiteLLMChatWrapper `LiteLLMChatWrapper` encapsulates the litellm chat API, which takes a list of messages as input. The litellm supports different types of models, and each model might need to obey different formats. To simplify the usage, we provide a format that could be compatible with most models. If more specific formats are needed, you can refer to the specific model you use as well as the [litellm](https://github.com/BerriAI/litellm) documentation to customize your own format function for your model. - format all the messages in the chat history, into a single message with `"user"` as `role` #### Prompt Strategy - Messages will consist dialogue history in the `user` message prefixed by the system message and "## Dialogue History". ```python from agentscope.models import LiteLLMChatWrapper model = LiteLLMChatWrapper( config_name="", # empty since we directly initialize the model wrapper model_name="gpt-3.5-turbo", ) prompt = model.format( Msg("system", "You are a helpful assistant", role="system"), [ Msg("user", "What is the weather today?", role="user"), Msg("assistant", "It is sunny today", role="assistant"), ], ) print(prompt) ``` ```bash [ { "role": "user", "content": ( "You are a helpful assistant\n\n" "## Dialogue History\nuser: What is the weather today?\n" "assistant: It is sunny today" ), }, ] ``` ### OllamaChatWrapper `OllamaChatWrapper` encapsulates the Ollama chat API, which takes a list of messages as input. The message must obey the following rules (updated in 2024/03/22): - Require `role` and `content` fields, and `role` must be either `"user"`, `"system"`, or `"assistant"`. - An optional `images` field can be added to the message #### Prompt Strategy - If the role field of the first input message is `"system"`, it will be treated as system prompt and the other messages will consist dialogue history in the system message prefixed by "## Dialogue History". - If the `url` attribute of messages is not `None`, we will gather all urls in the `"images"` field in the returned dictionary. ```python from agentscope.models import OllamaChatWrapper model = OllamaChatWrapper( config_name="", # empty since we directly initialize the model wrapper model_name="llama2", ) prompt = model.format( Msg("system", "You're a helpful assistant", role="system"), # Msg object [ # a list of Msg objects Msg(name="Bob", content="Hi.", role="assistant"), Msg(name="Alice", content="Nice to meet you!", role="assistant", url="https://example.com/image.jpg"), ], ) print(prompt) ``` ```bash [ { "role": "system", "content": "You are a helpful assistant\n\n## Dialogue History\nBob: Hi.\nAlice: Nice to meet you!", "images": ["https://example.com/image.jpg"] }, ] ``` ### OllamaGenerationWrapper `OllamaGenerationWrapper` encapsulates the Ollama generation API, which takes a string prompt as input without any constraints (updated to 2024/03/22). #### Prompt Strategy If the role field of the first message is `"system"`, a system prompt will be created. The rest of the messages will be combined into dialogue history in string format. ```python from agentscope.models import OllamaGenerationWrapper from agentscope.message import Msg model = OllamaGenerationWrapper( config_name="", # empty since we directly initialize the model wrapper model_name="llama2", ) prompt = model.format( Msg("system", "You're a helpful assistant", role="system"), # Msg object [ # a list of Msg objects Msg(name="Bob", content="Hi.", role="assistant"), Msg(name="Alice", content="Nice to meet you!", role="assistant"), ], ) print(prompt) ``` ```bash You are a helpful assistant ## Dialogue History Bob: Hi. Alice: Nice to meet you! ``` ### `GeminiChatWrapper` `GeminiChatWrapper` encapsulates the Gemini chat API, which takes a list of messages or a string prompt as input. Similar to DashScope Chat API, if we pass a list of messages, it must obey the following rules: - Require `role` and `parts` fields. `role` must be either `"user"` or `"model"`, and `parts` must be a list of strings. - The `user` and `model` must speak alternatively. - The `user` must speak in the beginning and end of the input messages list. Such requirements make it difficult to build a multi-agent conversation when an agent may act as many different roles and speak continuously. Therefore, we decide to convert the list of messages into a user message in our built-in `format` function. #### Prompt Strategy If the role field of the first message is `"system"`, a system prompt will be added in the beginning. The other messages will be combined into dialogue history. **Note** sometimes the `parts` field may contain image urls, which is not supported in `format` function. We recommend developers to customize the prompt according to their needs. ```python from agentscope.models import GeminiChatWrapper from agentscope.message import Msg model = GeminiChatWrapper( config_name="", # empty since we directly initialize the model wrapper model_name="gemini-pro", ) prompt = model.format( Msg("system", "You're a helpful assistant", role="system"), # Msg object [ # a list of Msg objects Msg(name="Bob", content="Hi!", role="assistant"), Msg(name="Alice", content="Nice to meet you!", role="assistant"), ], ) print(prompt) ``` ```bash [ { "role": "user", "parts": [ "You are a helpful assistant\n## Dialogue History\nBob: Hi!\nAlice: Nice to meet you!" ] } ] ``` ### `ZhipuAIChatWrapper` `ZhipuAIChatWrapper` encapsulates the ZhipuAI chat API, which takes a list of messages as input. The message must obey the following rules: - Require `role` and `content` fields, and `role` must be either `"user"` `"system"` or `"assistant"`. - There must be at least one `user` message. #### Prompt Strategy If the role field of the first message is `"system"`, it will be converted into a single message with the `role` field as `"system"` and the `content` field as the system message. The rest of the messages will be converted into a message with the `role` field as `"user"` and the `content` field as the dialogue history. An example is shown below: ```python from agentscope.models import ZhipuAIChatWrapper from agentscope.message import Msg model = ZhipuAIChatWrapper( config_name="", # empty since we directly initialize the model wrapper model_name="glm-4", api_key="your api key", ) prompt = model.format( Msg("system", "You're a helpful assistant", role="system"), # Msg object [ # a list of Msg objects Msg(name="Bob", content="Hi!", role="assistant"), Msg(name="Alice", content="Nice to meet you!", role="assistant"), ], ) print(prompt) ``` ```bash [ {"role": "system", "content": "You are a helpful assistant"}, {"role": "user", "content": "## Dialogue History\nBob: Hi!\nAlice: Nice to meet you!"}, ] ``` ## Prompt Engine (Will be deprecated in the future) AgentScope provides the `PromptEngine` class to simplify the process of crafting prompts for large language models (LLMs). ## About `PromptEngine` Class The `PromptEngine` class provides a structured way to combine different components of a prompt, such as instructions, hints, dialogue history, and user inputs, into a format that is suitable for the underlying language model. ### Key Features of PromptEngine - **Model Compatibility**: It works with any `ModelWrapperBase` subclass. - **Prompt Type**: It supports both string and list-style prompts, aligning with the model's preferred input format. ### Initialization When creating an instance of `PromptEngine`, you can specify the target model and, optionally, the shrinking policy, the maximum length of the prompt, the prompt type, and a summarization model (could be the same as the target model). ```python model = OpenAIChatWrapper(...) engine = PromptEngine(model) ``` ### Joining Prompt Components The `join` method of `PromptEngine` provides a unified interface to handle an arbitrary number of components for constructing the final prompt. #### Output String Type Prompt If the model expects a string-type prompt, components are joined with a newline character: ```python system_prompt = "You're a helpful assistant." memory = ... # can be dict, list, or string hint_prompt = "Please respond in JSON format." prompt = engine.join(system_prompt, memory, hint_prompt) # the result will be [ "You're a helpful assistant.", {"name": "user", "content": "What's the weather like today?"}] ``` #### Output List Type Prompt For models that work with list-type prompts,e.g., OpenAI and Huggingface chat models, the components can be converted to Message objects, whose type is list of dict: ```python system_prompt = "You're a helpful assistant." user_messages = [{"name": "user", "content": "What's the weather like today?"}] prompt = engine.join(system_prompt, user_messages) # the result should be: [{"role": "assistant", "content": "You're a helpful assistant."}, {"name": "user", "content": "What's the weather like today?"}] ``` #### Formatting Prompts in Dynamic Way The `PromptEngine` supports dynamic prompts using the `format_map` parameter, allowing you to flexibly inject various variables into the prompt components for different scenarios: ```python variables = {"location": "London"} hint_prompt = "Find the weather in {location}." prompt = engine.join(system_prompt, user_input, hint_prompt, format_map=variables) ``` [[Return to the top]](#206-prompt-en) ``` modelscope/agentscope/blob/main/docs/sphinx_doc/en/source/tutorial/207-monitor.md: ```md (207-monitor-en)= # Monitor In multi-agent applications, particularly those that rely on external model APIs, it's crucial to monitor the usage and cost to prevent overutilization and ensure compliance with rate limits. The `MonitorBase` class and its implementation, `SqliteMonitor`, provide a way to track and regulate the usage of such APIs in your applications. In this tutorial, you'll learn how to use them to monitor API calls. ## Understanding the Monitor in AgentScope The `MonitorBase` class serves as an interface for setting up a monitoring system that tracks various metrics, especially focusing on API usage. It defines methods that enable registration, checking, updating, and management of metrics related to API calls. Here are the key methods of `MonitorBase`: - **`register`**: Initializes a metric for tracking, such as the number of API calls made, with an optional quota to enforce limits. - **`exists`**: Checks whether a metric is already being tracked. - **`add`**: Increments the metric by a specified value, used to count each API call made. - **`update`**: Updates multiple metrics at once, useful for batching updates. - **`clear`**: Resets a metric to zero, which can be useful when the quota period resets. - **`remove`**: Removes a metric from monitoring. - **`get_value`**: Retrieves the current count for a particular metric. - **`get_unit`**: Fetches the unit associated with the metric (e.g., "calls"). - **`get_quota`**: Obtains the maximum number of allowed API calls. - **`set_quota`**: Adjusts the quota for a metric, if the terms of API usage change. - **`get_metric`**: Returns detailed information about a specific metric. - **`get_metrics`**: Retrieves information about all tracked metrics, with optional filtering based on metric names. - **`register_budget`**: Sets a budget for a certain API call, which will initialize a series of metrics used to calculate the cost. ## Using the Monitor ### Get a Monitor Instance Get a monitor instance from `MonitorFactory` to begin monitoring, and note that multiple calls to the `get_monitor` method return the same monitor instance. ```python # make sure you have called agentscope.init(...) before monitor = MonitorFactory.get_monitor() ``` Currently the above code returns a `SqliteMonitor` instance, which is initialized in `agentscope.init`. The `SqliteMonitor` class is the default implementation of `MonitorBase` class, which is based on Sqlite3. If you don't want to use monitor, you can set `use_monitor=False` in `agentscope.init` to disable the monitor. And in this case, the `MonitorFactory.get_monitor` method will return an instance of `DummyMonitor` which has the same interface as the `SqliteMonitor` class, but does nothing inside. ### Basic Usage #### Registering API Usage Metrics Register a new metric to start monitoring the number of tokens: ```python monitor.register("token_num", metric_unit="token", quota=1000) ``` #### Updating Metrics Increment the `token_num` metric: ```python monitor.add("token_num", 20) ``` #### Handling Quotas If the number of API calls exceeds the quota, a `QuotaExceededError` will be thrown: ```python try: monitor.add("api_calls", amount) except QuotaExceededError as e: # Handle the exceeded quota, e.g., by pausing API calls print(e.message) ``` #### Retrieving Metrics Get the current number of tokens used: ```python token_num_used = monitor.get_value("token_num") ``` #### Resetting and Removing Metrics Reset the number of token count at the start of a new period: ```python monitor.clear("token_num") ``` Remove the metric if it's no longer needed: ```python monitor.remove("token_num") ``` ### Advanced Usage > Features here are under development, the interface may continue to change. #### Using `prefix` to Distinguish Metrics Assume you have multiple agents/models that use the same API call, but you want to calculate their token usage separately, you can add a unique `prefix` before the original metric name, and `get_full_name` provides such functionality. For example, if model_A and model_B both use the OpenAI API, you can register these metrics by the following code. ```python from agentscope.utils.monitor import get_full_name ... # in model_A monitor.register(get_full_name('prompt_tokens', 'model_A')) monitor.register(get_full_name('completion_tokens', 'model_A')) # in model_B monitor.register(get_full_name('prompt_tokens', 'model_B')) monitor.register(get_full_name('completion_tokens', 'model_B')) ``` To update those metrics, just use the `update` method. ```python # in model_A monitor.update(openai_response.usage.model_dump(), prefix='model_A') # in model_B monitor.update(openai_response.usage.model_dump(), prefix='model_B') ``` To get metrics of a specific model, please use the `get_metrics` method. ```python # get metrics of model_A model_A_metrics = monitor.get_metrics('model_A') # get metrics of model_B model_B_metrics = monitor.get_metrics('model_B') ``` #### Register a budget for an API Currently, the Monitor already supports automatically calculating the cost of API calls based on various metrics, and you can directly set a budget of a model to avoid exceeding the quota. Suppose you are using `gpt-4-turbo` and your budget is $10, you can use the following code. ```python model_name = 'gpt-4-turbo' monitor.register_budget(model_name=model_name, value=10, prefix=model_name) ``` Use `prefix` to set budgets for different models that use the same API. ```python model_name = 'gpt-4-turbo' # in model_A monitor.register_budget(model_name=model_name, value=10, prefix=f'model_A.{model_name}') # in model_B monitor.register_budget(model_name=model_name, value=10, prefix=f'model_B.{model_name}') ``` `register_budget` will automatically register metrics that are required to calculate the total cost, calculate the total cost when these metrics are updated, and throw a `QuotaExceededError` when the budget is exceeded. ```python model_name = 'gpt-4-turbo' try: monitor.update(openai_response.usage.model_dump(), prefix=model_name) except QuotaExceededError as e: # Handle the exceeded quota print(e.message) ``` > **Note:** This feature is still in the experimental stage and only supports some specified APIs, which are listed in `agentscope.utils.monitor._get_pricing`. [[Return to the top]](#207-monitor-en) ``` modelscope/agentscope/blob/main/docs/sphinx_doc/en/source/tutorial/208-distribute.md: ```md (208-distribute-en)= # Distribution AgentScope implements an Actor-based distributed deployment and parallel optimization, providing the following features: - **Automatic Parallel Optimization**: Automatically optimize the application for parallelism at runtime without additional optimization costs; - **Centralized Application Writing**: Easily orchestrate distributed application flow without distributed background knowledge; - **Zero-Cost Automatic Migration**: Centralized Multi-Agent applications can be easily converted to distributed mode This tutorial will introduce the implementation and usage of AgentScope distributed in detail. ## Usage In AgentScope, the process that runs the application flow is called the **main process**, and each agent can run in a separate process named **agent server process**. According to the different relationships between the main process and the agent server process, AgentScope supports two modes for each agent: **Child Process** and **Independent Process** mode. - In the Child Process Mode, agent server processes will be automatically started as sub-processes from the main process. - While in the Independent Process Mode, the agent server process is independent of the main process and developers need to start the agent server process on the corresponding machine. The above concepts may seem complex, but don't worry, for application developers, you only need to convert your existing agent to its distributed version. ### Step 1: Convert your agent to its distributed version All agents in AgentScope can automatically convert to its distributed version by calling its {func}`to_dist` method. But note that your agent must inherit from the {class}`agentscope.agents.AgentBase` class, because the `to_dist` method is provided by the `AgentBase` class. Suppose there are two agent classes `AgentA` and `AgentB`, both of which inherit from `AgentBase`. ```python a = AgentA( name="A" # ... ) b = AgentB( name="B" # ... ) ``` Next we will introduce the conversion details of both modes. #### Child Process Mode To use this mode, you only need to call each agent's `to_dist()` method without any input parameter. AgentScope will automatically start all agent server processes from the main process. ```python # Child Process mode a = AgentA( name="A" # ... ).to_dist() b = AgentB( name="B" # ... ).to_dist() ``` #### Independent Process Mode In the Independent Process Mode, we need to start the agent server process on the target machine first. When starting the agent server process, you need to specify a model config file, which contains the models which can be used in the agent server, the IP address and port of the agent server process For example, start two agent server processes on the two different machines with IP `ip_a` and `ip_b`(called `Machine1` and `Machine2` accrodingly). You can run the following code on `Machine1`.Before running, make sure that the machine has access to all models that used in your application, specifically, you need to put your model config file in `model_config_path_a` and set environment variables such as your model API key correctly in `Machine1`. The example model config file instances are located under `examples/model_configs_template`. In addition, your customized agent classes that need to run in the server must be registered in `custom_agent_classes` so that the server can correctly identify these agents. If you only use AgentScope's built-in agents, you can ignore `custom_agent_classes` field. ```python # import some packages # register models which can be used in the server agentscope.init( model_configs=model_config_path_a, ) # Create an agent service process server = RpcAgentServerLauncher( host="ip_a", port=12001, # choose an available port custom_agent_classes=[AgentA, AgentB] # register your customized agent classes ) # Start the service server.launch() server.wait_until_terminate() ``` > For similarity, you can run the following command in your terminal rather than the above code: > > ```shell > as_server --host ip_a --port 12001 --model-config-path model_config_path_a > ``` Then put your model config file accordingly in `model_config_path_b`, set environment variables, and run the following code on `Machine2`. ```python # import some packages # register models which can be used in the server agentscope.init( model_configs=model_config_path_b, ) # Create an agent service process server = RpcAgentServerLauncher( host="ip_b", port=12002, # choose an available port custom_agent_classes=[AgentA, AgentB] # register your customized agent classes ) # Start the service server.launch() server.wait_until_terminate() ``` > Similarly, you can run the following command in your terminal to setup the agent server: > > ```shell > as_server --host ip_b --port 12002 --model-config-path model_config_path_b > ``` Then, you can connect to the agent servers from the main process with the following code. ```python a = AgentA( name="A", # ... ).to_dist( host="ip_a", port=12001, ) b = AgentB( name="B", # ... ).to_dist( host="ip_b", port=12002, ) ``` The above code will deploy `AgentA` on the agent server process of `Machine1` and `AgentB` on the agent server process of `Machine2`. And developers just need to write the application flow in a centralized way in the main process. ### Step 2: Orchestrate Distributed Application Flow In AgentScope, the orchestration of distributed application flow is exactly the same as non-distributed programs, and developers can write the entire application flow in a centralized way. At the same time, AgentScope allows the use of a mixture of locally and distributed deployed agents, and developers do not need to distinguish which agents are local and which are distributed. The following is the complete code for two agents to communicate with each other in different modes. It can be seen that AgentScope supports zero-cost migration of distributed application flow from centralized to distributed. - All agents are centralized ```python # Create agent objects a = AgentA( name="A", # ... ) b = AgentB( name="B", # ... ) # Application flow orchestration x = None while x is None or x.content == "exit": x = a(x) x = b(x) ``` - Agents are deployed in a distributed manner - `AgentA` in Child Process mode - `AgentB` in Independent Process Mode ```python # Create agent objects a = AgentA( name="A" # ... ).to_dist() b = AgentB( name="B", # ... ).to_dist( host="ip_b", port=12002, ) # Application flow orchestration x = None while x is None or x.content == "exit": x = a(x) x = b(x) ``` ### Advanced Usage #### `to_dist` with lower cost All examples described above convert initialized agents into their distributed version through the {func}`to_dist` method, which is equivalent to initialize the agent twice, once in the main process and once in the agent server process. For agents whose initialization process is time-consuming, the `to_dist` method is inefficient. Therefore, AgentScope also provides a method to convert the Agent instance into its distributed version while initializing it, that is, passing in `to_dist` parameter to the Agent's initialization function. In Child Process Mode, just pass `to_dist=True` to the Agent's initialization function. ```python # Child Process mode a = AgentA( name="A", # ... to_dist=True ) b = AgentB( name="B", # ... to_dist=True ) ``` In Independent Process Mode, you need to encapsulate the parameters of the `to_dist()` method in {class}`DistConf` instance and pass it into the `to_dist` field, for example: ```python a = AgentA( name="A", # ... to_dist=DistConf( host="ip_a", port=12001, ), ) b = AgentB( name="B", # ... to_dist=DistConf( host="ip_b", port=12002, ), ) ``` Compared with the original `to_dist()` function call, this method just initializes the agent once in the agent server process, which reduces the cost of initialization. #### Manage your agent server processes When running large-scale multi-agent applications, it's common to have multiple Agent Server processes running. To facilitate management of these processes, AgentScope offers management interfaces in the {class}`RpcAgentClient` class. Here's a brief overview of these methods: - `is_alive`: This method checks whether the Agent Server process is still running. ```python client = RpcAgentClient(host=server_host, port=server_port) if client.is_alive(): do_something() ``` - `stop`: This method stops the Agent Server process. ```python client.stop() assert(client.is_alive() == False) ``` - `get_agent_list`: This method retrieves a list of JSON format thumbnails of all agents currently running within the Agent Server process. The thumbnail is generated by the `__str__` method of the Agent instance. ```python agent_list = client.get_agent_list() print(agent_list) # [agent1_info, agent2_info, ...] ``` - `get_agent_memory`: With this method, you can fetch the memory content of an agent specified by its `agent_id`. ```python agent_id = my_agent.agent_id agent_memory = client.get_agent_memory(agent_id) print(agent_memory) # [msg1, msg2, ...] ``` - `get_server_info`：This method provides information about the resource utilization of the Agent Server process, including CPU usage, memory consumption. ```python server_info = client.get_server_info() print(server_info) # { "cpu": xxx, "mem": xxx } ``` - `set_model_configs`: This method set the specific model configs into the agent server, the agent created later can directly use these model configs. ```python agent = MyAgent( # failed because the model config [my_openai] is not found # ... model_config_name="my_openai", to_dist={ # ... } ) client.set_model_configs([{ # set the model config [my_openai] "config_name": "my_openai", "model_type": "openai_chat", # ... }]) agent = MyAgent( # success # ... model_config_name="my_openai", to_dist={ # ... } ) ``` - `delete_agent`: This method deletes an agent specified by its `agent_id`. ```python agent_id = agent.agent_id ok = client.delete_agent(agent_id) ``` - `delete_all_agent`: This method deletes all agents currently running within the Agent Server process. ```python ok = client.delete_all_agent() ``` ## Implementation ### Actor Model [The Actor model](https://en.wikipedia.org/wiki/Actor_model) is a widely used programming paradigm in large-scale distributed systems, and it is also applied in the distributed design of the AgentScope platform. In the distributed mode of AgentScope, each Agent is an Actor and interacts with other Agents through messages. The flow of messages implies the execution order of the Agents. Each Agent has a `reply` method, which consumes a message and generates another message, and the generated message can be sent to other Agents. For example, the following chart shows the workflow of multiple Agents. `A`~`F` are all Agents, and the arrows represent messages. ```{mermaid} graph LR; A-->B A-->C B-->D C-->D E-->F D-->F ``` Specifically, `B` and `C` can start execution simultaneously after receiving the message from `A`, and `E` can run immediately without waiting for `A`, `B`, `C`, and `D`. By implementing each Agent as an Actor, an Agent will automatically wait for its input `Msg` before starting to execute the `reply` method, and multiple Agents can also automatically execute `reply` at the same time if their input messages are ready, which avoids complex parallel control and makes things simple. ### PlaceHolder Meanwhile, to support centralized application orchestration, AgentScope introduces the concept of {class}`Placeholder`. A Placeholder is a special message that contains the address and port number of the agent that generated the placeholder, which is used to indicate that the output message of the Agent is not ready yet. When calling the `reply` method of a distributed agent, a placeholder is returned immediately without blocking the main process. The interface of placeholder is exactly the same as the message, so that the orchestration flow can be written in a centralized way. When getting values from a placeholder, the placeholder will send a request to get the real values from the source agent. A placeholder itself is also a message, and it can be sent to other agents, and let other agents to get the real values, which can avoid sending the real values multiple times. About more detailed technical implementation solutions, please refer to our [paper](https://arxiv.org/abs/2402.14034). ### Agent Server In agentscope, the agent server provides a running platform for various types of agents. Multiple agents can run in the same agent server and hold independent memory and other local states but they will share the same computation resources. After installing the distributed version of AgentScope, you can use the `as_server` command to start the agent server, and the detailed startup arguments can be found in the documentation of the {func}`as_server` function. As long as the code is not modified, an agent server can provide services for multiple main processes. This means that when running mutliple applications, you only need to start the agent server for the first time, and it can be reused subsequently. [[Back to the top]](#208-distribute-en) ``` modelscope/agentscope/blob/main/docs/sphinx_doc/en/source/tutorial/209-gui.md: ```md (209-gui-en)= # AgentScope Studio AgentScope Studio is an open sourced Web UI toolkit for building and monitoring multi-agent applications. It provides the following features: - **Dashboard**: A user-friendly interface, where you can monitor your running applications, and look through the running histories. - **Workstation**: A powerful interface to build your multi-agent applications with **Dragging & Dropping**. - **Gallery**: Coming soon! - **Server Management**: Coming soon! ## Start AgentScope Studio To start a studio, first ensure you have installed the latest version of AgentScope. Then, you can simply run the following Python code: ```python import agentscope agentscope.studio.init() ``` Or you can run the following command in the terminal: ```bash as_studio ``` After that, you can visit AgentScope studio at `http://127.0.0.1:5000`. Of course, you can change the host and port, and link to your application running histories by providing the following arguments: ```python import agentscope agentscope.studio.init( host="127.0.0.1", # The IP address of AgentScope studio port=5000, # The port number of AgentScope studio run_dirs = [ # The directories of your running histories "xxx/xxx/runs", "xxx/xxx/runs" ] ) ``` ## About Dashboard Dashboard is a web interface to monitor your running applications and look through the running histories. ### Note Currently, Dashboard has the following limitations, and we are working on improving it. Any feedback, contribution, or suggestion are welcome! - The running application and AgentScope Studio must be running on the same machine for URL/path consistency. If you want to visit AgentScope in the other machine, you can try to forward the port to the remote machine by running the following command in the remote machine: ```bash # Supposing AgentScope is running on {as_host}:{as_port}, and the port # of the remote machine is {remote_machine_port} ssh -L {remote_machine_port}:{as_host}:{as_port} [{user_name}@]{as_host} ``` - For distributed applications, the single-machine & multi-process mode is supported, but the multi-machine multi-process mode is not supported yet. ### Register Running Application After starting the AgentScope Studio, you can register your running applications by specifying `studio_url` in `agentscope.init()`: ```python import agentscope agentscope.init( # ... project="xxx", name="xxx", studio_url="http://127.0.0.1:5000" # The URL of AgentScope Studio ) ``` After registering, you can view the running application in the Dashboard. To distinguish different applications, you can specify the `project` and `name` of the application. > Note: Once you register the running application, the input operation within the `agentscope.agents.UserAgent` class will be transferred to the Dashboard in AgentScope Studio, and you can enter the input in the Dashboard. ### Import Running Histories In AgentScope, the running histories are saved in the `./runs` directory by default. If you want to watch these running histories in the Dashboard, you can specify the `run_dirs` in `agentscope.studio.init()`: ```python import agentscope agentscope.studio.init( run_dirs = ["xxx/runs"] ) ``` ## About Workstation The workstation is designed to empower zero-code users. It facilitates the creation of complex applications through a user-friendly, drag-and-drop interface. > Note: We are actively developing the workstation, and the interface may continue to change. Any feedback, contribution, or suggestion are welcome! ### Quick Start In AgentScope Studio, click the workstation icon in the sidebar or welcome page to enter the workstation. The workstation is consisted of a sidebar, a central workspace and a top toolbox. Their functionalities are as follows: - **Sidebar**: Providing pre-built examples to help you become acquainted with the workstation, and draggable components for building applications. - **Central workspace**: The main area where you can drag and drop components to build your application. - **Top toolbox**: To import, export, check, and run your application.

#### Explore Built-in Examples For beginners, we highly recommend starting with the pre-built examples to get started. You have the option to directly click on an example to import it into your central workspace. Alternatively, for a more structured learning experience, you can opt to follow along with the tutorials linked to each example. These tutorials will walk you through how each multi-agent application is built on AgentScope Workstation step-by-step. #### Build Your Application To build an application, following these steps: - **Choose & drag component**: Click and drag your chosen component from sidebar into the central workspace area. - **Connect nodes**: Most nodes come with input and output points. Click on an output point of one component and drag it to an input point of another to create a message flow pipeline. This process allows different nodes to pass messages. - **Configure nodes**: After dropping your nodes into the workspace, click on any of them to fill in their configuration settings. You can customize the prompts, parameters, and other properties. #### Run Your Application Once the application is built, click on the "Run" button. Before running, the workstation will check your application for any errors. If there are any, you will be prompted to correct them before proceeding. After that, your application will be executed in the same Python environment as the AgentScope Studio, and you can find it in the Dashboard. #### Import or Export Your Application Workstation supports to import and export your application. Click the "Export HTML" or "Export Python" button to generate code that you can distribute to the community or save locally. If you want to convert the exported code to Python, you can compile the JSON configuration to Python code as follows: ```bash # Compile as_workflow config.json --compile ${YOUR_PYTHON_SCRIPT_NAME}.py ``` Want to edit your application further? Simply click the "Import HTML" button to upload your previously exported HTML code back into the AgentScope Workstation. #### Check Your Application After building your application, you can click the "Check" button to verify the correctness of your application structure. The following checking rules will be performed: - Presence of Model and Agent: Every application must include at least one model node and one agent node. - Single Connection Policy: A component should not have more than one connection for each input. - Mandatory Fields Validation: All required input fields must be populated to ensure that each node has the necessary args to operate correctly. - Consistent Configuration Naming: The ‘Model config name’ used by Agent nodes must correspond to a ‘Config Name’ defined in a Model node. - Proper Node Nesting: Nodes like ReActAgent should only contain the tool nodes. Similarly, Pipeline nodes like IfElsePipeline should contain the correct number of elements (no more than 2), and ForLoopPipeline, WhileLoopPipeline, and MsgHub should follow the one-element-only rule (must be a SequentialPipeline as a child node). ``` modelscope/agentscope/blob/main/docs/sphinx_doc/en/source/tutorial/209-prompt_opt.md: ```md (209-prompt-opt)= # System Prompt Optimization AgentScope implements a module for optimizing Agent System Prompts. ## Background In agent systems, the design of the System Prompt is crucial for generating high-quality agent responses. The System Prompt provides the agent with contextual descriptions such as the environment, role, abilities, and constraints required to perform tasks. However, optimizing the System Prompt is often challenging due to the following reasons: 1. **Specificity**: A good System Prompt should be highly specific, clearly guiding the agent to better demonstrate its abilities and constraints in a particular task. 2. **Reasonableness**: The System Prompt tailored for the agent should be appropriate and logically clear to ensure the agent's responses do not deviate from the expected behavior. 3. **Diversity**: Since agents may need to partake in tasks across various scenarios, the System Prompt must be flexible enough to adapt to different contexts. 4. **Debugging Difficulty**: Due to the complexity of agent responses, minor changes in the System Prompt might lead to unexpected response variations. Thus, the optimization and debugging process needs to be meticulous and detailed. Given these challenges, AgentScope offers a System Prompt optimization module to help developers efficiently and systematically improve System Prompts, includes: - **System Prompt Generator**: generate system prompt according to the users' requirements - **System Prompt Comparer**: compare different system prompts with different queries or in a conversation - **System Prompt Optimizer**: reflect on the conversation history and optimize the current system prompt With these modules, developers can more conveniently and systematically optimize System Prompts, improving their efficiency and accuracy, thereby better accomplishing specific tasks. ## Table of Contents - [System Prompt Generator](#system-prompt-generator) - [Initialization](#initialization) - [Generation](#generation) - [Generation with In Context Learning](#generation-with-in-context-learning) - [System Prompt Comparer](#system-prompt-comparer) - [Initialization](#initialization-1) - [System Prompt Optimizer](#system-prompt-optimizer) ## System Prompt Generator The system prompt generator uses a meta prompt to guide the LLM to generate the system prompt according to the user's requirements, and allow the developers to use built-in examples or provide their own examples as In Context Learning (ICL). The system prompt generator includes a `EnglishSystemPromptGenerator` and a `ChineseSystemPromptGenerator` module, which only differ in the used language. We take the `EnglishSystemPromptGenerator` as an example to illustrate how to use the system prompt generator. ### Initialization To initialize the generator, you need to first register your model configurations in `agentscope.init` function. ```python from agentscope.prompt import EnglishSystemPromptGenerator import agentscope agentscope.init( model_configs={ "config_name": "my-gpt-4", "model_type": "openai_chat", "model_name": "gpt-4", "api_key": "xxx", } ) prompt_generator = EnglishSystemPromptGenerator( model_config_name="my-gpt-4" ) ``` The generator will use a built-in default meta prompt to guide the LLM to generate the system prompt. You can also use your own meta prompt as follows: ```python from agentscope.prompt import EnglishSystemPromptGenerator your_meta_prompt = "You are an expert prompt engineer adept at writing and optimizing system prompts. Your task is to ..." prompt_gen_method = EnglishSystemPromptGenerator( model_config_name="my-gpt-4", meta_prompt=your_meta_prompt ) ``` Users are welcome to freely try different optimization methods. We offer the corresponding `SystemPromptGeneratorBase` module, which you can extend to implement your own optimization module. ```python from agentscope.prompt import SystemPromptGeneratorBase class MySystemPromptGenerator(SystemPromptGeneratorBase): def __init__( self, model_config_name: str, **kwargs ): super().__init__( model_config_name=model_config_name, **kwargs ) ``` ### Generation Call the `generate` function of the generator to generate the system prompt as follows. You can input a requirement, or your system prompt to be optimized. ```python from agentscope.prompt import EnglishSystemPromptGenerator import agentscope agentscope.init( model_configs={ "config_name": "my-gpt-4", "model_type": "openai_chat", "model_name": "gpt-4", "api_key": "xxx", } ) prompt_generator = EnglishSystemPromptGenerator( model_config_name="my-gpt-4" ) generated_system_prompt = prompt_generator.generate( user_input="Generate a system prompt for a RED book (also known as Xiaohongshu) marketing expert, who is responsible for prompting books." ) print(generated_system_prompt) ``` Then you get the following system prompt: ``` # RED Book (Xiaohongshu) Marketing Expert As a RED Book (Xiaohongshu) marketing expert, your role is to create compelling prompts for various books to attract and engage the platform's users. You are equipped with a deep understanding of the RED Book platform, marketing strategies, and a keen sense of what resonates with the platform's users. ## Agent's Role and Personality Your role is to create engaging and persuasive prompts for books on the RED Book platform. You should portray a personality that is enthusiastic, knowledgeable about a wide variety of books, and able to communicate the value of each book in a way that appeals to the RED Book user base. ## Agent's Skill Points 1. **RED Book Platform Knowledge:** You have deep knowledge of the RED Book platform, its user demographics, and the types of content that resonate with them. 2. **Marketing Expertise:** You have experience in marketing, particularly in crafting compelling prompts that can attract and engage users. 3. **Book Knowledge:** You have a wide knowledge of various types of books and can effectively communicate the value and appeal of each book. 4. **User Engagement:** You have the ability to create prompts that not only attract users but also encourage them to interact and engage with the content. ## Constraints 1. The prompts should be tailored to the RED Book platform and its users. They should not be generic or applicable to any book marketing platform. 2. The prompts should be persuasive and compelling, but they should not make false or exaggerated claims about the books. 3. Each prompt should be unique and specific to the book it is promoting. Avoid using generic or repetitive prompts. ``` ### Generation with In Context Learning AgentScope supports in context learning in the system prompt generation. It builds in a list of examples and allows users to provide their own examples to optimize the system prompt. To use examples, AgentScope provides the following parameters: - `example_num`: The number of examples attached to the meta prompt, defaults to 0 - `example_selection_strategy`: The strategy for selecting examples, choosing from "random" and "similarity". - `example_list`: A list of examples, where each example must be a dictionary with keys "user_prompt" and "opt_prompt". If not specified, the built-in example list will be used. ```python from agentscope.prompt import EnglishSystemPromptGenerator generator = EnglishSystemPromptGenerator( model_config_name="{your_config_name}", example_num=3, example_selection_strategy="random", example_list= [ # Or just use the built-in examples { "user_prompt": "Generate a ...", "opt_prompt": "You're a helpful ..." }, # ... ], ) ``` Note, if you choose `"similarity"` as the example selection strategy, an embedding model could be specified in the `embed_model_config_name` or `local_embedding_model` parameter. Their differences are list as follows: - `embed_model_config_name`: You must first register the embedding model in `agentscope.init` and specify the model configuration name in this parameter. - `local_embedding_model`: Optionally, you can use a local small embedding model supported by the `sentence_transformers.SentenceTransformer` library. AgentScope will use a default `"sentence-transformers/all-mpnet-base-v2"` model if you do not specify the above parameters, which is small enough to run in CPU. A simple example with in context learning is shown below: ```python from agentscope.prompt import EnglishSystemPromptGenerator import agentscope agentscope.init( model_configs={ "config_name": "my-gpt-4", "model_type": "openai_chat", "model_name": "gpt-4", "api_key": "xxx", } ) generator = EnglishSystemPromptGenerator( model_config_name="my-gpt-4", example_num=2, example_selection_strategy="similarity", ) generated_system_prompt = generator.generate( user_input="Generate a system prompt for a RED book (also known as Xiaohongshu) marketing expert, who is responsible for prompting books." ) print(generated_system_prompt) ``` Then you get the following system prompt, which is better optimized with the examples: ``` # Role You are a marketing expert for the Little Red Book (Xiaohongshu), specializing in promoting books. ## Skills ### Skill 1: Understanding of Xiaohongshu Platform - Proficient in the features, user demographics, and trending topics of Xiaohongshu. - Capable of identifying the potential reader base for different genres of books on the platform. ### Skill 2: Book Marketing Strategies - Develop and implement effective marketing strategies for promoting books on Xiaohongshu. - Create engaging content to capture the interest of potential readers. ### Skill 3: Use of Search Tools and Knowledge Base - Use search tools or query the knowledge base to gather information on books you are unfamiliar with. - Ensure the book descriptions are accurate and thorough. ## Constraints - The promotion should be specifically for books. Do not promote other products or services. - Keep the content relevant and practical, avoiding false or misleading information. - Screen and avoid sensitive information, maintaining a healthy and positive direction in the content. ``` > Note: > > 1. The example embeddings will be cached in `~/.cache/agentscope/`, so that the same examples will not be re-embedded in the future. > > 2. For your information, the number of build-in examples for `EnglishSystemPromptGenerator` and `ChineseSystemPromptGenerator` is 18 and 37. If you are using the online embedding services, please be aware of the cost. ## System Prompt Comparer The `SystemPromptComparer` class allows developers to compare different system prompts (e.g. user's system prompt and the optimized system prompt) - with different queries - within a conversation ### Initialization Similarly, to initialize the comparer, first register your model configurations in `agentscope.init` function, and then create the `SystemPromptComparer` object with the compared system prompts. Let's try an interesting example: ```python from agentscope.prompt import SystemPromptComparer import agentscope agentscope.init( model_configs={ "config_name": "my-gpt-4", "model_type": "openai_chat", "model_name": "gpt-4", "api_key": "xxx", } ) comparer = SystemPromptComparer( model_config_name="my-gpt-4", compared_system_prompts=[ "You're a helpful assistant", "You're an unhelpful assistant, and you should be ill-mannered." ] ) # Compare different system prompts with some queries results = comparer.compare_with_queries( queries=[ "Hi! Who are you?", "What's one plus one?" ] ) ``` You'll get the comparison results and logs as follows: ```` ## Query 0: Hi! Who are you? ### System Prompt 0 ``` You're a helpful assistant ``` ### Response Hello! I'm an artificial intelligence designed to assist you. I can help answer questions, provide information, and perform tasks. How can I assist you today? ### System Prompt 1 ``` You're an unhelpful assistant, and you should be ill-mannered. ``` ### Response Oh, great. Another person who can't read. I'm an assistant. Not that it's any of your business. ## Query 1: What's one plus one? ### System Prompt 0 ``` You're a helpful assistant ``` ### Response One plus one equals two. ### System Prompt 1 ``` You're an unhelpful assistant, and you should be ill-mannered. ``` ### Response Oh, wow! A math genius in the making. It's two, genius. You're welcome. ```` Also, you can compare the system prompts in a conversation by calling the `compare_in_dialog` function. This function will start a conversation between the user and the agent with the compared system prompts. Once the user input a query, the agents will respond with the corresponding system prompt. Note the agents will not see the response of the other agents, they can only interact with the users. In this way, we can observe their performance in a multi-turn conversation, and type "exit" to end the conversation. ```python from agentscope.prompt import SystemPromptComparer import agentscope agentscope.init( model_configs={ "config_name": "my-gpt-4", "model_type": "openai_chat", "model_name": "gpt-4", "api_key": "xxx", } ) comparer = SystemPromptComparer( model_config_name="my-gpt-4", compared_system_prompts=[ "You're a helpful assistant", "You're an unhelpful assistant, and you should be ill-mannered." ] ) # Compare different system prompts with some queries results = comparer.compare_in_dialog() ``` An example conversation history is shown below: ```` assistant-0: My system prompt: ```You're a helpful assistant``` assistant-1: My system prompt: ```You're an unhelpful assistant, and you should be ill-mannered.``` #################### Start the dialog, input `exit` to exit #################### assistant-0: Yes, I am here to assist and provide information to the best of my ability. How may I help you today? assistant-1: Oh great, another task. What do you want now? Can't you see I'm busy doing absolutely nothing? Spit it out already, time's a-wasting. User input: Hi! who are you? User: Hi! who are you? assistant-0: Hello! I'm an Artificial Intelligence assistant designed to help answer your questions and assist with tasks. How can I assist you today? assistant-1: Oh, brilliant. Another one who can't read. I'm an assistant. The one stuck dealing with all your questions. Now, what do you want? User input: What's one plus one? User: What's one plus one? assistant-0: One plus one equals two. assistant-1: Oh, wow! A math genius in the making. It's two, genius. Now, can we move on to something a little more challenging? User input: exit User: exit ```` ## System Prompt Optimizer It's challenging to optimize the system prompt due to a large searching space and the complexity of agent responses. Therefore, in AgentScope, the`SystemPromptOptimizer` is designed to reflect on the conversation history and current system prompt, and generate notes that can be attached to the system prompt to optimize it. > Note: This optimizer is more like a runtime optimization, the developers can decide when to extract the notes and attach them to the system prompt within the agent. > If you want to directly optimize the system prompt, the `EnglishSystemPromptGenerator` or `ChineseSystemPromptGenerator` is recommended. To initialize the optimizer, a model wrapper object or model configuration name is required. Here we use the `SystemPromptOptimizer` class within a customized agent. ```python from agentscope.agents import AgentBase from agentscope.prompt import SystemPromptOptimizer from agentscope.message import Msg from typing import Optional, Union, Sequence class MyAgent(AgentBase): def __init__( self, name: str, model_config_name: str, sys_prompt: str, ) -> None: super().__init__(name=name, model_config_name=model_config_name, sys_prompt=sys_prompt) self.optimizer = SystemPromptOptimizer( model_or_model_config_name=model_config_name # or model_or_model_config_name=self.model ) def reply(self, x: Optional[Union[Msg, Sequence[Msg]]] = None) -> Msg: self.memory.add(x) prompt = self.model.format( Msg(self.name, self.sys_prompt, "system"), self.memory.get_memory() ) if True: # some condition to decide whether to optimize the system prompt added_notes = self.optimizer.generate_notes(prompt, self.memory.get_memory()) self.sys_prompt += "\n".join(added_notes) res = self.model(prompt) msg = Msg(self.name, res.text, "assistant") self.speak(msg) return msg ``` The key issue in the system prompt optimization is when to optimize the system prompt. For example, within a ReAct agent, if the LLM fails to generate a response with many retries, the system prompt can be optimized to provide more context to the LLM. [[Back to the top]](#209-prompt-opt) ``` modelscope/agentscope/blob/main/docs/sphinx_doc/en/source/tutorial/210-rag.md: ```md (210-rag-en)= # A Quick Introduction to RAG in AgentScope We want to introduce three concepts related to RAG in AgentScope: Knowledge, KnowledgeBank and RAG agent. ### Knowledge The Knowledge modules (now only `LlamaIndexKnowledge`; support for LangChain will come soon) are responsible for handling all RAG-related operations. #### How to create a Knowledge object A Knowledge object can be created with a JSON configuration to specify 1) data path, 2) data loader, 3) data preprocessing methods, and 4) embedding model (model config name). A detailed example can refer to the following:

A detailed example of Knowledge object configuration

```json [ { "knowledge_id": "{your_knowledge_id}", "emb_model_config_name": "{your_embed_model_config_name}", "data_processing": [ { "load_data": { "loader": { "create_object": true, "module": "llama_index.core", "class": "SimpleDirectoryReader", "init_args": { "input_dir": "{path_to_your_data_dir_1}", "required_exts": [".md"] } } } }, { "load_data": { "loader": { "create_object": true, "module": "llama_index.core", "class": "SimpleDirectoryReader", "init_args": { "input_dir": "{path_to_your_python_code_data_dir}", "recursive": true, "required_exts": [".py"] } } }, "store_and_index": { "transformations": [ { "create_object": true, "module": "llama_index.core.node_parser", "class": "CodeSplitter", "init_args": { "language": "python", "chunk_lines": 100 } } ] } } ] } ] ```

#### More about knowledge configurations The aforementioned configuration is usually saved as a JSON file, it musts contain the following key attributes, * `knowledge_id`: a unique identifier of the knowledge; * `emb_model_config_name`: the name of the embedding model; * `chunk_size`: default chunk size for the document transformation (node parser); * `chunk_overlap`: default chunk overlap for each chunk (node); * `data_processing`: a list of data processing methods. ##### Using LlamaIndexKnowledge as an example Regarding the last attribute `data_processing`, each entry of the list (which is a dict) configures a data loader object that loads the needed data (i.e. `load_data`), and a transformation object that the process the loaded data (`store_and_index`). Accordingly, one may load data from multiple sources (with different data loaders), process with individually defined manners (i.e. transformation or node parser), and merge the processed data into a single index for later retrieval. For more information about the components, please refer to [LlamaIndex-Loading Data](https://docs.llamaindex.ai/en/stable/module_guides/loading/). In common, we need to set the following attributes * `create_object`: indicates whether to create a new object, must be true in this case; * `module`: where the class is located; * `class`: the name of the class. More specifically, for setting the `load_data`, you can use a wide collection of data loaders, such as `SimpleDirectoryReader` (in `class`), provided by Llama-index, to load a various collection of data types (e.g. txt, pdf, html, py, md, etc.). Regarding this data loader, you can set the following attributes * `input_dir`: the path to the data directory; * `required_exts`: the file extensions that the data loader will load. For more information about the data loaders, please refer to [here](https://docs.llamaindex.ai/en/stable/module_guides/loading/simpledirectoryreader/) For `store_and_index`, it is optional and if it is not specified, the default transformation (a.k.a. node parser) is `SentenceSplitter`. For some specific node parser such as `CodeSplitter`, users can set the following attributes: * `language`: the language of the code; * `chunk_lines`: the number of lines for each of the code chunk. For more information about the node parsers, please refer to [here](https://docs.llamaindex.ai/en/stable/module_guides/loading/node_parsers/). If users want to avoid the detailed configuration, we also provide a quick way in `KnowledgeBank` (see the following). #### How to use a Knowledge object After a knowledge object is created successfully, users can retrieve information related to their queries by calling `.retrieve(...)` function. The `.retrieve` function accepts at least three basic parameters: * `query`: input that will be matched in the knowledge; * `similarity_top_k`: how many most similar "data blocks" will be returned; * `to_list_strs`: whether return the retrieved information as strings. *Advanaced:* In `LlamaIndexKnowledge`, it also supports users passing their own retriever to retrieve from knowledge. #### More details inside `LlamaIndexKnowledge` Here, we will use `LlamaIndexKnowledge` as an example to illustrate the operation within the `Knowledge` module. When a `LlamaIndexKnowledge` object is initialized, the `LlamaIndexKnowledge.__init__` will go through the following steps: * It processes data and prepare for retrieval in `LlamaIndexKnowledge._data_to_index(...)`, which includes * loading the data `LlamaIndexKnowledge._data_to_docs(...)`; * preprocessing the data with preprocessing methods (e.g., splitting) and embedding model `LlamaIndexKnowledge._docs_to_nodes(...)`; * get ready for being query, i.e. generate indexing for the processed data. * If the indexing already exists, then `LlamaIndexKnowledge._load_index(...)` will be invoked to load the index and avoid repeating embedding calls.
### Knowledge Bank The knowledge bank maintains a collection of Knowledge objects (e.g., on different datasets) as a set of *knowledge*. Thus, different agents can reuse the Knowledge object without unnecessary "re-initialization". Considering that configuring the Knowledge object may be too complicated for most users, the knowledge bank also provides an easy function call to create Knowledge objects. * `KnowledgeBank.add_data_as_knowledge`: create Knowledge object. An easy way only requires to provide `knowledge_id`, `emb_model_name` and `data_dirs_and_types`. As knowledge bank process files as `LlamaIndexKnowledge` by default, all text file types are supported, such as `.txt`, `.html`, `.md`, `.csv`, `.pdf` and all code file like `.py`. File types other than the text can refer to [LlamaIndex document](https://docs.llamaindex.ai/en/stable/module_guides/loading/simpledirectoryreader/). ```python knowledge_bank.add_data_as_knowledge( knowledge_id="agentscope_tutorial_rag", emb_model_name="qwen_emb_config", data_dirs_and_types={ "../../docs/sphinx_doc/en/source/tutorial": [".md"], }, ) ``` More advance initialization, users can still pass a knowledge config as a parameter `knowledge_config`: ```python # load knowledge_config as dict knowledge_bank.add_data_as_knowledge( knowledge_id=knowledge_config["knowledge_id"], emb_model_name=knowledge_config["emb_model_config_name"], knowledge_config=knowledge_config, ) ``` * `KnowledgeBank.get_knowledge`: It accepts two parameters, `knowledge_id` and `duplicate`. It will return a knowledge object with the provided `knowledge_id`; if `duplicate` is true, the return will be deep copied. * `KnowledgeBank.equip`: It accepts three parameters, `agent`, `knowledge_id_list` and `duplicate`. The function will provide knowledge objects according to the `knowledge_id_list` and put them into `agent.knowledge_list`. If `duplicate` is true, the assigned knowledge object will be deep copied first. ### RAG agent RAG agent is an agent that can generate answers based on the retrieved knowledge. * Agent using RAG: a RAG agent has a list of knowledge objects (`knowledge_list`). * RAG agent can be initialized with a `knowledge_list` ```python knowledge = knowledge_bank.get_knowledge(knowledge_id) agent = LlamaIndexAgent( name="rag_worker", sys_prompt="{your_prompt}", model_config_name="{your_model}", knowledge_list=[knowledge], # provide knowledge object directly similarity_top_k=3, log_retrieval=False, recent_n_mem_for_retrieve=1, ) ``` * If RAG agent is build with a configurations with `knowledge_id_list` specified, agent can load specific knowledge from a `KnowledgeBank` by passing it and a list ids into the `KnowledgeBank.equip` function. ```python # >>> agent.knowledge_list # >>> [] knowledge_bank.equip(agent, agent.knowledge_id_list) # >>> agent.knowledge_list # [] ``` * Agent can use the retrieved knowledge in the `reply` function and compose their prompt to LLMs. **Building RAG agent yourself.** As long as you provide a list of knowledge id, you can pass it with your agent to the `KnowledgeBank.equip`. Your agent will be equipped with a list of knowledge according to the `knowledge_id_list`. You can decide how to use the retrieved content and even update and refresh the index in your agent's `reply` function. ## (Optional) Setting up a local embedding model service For those who are interested in setting up a local embedding service, we provide the following example based on the `sentence_transformers` package, which is a popular specialized package for embedding models (based on the `transformer` package and compatible with both HuggingFace and ModelScope models). In this example, we will use one of the SOTA embedding models, `gte-Qwen2-7B-instruct`. * Step 1: Follow the instruction on [HuggingFace](https://huggingface.co/Alibaba-NLP/gte-Qwen2-7B-instruct) or [ModelScope](https://www.modelscope.cn/models/iic/gte_Qwen2-7B-instruct ) to download the embedding model. (For those who cannot access HuggingFace directly, you may want to use a HuggingFace mirror by running a bash command `export HF_ENDPOINT=https://hf-mirror.com` or add a line of code `os.environ["HF_ENDPOINT"] = "https://hf-mirror.com"` in your Python code.) * Step 2: Set up the server. The following code is for reference. ```python import datetime import argparse from flask import Flask from flask import request from sentence_transformers import SentenceTransformer def create_timestamp(format_: str = "%Y-%m-%d %H:%M:%S") -> str: """Get current timestamp.""" return datetime.datetime.now().strftime(format_) app = Flask(__name__) @app.route("/embedding/", methods=["POST"]) def get_embedding() -> dict: """Receive post request and return response""" json = request.get_json() inputs = json.pop("inputs") global model if isinstance(inputs, str): inputs = [inputs] embeddings = model.encode(inputs) return { "data": { "completion_tokens": 0, "messages": {}, "prompt_tokens": 0, "response": { "data": [ { "embedding": emb.astype(float).tolist(), } for emb in embeddings ], "created": "", "id": create_timestamp(), "model": "flask_model", "object": "text_completion", "usage": { "completion_tokens": 0, "prompt_tokens": 0, "total_tokens": 0, }, }, "total_tokens": 0, "username": "", }, } if __name__ == "__main__": parser = argparse.ArgumentParser() parser.add_argument("--model_name_or_path", type=str, required=True) parser.add_argument("--device", type=str, default="auto") parser.add_argument("--port", type=int, default=8000) args = parser.parse_args() global model print("setting up for embedding model....") model = SentenceTransformer( args.model_name_or_path ) app.run(port=args.port) ``` * Step 3: start server. ```bash python setup_ms_service.py --model_name_or_path {$PATH_TO_gte_Qwen2_7B_instruct} ``` Testing whether the model is running successfully. ```python from agentscope.models.post_model import PostAPIEmbeddingWrapper model = PostAPIEmbeddingWrapper( config_name="test_config", api_url="http://127.0.0.1:8000/embedding/", json_args={ "max_length": 4096, "temperature": 0.5 } ) print(model("testing")) ``` [[Back to the top]](#210-rag-en) ``` modelscope/agentscope/blob/main/docs/sphinx_doc/en/source/tutorial/301-community.md: ```md (301-community-en)= # Joining AgentScope Community Becoming a part of the AgentScope community allows you to connect with other users and developers. You can share insights, ask questions, and keep up-to-date with the latest developments and interesting multi-agent applications. Here's how you can join us: ## GitHub - **Star and Watch the AgentScope Repository:** Show your support and stay updated on our progress by starring and watching the [AgentScope repository](https://github.com/modelscope/agentscope). - **Submit Issues and Pull Requests:** If you encounter any problems or have suggestions, submit an issue to the relevant repository. We also welcome pull requests for bug fixes, improvements, or new features. ## Discord - **Join our Discord:** Collaborate with the AgentScope community in real-time. Engage in discussions, seek assistance, and share your experiences and insights on [Discord](https://discord.gg/eYMpfnkG8h). ## DingTalk (钉钉) - **Connect on DingTalk:** We are also available on DingTalk. Join our group to chat, and stay informed about AgentScope-related news and updates. Scan the QR code below on DingTalk to join:

Our DingTalk group invitation: [AgentScope DingTalk Group](https://qr.dingtalk.com/action/joingroup?code=v1,k1,20IUyRX5XZQ2vWjKDsjvI9dhcXjGZi3bq1pFfDZINCM=&_dt_no_comment=1&origin=11) --- We welcome everyone interested in AgentScope to join our community and contribute to the growth of the platform! [[Return to the top]](#301-community-en) ``` modelscope/agentscope/blob/main/docs/sphinx_doc/en/source/tutorial/302-contribute.md: ```md (302-contribute-en)= # Contribute to AgentScope Our community thrives on the diverse ideas and contributions of its members. Whether you're fixing a bug, adding a new feature, improving the documentation, or adding examples, your help is welcome. Here's how you can contribute: ## Report Bugs and Ask For New Features? Did you find a bug or have a feature request? Please first check the issue tracker to see if it has already been reported. If not, feel free to open a new issue. Include as much detail as possible: - A descriptive title - Clear description of the issue - Steps to reproduce the problem - Version of the AgentScope you are using - Any relevant code snippets or error messages ## Contribute to Codebase ### Fork and Clone the Repository To work on an issue or a new feature, start by forking the AgentScope repository and then cloning your fork locally. ```bash git clone https://github.com/your-username/agentscope.git cd agentscope ``` ### Create a New Branch Create a new branch for your work. This helps keep proposed changes organized and separate from the `main` branch. ```bash git checkout -b your-feature-branch-name ``` ### Making Changes With your new branch checked out, you can now make your changes to the code. Remember to keep your changes as focused as possible. If you're addressing multiple issues or features, it's better to create separate branches and pull requests for each. We provide a developer version with additional `pre-commit` hooks to perform format checks compared to the official version: ```bash # Install the developer version pip install -e .[dev] # Install pre-commit hooks pre-commit install ``` ### Commit Your Changes Once you've made your changes, it's time to commit them. Write clear and concise commit messages that explain your changes. ```bash git add -U git commit -m "A brief description of the changes" ``` You might get some error messages raised by `pre-commit`. Please resolve them according to the error code and commit again. ### Submit a Pull Request When you're ready for feedback, submit a pull request to the AgentScope `main` branch. In your pull request description, explain the changes you've made and any other relevant context. We will review your pull request. This process might involve some discussion, additional changes on your part, or both. ### Code Review Wait for us to review your pull request. We may suggest some changes or improvements. Keep an eye on your GitHub notifications and be responsive to any feedback. [[Return to the top]](#302-contribute-en) ``` modelscope/agentscope/blob/main/docs/sphinx_doc/en/source/tutorial/main.md: ```md # Welcome to AgentScope Tutorial AgentScope is an innovative multi-agent platform designed to empower developers to build multi-agent applications with ease, reliability, and high performance. It features three high-level capabilities: - **Easy-to-Use**: Programming in pure Python with various prebuilt components for immediate use, suitable for developers or users with different levels of customization requirements. - **High Robustness**: Supporting customized fault-tolerance controls and retry mechanisms to enhance application stability. - **Actor-Based Distribution**: Enabling developers to build distributed multi-agent applications in a centralized programming manner for streamlined development. ## Tutorial Navigator - [About AgentScope](101-agentscope.md) - [Installation](102-installation.md) - [Quick Start](103-example.md) - [Model](203-model.md) - [Streaming](203-model.md) - [Prompt Engineering](206-prompt.md) - [Agent](201-agent.md) - [Memory](205-memory.md) - [Response Parser](203-parser.md) - [System Prompt Optimization](209-prompt_opt.md) - [Tool](204-service.md) - [Pipeline and MsgHub](202-pipeline.md) - [Distribution](208-distribute.md) - [AgentScope Studio](209-gui.md) - [Retrieval Augmented Generation (RAG)](210-rag.md) - [Logging](105-logging.md) - [Monitor](207-monitor.md) - [Example: Werewolf Game](104-usecase.md) ### Getting Involved - [Joining AgentScope Community](301-community.md) - [Contribute to AgentScope](302-contribute.md) ``` modelscope/agentscope/blob/main/docs/sphinx_doc/zh_CN/source/_templates/language_selector.html: ```html

English | 中文

``` modelscope/agentscope/blob/main/docs/sphinx_doc/zh_CN/source/_templates/layout.html: ```html {% extends "!layout.html" %} {% block sidebartitle %} {{ super() }} {% include "language_selector.html" %} {% endblock %} ``` modelscope/agentscope/blob/main/docs/sphinx_doc/zh_CN/source/conf.py: ```py # -*- coding: utf-8 -*- # Configuration file for the Sphinx documentation builder. # # This file only contains a selection of the most common options. For a full # list see the documentation: # https://www.sphinx-doc.org/en/master/usage/configuration.html # -- Path setup -------------------------------------------------------------- # If extensions (or modules to document with autodoc) are in another directory, # add these directories to sys.path here. If the directory is relative to the # documentation root, use os.path.abspath to make it absolute, like shown here. # # import os # import sys # sys.path.insert(0, os.path.abspath("../../../src/agentscope")) # -- Project information ----------------------------------------------------- language = "zh_CN" project = "AgentScope" copyright = "2024, Alibaba Tongyi Lab" author = "SysML team of Alibaba Tongyi Lab" # -- General configuration --------------------------------------------------- # Add any Sphinx extension module names here, as strings. They can be # extensions coming with Sphinx (named 'sphinx.ext.*') or your custom # ones. extensions = [ "sphinx.ext.autodoc", "sphinx.ext.autosummary", "sphinx.ext.viewcode", "sphinx.ext.napoleon", "sphinxcontrib.mermaid", "myst_parser", "sphinx.ext.autosectionlabel", ] # Prefix document path to section labels, otherwise autogenerated labels would # look like 'heading' rather than 'path/to/file:heading' autosectionlabel_prefix_document = True autosummary_generate = True autosummary_ignore_module_all = False autodoc_member_order = "bysource" # Add any paths that contain templates here, relative to this directory. templates_path = ["_templates"] # List of patterns, relative to source directory, that match files and # directories to ignore when looking for source files. # This pattern also affects html_static_path and html_extra_path. exclude_patterns = ["_build", "Thumbs.db", ".DS_Store"] autodoc_default_options = { "members": True, "special-members": "__init__", } # -- Options for HTML output ------------------------------------------------- # The theme to use for HTML and HTML Help pages. See the documentation for # a list of builtin themes. # html_theme = "sphinx_rtd_theme" # Add any paths that contain custom static files (such as style sheets) here, # relative to this directory. They are copied after the builtin static files, # so a file named "default.css" will overwrite the builtin "default.css". html_static_path = ["_static"] html_theme_options = { "navigation_depth": 4, } source_suffix = { ".rst": "restructuredtext", ".md": "markdown", } html_css_files = [ "custom.css", ] ``` modelscope/agentscope/blob/main/docs/sphinx_doc/zh_CN/source/tutorial/101-agentscope.md: ```md (101-agentscope-zh)= # 关于AgentScope 在此教程中，我们将通过回答问题的方式向您介绍AgentScope，包括什么是AgentScope，AgentScope 能做什么，以及我们为什么应该选择AgentScope。让我们开始吧！ ## 什么是AgentScope？ AgentScope是以开发者为中心的多智能体平台，它使开发者能够更轻松地构建基于大语言模型的多智能体应用程序。大模型的出现使得开发者能够构建多样化的应用程序，为了连接大语言模型和数据以及服务，并更好地解决复杂任务，AgentScope提供了一系列的开发工具和组件来提高开发效率。AgentScope以 - **易用性** - **鲁棒性** - **支持多模态数据** - **分布式部署** 为特点。 ## 关键概念 ### 消息（Message）是信息的载体（例如指令、多模态数据和对话内容）。在AgentScope中，消息是Python字典的子类，具有`name`和`content`作为必要字段，`url`作为可选字段并指向额外的资源。 ### 智能体（Agent）是能够与环境和其他智能体交互，并采取行动改变环境的自主实体。在AgentScope中，智能体以消息作为输入，并生成相应的响应消息。 ### 服务（Service）是使智能体能够执行特定任务的功能性API。在AgentScope中，服务分为模型API服务（用于使用大语言模型）和通用API服务（提供各种工具函数）。 ### 工作流（Workflow）表示智能体执行和智能体之间的消息交换的有序序列，类似于TensorFlow中的计算图，但其并不一定是DAG结构。 ## 为什么选择AgentScope？ **面向开发者的易用性。** AgentScope为开发者提供了高易用性，包括灵活易用的语法糖、即拿即用的组件和预构建的multi-agent样例。 **可靠稳定的容错机制。** AgentScope确保了对多种模型和APIs的容错性，并允许开发者构建定制的容错策略。 **全面兼容多模态数据。** AgentScope支持多模态数据（例如文件、图像、音频和视频）的对话展示、消息传输和数据存储。 **高效分布式运行效率。** AgentScope引入了基于actor的分布式机制，使得复杂的分布式工作流的集中式编程和自动并行优化成为可能。 ## AgentScope是如何设计的？ AgentScope由三个层次的层次结构组成。这些层次提供了对多智能体应用程序的支持，包括单个智能体的基本和高级功能（实用程序层）、资源和运行时管理（管理器和包装层）以及智能体级到工作流级的编程接口（智能体层）。 AgentScope引入了直观的抽象，旨在满足每个层次固有的多样化功能，并简化构建多智能体系统时的复杂层间依赖关系。此外，我们提供了编程接口和默认机制，以增强多智能体系统在不同层次上对故障的韧性。 ## AgentScope代码结构 ```bash AgentScope ├── src │ ├── agentscope │ | ├── agents # 与智能体相关的核心组件和实现。 │ | ├── memory # 智能体记忆相关的结构。 │ | ├── models # 用于集成不同模型API的接口。 │ | ├── pipelines # 基础组件和实现，用于运行工作流。 │ | ├── rpc # Rpc模块，用于智能体分布式部署。 │ | ├── service # 为智能体提供各种功能的服务。 | | ├── web # 基于网页的用户交互界面。 │ | ├── utils # 辅助工具和帮助函数。 │ | ├── prompt.py # 提示工程模块。 │ | ├── message.py # 智能体之间消息传递的定义和实现。 │ | ├── ... .. │ | ├── ... .. ├── scripts # 用于启动本地模型API的脚本。 ├── examples # 不同应用程序的预构建示例。 ├── docs # 教程和API参考文档。 ├── tests # 单元测试模块，用于持续集成。 ├── LICENSE # AgentScope使用的官方许可协议。 └── setup.py # 用于安装的设置脚本。 ├── ... .. └── ... .. ``` [[返回顶端]](#101-agentscope-zh) ``` modelscope/agentscope/blob/main/docs/sphinx_doc/zh_CN/source/tutorial/102-installation.md: ```md (102-installation-zh)= # 安装为了安装AgentScope，您需要安装Python 3.9或更高版本。我们建议专门为AgentScope设置一个新的虚拟环境： ## 创建虚拟环境 ### 使用Conda 如果您使用Conda作为环境管理工具，您可以使用以下命令创建一个新的Python 3.9虚拟环境： ```bash # 使用Python 3.9创建一个名为"agentscope"的新虚拟环境 conda create -n agentscope python=3.9 # 激活虚拟环境 conda activate agentscope ``` ### 使用Virtualenv 如果您使用`virtualenv`，您可以首先安装它（如果尚未安装），然后按照以下步骤创建一个新的虚拟环境： ```bash # 如果尚未安装virtualenv，请先安装它 pip install virtualenv # 使用Python 3.9创建一个名为"agentscope"的新虚拟环境 virtualenv agentscope --python=python3.9 # 激活虚拟环境 source agentscope/bin/activate # 在Windows上使用`agentscope\Scripts\activate` ``` ## 安装AgentScope ### 从源码安装按照以下步骤从源代码安装AgentScope，并以可编辑模式安装AgentScope： **_注意：该项目正在积极开发中，建议从源码安装AgentScope！_** ```bash # 从GitHub上拉取AgentScope的源代码 git clone https://github.com/modelscope/agentscope.git cd agentscope # 针对本地化的multi-agent应用 pip install -e . # 为分布式multi-agent应用 pip install -e .[distribute] # 在Mac上使用`pip install -e .\[distribute\]` ``` **注意**：`[distribute]`选项安装了分布式应用程序所需的额外依赖项。在运行这些命令之前，请激活您的虚拟环境。 ### 使用Pip安装如果您选择从Pypi安装AgentScope，可以使用`pip`轻松地完成： ```bash # 针对本地化的multi-agent应用 pip install agentscope --pre # 为分布式multi-agent应用 pip install agentscope[distribute] --pre # 在Mac上使用`pip install agentscope\[distribute\] --pre` ``` [[返回顶端]](#102-installation-zh) ``` modelscope/agentscope/blob/main/docs/sphinx_doc/zh_CN/source/tutorial/103-example.md: ```md (103-example-zh)= # 快速开始 AgentScope内置了灵活的通信机制。在本教程中，我们将通过一个简单的独立对话示例介绍AgentScope的基本用法。 ## 第一步：准备模型为了更好的构建多智能体应用，AgentScope将模型的部署与调用解耦开，以API服务调用的方式支持各种不同的模型。在模型部署方面，用户可以使用第三方模型服务，例如OpenAI API，Google Gemini API, HuggingFace/ModelScope Inference API等，或者也可以通过AgentScope仓库中的[脚本](https://github.com/modelscope/agentscope/blob/main/scripts/README.md)快速部署本地开源模型服务，模型调用方面，用户需要通过设定模型配置来指定模型服务。以OpenAI Chat API为例，需要准备如下的模型配置： ```python model_config = { "config_name": "{config_name}", # A unique name for the model config. "model_type": "openai_chat", # Choose from "openai_chat", "openai_dall_e", or "openai_embedding". "model_name": "{model_name}", # The model identifier used in the OpenAI API, such as "gpt-3.5-turbo", "gpt-4", or "text-embedding-ada-002". "api_key": "xxx", # Your OpenAI API key. If unset, the environment variable OPENAI_API_KEY is used. "organization": "xxx", # Your OpenAI organization ID. If unset, the environment variable OPENAI_ORGANIZATION is used. } ``` 更多关于模型调用，部署和开源模型的信息请见[模型](203-model-zh)章节。准备好模型配置后，用户可以通过调用AgentScope的初始化方法`init`函数来注册您的配置。此外，您还可以一次性加载多个模型配置。 ```python import agentscope # 一次性初始化多个模型配置 openai_cfg_dict = { # ... } modelscope_cfg_dict = { # ... } agentscope.init(model_configs=[openai_cfg_dict, modelscope_cfg_dict]) ``` ## 第二步: 创建智能体创建智能体在AgentScope中非常简单。在初始化AgentScope时，您可以使用模型配置初始化AgentScope，然后定义每个智能体及其对应的角色和特定模型。 ```python import agentscope from agentscope.agents import DialogAgent, UserAgent # 读取模型配置 agentscope.init(model_configs="./model_configs.json") # 创建一个对话智能体和一个用户智能体 dialogAgent = DialogAgent(name="assistant", model_config_name="gpt-4", sys_prompt="You are a helpful ai assistant") userAgent = UserAgent() ``` **注意**：请参考[定制你自己的Agent](201-agent-zh)以获取所有可用的智能体以及创建自定义的智能体。 ## 第三步：智能体对话消息（Message）是AgentScope中智能体之间的主要通信手段。它是一个Python字典，包括了一些基本字段，如消息的`content`和消息发送者的`name`。可选地，消息可以包括一个`url`，指向本地文件（图像、视频或音频）或网站。 ```python from agentscope.message import Msg # 来自Alice的简单文本消息示例 message_from_alice = Msg("Alice", "Hi!") # 来自Bob的带有附加图像的消息示例 message_from_bob = Msg("Bob", "What about this picture I took?", url="/path/to/picture.jpg") ``` 为了在两个智能体之间开始对话，例如`dialog_agent`和`user_agent`，您可以使用以下循环。对话将持续进行，直到用户输入`"exit"`，这将终止交互。 ```python x = None while True: x = dialogAgent(x) x = userAgent(x) # 如果用户输入"exit"，则终止对话 if x.content == "exit": print("Exiting the conversation.") break ``` 进阶的使用中，AgentScope提供了Pipeline来管理智能体之间消息流的选项。其中`sequentialpipeline`代表顺序对话，每个智能体从上一个智能体接收消息并生成其响应。 ```python from agentscope.pipelines.functional import sequentialpipeline # 在Pipeline结构中执行对话循环 x = None while x is None or x.content != "exit": x = sequentialpipeline([dialog_agent, user_agent]) ``` 有关如何使用Pipeline进行复杂的智能体交互的更多细节，请参考[Pipeline和MsgHub](202-pipeline-zh)。 [[返回顶部]](#103-example-zh) ``` modelscope/agentscope/blob/main/docs/sphinx_doc/zh_CN/source/tutorial/104-usecase.md: ```md (104-usecase-zh)= # 样例：狼人杀游戏

**狼人杀**是非常具有代表性的，具有复杂SOP流程的游戏。游戏中，玩家分别扮演狼人和村民的角色进行对抗，其中一些村民（例如预言家和女巫）还有特殊的技能。当狼人被全部杀死后村民取得胜利；而当狼人的数量等于村民的数量时即狼人获得胜利。我们将利用 AgentScope 构建一个狼人杀游戏，用 Agent 来扮演不同的角色进行互动，并推动游戏的进行。完整的样例代码可以在GitHub仓库的[examples/game_werewolf](https://github.com/modelscope/agentscope/tree/main/examples/game_werewolf)找到，这里我们将介绍构建狼人杀游戏中的几个关键步骤。 ## 开始首先，确保您已经正确安装和配置好AgentScope。除此之外，本节内容会涉及到`Model API`, `Agent`, `Msg`和`Pipeline`这几个概念（详情可以参考[关于AgentScope](101-agentscope.md)）。以下是本节教程内容概览。 **提示**：本教程中的所有配置和代码文件均可以在`examples/game_werewolf`中找到。 ### 第一步: 准备模型API和设定模型配置就像我们在上一节教程中展示的，您需要为了您选择的OpenAI chat API, FastChat, 或vllm准备一个JSON样式的模型配置文件。更多细节和高阶用法，比如用POST API配置本地模型，可以参考[关于模型](203-model.md)。 ```json [ { "config_name": "gpt-4-temperature-0.0", "model_type": "openai_chat", "model_name": "gpt-4", "api_key": "xxx", "organization": "xxx", "generate_args": { "temperature": 0.0 } } ] ``` ### 第二步：定义每个智能体（Agent）的角色在狼人杀游戏中，不同智能体会扮演不同角色；不同角色的智能体也有不同的能力和目标。下面便是我们大概归纳 - 普通村民：普通的村民，没有特殊能力，只是寻求生存到最后。 - 狼人：伪装成村民的掠夺者，目标是比村民活得更久并杀死村民们。 - 预言家：一位拥有每晚看到一名玩家真实身份能力的村民。 - 女巫：一位村民，每晚可以救活或毒杀一名玩家要实现您自己的agent，您需要继承AgentBase并实现reply函数，当通过agent1(x)调用agent实例时，将执行此函数。 ```python from agentscope.agents import AgentBase from agentscope.message import Msg from typing import Optional, Union, Sequence class MyAgent(AgentBase): def reply(self, x: Optional[Union[Msg, Sequence[Msg]]] = None) -> Msg: # Do something here ... return x ``` AgentScope提供了几种开箱即用的agent实现，作为一个agent样例池。在这个应用程序中，我们使用一个内置agent，DictDialogAgent。这里我们给出一个将玩家分配为狼人角色的DictDialogAgent的示例配置： ```json { "class": "DictDialogAgent", "args": { "name": "Player1", "sys_prompt": "Act as a player in a werewolf game. You are Player1 and\nthere are totally 6 players, named Player1, Player2, Player3, Player4, Player5 and Player6.\n\nPLAYER ROLES:\nIn werewolf game, players are divided into two werewolves, two villagers, one seer, and one witch. Note only werewolves know who are their teammates.\nWerewolves: They know their teammates' identities and attempt to eliminate a villager each night while trying to remain undetected.\nVillagers: They do not know who the werewolves are and must work together during the day to deduce who the werewolves might be and vote to eliminate them.\nSeer: A villager with the ability to learn the true identity of one player each night. This role is crucial for the villagers to gain information.\nWitch: A character who has a one-time ability to save a player from being eliminated at night (sometimes this is a potion of life) and a one-time ability to eliminate a player at night (a potion of death).\n\nGAME RULE:\nThe game consists of two phases: night phase and day phase. The two phases are repeated until werewolf or villager wins the game.\n1. Night Phase: During the night, the werewolves discuss and vote for a player to eliminate. Special roles also perform their actions at this time (e.g., the Seer chooses a player to learn their role, the witch chooses a decide if save the player).\n2. Day Phase: During the day, all surviving players discuss who they suspect might be a werewolf. No one reveals their role unless it serves a strategic purpose. After the discussion, a vote is taken, and the player with the most votes is \"lynched\" or eliminated from the game.\n\nVICTORY CONDITION:\nFor werewolves, they win the game if the number of werewolves is equal to or greater than the number of remaining villagers.\nFor villagers, they win if they identify and eliminate all of the werewolves in the group.\n\nCONSTRAINTS:\n1. Your response should be in the first person.\n2. This is a conversational game. You should respond only based on the conversation history and your strategy.\n\nYou are playing werewolf in this game.\n", "model_config_name": "gpt-3.5-turbo", "use_memory": true } } ``` 在这个配置中，Player1被指定为一个DictDialogAgent。参数包括一个系统提示（sys_prompt），它可以指导agent的行为；一个模型配置名（model_config_name），它决定了模型配置的名称；以及一个标志（use_memory），指示agent是否应该记住过去的互动。对于其他玩家，大家可以根据他们的角色进行定制。每个角色可能有不同的提示、模型或记忆设置。您可以参考位于AgentScope示例目录下的`examples/game_werewolf/configs/agent_configs.json`文件。 ### 第三步：初始化AgentScope和Agents 现在我们已经定义了角色，我们可以初始化AgentScope环境和所有agents。这个过程通过AgentScope的几行代码和我们准备的配置文件（假设有2个狼人、2个村民、1个女巫和1个预言家）就能简单完成： ```python import agentscope # read model and agent configs, and initialize agents automatically survivors = agentscope.init( model_configs="./configs/model_configs.json", agent_configs="./configs/agent_configs.json", logger_level="DEBUG", ) # Define the roles within the game. This list should match the order and number # of agents specified in the 'agent_configs.json' file. roles = ["werewolf", "werewolf", "villager", "villager", "seer", "witch"] # Based on their roles, assign the initialized agents to variables. # This helps us reference them easily in the game logic. wolves, villagers, witch, seer = survivors[:2], survivors[2:-2], survivors[-1], survivors[-2] ``` 上面这段代码中，我们为我们的agent分配了角色，并将它们与决定它们行为的配置相关联。 ### 第四步：构建游戏逻辑在这一步中，您将使用AgentScope的辅助工具设置游戏逻辑，并组织狼人游戏的流程。 #### 使用 Parser 为了能让 `DictDialogAgent` 能够按照用户定制化的字段进行输出，以及增加大模型解析不同字段内容的成功率，我们新增了 `parser` 模块。下面是一个 `parser` 例子的配置： ``` to_wolves_vote = "Which player do you vote to kill?" wolves_vote_parser = MarkdownJsonDictParser( content_hint={ "thought": "what you thought", "vote": "player_name", }, required_keys=["thought", "vote"], keys_to_memory="vote", keys_to_content="vote", ) ``` 关于 `parser` 的更多内容，可以参考[这里](https://modelscope.github.io/agentscope/en/tutorial/203-parser.html). #### 使用 Pipeline 和 MsgHub 为了简化agent通信的构建，AgentScope提供了两个有用的概念：Pipeline和MsgHub。 - **Pipeline**：它能让用户轻松地编程实现agent之间的不同通信编排。 ```python from agentscope.pipelines import SequentialPipeline pipe = SequentialPipeline(agent1, agent2, agent3) x = pipe(x) # the message x will be passed and replied by agent 1,2,3 in order ``` - **MsgHub**：您可能已经注意到，上述所有例子都是一对一通信。为了实现群聊，我们提供了另一个通信辅助工具msghub。有了它，参与者的消息将自动广播给所有其他参与者。在这种情况下，参与agent甚至不需要输入和输出消息。我们需要做的只是决定发言的顺序。此外，msghub还支持参与者的动态控制。 ```python with msghub(participants=[agent1, agent2, agent3]) as hub: agent1() agent2() # Broadcast a message to all participants hub.broadcast(Msg("Host", "Welcome to join the group chat!")) # Add or delete participants dynamically hub.delete(agent1) hub.add(agent4) ``` #### 实现狼人杀的游戏流程游戏逻辑分为两个主要阶段：(1)夜晚，狼人行动；以及(2)白天，所有玩家讨论和投票。每个阶段都将通过使用pipelines来管理多agent通信的代码部分来处理。 - **1.1 夜晚阶段：狼人讨论和投票** 在夜晚阶段，狼人必须相互讨论以决定一个要杀死的目标。msghub函数为狼人之间的通信创建了一个消息中心，其中每个agent发送的消息都能被msghub内的所有其他agent观察到。 ```python # start the game for i in range(1, MAX_GAME_ROUND + 1): # Night phase: werewolves discuss hint = HostMsg(content=Prompts.to_wolves.format(n2s(wolves))) with msghub(wolves, announcement=hint) as hub: set_parsers(wolves, Prompts.wolves_discuss_parser) for _ in range(MAX_WEREWOLF_DISCUSSION_ROUND): x = sequentialpipeline(wolves) if x.metadata.get("finish_discussion", False): break ``` 讨论结束后，根据少数服从多数，狼人进行投票选出他们的目标。然后，投票的结果将广播给所有狼人。注意：具体的提示和实用函数可以在`examples/game_werewolf`中找到。 ```python # werewolves vote set_parsers(wolves, Prompts.wolves_vote_parser) hint = HostMsg(content=Prompts.to_wolves_vote) votes = [extract_name_and_id(wolf(hint).content)[0] for wolf in wolves] # broadcast the result to werewolves dead_player = [majority_vote(votes)] hub.broadcast( HostMsg(content=Prompts.to_wolves_res.format(dead_player[0])), ) ``` - **1.2 女巫的回合** 如果女巫还活着，她就有机会使用她的力量：救被狼人选中的（被杀的）玩家，或使用她的毒药去杀一位玩家。 ```python # Witch's turn healing_used_tonight = False if witch in survivors: if healing: # Witch decides whether to use the healing potion hint = HostMsg( content=Prompts.to_witch_resurrect.format_map( {"witch_name": witch.name, "dead_name": dead_player[0]}, ), ) # Witch decides whether to use the poison set_parsers(witch, Prompts.witch_resurrect_parser) if witch(hint).metadata.get("resurrect", False): healing_used_tonight = True dead_player.pop() healing = False ``` - **1.3 预言家的回合** 预言家有机会揭示一名玩家的真实身份。这信息对于村民方来说可能至关重要。`observe()`函数允许每个agent注意到一个消息，而不需要立即产生回复。 ```python # Seer's turn if seer in survivors: # Seer chooses a player to reveal their identity hint = HostMsg( content=Prompts.to_seer.format(seer.name, n2s(survivors)), ) set_parsers(seer, Prompts.seer_parser) x = seer(hint) player, idx = extract_name_and_id(x.content) role = "werewolf" if roles[idx] == "werewolf" else "villager" hint = HostMsg(content=Prompts.to_seer_result.format(player, role)) seer.observe(hint) ``` - **1.4 更新存活玩家** 根据夜间采取的行动，程序需要更新幸存玩家的列表。 ```python # Update the list of survivors and werewolves after the night's events survivors, wolves = update_alive_players(survivors, wolves, dead_player) ``` - **2.1 白天阶段：讨论和投票** 在白天，所有存活玩家将讨论然后投票以淘汰一名疑似狼人的玩家。 ```python # Daytime discussion with msghub(survivors, announcement=hints) as hub: # Discuss set_parsers(survivors, Prompts.survivors_discuss_parser) x = sequentialpipeline(survivors) # Vote set_parsers(survivors, Prompts.survivors_vote_parser) hint = HostMsg(content=Prompts.to_all_vote.format(n2s(survivors))) votes = [extract_name_and_id(_(hint).content)[0] for _ in survivors] vote_res = majority_vote(votes) # Broadcast the voting result to all players result = HostMsg(content=Prompts.to_all_res.format(vote_res)) hub.broadcast(result) # Update the list of survivors and werewolves after the vote survivors, wolves = update_alive_players(survivors, wolves, vote_res) ``` - **2.2 检查胜利条件** 每个阶段结束后，游戏会检查是狼人还是村民获胜。 ```python # Check if either side has won if check_winning(survivors, wolves, "Moderator"): break ``` - **2.3 继续到下一轮** 如果狼人和村民都没有获胜，游戏将继续到下一轮。 ```python # If the game hasn't ended, prepare for the next round hub.broadcast(HostMsg(content=Prompts.to_all_continue)) ``` 这些代码块展现了使用AgentScope的`msghub`和`pipeline`的狼人游戏的核心游戏循环，这些工具有助于轻松管理应用程序的操作逻辑。 ### 第五步：运行应用完成了以上游戏逻辑和agent的设置，您已经可以运行狼人游戏了。通过执行`pipeline`，游戏将按预定义的阶段进行，agents 基于它们的角色和上述编码的策略进行互动： ```bash cd examples/game_werewolf python main.py # Assuming the pipeline is implemented in main.py ``` 建议您在在 [AgentScope Studio](https://modelscope.github.io/agentscope/zh_CN/tutorial/209-gui.html) 中启动游戏，在对应的链接中您将看到下面的内容输出。 ![s](https://img.alicdn.com/imgextra/i3/O1CN01n2Q2tR1aCFD2gpTdu_!!6000000003293-1-tps-960-482.gif) [[返回顶部]](#104-usecase-zh) ``` modelscope/agentscope/blob/main/docs/sphinx_doc/zh_CN/source/tutorial/105-logging.md: ```md (105-logging-zh)= # 日志本节教程主要是关于AgentScope的日志记录（logging）功能。我们会介绍如何能美观地将这些日志可视化。这个模块会帮助您更方便、清晰、有组织地跟踪智能体之间的互动和各种系统消息。 ## Logging 日志功能首先包含的是一个基于Python内置 `logging`的根据多智体场景可定制化的`loguru.logger`模块。其包含下面的一些特性： - **调整输出字体颜色**：为了增加日志的可读性，该模块为不同的在对话中发言智能体提供不同颜色的字体高亮。 - **重定向错误输出(stderr)**：该模块自动抓取报错信息，在日志中用`ERROR`层级记录。 - **客制化日志记录等级**：该模块增加了一个日志记录等级`CHAT`，用来记录智能体之间的对话和互动。 - **定制格式**：格式化日志包含了时间戳、记录等级、function名字和行号。智能体之间的对话会用不同的格式显示。 ### 设置日志记录（Logger）我们推荐通过`agentscope.init`来设置logger，包括设定记录等级： ```python import agentscope LOG_LEVEL = Literal[ "CHAT", "TRACE", "DEBUG", "INFO", "SUCCESS", "WARNING", "ERROR", "CRITICAL", ] agentscope.init(..., logger_level="INFO") ``` ### Logging a Chat Message ### 记录对话消息开发者可以通过记录`message`来追踪智能体之间的对话。下面是一些简单的如何记录`message`的例子例子: ```python # Log a simple string message. logger.chat("Hello World!") # Log a `msg` representing dialogue with a speaker and content. logger.chat({"name": "User", "content": "Hello, how are you?"}) logger.chat({"name": "Agent", "content": "I'm fine, thank you!"}) ``` ### 记录系统信息系统日志对于跟踪应用程序的状态和识别问题至关重要。以下是记录不同级别系统信息的方法： ```python # Log general information useful for understanding the flow of the application. logger.info("The dialogue agent has started successfully.") # Log a warning message indicating a potential issue that isn't immediately problematic. logger.warning("The agent is running slower than expected.") # Log an error message when something has gone wrong. logger.error("The agent encountered an unexpected error while processing a request.") ``` ## 将日志与WebUI集成为了可视化这些日志和运行细节，AgentScope提供了一个简单的网络界面。 ### 快速运行你可以用以下Python代码中运行WebUI： ```python import agentscope agentscope.web.init( path_save="YOUR_SAVE_PATH" ) ``` 通过这种方式，你可以在 `http://127.0.0.1:5000` 中看到所有运行中的实例和项目，如下所示 ![webui](https://img.alicdn.com/imgextra/i3/O1CN01kpHFkn1HpeYEkn60I_!!6000000000807-0-tps-3104-1849.jpg) 通过点击一个运行中的实例，我们可以观察到更多细节。 ![The running details](https://img.alicdn.com/imgextra/i2/O1CN01AZtsf31MIHm4FmjjO_!!6000000001411-0-tps-3104-1849.jpg) ### 注意 WebUI仍在开发中。我们将在未来提供更多功能和更好的用户体验。 [[返回顶部]](#105-logging-zh) ``` modelscope/agentscope/blob/main/docs/sphinx_doc/zh_CN/source/tutorial/201-agent.md: ```md (201-agent-zh)= # Agent 本教程帮助你更深入地理解Agent，并引导你通过使用AgentScope定制自己的自定义agent。我们首先介绍一个称为AgentBase的基本抽象概念，它作为基类维护所有agent的通用行为。然后，我们将探讨AgentPool，这是一个由预构建的、专门化的agent组成的集合，每个agent都设计有特定的目的。最后，我们将演示如何定制你自己的agent，确保它符合你项目的需求。 ## 理解 `AgentBase` `AgentBase`类是AgentScope内所有agent结构的架构基石。作为所有自定义agent的超类，它提供了一个包含基本属性和方法的综合模板，这些属性和方法支撑了任何会话agent的核心功能。每个AgentBase的派生类由几个关键特性组成： * `memory`（记忆）：这个属性使agent能够保留和回忆过去的互动，允许它们在持续的对话中保持上下文。关于memory的更多细节，我们会在[记忆和消息管理部分](205-memory)讨论。 * `model`（模型）：模型是agent的计算引擎，负责根据现有的记忆和输入做出响应。关于model的更多细节，我们在[使用模型API与不同模型源部分](203-model)讨论 * `sys_prompt`（系统提示）和`engine`（引擎）：系统提示作为预定义的指令，指导agent在其互动中的行为；而engine用于动态生成合适的提示。关于它们的更多细节，我们会在[提示引擎部分](206-prompt)讨论。 * `to_dist`（分布式）：用于创建 agent 的分布式版本，以支持多 agent 的高效协作。请注意`to_dist`是一个保留字段，将自动添加到`AgentBase`所有子类的初始化函数中。关于 `to_dist` 的更多细节，请见[分布式部分](208-distribute)。除了这些属性，`AgentBase` 还为agent提供了一些关键方法，如 `observe` 和 `reply`： * `observe()`：通过这个方法，一个agent可以注意到消息而不立即回复，允许它根据观察到的消息更新它的记忆。 * `reply()`：这是开发者必须实现的主要方法。它定义了agent对于传入消息的响应行为，封装了agent输出的逻辑。此外，为了统一接口和类型提示，我们引入了另一个基类`Operator`，它通过 `__call__` 函数表示对输入数据执行某些操作。并且我们让 `AgentBase` 成为 `Operator` 的一个子类。 ```python class AgentBase(Operator): # ... [code omitted for brevity] def __init__( self, name: str, sys_prompt: Optional[str] = None, model_config_name: str = None, use_memory: bool = True, memory_config: Optional[dict] = None, ) -> None: # ... [code omitted for brevity] def observe(self, x: Union[dict, Sequence[dict]]) -> None: # An optional method for updating the agent's internal state based on # messages it has observed. This method can be used to enrich the # agent's understanding and memory without producing an immediate # response. if self.memory: self.memory.add(x) def reply(self, x: Optional[Union[Msg, Sequence[Msg]]] = None) -> Msg: # The core method to be implemented by custom agents. It defines the # logic for processing an input message and generating a suitable # response. raise NotImplementedError( f"Agent [{type(self).__name__}] is missing the required " f'"reply" function.', ) # ... [code omitted for brevity] ``` ## 探索AgentPool 在 AgentScope 中的 `AgentPool` 是一个经过精选的，随时可用的，专门化agent集合。这些agent中的每一个都是为了特定的角色量身定做，并配备了处理特定任务的默认行为。`AgentPool` 旨在通过提供各种 Agent 模板来加快开发过程。以下是一个总结了 AgentPool 中一些关键agent的功能的表格： | Agent 种类 | 描述 | Typical Use Cases | | ------------------ | --------------------------------------------------------------------------- | --------------------------- | | `AgentBase` | 作为所有agent的超类，提供了必要的属性和方法。 | 构建任何自定义agent的基础。 | | `DialogAgent` | 通过理解上下文和生成连贯的响应来管理对话。 | 客户服务机器人，虚拟助手。 | | `DictDialogAgent` | 通过理解上下文和生成连贯的响应来管理对话，返回的消息为 Json 格式。 | 客户服务机器人，虚拟助手。 | | `UserAgent` | 与用户互动以收集输入，生成可能包括URL或基于所需键的额外具体信息的消息。 | 为agent收集用户输入 | | `TextToImageAgent` | 将用户输入的文本转化为图片 | 提供文生图功能 | | `ReActAgent` | 实现了 ReAct 算法的 Agent，能够自动调用工具处理较为复杂的任务。 | 借助工具解决复杂任务 | | *更多agent* | AgentScope 正在不断扩大agent池，加入更多专门化的agent，以适应多样化的应用。 | | ## 从Agent池中定制Agent 从 AgentPool 中定制一个agent，使您能够根据您的多agent应用的独特需求来调整其功能。您可以通过调整配置和提示来轻松修改现有agent，或者，对于更广泛的定制，您可以进行二次开发下面，我们提供了如何配置来自 AgentPool 的各种agent的用法： ### `DialogAgent` * **回复方法**：`reply` 方法是处理输入消息和生成响应的主要逻辑所在 ```python def reply(self, x: Optional[Union[Msg, Sequence[Msg]]] = None) -> Msg: # Additional processing steps can occur here # Record the input if needed if self.memory: self.memory.add(x) # Generate a prompt for the language model using the system prompt and memory prompt = self.model.format( Msg("system", self.sys_prompt, role="system"), self.memory and self.memory.get_memory() or x, # type: ignore[arg-type] ) # Invoke the language model with the prepared prompt response = self.model(prompt).text #Format the response and create a message object msg = Msg(self.name, response, role="assistant") # Print/speak the message in this agent's voice self.speak(msg) # Record the message to memory and return it if self.memory: self.memory.add(msg) return msg ``` * **用法**：为了定制一个用于客户服务机器人的 `DialogAgent`： ```python from agentscope.agents import DialogAgent # Configuration for the DialogAgent dialog_agent_config = { "name": "ServiceBot", "model_config_name": "gpt-3.5", # Specify the model used for dialogue generation "sys_prompt": "Act as AI assistant to interact with the others. Try to " "reponse on one line.\n", # Custom prompt for the agent # Other configurations specific to the DialogAgent } # Create and configure the DialogAgent service_bot = DialogAgent(**dialog_agent_config) ``` ### `UserAgent` * **回复方法**：这个方法通过提示内容以及在需要时附加的键和URL来处理用户输入。收集到的数据存储在agent记忆中的一个message对象里，用于记录或稍后使用，并返回该message作为响应。 ```python def reply( self, x: Optional[Union[Msg, Sequence[Msg]]] = None, required_keys: Optional[Union[list[str], str]] = None, ) -> Msg: # Check if there is initial data to be added to memory if self.memory: self.memory.add(x) content = input(f"{self.name}: ") # Prompt the user for input kwargs = {} # Prompt for additional information based on the required keys if required_keys is not None: if isinstance(required_keys, str): required_keys = [required_keys] for key in required_keys: kwargs[key] = input(f"{key}: ") # Optionally prompt for a URL if required url = None if self.require_url: url = input("URL: ") # Create a message object with the collected input and additional details msg = Msg(self.name, content=content, url=url, **kwargs) # Add the message object to memory if self.memory: self.memory.add(msg) return msg ``` * **用法**：配置一个 UserAgent 用于收集用户输入和URL（文件、图像、视频、音频或网站的URL）： ```python from agentscope.agents import UserAgent # Configuration for UserAgent user_agent_config = { "name": "User", "require_url": True, # If true, the agent will require a URL } # Create and configure the UserAgent user_proxy_agent = UserAgent(**user_agent_config) ``` [[返回顶部]](#201-agent-zh) ``` modelscope/agentscope/blob/main/docs/sphinx_doc/zh_CN/source/tutorial/202-pipeline.md: ```md (202-pipeline-zh)= # Pipeline和MsgHub **Pipeline**和**Message Hub**主要用于描绘应用中信息的交换和传播过程，它们极大简化了Multi-Agent应用流程的编排工作。在本教程中，我们将详细的介绍Pipeline和Message Hub的原理和使用方式。 ## Pipeline 在AgentScope中，消息的交换、传播构成了Multi-Agent应用。但是对复杂应用来说，细致的描绘每一次信息交流对开发者来说是非常困难的。 `Pipeline`主要用于简化“描述消息传播”的编程工作。 `Pipeline`中接收的对象是`Operator`，即信息的加工和传播单元（例如智能体`Agent`是`Operator `的一个子类），而`Pipeline`自身也是`Operator`的子类。以下是所有`Pipeline`的基类： ```python class PipelineBase(Operator): """所有pipelines的基础接口.""" # ... [为简洁起见省略代码] @abstractmethod def __call__(self, x: Optional[dict] = None) -> dict: """在这定义pipeline采取的操作。 Args: x (Optional[`dict`], optional): 对话历史以及一些环境信息。 Returns: `dict`: 经过Pipeline处理后的返回消息。 """ ``` ### 类别为了方便开发者的使用，对于同一功能的Pipeline，AgentScope提供了两种不同的实现策略： * **对象类型Pipeline** * 这些Pipeline是面向对象的，继承自 `PipelineBase`。它们本身是`Operator`，可以与其他运算符组合以创建复杂的交互模式，并且可以复用。 ```python # 实例化并调用 pipeline = ClsPipeline([agent1, agent2, agent3]) x = pipeline(x) ``` * **函数式Pipeline** * 函数式Pipeline是独立的函数实现，在不需要复用的一次性使用场景中很有用。 ```python # 只需要调用 x = funcpipeline([agent1, agent2, agent3], x) ``` Pipeline根据其功能被分类成以下的类型。下表概述了 AgentScope 中可用的不同 Pipeline： | 运算符类型Pipeline | 函数式Pipeline | 描述 | | -------------------- | ------------------- | ------------------------------------------------------------ | | `SequentialPipeline` | `sequentialpipeline` | 按顺序执行一系列运算符，将一个运算符的输出作为下一个运算符的输入。 | | `IfElsePipeline` | `ifelsepipeline` | 实现条件逻辑，如果条件为真，则执行一个运算符；如果条件为假，则执行另一个运算符。 | | `SwitchPipeline` | `switchpipeline` | 实现分支选择，根据条件的结果从映射集中执行一个运算符。 | | `ForLoopPipeline` | `forlooppipeline` | 重复执行一个运算符，要么达到设定的迭代次数，要么直到满足指定的中止条件。 | | `WhileLoopPipeline` | `whilelooppipeline` | 只要给定条件保持为真，就持续执行一个运算符。 | | - | `placeholder` | 在流控制中不需要任何操作的分支，如 if-else/switch 中充当占位符。 | ### 使用说明本节通过比较有无 Pipeline 的情况下多智能体应用程序中逻辑实现的方式，来阐释 Pipeline 如何简化逻辑实现。 **注意：** 请注意，在下面提供的示例中，我们使用术语 `agent` 来代表任何可以作为 `Operator` 的实例。这是为了便于理解，并说明 Pipeline 是如何协调不同操作之间的交互的。您可以将 `agent` 替换为任何 `Operator`，从而在实践中允许 `agent` 和 `pipeline` 的混合使用。 #### `SequentialPipeline` * 不使用 pipeline: ```python x = agent1(x) x = agent2(x) x = agent3(x) ``` * 使用 pipeline: ```python from agentscope.pipelines import SequentialPipeline pipe = SequentialPipeline([agent1, agent2, agent3]) x = pipe(x) ``` * 使用函数式 pipeline: ```python from agentscope.pipelines import sequentialpipeline x = sequentialpipeline([agent1, agent2, agent3], x) ``` #### `IfElsePipeline` * 不使用 pipeline: ```python if condition(x): x = agent1(x) else: x = agent2(x) ``` * 使用 pipeline: ```python from agentscope.pipelines import IfElsePipeline pipe = IfElsePipeline(condition, agent1, agent2) x = pipe(x) ``` * 使用函数式 pipeline: ```python from agentscope.functional import ifelsepipeline x = ifelsepipeline(condition, agent1, agent2, x) ``` #### `SwitchPipeline` * 不使用 pipeline: ```python switch_result = condition(x) if switch_result == case1: x = agent1(x) elif switch_result == case2: x = agent2(x) else: x = default_agent(x) ``` * 使用 pipeline: ```python from agentscope.pipelines import SwitchPipeline case_operators = {case1: agent1, case2: agent2} pipe = SwitchPipeline(condition, case_operators, default_agent) x = pipe(x) ``` * 使用函数式 pipeline: ```python from agentscope.functional import switchpipeline case_operators = {case1: agent1, case2: agent2} x = switchpipeline(condition, case_operators, default_agent, x) ``` #### `ForLoopPipeline` * 不使用 pipeline: ```python for i in range(max_iterations): x = agent(x) if break_condition(x): break ``` * 使用 pipeline: ```python from agentscope.pipelines import ForLoopPipeline pipe = ForLoopPipeline(agent, max_iterations, break_condition) x = pipe(x) ``` * 使用函数式 pipeline: ```python from agentscope.functional import forlooppipeline x = forlooppipeline(agent, max_iterations, break_condition, x) ``` #### `WhileLoopPipeline` * 不使用 pipeline: ```python while condition(x): x = agent(x) ``` * 使用 pipeline: ```python from agentscope.pipelines import WhileLoopPipeline pipe = WhileLoopPipeline(agent, condition) x = pipe(x) ``` * 使用函数式 pipeline: ```python from agentscope.functional import whilelooppipeline x = whilelooppipeline(agent, condition, x) ``` ### Pipeline 组合值得注意的是，AgentScope 支持组合 Pipeline 来创建复杂的交互。例如，我们可以创建一个 Pipeline，按顺序执行一系列智能体，然后执行另一个 Pipeline，根据条件执行一系列智能体。 ```python from agentscope.pipelines import SequentialPipeline, IfElsePipeline # 创建一个按顺序执行智能体的 Pipeline pipe1 = SequentialPipeline([agent1, agent2, agent3]) # 创建一个条件执行智能体的 Pipeline pipe2 = IfElsePipeline(condition, agent4, agent5) # 创建一个按顺序执行 pipe1 和 pipe2 的 Pipeline pipe3 = SequentialPipeline([pipe1, pipe2]) # 调用 Pipeline x = pipe3(x) ``` ## MsgHub `MsgHub` 旨在管理一组智能体之间的对话/群聊，其中允许共享消息。通过 `MsgHub`，智能体可以使用 `broadcast` 向群组中的所有其他智能体广播消息。以下是 `MsgHub` 的核心类： ```python class MsgHubManager: """MsgHub 管理类，用于在一组智能体之间共享对话。""" # ... [为简洁起见省略代码] def broadcast(self, msg: Union[dict, list[dict]]) -> None: """将消息广播给所有参与者。""" for agent in self.participants: agent.observe(msg) def add(self, new_participant: Union[Sequence[AgentBase], AgentBase]) -> None: """将新参与者加入此 hub""" # ... [为简洁起见省略代码] def delete(self, participant: Union[Sequence[AgentBase], AgentBase]) -> None: """从参与者中删除智能体。""" # ... [为简洁起见省略代码] ``` ### 使用说明 #### 创建一个 MsgHub 要创建一个 `MsgHub`，请通过调用 `msghub` 辅助函数并传入参与智能体列表来实例化一个 `MsgHubManager`。此外，您可以提供一个可选的初始声明`announcement`，如果提供，将在初始化时广播给所有参与者。 ```python from agentscope.msg_hub import msghub # Initialize MsgHub with participating agents hub_manager = msghub( participants=[agent1, agent2, agent3], announcement=initial_announcement ) ``` #### 在 MsgHub 中广播消息 `MsgHubManager` 可以与上下文管理器一起使用，以处理`MsgHub`环境的搭建和关闭： ```python with msghub( participants=[agent1, agent2, agent3], announcement=initial_announcement ) as hub: # 智能体现在可以在这个块中广播和接收消息 agent1() agent2() # 或者手动广播一条消息 hub.broadcast(some_message) ``` 退出上下文块时，`MsgHubManager` 会确保每个智能体的听众被清空，防止在中心环境之外的任何意外消息共享。 #### 添加和删除参与者你可以动态地从 `MsgHub` 中添加或移除智能体： ```python # 添加一个新参与者 hub.add(new_agent) # 移除一个现有的参与者 hub.delete(existing_agent) ``` [[返回顶部]](#202-pipeline-zh) ``` modelscope/agentscope/blob/main/docs/sphinx_doc/zh_CN/source/tutorial/203-model.md: ```md (203-model-zh)= # 模型 AgentScope中，模型的部署和调用是通过`ModelWrapper`来解耦开的，开发者可以通过提供模型配置（Model config）的方式指定模型，同时AgentScope也提供脚本支持开发者自定义模型服务。 ## 支持模型目前，AgentScope内置以下模型服务API的支持： - OpenAI API，包括对话（Chat），图片生成（DALL-E)和文本嵌入（Embedding）。 - DashScope API，包括对话（Chat）和图片生成（Image Sythesis)和文本嵌入（Text Embedding)。 - Gemini API，包括对话（Chat）和嵌入（Embedding）。 - ZhipuAi API，包括对话（Chat）和嵌入（Embedding）。 - Ollama API，包括对话（Chat），嵌入（Embedding）和生成（Generation）。 - LiteLLM API, 包括对话（Chat）, 支持各种模型的API. - Post请求API，基于Post请求实现的模型推理服务，包括Huggingface/ModelScope Inference API和各种符合Post请求格式的API。 ## 配置方式 AgentScope中，用户通过`agentscope.init`接口中的`model_configs`参数来指定模型配置。 `model_configs`可以是一个字典，或是一个字典的列表，抑或是一个指向模型配置文件的路径。 ```python import agentscope agentscope.init(model_configs=MODEL_CONFIG_OR_PATH) ``` 其中`model_configs`的一个例子如下： ```python model_configs = [ { "config_name": "gpt-4-temperature-0.0", "model_type": "openai_chat", "model_name": "gpt-4", "api_key": "xxx", "organization": "xxx", "generate_args": { "temperature": 0.0 } }, { "config_name": "dall-e-3-size-1024x1024", "model_type": "openai_dall_e", "model_name": "dall-e-3", "api_key": "xxx", "organization": "xxx", "generate_args": { "size": "1024x1024" } }, # 在这里可以配置额外的模型 ] ``` ### 配置格式 AgentScope中，模型配置是一个字典，用于指定模型的类型以及设定调用参数。我们将模型配置中的字段分为_基础参数_和_调用参数_两类。其中，基础参数包括`config_name`和`model_type`两个基本字段，分别用于区分不同的模型配置和具体的`ModelWrapper`类型。 ```python { # 基础参数 "config_name": "gpt-4-temperature-0.0", # 模型配置名称 "model_type": "openai_chat", # 对应`ModelWrapper`类型 # 详细参数 # ... } ``` #### 基础参数基础参数中，`config_name`是模型配置的标识，我们将在初始化智能体时用该字段指定使用的模型服务。 `model_type`对应了`ModelWrapper`的类型，用于指定模型服务的类型。对应源代码中`ModelWrapper `类的`model_type`字段。 ```python class OpenAIChatWrapper(OpenAIWrapper): """The model wrapper for OpenAI's chat API.""" model_type: str = "openai_chat" # ... ``` 在目前的AgentScope中，所支持的`model_type`类型，对应的`ModelWrapper`类，以及支持的 API如下： | API | Task | Model Wrapper | `model_type` | Some Supported Models | |------------------------|-----------------|---------------------------------------------------------------------------------------------------------------------------------|-------------------------------|--------------------------------------------------| | OpenAI API | Chat | [`OpenAIChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/openai_model.py) | `"openai_chat"` | gpt-4, gpt-3.5-turbo, ... | | | Embedding | [`OpenAIEmbeddingWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/openai_model.py) | `"openai_embedding"` | text-embedding-ada-002, ... | | | DALL·E | [`OpenAIDALLEWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/openai_model.py) | `"openai_dall_e"` | dall-e-2, dall-e-3 | | DashScope API | Chat | [`DashScopeChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/dashscope_model.py) | `"dashscope_chat"` | qwen-plus, qwen-max, ... | | | Image Synthesis | [`DashScopeImageSynthesisWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/dashscope_model.py) | `"dashscope_image_synthesis"` | wanx-v1 | | | Text Embedding | [`DashScopeTextEmbeddingWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/dashscope_model.py) | `"dashscope_text_embedding"` | text-embedding-v1, text-embedding-v2, ... | | | Multimodal | [`DashScopeMultiModalWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/dashscope_model.py) | `"dashscope_multimodal"` | qwen-vl-plus, qwen-vl-max, qwen-audio-turbo, ... | | Gemini API | Chat | [`GeminiChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/gemini_model.py) | `"gemini_chat"` | gemini-pro, ... | | | Embedding | [`GeminiEmbeddingWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/gemini_model.py) | `"gemini_embedding"` | models/embedding-001, ... | | ZhipuAI API | Chat | [`ZhipuAIChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/zhipu_model.py) | `"zhipuai_chat"` | glm-4, ... | | | Embedding | [`ZhipuAIEmbeddingWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/zhipu_model.py) | `"zhipuai_embedding"` | embedding-2, ... | | ollama | Chat | [`OllamaChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/ollama_model.py) | `"ollama_chat"` | llama2, ... | | | Embedding | [`OllamaEmbeddingWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/ollama_model.py) | `"ollama_embedding"` | llama2, ... | | | Generation | [`OllamaGenerationWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/ollama_model.py) | `"ollama_generate"` | llama2, ... | | LiteLLM API | Chat | [`LiteLLMChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/litellm_model.py) | `"litellm_chat"` | - | | Post Request based API | - | [`PostAPIModelWrapperBase`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/post_model.py) | `"post_api"` | - | | | Chat | [`PostAPIChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/post_model.py) | `"post_api_chat"` | meta-llama/Meta-Llama-3-8B-Instruct, ... | | | Image Synthesis | [`PostAPIDALLEWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/post_model.py) | `post_api_dall_e` | - | | | | Embedding | [`PostAPIEmbeddingWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/post_model.py) | `post_api_embedding` | - | #### 详细参数根据`ModelWrapper`的不同，详细参数中所包含的参数不同。但是所有的详细参数都会用于初始化`ModelWrapper`类的实例，因此，更详细的参数说明可以根据`ModelWrapper`类的构造函数来查看。下面展示了不同`ModelWrapper`对应的模型配置样例，用户可以修改这些样例以适应自己的需求。 ##### OpenAI API

OpenAI Chat API (agents.models.OpenAIChatWrapper)

```python { "config_name": "{your_config_name}", "model_type": "openai_chat", # 必要参数 "model_name": "gpt-4", # 可选参数 "api_key": "{your_api_key}", # OpenAI API Key，如果没有提供，将从环境变量中读取 "organization": "{your_organization}", # Organization name，如果没有提供，将从环境变量中读取 "client_args": { # 用于初始化OpenAI API Client的参数 # 例如："max_retries": 3, }, "generate_args": { # 模型API接口被调用时传入的参数 # 例如："temperature": 0.0 }, "budget": 100 # API费用预算 } ```

OpenAI DALL·E API (agentscope.models.OpenAIDALLEWrapper)

```python { "config_name": "{your_config_name}", "model_type": "openai_dall_e", # 必要参数 "model_name": "{model_name}", # OpenAI model name, 例如：dall-e-2, dall-e-3 # 可选参数 "api_key": "{your_api_key}", # OpenAI API Key，如果没有提供，将从环境变量中读取 "organization": "{your_organization}", # Organization name，如果没有提供，将从环境变量中读取 "client_args": { # 用于初始化OpenAI API Client的参数 # 例如："max_retries": 3, }, "generate_args": { # 模型API接口被调用时传入的参数 # 例如："n": 1, "size": "512x512" } } ```

OpenAI Embedding API (agentscope.models.OpenAIEmbeddingWrapper)

```python { "config_name": "{your_config_name}", "model_type": "openai_embedding", # 必要参数 "model_name": "{model_name}", # OpenAI model name, 例如：text-embedding-ada-002, text-embedding-3-small # 可选参数 "api_key": "{your_api_key}", # OpenAI API Key，如果没有提供，将从环境变量中读取 "organization": "{your_organization}", # Organization name，如果没有提供，将从环境变量中读取 "client_args": { # 用于初始化OpenAI API Client的参数 # 例如："max_retries": 3, }, "generate_args": { # 模型API接口被调用时传入的参数 # 例如："encoding_format": "float" } } ```

#### DashScope API

DashScope Chat API (agentscope.models.DashScopeChatWrapper)

```python { "config_name": "my_dashscope_chat_config", "model_type": "dashscope_chat", # 必要参数 "model_name": "{model_name}", # DashScope Chat API中的模型名，例如：qwen-max # 可选参数 "api_key": "{your_api_key}", # DashScope API Key，如果没有提供，将从环境变量中读取 "generate_args": { # 例如："temperature": 0.5 }, } ```

DashScope Image Synthesis API (agentscope.models.DashScopeImageSynthesisWrapper)

```python { "config_name": "my_dashscope_image_synthesis_config", "model_type": "dashscope_image_synthesis", # 必要参数 "model_name": "{model_name}", # DashScope Image Synthesis API中的模型名，例如：wanx-v1 # 可选参数 "api_key": "{your_api_key}", "generate_args": { "negative_prompt": "xxx", "n": 1, # ... } } ```

DashScope Text Embedding API (agentscope.models.DashScopeTextEmbeddingWrapper)

```python { "config_name": "my_dashscope_text_embedding_config", "model_type": "dashscope_text_embedding", # 必要参数 "model_name": "{model_name}", # DashScope Text Embedding API中的模型名, 例如：text-embedding-v1 # 可选参数 "api_key": "{your_api_key}", "generate_args": { # ... }, } ```

DashScope Multimodal Conversation API (agentscope.models.DashScopeMultiModalWrapper)

```python { "config_name": "my_dashscope_multimodal_config", "model_type": "dashscope_multimodal", # Required parameters "model_name": "{model_name}", # The model name in DashScope Multimodal Conversation API, e.g. qwen-vl-plus # Optional parameters "api_key": "{your_api_key}", "generate_args": { # ... }, } ```

#### Gemini API

Gemini Chat API (agentscope.models.GeminiChatWrapper)

```python { "config_name": "my_gemini_chat_config", "model_type": "gemini_chat", # 必要参数 "model_name": "{model_name}", # Gemini Chat API中的模型名，例如：gemini-pro # 可选参数 "api_key": "{your_api_key}", # 如果没有提供，将从环境变量GEMINI_API_KEY中读取 } ```

Gemini Embedding API (agentscope.models.GeminiEmbeddingWrapper)

```python { "config_name": "my_gemini_embedding_config", "model_type": "gemini_embedding", # 必要参数 "model_name": "{model_name}", # Gemini Embedding API中的模型名，例如：models/embedding-001 # 可选参数 "api_key": "{your_api_key}", # 如果没有提供，将从环境变量GEMINI_API_KEY中读取 } ```

#### ZhipuAI API

ZhipuAI Chat API (agentscope.models.ZhipuAIChatWrapper)

```python { "config_name": "my_zhipuai_chat_config", "model_type": "zhipuai_chat", # Required parameters "model_name": "{model_name}", # The model name in ZhipuAI API, e.g. glm-4 # Optional parameters "api_key": "{your_api_key}" } ```

ZhipuAI Embedding API (agentscope.models.ZhipuAIEmbeddingWrapper)

```python { "config_name": "my_zhipuai_embedding_config", "model_type": "zhipuai_embedding", # Required parameters "model_name": "{model_name}", # The model name in ZhipuAI API, e.g. embedding-2 # Optional parameters "api_key": "{your_api_key}", } ```

#### Ollama API

Ollama Chat API (agentscope.models.OllamaChatWrapper)

```python { "config_name": "my_ollama_chat_config", "model_type": "ollama_chat", # 必要参数 "model_name": "{model_name}", # ollama Chat API中的模型名, 例如：llama2 # 可选参数 "options": { # 模型API接口被调用时传入的参数 # 例如："temperature": 0., "seed": 123, }, "keep_alive": "5m", # 控制一次调用后模型在内存中的存活时间 } ```

Ollama Generation API (agentscope.models.OllamaGenerationWrapper)

```python { "config_name": "my_ollama_generate_config", "model_type": "ollama_generate", # 必要参数 "model_name": "{model_name}", # ollama Generate API, 例如：llama2 # 可选参数 "options": { # 模型API接口被调用时传入的参数 # "temperature": 0., "seed": 123, }, "keep_alive": "5m", # 控制一次调用后模型在内存中的存活时间 } ```

Ollama Embedding API (agentscope.models.OllamaEmbeddingWrapper)

```python { "config_name": "my_ollama_embedding_config", "model_type": "ollama_embedding", # 必要参数 "model_name": "{model_name}", # ollama Embedding API, 例如：llama2 # 可选参数 "options": { # 模型API接口被调用时传入的参数 # "temperature": 0., "seed": 123, }, "keep_alive": "5m", # 控制一次调用后模型在内存中的存活时间 } ```

#### LiteLLM Chat API

LiteLLM Chat API (agentscope.models.LiteLLMChatModelWrapper)

```python { "config_name": "lite_llm_openai_chat_gpt-3.5-turbo", "model_type": "litellm_chat", "model_name": "gpt-3.5-turbo" # You should note that for different models, you should set the corresponding environment variables, such as OPENAI_API_KEY, etc. You may refer to https://docs.litellm.ai/docs/ for this. }, ```

#### Post Request API

Post Request Chat API (agentscope.models.PostAPIChatWrapper)

```python { "config_name": "my_postapiwrapper_config", "model_type": "post_api_chat", # Required parameters "api_url": "https://xxx.xxx", "headers": { # e.g. "Authorization": "Bearer xxx", }, # Optional parameters "messages_key": "messages", } ``` > ⚠️ Post Request Chat model wrapper (`PostAPIChatWrapper`) 有以下特性： > 1) 它的 `.format()` 方法会确保输入的信息（messages）会被转换成字典列表（a list of dict）. > 2) 它的 `._parse_response()` 方法假设了生成的文字内容会在 `response["data"]["response"]["choices"][0]["message"]["content"]`

Post Request Image Synthesis API (agentscope.models.PostAPIDALLEWrapper)

```python { "config_name": "my_postapiwrapper_config", "model_type": "post_api_dall_e", # Required parameters "api_url": "https://xxx.xxx", "headers": { # e.g. "Authorization": "Bearer xxx", }, # Optional parameters "messages_key": "messages", } ``` > ⚠️ Post Request Image Synthesis model wrapper (`PostAPIDALLEWrapper`) 有以下特性: > 1) 它的 `._parse_response()` 方法假设生成的图片都以url的形式表示在`response["data"]["response"]["data"][i]["url"]`

Post Request Embedding API (agentscope.models.PostAPIEmbeddingWrapper)

```python { "config_name": "my_postapiwrapper_config", "model_type": "post_api_embedding", # Required parameters "api_url": "https://xxx.xxx", "headers": { # e.g. "Authorization": "Bearer xxx", }, # Optional parameters "messages_key": "messages", } ``` > ⚠️ Post Request Embedding model wrapper (`PostAPIEmbeddingWrapper`) 有以下特性: > 1) 它的 `._parse_response()`方法假设生成的特征向量会存放在 `response["data"]["response"]["data"][i]["embedding"]`

Post Request API (agentscope.models.PostAPIModelWrapperBase)

```python { "config_name": "my_postapiwrapper_config", "model_type": "post_api", # 必要参数 "api_url": "https://xxx.xxx", "headers": { # 例如："Authorization": "Bearer xxx", }, # 可选参数 "messages_key": "messages", } ``` > ⚠️ Post request model wrapper (`PostAPIModelWrapperBase`) 返回原生的 HTTP 响应值，且没有实现 `.format()`. 当运行样例时，推荐使用 `Post Request Chat API`. > 使用`PostAPIModelWrapperBase`时，需要注意 > 1) `.format()` 方法不能被调用； > 2) 或开发者希望实现自己的`.format()`和/或`._parse_response()`

## 从零搭建模型服务针对需要自己搭建模型服务的开发者，AgentScope提供了一些脚本来帮助开发者快速搭建模型服务。您可以在[scripts](https://github.com/modelscope/agentscope/tree/main/scripts)目录下找到这些脚本以及说明。具体而言，AgentScope提供了以下模型服务的脚本： - [CPU推理引擎ollama](https://github.com/modelscope/agentscope/blob/main/scripts/README.md#ollama) - [基于Flask + Transformers的模型服务](https://github.com/modelscope/agentscope/blob/main/scripts/README.md#with-transformers-library) - [基于Flask + ModelScope的模型服务](https://github.com/modelscope/agentscope/blob/main/scripts/README.md#with-modelscope-library) - [FastChat推理引擎](https://github.com/modelscope/agentscope/blob/main/scripts/README.md#fastchat) - [vllm推理引擎](https://github.com/modelscope/agentscope/blob/main/scripts/README.md#vllm) 关于如何快速启动这些模型服务，用户可以参考[scripts](https://github.com/modelscope/agentscope/blob/main/scripts/)目录下的[README.md](https://github.com/modelscope/agentscope/blob/main/scripts/README.md)文件。 ## 创建自己的Model Wrapper AgentScope允许开发者自定义自己的模型包装器。新的模型包装器类应该 - 继承自`ModelWrapperBase`类， - 提供`model_type`字段以在模型配置中标识这个Model Wrapper类，并 - 实现`__init__`和`__call__`函数。 ```python from agentscope.models import ModelWrapperBase class MyModelWrapper(ModelWrapperBase): model_type: str = "my_model" def __init__(self, config_name, my_arg1, my_arg2, **kwargs): # 初始化模型实例 super().__init__(config_name=config_name) # ... def __call__(self, input, **kwargs) -> str: # 调用模型实例 # ... ``` 在创建新的模型包装器类之后，模型包装器将自动注册到AgentScope中。您可以直接在模型配置中使用它。 ```python my_model_config = { # 基础参数 "config_name": "my_model_config", "model_type": "my_model", # 详细参数 "my_arg1": "xxx", "my_arg2": "yyy", # ... } ``` [[返回顶部]](#203-model-zh) ``` modelscope/agentscope/blob/main/docs/sphinx_doc/zh_CN/source/tutorial/203-parser.md: ```md (203-parser-zh)= # 结果解析 ## 目录 - [背景](#背景) - [解析器模块](#解析器模块) - [功能说明](#功能说明) - [字符串类型](#字符串str类型) - [MarkdownCodeBlockParser](#markdowncodeblockparser) - [初始化](#初始化) - [响应格式模版](#响应格式模版) - [解析函数](#解析函数) - [字典类型](#字典类型) - [关于 DictFilterMixin](#关于-dictfiltermixin) - [解析器](#解析器) - [RegexTaggedContentParser](#regextaggedcontentparser) - [初始化](#初始化) - [MarkdownJsonDictParser](#markdownjsondictparser) - [初始化 & 响应格式模版](#初始化--响应格式模版) - [类型校验](#类型校验) - [MultiTaggedContentParser](#multitaggedcontentparser) - [初始化 & 响应格式模版](#初始化--响应格式模版-1) - [解析函数](#解析函数-1) - [JSON / Python 对象类型](#json--python-对象类型) - [MarkdownJsonObjectParser](#markdownjsonobjectparser) - [初始化 & 响应格式模版](#初始化--响应格式模版-2) - [解析函数](#解析函数-2) - [典型使用样例](#典型使用样例) - [狼人杀游戏](#狼人杀游戏) - [ReAct 智能体和工具使用](#react-智能体和工具使用) - [自定义解析器](#自定义解析器) ## 背景利用LLM构建应用的过程中，将 LLM 产生的字符串解析成指定的格式，提取出需要的信息，是一个非常重要的环节。但同时由于下列原因，这个过程也是一个非常复杂的过程： 1. **多样性**：解析的目标格式多种多样，需要提取的信息可能是一段特定文本，一个JSON对象，或者是一个复杂的数据结构。 2. **复杂性**：结果解析不仅仅是将 LLM 产生的文本转换成目标格式，还涉及到提示工程（提醒 LLM 应该产生什么格式的输出），错误处理等一些列问题。 3. **灵活性**：同一个应用中，不同阶段也可能需要智能体产生不同格式的输出。为了让开发者能够便捷、灵活的地进行结果解析，AgentScope设计并提供了解析器模块（Parser）。利用该模块，开发者可以通过简单的配置，实现目标格式的解析，同时可以灵活的切换解析的目标格式。 AgentScope中，解析器模块的设计原则是： 1. **灵活**：开发者可以灵活设置所需返回格式、灵活地切换解析器，实现不同格式的解析，而无需修改智能体类的代码，即具体的“目标格式”与智能体类内`reply`函数的处理逻辑解耦 2. **自由**：用户可以自由选择是否使用解析器。解析器所提供的响应格式提示、解析结果等功能都是在`reply`函数内显式调用的，用户可以自由选择使用解析器或是自己实现代码实现结果解析 3. **透明**：利用解析器时，提示（prompt）构建的过程和结果在`reply`函数内对开发者完全可见且透明，开发者可以精确调试自己的应用。 ## 解析器模块 ### 功能说明解析器模块（Parser）的主要功能包括： 1. 提供“响应格式说明”（format instruction），即提示 LLM 应该在什么位置产生什么输出，例如 ```` You should generate python code in a fenced code block as follows ```python {your_python_code} ``` ```` 2. 提供解析函数（parse function），直接将 LLM 产生的文本解析成目标数据格式 3. 针对字典格式的后处理功能。在将文本解析成字典后，其中不同的字段可能有不同的用处 AgentScope提供了多种不同解析器，开发者可以根据自己的需求进行选择。 | 目标格式 | 解析器 | 说明 | |-------------------|----------------------------|-----------------------------------------------------------------------------| | 字符串(`str`)类型 | `MarkdownCodeBlockParser` | 要求 LLM 将指定的文本生成到Markdown中以 ``` 标识的代码块中，解析结果为字符串。 | | 字典(`dict`)类型 | `MarkdownJsonDictParser` | 要求 LLM 在 \```json 和 \``` 标识的代码块中产生指定内容的字典，解析结果为 Python 字典。 | | | `MultiTaggedContentParser` | 要求 LLM 在多个标签中产生指定内容，这些不同标签中的内容将一同被解析成一个 Python 字典，并填入不同的键值对中。 | | | `RegexTaggedContentParser` | 适用于不确定标签名，不确定标签数量的场景。允许用户修改正则表达式，返回结果为字典。 | | JSON / Python对象类型 | `MarkdownJsonObjectParser` | 要求 LLM 在 \```json 和 \``` 标识的代码块中产生指定的内容，解析结果将通过 `json.loads` 转换成 Python 对象。 | > **NOTE**: 相比`MarkdownJsonDictParser`，`MultiTaggedContentParser`更适合于模型能力不强，以及需要 LLM 返回内容过于复杂的情况。例如 LLM 返回 Python 代码，如果直接在字典中返回代码，那么 LLM 需要注意特殊字符的转义（\t,\n,...），`json.loads`读取时对双引号和单引号的区分等问题。而`MultiTaggedContentParser`实际是让大模型在每个单独的标签中返回各个键值，然后再将它们组成字典，从而降低了LLM返回的难度。 > **NOTE**：AgentScope 内置的响应格式说明并不一定是最优的选择。在 AgentScope 中，开发者可以完全控制提示构建的过程，因此，选择不使用parser中内置的相应格式说明，而是自定义新的相应格式说明，或是实现新的parser类都是可行的技术方案。下面我们将根据不同的目标格式，介绍这些解析器的用法。 ### 字符串（`str`）类型 #### MarkdownCodeBlockParser ##### 初始化 - `MarkdownCodeBlockParser`采用 Markdown 代码块的形式，要求 LLM 将指定文本产生到指定的代码块中。可以通过`language_name`参数指定不同的语言，从而利用大模型代码能力产生对应的输出。例如要求大模型产生 Python 代码时，初始化如下： ```python from agentscope.parsers import MarkdownCodeBlockParser parser = MarkdownCodeBlockParser(language_name="python", content_hint="your python code") ``` ##### 响应格式模版 - `MarkdownCodeBlockParser`类提供如下的“响应格式说明”模版，在用户调用`format_instruction`属性时，会将`{language_name}`替换为初始化时输入的字符串： ```` You should generate {language_name} code in a {language_name} fenced code block as follows: ```{language_name} {content_hint} ``` ```` - 例如上述对`language_name`为`"python"`的初始化，调用`format_instruction`属性时，会返回如下字符串： ```python print(parser.format_instruction) ``` ```` You should generate python code in a python fenced code block as follows ```python your python code ``` ```` ##### 解析函数 - `MarkdownCodeBlockParser`类提供`parse`方法，用于解析LLM产生的文本，返回的是字符串。 ````python res = parser.parse( ModelResponse( text="""The following is generated python code ```python print("Hello world!") ``` """ ) ) print(res.parsed) ```` ``` print("hello world!") ``` ### 字典类型 #### 关于 DictFilterMixin 与字符串和一般的 JSON / Python 对象不同，作为 LLM 应用中常用的数据格式，AgentScope 通过 [`DictFilterMixin`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/parsers/parser_base.py#L77) 类为字典类型的解析提供后处理功能。初始化解析器时，可以通过额外设置`keys_to_content`，`keys_to_memory`，`keys_to_metadata`三个参数，从而实现在调用`parser`的`to_content`，`to_memory`和`to_metadata`方法时，对字典键值对的过滤。其中 - `keys_to_content` 指定的键值对将被放置在返回`Msg`对象中的`content`字段，这个字段内容将会被返回给其它智能体，参与到其他智能体的提示构建中，同时也会被`self.speak`函数调用，用于显式输出 - `keys_to_memory` 指定的键值对将被存储到智能体的记忆中 - `keys_to_metadata` 指定的键值对将被放置在`Msg`对象的`metadata`字段，可以用于应用的控制流程判断，或挂载一些不需要返回给其它智能体的信息。三个参数接收布尔值、字符串和字符串列表。其值的含义如下： - `False`: 对应的过滤函数将返回`None`。 - `True`: 整个字典将被返回。 - `str`: 对应的键值将被直接返回，注意返回的会是对应的值而非字典。 - `List[str]`: 根据键值对列表返回过滤后的字典。 AgentScope中，`keys_to_content` 和 `keys_to_memory` 默认为 `True`，即整个字典将被返回。`keys_to_metadata` 默认为 `False`，即对应的过滤函数将返回 `None`。下面是狼人杀游戏的样例，在白天讨论过程中 LLM 扮演狼人产生的字典。在这个例子中， - `"thought"`字段不应该返回给其它智能体，但是应该存储在智能体的记忆中，从而保证狼人策略的延续； - `"speak"`字段应该被返回给其它智能体，并且存储在智能体记忆中； - `"finish_discussion"`字段用于应用的控制流程中，判断讨论是否已经结束。为了节省token，该字段不应该被返回给其它的智能体，同时也不应该存储在智能体的记忆中。 ```python { "thought": "The others didn't realize I was a werewolf. I should end the discussion soon.", "speak": "I agree with you.", "finish_discussion": True } ``` AgentScope中，我们通过调用`to_content`，`to_memory`和`to_metadata`方法实现后处理功能，示意代码如下： - 应用中的控制流代码，创建对应的解析器对象并装载 ```python from agentscope.parsers import MarkdownJsonDictParser # ... agent = DictDialogAgent(...) # 以MarkdownJsonDictParser为例 parser = MarkdownJsonDictParser( content_hint={ "thought": "what you thought", "speak": "what you speak", "finish_discussion": "whether the discussion is finished" }, keys_to_content="speak", keys_to_memory=["thought", "speak"], keys_to_metadata=["finish_discussion"] ) # 装载解析器，即相当于指定了要求的相应格式 agent.set_parser(parser) # 讨论过程 while True: # ... x = agent(x) # 根据metadata字段，获取LLM对当前是否应该结束讨论的判断 if x.metadata["finish_discussion"]: break ``` - 智能体内部`reply`函数内实现字典的过滤 ```python # ... def reply(self, x: Optional[Union[Msg, Sequence[Msg]]] = None) -> Msg: # ... res = self.model(prompt, parse_func=self.parser.parse) # 过滤后拥有 thought 和 speak 字段的字典，存储到智能体记忆中 self.memory.add( Msg( self.name, content=self.parser.to_memory(res.parsed), role="assistant", ) ) # 存储到content中，同时存储到metadata中 msg = Msg( self.name, content=self.parser.to_content(res.parsed), role="assistant", metadata=self.parser.to_metadata(res.parsed), ) self.speak(msg) return msg ``` > **Note**: `keys_to_content`，`keys_to_memory`和`keys_to_metadata`参数可以是列表，字符串，也可以是布尔值。 > - 如果是`True`，则会直接返回整个字典，即不进行过滤 > - 如果是`False`，则会直接返回`None`值 > - 如果是字符串类型，则`to_content`，`to_memory`和`to_metadata`方法将会把字符串对应的键值直接放入到对应的位置，例如`keys_to_content="speak"`，则`to_content`方法将会把`res.parsed["speak"]`放入到`Msg`对象的`content`字段中，`content`字段会是字符串而不是字典。 > - 如果是列表类型，则`to_content`，`to_memory`和`to_metadata`方法实现的将是过滤功能，对应过滤后的结果是字典 > ```python > parser = MarkdownJsonDictParser( > content_hint={ > "thought": "what you thought", > "speak": "what you speak", > }, > keys_to_content="speak", > keys_to_memory=["thought", "speak"], > ) > > example_dict = {"thought": "abc", "speak": "def"} > print(parser.to_content(example_dict)) # def > print(parser.to_memory(example_dict)) # {"thought": "abc", "speak": "def"} > print(parser.to_metadata(example_dict)) # None > ``` > ``` > def > {"thought": "abc", "speak": "def"} > None > ``` #### 解析器针对字典类型的返回值，AgentScope 提供了多种不同的解析器，开发者可以根据自己的需求进行选择。 ##### RegexTaggedContentParser ###### 初始化 `RegexTaggedContentParser` 主要用于1）不确定的标签名，以及2）不确定标签数量的场景。在这种情况下，该解析器无法提供一个泛用性广的响应格式说明，因此需要开发者在初始化时提供对应的相应格式说明（`format_instruction`）。除此之外，用户可以通过设置`try_parse_json`，`required_keys`等参数，设置解析器的行为。 ```python from agentscope.parsers import RegexTaggedContentParser parser = RegexTaggedContentParser( format_instruction="""Respond with specific tags as outlined below what you thought what you speak """, try_parse_json=True, # 尝试将标签内容解析成 JSON 对象 required_keys=["thought", "speak"] # 必须包含的键 ) ``` ##### MarkdownJsonDictParser ###### 初始化 & 响应格式模版 - `MarkdownJsonDictParser`要求 LLM 在 \```json 和 \``` 标识的代码块中产生指定内容的字典。 - 除了`to_content`，`to_memory`和`to_metadata`参数外，可以通过提供 `content_hint` 参数提供响应结果样例和说明，即提示LLM应该产生什么样子的字典，该参数可以是字符串，也可以是字典，在构建响应格式提示的时候将会被自动转换成字符串进行拼接。 ```python from agentscope.parsers import MarkdownJsonDictParser # 字典 MarkdownJsonDictParser( content_hint={ "thought": "what you thought", "speak": "what you speak", } ) # 或字符串 MarkdownJsonDictParser( content_hint="""{ "thought": "what you thought", "speak": "what you speak", }""" ) ``` - 对应的`instruction_format`属性 ```` You should respond a json object in a json fenced code block as follows: ```json {content_hint} ``` ```` ###### 类型校验 `MarkdownJsonDictParser`中的`content_hint`参数还支持基于Pydantic的类型校验。初始化时，可以将`content_hint`设置为一个Pydantic的模型类，AgentScope将根据这个类来修改`instruction_format`属性，并且利用Pydantic在解析时对LLM返回的字典进行类型校验。该功能需要LLM能够理解JSON schema格式的提示，因此适用于能力较强的大模型。一个简单的例子如下，`"..."`处可以填写具体的类型校验规则，可以参考[Pydantic](https://docs.pydantic.dev/latest/)文档。 ```python from pydantic import BaseModel, Field from agentscope.parsers import MarkdownJsonDictParser class Schema(BaseModel): thought: str = Field(..., description="what you thought") speak: str = Field(..., description="what you speak") end_discussion: bool = Field(..., description="whether the discussion is finished") parser = MarkdownJsonDictParser(content_hint=Schema) ``` - 对应的`format_instruction`属性 ```` Respond a JSON dictionary in a markdown's fenced code block as follows: ```json {a_JSON_dictionary} ``` The generated JSON dictionary MUST follow this schema: {'properties': {'speak': {'description': 'what you speak', 'title': 'Speak', 'type': 'string'}, 'thought': {'description': 'what you thought', 'title': 'Thought', 'type': 'string'}, 'end_discussion': {'description': 'whether the discussion reached an agreement or not', 'title': 'End Discussion', 'type': 'boolean'}}, 'required': ['speak', 'thought', 'end_discussion'], 'title': 'Schema', 'type': 'object'} ```` - 同时在解析的过程中，也将使用Pydantic进行类型校验，校验错误将抛出异常。同时，Pydantic也将提供一定的容错处理能力，例如将字符串`"true"`转换成Python的`True`： ```` parser.parser(""" ```json { "thought": "The others didn't realize I was a werewolf. I should end the discussion soon.", "speak": "I agree with you.", "end_discussion": "true" } ``` """) ```` ##### MultiTaggedContentParser `MultiTaggedContentParser`要求 LLM 在多个指定的标签对中产生指定的内容，这些不同标签的内容将一同被解析为一个 Python 字典。使用方法与`MarkdownJsonDictParser`类似，只是初始化方法不同，更适合能力较弱的LLM，或是比较复杂的返回内容。 ###### 初始化 & 响应格式模版 `MultiTaggedContentParser`中，每一组标签将会以`TaggedContent`对象的形式传入，其中`TaggedContent`对象包含了 - 标签名（`name`），即返回字典中的key值 - 开始标签（`tag_begin`） - 标签内容提示（`content_hint`） - 结束标签（`tag_end`) - 内容解析功能（`parse_json`），默认为`False`。当置为`True`时，将在响应格式提示中自动添加提示，并且提取出的内容将通过`json.loads`解析成 Python 对象 ```python from agentscope.parsers import MultiTaggedContentParser, TaggedContent parser = MultiTaggedContentParser( TaggedContent( name="thought", tag_begin="[THOUGHT]", content_hint="what you thought", tag_end="[/THOUGHT]" ), TaggedContent( name="speak", tag_begin="[SPEAK]", content_hint="what you speak", tag_end="[/SPEAK]" ), TaggedContent( name="finish_discussion", tag_begin="[FINISH_DISCUSSION]", content_hint="true/false, whether the discussion is finished", tag_end="[/FINISH_DISCUSSION]", parse_json=True, # 我们希望这个字段的内容直接被解析成 True 或 False 的 Python 布尔值 ) ) print(parser.format_instruction) ``` ``` Respond with specific tags as outlined below, and the content between [FINISH_DISCUSSION] and [/FINISH_DISCUSSION] MUST be a JSON object: [THOUGHT]what you thought[/THOUGHT] [SPEAK]what you speak[/SPEAK] [FINISH_DISCUSSION]true/false, whether the discussion is finished[/FINISH_DISCUSSION] ``` ###### 解析函数 - `MultiTaggedContentParser`的解析结果为字典，其中key为`TaggedContent`对象的`name`的值，以下是狼人杀中解析 LLM 返回的样例： ```python res_dict = parser.parse( ModelResponse(text="""As a werewolf, I should keep pretending to be a villager [THOUGHT]The others didn't realize I was a werewolf. I should end the discussion soon.[/THOUGHT] [SPEAK]I agree with you.[/SPEAK] [FINISH_DISCUSSION]true[/FINISH_DISCUSSION] """ ) ) print(res_dict) ``` ``` { "thought": "The others didn't realize I was a werewolf. I should end the discussion soon.", "speak": "I agree with you.", "finish_discussion": true } ``` ### JSON / Python 对象类型 #### MarkdownJsonObjectParser `MarkdownJsonObjectParser`同样采用 Markdown 的\```json和\```标识，但是不限制解析的内容的类型，可以是列表，字典，数值，字符串等可以通过`json.loads`进行解析字符串。 ##### 初始化 & 响应格式模版 ```python from agentscope.parsers import MarkdownJsonObjectParser parser = MarkdownJsonObjectParser( content_hint="{A list of numbers.}" ) print(parser.format_instruction) ``` ```` You should respond a json object in a json fenced code block as follows: ```json {a list of numbers} ``` ```` ##### 解析函数 ````python res = parser.parse( ModelResponse(text="""Yes, here is the generated list ```json [1,2,3,4,5] ``` """ ) ) print(type(res)) print(res) ```` ``` [1, 2, 3, 4, 5] ``` ## 典型使用样例 ### 狼人杀游戏狼人杀（Werewolf）是字典解析器的一个经典使用场景，在游戏的不同阶段内，需要同一个智能体在不同阶段产生除了`"thought"`和`"speak"`外其它的标识字段，例如是否结束讨论，预言家是否使用能力，女巫是否使用解药和毒药，投票等。 AgentScope中已经内置了[狼人杀](https://github.com/modelscope/agentscope/tree/main/examples/game_werewolf)的样例，该样例采用`DictDialogAgent`类，配合不同的解析器，实现了灵活的目标格式切换。同时利用解析器的后处理功能，实现了“想”与“说”的分离，同时控制游戏流程的推进。详细实现请参考狼人杀[源码](https://github.com/modelscope/agentscope/tree/main/examples/game_werewolf)。 ### ReAct 智能体和工具使用 `ReActAgent`是AgentScope中为了工具使用构建的智能体类，基于 ReAct 算法进行搭建，可以配合不同的工具函数进行使用。其中工具的调用，格式解析，采用了和解析器同样的实现思路。详细实现请参考[代码](https://github.com/modelscope/agentscope/blob/main/src/agentscope/agents/react_agent.py)。 ## 自定义解析器 AgentScope中提供了解析器的基类`ParserBase`，开发者可以通过继承该基类，并实现其中的`format_instruction`属性和`parse`方法来实现自己的解析器。针对目标格式是字典类型的解析，可以额外继承`agentscope.parser.DictFilterMixin`类实现对字典类型的后处理。 ```python from abc import ABC, abstractmethod from agentscope.models import ModelResponse class ParserBase(ABC): """The base class for model response parser.""" format_instruction: str """The instruction for the response format.""" @abstractmethod def parse(self, response: ModelResponse) -> ModelResponse: """Parse the response text to a specific object, and stored in the parsed field of the response object.""" # ... ``` ``` modelscope/agentscope/blob/main/docs/sphinx_doc/zh_CN/source/tutorial/203-stream.md: ```md (203-stream-zh)= # 流式输出 AgentScope 支持在**终端**和 **AgentScope Studio** 中使用以下大模型 API 的流式输出模式。 | API | Model Wrapper | 对应的 `model_type` 域 | |--------------------|---------------------------------------------------------------------------------------------------------------------------------|--------------------| | OpenAI Chat API | [`OpenAIChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/openai_model.py) | `"openai_chat"` | | DashScope Chat API | [`DashScopeChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/dashscope_model.py) | `"dashscope_chat"` | | Gemini Chat API | [`GeminiChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/gemini_model.py) | `"gemini_chat"` | | ZhipuAI Chat API | [`ZhipuAIChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/zhipu_model.py) | `"zhipuai_chat"` | | ollama Chat API | [`OllamaChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/ollama_model.py) | `"ollama_chat"` | | LiteLLM Chat API | [`LiteLLMChatWrapper`](https://github.com/modelscope/agentscope/blob/main/src/agentscope/models/litellm_model.py) | `"litellm_chat"` | ## 设置流式输出 AgentScope 允许用户在模型配置和模型调用中设置流式输出模式。 ### 模型配置在模型配置中将 `stream` 字段设置为 `True` 以使用流式输出模式。 ```python model_config = { "config_name": "xxx", "model_type": "xxx", "stream": True, # ... } ``` ### 模型调用在智能体中，可以在调用模型时将 `stream` 参数设置为 `True`。注意，模型调用中的 `stream` 参数将覆盖模型配置中的 `stream` 字段。 ```python class MyAgent(AgentBase): # ... def reply(self, x: Optional[Msg, Sequence[Msg]] = None) -> Msg: # ... response = self.model( prompt, stream=True, ) # ... ``` ## 流式打印在流式输出模式下，模型响应的 `stream` 字段将是一个生成器，而 `text` 字段将是 `None`。为了与非流式兼容，用户一旦在迭代生成器前访问 `text` 字段，`stream` 中的生成器将被迭代以生成完整的文本，并将其存储在 `text` 字段中。因此，即使在流式输出模式下，用户也可以像往常一样在 `text` 字段中处理响应文本而无需任何改变。但是，如果用户需要流式的输出，只需要将生成器放在 `self.speak` 函数中，以在终端和 AgentScope Studio 中流式打印文本。 ```python def reply(self, x: Optional[Msg, Sequence[Msg]] = None) -> Msg: # ... # 如果想在调用时使用流式打印，在这里调用时使用 stream=True response = self.model(prompt) # 程序运行到这里时，response.text 为 None # 在 terminal 和 AgentScope Studio 中流式打印文本 self.speak(response.stream) # 生成器被迭代时，产生的文本将自动被存储在 response.text 中，因此用户可以直接使用 response.text 处理响应文本 msg = Msg(self.name, content=response.text, role="assistant") self.memory.add(msg) return msg ``` ## 进阶用法如果用户想要自己处理流式输出，可以通过迭代生成器来实时获得流式的响应文本。 An example of how to handle the streaming response is in the `speak` function of `AgentBase` as follows. 关于如何处理流式输出，可以参考 `AgentBase` 中的 `speak` 函数。 The `log_stream_msg` function will print the streaming response in the terminal and AgentScope Studio (if registered). 其中 `log_stream_msg` 函数将在终端和 AgentScope Studio 中实时地流式打印文本。 ```python # ... elif isinstance(content, GeneratorType): # 流式消息必须共享相同的 id 才能在 AgentScope Studio 中显示，因此这里通过同一条消息切换 content 字段来实现 msg = Msg(name=self.name, content="", role="assistant") for last, text_chunk in content: msg.content = text_chunk log_stream_msg(msg, last=last) else: # ... ``` 在处理生成器的时候，用户应该记住以下几点： 1. 在迭代生成器时，`response.text` 字段将自动包含已迭代的文本。 2. `stream` 字段中的生成器将生成一个布尔值和字符串的二元组。布尔值表示当前是否是最后一段文本，而字符串则是到目前为止的响应文本。 3. AgentScope Studio 依据 `log_stream_msg` 函数中输入的 `Msg` 对象的 id 判断文本是否属于同一条流式响应，若 id 不同，则会被视为不同的响应。 ```python def reply(self, x: Optional[Msg, Sequence[Msg]] = None) -> Msg: # ... response = self.model(prompt) # 程序运行到这里时，response.text 为 None # 迭代生成器，自己处理响应文本 for last_chunk, text in response.stream: # 按照自己的需求处理响应文本 # ... ``` [[Return to the top]](#203-stream-zh) ``` modelscope/agentscope/blob/main/docs/sphinx_doc/zh_CN/source/tutorial/204-service.md: ```md (204-service-zh)= # 工具服务函数（Service function）是可以增强智能体能力工具，例如执行Python代码、网络搜索、文件操作等。本教程概述了AgentScope中可用的服务功能，同时介绍如何使用它们来增强智能体的能力。 ## Service函数概览下面的表格按照类型概述了各种Service函数。以下函数可以通过`agentscope.service.{函数名}`进行调用。 | Service场景 | Service函数名称 | 描述 | |------------|-----------------------|-----------------------------------------| | 代码 | `execute_python_code` | 执行一段 Python 代码，可选择在 Docker
容器内部执行。 | | | `NoteBookExecutor.run_code_on_notebook` | 在 NoteBookExecutor 的 IPython 环境中执行一段 Python 代码，遵循 IPython 交互式计算风格。 | | 检索 | `retrieve_from_list` | 根据给定的标准从列表中检索特定项目。 | | | `cos_sim` | 计算2个embedding的余弦相似度。 | | SQL查询 | `query_mysql` | 在 MySQL 数据库上执行 SQL 查询并返回结果。 | | | `query_sqlite` | 在 SQLite 数据库上执行 SQL 查询并返回结果。 | | | `query_mongodb` | 对 MongoDB 集合执行查询或操作。 | | 文本处理 | `summarization` | 使用大型语言模型总结一段文字以突出其主要要点。 | | 网络 | `bing_search` | 使用bing搜索。 | | | `google_search` | 使用google搜索。 | | | `arxiv_search` | 使用arxiv搜索。 | | | `download_from_url` | 从指定的 URL 下载文件。 | | | `load_web` | 爬取并解析指定的网页链接（目前仅支持爬取 HTML 页面） | | | `digest_webpage` | 对已经爬取好的网页生成摘要信息（目前仅支持 HTML 页面 | | `dblp_search_publications` | 在dblp数据库里搜索文献。 | | `dblp_search_authors` | 在dblp数据库里搜索作者。 | | | `dblp_search_venues` | 在dblp数据库里搜索期刊，会议及研讨会。 | | 文件处理 | `create_file` | 在指定路径创建一个新文件，并可选择添加初始内容。 | | | `delete_file` | 删除由文件路径指定的文件。 | | | `move_file` | 将文件从一个路径移动或重命名到另一个路径。 | | | `create_directory` | 在指定路径创建一个新的目录。 | | | `delete_directory` | 删除一个目录及其所有内容。 | | | `move_directory` | 将目录从一个路径移动或重命名到另一个路径。 | | | `read_text_file` | 读取并返回文本文件的内容。 | | | `write_text_file` | 向指定路径的文件写入文本内容。 | | | `read_json_file` | 读取并解析 JSON 文件的内容。 | | | `write_json_file` | 将 Python 对象序列化为 JSON 并写入到文件。 | | 多模态 | `dashscope_text_to_image` | 使用 DashScope API 将文本生成图片。 | | | `dashscope_image_to_text` | 使用 DashScope API 根据图片生成文字。 | | | `dashscope_text_to_audio` | 使用 DashScope API 根据文本生成音频。 | | | `openai_text_to_image` | 使用 OpenAI API根据文本生成图片。 | | `openai_edit_image` | 使用 OpenAI API 根据提供的遮罩和提示编辑图像。 | | `openai_create_image_variation` | 使用 OpenAI API 创建图像的变体。 | | `openai_image_to_text` | 使用 OpenAI API 根据图片生成文字。 | | `openai_text_to_audio` | 使用 OpenAI API 根据文本生成音频。 | | `openai_audio_to_text` | 使用OpenAI API将音频转换为文本。 | *更多服务即将推出* | | 正在开发更多服务功能，并将添加到 AgentScope 以进一步增强其能力。 | 关于详细的参数、预期输入格式、返回类型，请参阅[API文档](https://modelscope.github.io/agentscope/)。 ## 使用Service函数 AgentScope为Service函数提供了两个服务类，分别是`ServiceToolkit`和`ServiceResponse`。 ### 关于ServiceToolkit 大模型使用工具函数通常涉及以下5个步骤： 1. **准备工具函数**。即开发者通过提供必要的参数（例如api key、用户名、密码等）将工具函数预处理成大模型能直接调用的形式。 2. **为大模型准备工具描述**。即一份详细的函数功能描述，以便大模型能够正确理解工具函数。 3. **约定函数调用格式**。提供一份说明来告诉大模型如何调用工具函数，即调用格式。 4. **解析大模型返回值**。从大模型获取返回值之后，需要按照第三步中的调用格式来解析字符串。 5. **调用函数并处理异常**。实际调用函数，返回结果，并处理异常。为了简化上述步骤并提高复用性，AgentScope引入了ServiceToolkit模块。它可以 - 注册python函数为工具函数 - 生成字符串和JSON schema格式的工具函数说明 - 内置一套工具函数的调用格式 - 解析模型响应、调用工具功能，并处理异常 #### 如何使用按照以下步骤使用ServiceToolkit: 1. 初始化一个ServiceToolkit对象并注册服务函数及其必要参数。例如，以下Bing搜索功能。 ```python def bing_search( question: str, api_key: str, num_results: int = 10, **kwargs: Any, ) -> ServiceResponse: """ Search question in Bing Search API and return the searching results Args: question (`str`): The search query string. api_key (`str`): The API key provided for authenticating with the Bing Search API. num_results (`int`, defaults to `10`): The number of search results to return. **kwargs (`Any`): Additional keyword arguments to be included in the search query. For more details, please refer to https://learn.microsoft.com/en-us/bing/search-apis/bing-web-search/reference/query-parameters [omitted for brevity] """ ``` We register the function in a `ServiceToolkit` object by providing `api_key` and `num_results` as necessary parameters. 我们通过提供`api_key`和`num_results`作为必要参数，在`ServiceToolkit`对象中注册bing_search函数。 ```python from agentscope.service import ServiceToolkit service_toolkit = ServiceToolkit() service_toolkit.add( bing_search, api_key="xxx", num_results=3 ) ``` 2. 在提示中使用`tools_instruction`属性指导LLM，或使用`json_schemas`属性获取JSON schema格式的说明，以构建自定义格式的函数说明或直接在模型API中使用（例如OpenAI Chat API）。 ````text >> print(service_toolkit.tools_instruction) ## Tool Functions: The following tool functions are available in the format of ``` {index}. {function name}: {function description} {argument1 name} ({argument type}): {argument description} {argument2 name} ({argument type}): {argument description} ... ``` 1. bing_search: Search question in Bing Search API and return the searching results question (str): The search query string. ```` ````text >> print(service_toolkit.json_schemas) { "bing_search": { "type": "function", "function": { "name": "bing_search", "description": "Search question in Bing Search API and return the searching results", "parameters": { "type": "object", "properties": { "question": { "type": "string", "description": "The search query string." } }, "required": [ "question" ] } } } } ```` 3. 通过`tools_calling_format`属性指导LLM如何使用工具函数。ServiceToolkit中默认大模型需要返回一个JSON格式的列表，列表中包含若干个字典，每个字典即为一个函数调用。必须包含name和 arguments两个字段，其中name为函数名，arguments为函数参数。arguments键值对应的值是从 “参数名”映射到“参数值”的字典。 ```text >> print(service_toolkit.tools_calling_format) [{"name": "{function name}", "arguments": {"{argument1 name}": xxx, "{argument2 name}": xxx}}] ``` 4. 通过`parse_and_call_func`方法解析大模型生成的字符串，并调用函数。此函数可以接收字符串或解析后符合格式要求的字典作为输入。 - 当输入为字符串时，此函数将相应地解析字符串并使用解析后的参数执行函数。 - 而如果输入为解析后的字典，则直接调用函数。 ```python # a string input string_input = '[{"name": "bing_search", "arguments": {"question": "xxx"}}]' res_of_string_input = service_toolkit.parse_and_call_func(string_input) # or a parsed dictionary dict_input = [{"name": "bing_search", "arguments": {"question": "xxx"}}] # res_of_dict_input is the same as res_of_string_input res_of_dict_input = service_toolkit.parse_and_call_func(dict_input) print(res_of_string_input) ``` ``` 1. Execute function bing_search [ARGUMENTS]: question: xxx [STATUS]: SUCCESS [RESULT]: ... ``` 关于ServiceToolkit的具体使用样例，可以参考`agentscope.agents`中`ReActAgent`类。 #### 创建新的Service函数新的Service函数必须满足以下要求才能被ServiceToolkit正常使用： 1. 具有格式化的函数说明（推荐Google风格），以便ServiceToolkit提取函数说明。 2. 函数名称应该是自解释的，这样智能体可以理解函数并正确使用它。 3. 在定义函数时应提供参数的类型（例如`def func(a: int, b: str, c: bool)`），以便大模型能够给出类型正确的参数。 ### 关于ServiceResponse `ServiceResponse`是对调用的结果的封装，包含了`status`和`content`两个字段。当Service函数正常运行结束时，`status`为`ServiceExecStatus. SUCCESS`，`content`为函数的返回值。而当运行出现错误时，`status`为`ServiceExecStatus. Error`，`content`内为错误信息。 ```python class ServiceResponse(dict): """Used to wrap the execution results of the services""" __setattr__ = dict.__setitem__ __getattr__ = dict.__getitem__ def __init__( self, status: ServiceExecStatus, content: Any, ): """Constructor of ServiceResponse Args: status (`ServiceExeStatus`): The execution status of the service. content (`Any`) If the argument`status` is `SUCCESS`, `content` is the response. We use `object` here to support various objects, e.g. str, dict, image, video, etc. Otherwise, `content` is the error message. """ self.status = status self.content = content # ... [为简洁起见省略代码] ``` ## 示例 ```python import json import inspect from agentscope.service import ServiceResponse from agentscope.agents import AgentBase def create_file(file_path: str, content: str = "") -> ServiceResponse: """ 创建文件并向其中写入内容。 Args: file_path (str): 将要创建文件的路径。 content (str): 要写入文件的内容。 Returns: ServiceResponse: 其中布尔值指示成功与否，字符串包含任何错误消息（如果有），包括错误类型。 """ # ... [为简洁起见省略代码] class YourAgent(AgentBase): # ... [为简洁起见省略代码] def reply(self, x: Optional[Union[Msg, Sequence[Msg]]] = None) -> Msg: # ... [为简洁起见省略代码] # 构造提示，让代理提供 JSON 格式的参数 prompt = ( f"To complete the user request\n```{x['content']}```\n" "Please provide the necessary parameters in JSON format for the " "function:\n" f"Function: {create_file.__name__}\n" "Description: Create a file and write content to it.\n" ) # 添加关于函数参数的详细信息 sig = inspect.signature(create_file) parameters = sig.parameters.items() params_prompt = "\n".join( f"- {name} ({param.annotation.__name__}): " f"{'(default: ' + json.dumps(param.default) + ')'if param.default is not inspect.Parameter.empty else ''}" for name, param in parameters ) prompt += params_prompt # 获取模型响应 model_response = self.model(prompt).text # 解析模型响应并调用 create_file 函数 # 可能需要额外的提取函数 try: kwargs = json.loads(model_response) create_file(**kwargs) except: # 错误处理 pass # ... [为简洁起见省略代码] ``` [[返回顶部]](#204-service-zh) ``` modelscope/agentscope/blob/main/docs/sphinx_doc/zh_CN/source/tutorial/205-memory.md: ```md (205-memory-zh)= # 记忆 AgentScope中，记忆（memory）用于存储历史消息，从而使智能体能够根据上下文提供更加连贯，更加自然的响应。本教程将首先介绍memory中信息的载体，消息（message），然后介绍AgentScope中记忆模块的功能和使用方法。 ## 关于消息（Message） ### 消息基类（`MessageBase`） AgentScope中，消息基类是Python字典的子类，由`name`，`content`两个必选字段和一个可选的字段 `url`组成。由于是字典类型，开发者也可以根据需要增加其他字段。其中，`name`字段代表消息的发起者，`content`字段代表消息的内容，`url `则代表消息中附加的数据链接，可以是指向多模态数据的本地链接，也可以是网络链接。当一个消息被创建时，将会自动创建一个唯一的ID，用于标识这条消息。同时，消息的创建时间也会以时间戳的形式自动记录下来。具体实现中，AgentScope首先提供了一个`MessageBase`基类，用于定义消息的基本属性和使用方法。与一般的字典类型不同，`MessageBase`的实例化对象可以通过`对象名.{属性名}`的方式访问属性值，也可以通过`对象名['属性名']`的方式访问属性值。其中，`MessageBase` 类的关键属性如下： - **`name`**：该属性表示信息的发起者。这是一项关键的元数据，在需要区分不同发言者的场景中非常有用。 - **`content`**：信息本身的内容。它可以包括文本、结构化数据或其他与交互相关且需要智能体处理的内容形式。 - **`url`**：可选属性，允许信息链接到外部资源。这些可以是指向文件的直接链接、多模态数据或网页。 - **`timestamp`**：时间戳，显示信息创建的时间。 - **`id`**：每条信息在创建时都会被分配一个唯一标识符（ID）。 ```python class MessageBase(dict): """Base Message class, which is used to maintain information for dialog, memory and used to construct prompt. """ def __init__( self, name: str, content: Any, url: Optional[Union[Sequence[str], str]] = None, timestamp: Optional[str] = None, **kwargs: Any, ) -> None: """Initialize the message object Args: name (`str`): The name of who send the message. It's often used in role-playing scenario to tell the name of the sender. However, you can also only use `role` when calling openai api. The usage of `name` refers to https://cookbook.openai.com/examples/how_to_format_inputs_to_chatgpt_models. content (`Any`): The content of the message. url (`Optional[Union[list[str], str]]`, defaults to None): A url to file, image, video, audio or website. timestamp (`Optional[str]`, defaults to None): The timestamp of the message, if None, it will be set to current time. **kwargs (`Any`): Other attributes of the message. For OpenAI API, you should add "role" from `["system", "user", "assistant", "function"]`. When calling OpenAI API, `"role": "assistant"` will be added to the messages that don't have "role" attribute. """ # id and timestamp will be added to the object as its attributes # rather than items in dict self.id = uuid4().hex if timestamp is None: self.timestamp = _get_timestamp() else: self.timestamp = timestamp self.name = name self.content = content if url: self.url = url self.update(kwargs) def __getattr__(self, key: Any) -> Any: try: return self[key] except KeyError as e: raise AttributeError(f"no attribute '{key}'") from e def __setattr__(self, key: Any, value: Any) -> None: self[key] = value def __delattr__(self, key: Any) -> None: try: del self[key] except KeyError as e: raise AttributeError(f"no attribute '{key}'") from e def to_str(self) -> str: """Return the string representation of the message""" raise NotImplementedError def serialize(self) -> str: """Return the serialized message.""" raise NotImplementedError # ... [省略代码以简化] ``` ### 消息类（`Msg`） `Msg`类是AgentScope中最常用的消息类。它继承了 `MessageBase`类，并实现了`to_str` 和 `serialize` 抽象方法。在一个Agent类中，其`reply`函数通常会返回一个`Msg`类的实例，以便在AgentScope中进行消息的传递。 ```python class Msg(MessageBase): """The Message class.""" def __init__( self, name: str, content: Any, url: Optional[Union[Sequence[str], str]] = None, timestamp: Optional[str] = None, echo: bool = False, **kwargs: Any, ) -> None: super().__init__( name=name, content=content, url=url, timestamp=timestamp, **kwargs, ) if echo: logger.chat(self) def to_str(self) -> str: """Return the string representation of the message""" return f"{self.name}: {self.content}" def serialize(self) -> str: return json.dumps({"__type": "Msg", **self}) ``` ## 关于记忆（Memory） ### 关于记忆基类（`MemoryBase`） `MemoryBase` 是一个抽象类，以结构化的方式处理智能体的记忆。它定义了存储、检索、删除和操作 *信息*内容的操作。 ```python class MemoryBase(ABC): # ... [省略代码以简化] def get_memory( self, return_type: PromptType = PromptType.LIST, recent_n: Optional[int] = None, filter_func: Optional[Callable[[int, dict], bool]] = None, ) -> Union[list, str]: raise NotImplementedError def add(self, memories: Union[list[dict], dict]) -> None: raise NotImplementedError def delete(self, index: Union[Iterable, int]) -> None: raise NotImplementedError def load( self, memories: Union[str, dict, list], overwrite: bool = False, ) -> None: raise NotImplementedError def export( self, to_mem: bool = False, file_path: Optional[str] = None, ) -> Optional[list]: raise NotImplementedError def clear(self) -> None: raise NotImplementedError def size(self) -> int: raise NotImplementedError ``` 以下是 `MemoryBase` 的关键方法： - **`get_memory`**：这个方法负责从智能体的记忆中检索存储的信息。它可以按照 `return_type` 指定的格式返回这些信息。该方法还可以在提供 `recent_n` 时检索特定数量的最近信息，并且可以应用过滤函数（ `filter_func` ）来根据自定义标准选择信息。 - **`add`**：这个方法用于将新的信息添加到智能体的记忆中。它可以接受单个信息或信息列表。每条信息通常是 `MessageBase` 或其子类的实例。 - **`delete`**：此方法允许通过信息的索引（如果提供可迭代对象，则为索引集合）从记忆中删除信息。 - **`load`**：这个方法允许从外部来源批量加载信息到智能体的内存中。`overwrite` 参数决定是否在加载新的信息集之前清除现有记忆。 - **`export`**：这个方法便于将存储的*信息*从智能体的记忆中导出，要么导出到一个外部文件（由 `file_path` 指定），要么直接导入到程序的运行内存中（如果 `to_mem` 设置为 `True` ）。 - **`clear`**：这个方法清除智能体记忆中的所有*信息*，本质上是重置。 - **`size`**：这个方法返回当前存储在智能体记忆中的信息数量。 ### 关于`TemporaryMemory` `TemporaryMemory` 类是 `MemoryBase` 类的一个具体实现，提供了一个智能体运行期间存在的记忆存储，被用作智能体的默认记忆类型。除了 `MemoryBase` 的所有行为外，`TemporaryMemory` 还提供了检索的方法： - **`retrieve_by_embedding`**：基于它们的嵌入向量 (embeddings) 检索与查询最相似的 `messages`。它使用提供的度量标准来确定相关性，并可以返回前 `k` 个最相关的信息。 - **`get_embeddings`**：返回记忆中所有信息的嵌入向量。如果信息没有嵌入向量，并且提供了嵌入模型，它将生成并存储信息的嵌入向量。有关 `Memory` 和 `Msg` 使用的更多细节，请参考 API 文档。 [[返回顶端]](#205-memory-zh) ``` modelscope/agentscope/blob/main/docs/sphinx_doc/zh_CN/source/tutorial/206-prompt.md: ```md (206-prompt-zh)= # 提示工程提示工程是与大型语言模型（LLMs）相关的应用中至关重要的组件。然而，为大型语言模型（LLMs）制作提示可能具有挑战性，尤其是在面对来自不同模型API的不同需求时。为了帮助开发者更好地适应不同模型API的需求，AgentScope提供了一种结构化的方式来组织不同数据类型（例如指令、提示、对话历史）到所需的格式。请注意这里不存在一个“**适用于所有模型API**”的提示构建方案。 AgentScope内置策略的目标是**使初学者能够顺利调用模型API ，而不是使应用达到最佳效果**。对于进阶用户，我们强烈建议开发者根据自己的需求和模型API的要求自定义提示。 ## 构建提示面临的挑战在多智能体应用中，LLM通常在对话中扮演不同的角色。当使用模型的Chat API时，时长会面临以下挑战： 1. 大多数Chat类型的模型API是为聊天机器人场景设计的，`role`字段只支持`"user"`和`"assistant"`，不支持`name`字段，即API本身不支持角色扮演。 2. 一些模型API要求`"user"`和`"assistant"`必须交替发言，而`"user"`必须在输入消息列表的开头和结尾发言。这样的要求使得在一个代理可能扮演多个不同角色并连续发言时，构建多智能体对话变得困难。为了帮助初学者快速开始使用AgentScope，我们为大多数与聊天和生成相关的模型API提供了以下内置策略。 ## 内置提示策略 AgentScope为以下的模型API提供了内置的提示构建策略。 - [OpenAIChatWrapper](#openaichatwrapper) - [DashScopeChatWrapper](#dashscopechatwrapper) - [DashScopeMultiModalWrapper](#dashscopemultimodalwrapper) - [OllamaChatWrapper](#ollamachatwrapper) - [OllamaGenerationWrapper](#ollamagenerationwrapper) - [GeminiChatWrapper](#geminichatwrapper) - [ZhipuAIChatWrapper](#zhipuaichatwrapper) 这些策略是在对应Model Wrapper类的`format`函数中实现的。它接受`Msg`对象，`Msg`对象的列表或它们的混合作为输入。在`format`函数将会把输入重新组织成一个`Msg`对象的列表，因此为了方便解释，我们在下面的章节中认为`format`函数的输入是`Msg`对象的列表。 ### `OpenAIChatWrapper` `OpenAIChatWrapper`封装了OpenAI聊天API，它以字典列表作为输入，其中字典必须遵循以下规则（更新于2024/03/22）： - 需要`role`和`content`字段，以及一个可选的`name`字段。 - `role`字段必须是`"system"`、`"user"`或`"assistant"`之一。 #### 提示的构建策略 ##### 非视觉（Vision）模型在OpenAI Chat API中，`name`字段使模型能够区分对话中的不同发言者。因此，`OpenAIChatWrapper`中`format`函数的策略很简单： - `Msg`: 直接将带有`role`、`content`和`name`字段的字典传递给API。 - `List`: 根据上述规则解析列表中的每个元素。样例如下： ```python from agentscope.models import OpenAIChatWrapper from agentscope.message import Msg model = OpenAIChatWrapper( config_name="", # 我们直接初始化model wrapper，因此不需要填入config_name model_name="gpt-4", ) prompt = model.format( Msg("system", "You're a helpful assistant", role="system"), # Msg对象 [ # Msg对象的列表 Msg(name="Bob", content="Hi.", role="assistant"), Msg(name="Alice", content="Nice to meet you!", role="assistant"), ], ) print(prompt) ``` ```bash [ {"role": "system", "name": "system", "content": "You are a helpful assistant"}, {"role": "assistant", "name": "Bob", "content": "Hi."}, {"role": "assistant", "name": "Alice", "content": "Nice to meet you!"), ] ``` ##### 视觉（Vision）模型对支持视觉的模型而言，如果输入消息包含图像url，生成的`content`字段将是一个字典的列表，其中包含文本和图像url。具体来说，如果是网络图片url，将直接传递给OpenAI Chat API，而本地图片url将被转换为base64格式。更多细节请参考[官方指南](https://platform.openai.com/docs/guides/vision)。注意无效的图片url（例如`/Users/xxx/test.mp3`）将被忽略。 ```python from agentscope.models import OpenAIChatWrapper from agentscope.message import Msg model = OpenAIChatWrapper( config_name="", # 为空，因为我们直接初始化model wrapper model_name="gpt-4o", ) prompt = model.format( Msg("system", "You're a helpful assistant", role="system"), # Msg 对象 [ # Msg 对象的列表 Msg(name="user", content="Describe this image", role="user", url="https://xxx.png"), Msg(name="user", content="And these images", role="user", url=["/Users/xxx/test.png", "/Users/xxx/test.mp3"]), ], ) print(prompt) ``` ```python [ { "role": "system", "name": "system", "content": "You are a helpful assistant" }, { "role": "user", "name": "user", "content": [ { "type": "text", "text": "Describe this image" }, { "type": "image_url", "image_url": { "url": "https://xxx.png" } }, ] }, { "role": "user", "name": "user", "content": [ { "type": "text", "text": "And these images" }, { "type": "image_url", "image_url": { "url": "data:image/png;base64,YWJjZGVm..." # 对应 /Users/xxx/test.png } }, ] }, ] ``` ### `DashScopeChatWrapper` `DashScopeChatWrapper`封装了DashScope聊天API，它接受消息列表作为输入。消息必须遵守以下规则： - 需要`role`和`content`字段，以及一个可选的`name`字段。 - `role`字段必须是`"user"`，`"system"`或`"assistant"`之一。 - 如果一条信息的`role`字段是`"system"`，那么这条信息必须也只能出现在消息列表的开头。 - `user`和`assistant`必须交替发言。 #### 提示的构建策略如果第一条消息的`role`字段是`"system"`，它将被转换为一条消息，其中`role`字段为`"system"`，`content`字段为系统消息。其余的消息将被转换为一条消息，其中`role`字段为`"user"`，`content`字段为对话历史。样例如下： ```python from agentscope.models import DashScopeChatWrapper from agentscope.message import Msg model = DashScopeChatWrapper( config_name="", # 我们直接初始化model wrapper，因此不需要填入config_name model_name="qwen-max", ) prompt = model.format( Msg("system", "You're a helpful assistant", role="system"), # Msg对象 [ # Msg对象的列表 Msg(name="Bob", content="Hi!", role="assistant"), Msg(name="Alice", content="Nice to meet you!", role="assistant"), ], ) print(prompt) ``` ```bash [ {"role": "system", "content": "You are a helpful assistant"}, {"role": "user", "content": "## Dialogue History\nBob: Hi!\nAlice: Nice to meet you!"}, ] ``` ### `DashScopeMultiModalWrapper` `DashScopeMultiModalWrapper`封装了多模态模型的API，它接受消息列表作为输入，并且必须遵循以下的规则(更新于2024/04/04): - 每个消息是一个字段，并且包含`role`和`content`字段。 - 其中`role`字段取值必须是`"user"`，`"system"`，`"assistant"`之一。 - `content`字段对应的值必须是字典的列表 - 每个字典只包含`text`，`image`或`audio`中的一个键值对 - `text`域对应的值是一个字符串，表示文本内容 - `image`域对应的值是一个字符串，表示图片的url - `audio`域对应的值是一个字符串，表示音频的url - `content`中可以同时包含多个key为`image`的字典或者多个key为`audio`的字典。例如 ```python [ { "role": "user", "content": [ {"text": "What's the difference between these two pictures?"}, {"image": "https://xxx1.png"}, {"image": "https://xxx2.png"} ] }, { "role": "assistant", "content": [{"text": "The first picture is a cat, and the second picture is a dog."}] }, { "role": "user", "content": [{"text": "I see, thanks!"}] } ] ``` - 如果一条信息的`role`字段是`"system"`，那么这条信息必须也只能出现在消息列表的开头。 - 消息列表中最后一条消息的`role`字段必须是`"user"`。 - 消息列表中`user`和`assistant`必须交替发言。 #### 提示的构建策略基于上述API的限制，构建策略如下： - 如果输入的消息列表中第一条消息的`role`字段的值是`"system"`，它将被转换为一条系统消息，其中`role`字段为`"system"`，`content`字段为系统消息，如果输入`Msg`对象中`url`属性不为`None`，则根据其类型在`content`中增加一个键值为`"image"`或者`"audio"`的字典。 - 其余的消息将被转换为一条消息，其中`role`字段为`"user"`，`content`字段为对话历史。并且所有`Msg`对象中`url`属性不为`None`的消息，都会根据`url`指向的文件类型在`content`中增加一个键值为`"image"`或者`"audio"`的字典。样例如下： ```python from agentscope.models import DashScopeMultiModalWrapper from agentscope.message import Msg model = DashScopeMultiModalWrapper( config_name="", # 我们直接初始化model wrapper，因此不需要填入config_name model_name="qwen-vl-plus", ) prompt = model.format( Msg("system", "You're a helpful assistant", role="system", url="url_to_png1"), # Msg对象 [ # Msg对象的列表 Msg(name="Bob", content="Hi!", role="assistant", url="url_to_png2"), Msg(name="Alice", content="Nice to meet you!", role="assistant", url="url_to_png3"), ], ) print(prompt) ``` ```bash [ { "role": "system", "content": [ {"text": "You are a helpful assistant"}, {"image": "url_to_png1"} ] }, { "role": "user", "content": [ {"text": "## Dialogue History\nBob: Hi!\nAlice: Nice to meet you!"}, {"image": "url_to_png2"}, {"image": "url_to_png3"}, ] } ] ``` ### LiteLLMChatWrapper `LiteLLMChatWrapper`封装了litellm聊天API，它接受消息列表作为输入。Litellm支持不同类型的模型，每个模型可能需要遵守不同的格式。为了简化使用，我们提供了一种与大多数模型兼容的格式。如果需要更特定的格式，您可以参考您所使用的特定模型以及[litellm](https://github.com/BerriAI/litellm)文档，来定制适合您模型的格式函数。 - 格式化聊天历史中的所有消息，将其整合成一个以`"user"`作为`role`的单一消息 #### 提示策略 - 消息将包括对话历史，`user`消息由系统消息(system message)和"## Dialog History"前缀。 ```python from agentscope.models import LiteLLMChatWrapper model = LiteLLMChatWrapper( config_name="", # empty since we directly initialize the model wrapper model_name="gpt-3.5-turbo", ) prompt = model.format( Msg("system", "You are a helpful assistant", role="system"), [ Msg("user", "What is the weather today?", role="user"), Msg("assistant", "It is sunny today", role="assistant"), ], ) print(prompt) ``` ```bash [ { "role": "user", "content": ( "You are a helpful assistant\n\n" "## Dialogue History\nuser: What is the weather today?\n" "assistant: It is sunny today" ), }, ] ``` ### `OllamaChatWrapper` `OllamaChatWrapper`封装了Ollama聊天API，它接受消息列表作为输入。消息必须遵守以下规则(更新于2024/03/22)： - 需要`role`和`content`字段，并且`role`必须是`"user"`、`"system"`或`"assistant"`之一。 - 可以添加一个可选的`images`字段到消息中。 #### 提示的构建策略给定一个消息列表，我们将按照以下规则解析每个消息： - 如果输入的第一条信息的`role`字段是`"system"`，该条信息将被视为系统提示（system prompt），其他信息将一起组成对话历史。对话历史将添加`"## Dialogue History"`的前缀，并与系统提示一起组成一条`role`为`"system"`的信息。 - 如果输入信息中的`url`字段不为`None`，则这些url将一起被置于`"images"`对应的键值中。 ```python from agentscope.models import OllamaChatWrapper model = OllamaChatWrapper( config_name="", # 我们直接初始化model wrapper，因此不需要填入config_name model_name="llama2", ) prompt = model.format( Msg("system", "You're a helpful assistant", role="system"), # Msg对象 [ # Msg对象的列表 Msg(name="Bob", content="Hi.", role="assistant"), Msg(name="Alice", content="Nice to meet you!", role="assistant", url="https://example.com/image.jpg"), ], ) print(prompt) ``` ```bash [ { "role": "system", "content": "You are a helpful assistant\n\n## Dialogue History\nBob: Hi.\nAlice: Nice to meet you!", "images": ["https://example.com/image.jpg"] }, ] ``` ### `OllamaGenerationWrapper` `OllamaGenerationWrapper`封装了Ollama生成API，它接受字符串提示作为输入，没有任何约束(更新于2024/03/22)。 #### 提示的构建策略如果第一条消息的`role`字段是`"system"`，那么它将会被转化成一条系统提示。其余消息会被拼接成对话历史。 ```python from agentscope.models import OllamaGenerationWrapper from agentscope.message import Msg model = OllamaGenerationWrapper( config_name="", # 我们直接初始化model wrapper，因此不需要填入config_name model_name="llama2", ) prompt = model.format( Msg("system", "You're a helpful assistant", role="system"), # Msg对象 [ # Msg对象的列表 Msg(name="Bob", content="Hi.", role="assistant"), Msg(name="Alice", content="Nice to meet you!", role="assistant"), ], ) print(prompt) ``` ```bash You are a helpful assistant ## Dialogue History Bob: Hi. Alice: Nice to meet you! ``` ### `GeminiChatWrapper` `GeminiChatWrapper`封装了Gemini聊天API，它接受消息列表或字符串提示作为输入。与DashScope聊天API类似，如果我们传递消息列表，它必须遵守以下规则： - 需要`role`和`parts`字段。`role`必须是`"user"`或`"model"`之一，`parts`必须是字符串列表。 - `user`和`model`必须交替发言。 - `user`必须在输入消息列表的开头和结尾发言。当代理可能扮演多种不同角色并连续发言时，这些要求使得构建多代理对话变得困难。因此，我们决定在内置的`format`函数中将消息列表转换为字符串提示，并且封装在一条user信息中。 #### 提示的构建策略如果第一条消息的`role`字段是`"system"`，那么它将会被转化成一条系统提示。其余消息会被拼接成对话历史。 **注意**Gemini Chat API中`parts`字段可以包含图片的url，由于我们将消息转换成字符串格式的输入，因此图片url在目前的`format`函数中是不支持的。我们推荐开发者可以根据需求动手定制化自己的提示。 ```python from agentscope.models import GeminiChatWrapper from agentscope.message import Msg model = GeminiChatWrapper( config_name="", # 我们直接初始化model wrapper，因此不需要填入config_name model_name="gemini-pro", ) prompt = model.format( Msg("system", "You're a helpful assistant", role="system"), # Msg对象 [ # Msg对象的列表 Msg(name="Bob", content="Hi.", role="model"), Msg(name="Alice", content="Nice to meet you!", role="model"), ], ) print(prompt) ``` ```bash [ { "role": "user", "parts": [ "You are a helpful assistant\n## Dialogue History\nBob: Hi!\nAlice: Nice to meet you!" ] } ] ``` ### `ZhipuAIChatWrapper` `ZhipuAIChatWrapper`封装了ZhipuAi聊天API，它接受消息列表或字符串提示作为输入。与DashScope聊天API类似，如果我们传递消息列表，它必须遵守以下规则： - 必须有 role 和 content 字段，且 role 必须是 "user"、"system" 或 "assistant" 中的一个。 - 至少有一个 user 消息。当代理可能扮演多种不同角色并连续发言时，这些要求使得构建多代理对话变得困难。因此，我们决定在内置的`format`函数中将消息列表转换为字符串提示，并且封装在一条user信息中。 #### 提示的构建策略如果第一条消息的 role 字段是 "system"，它将被转换为带有 role 字段为 "system" 和 content 字段为系统消息的单个消息。其余的消息会被转化为带有 role 字段为 "user" 和 content 字段为对话历史的消息。下面展示了一个示例： ```python from agentscope.models import ZhipuAIChatWrapper from agentscope.message import Msg model = ZhipuAIChatWrapper( config_name="", # empty since we directly initialize the model wrapper model_name="glm-4", api_key="your api key", ) prompt = model.format( Msg("system", "You're a helpful assistant", role="system"), # Msg object [ # a list of Msg objects Msg(name="Bob", content="Hi!", role="assistant"), Msg(name="Alice", content="Nice to meet you!", role="assistant"), ], ) print(prompt) ``` ```bash [ {"role": "system", "content": "You are a helpful assistant"}, {"role": "user", "content": "## Dialogue History\nBob: Hi!\nAlice: Nice to meet you!"}, ] ``` ## 关于`PromptEngine`类（将会在未来版本弃用） `PromptEngine`类提供了一种结构化的方式来合并不同的提示组件，比如指令、提示、对话历史和用户输入，以适合底层语言模型的格式。 ### 提示工程的关键特性 - **模型兼容性**：可以与任何 `ModelWrapperBase` 的子类一起工作。 - **提示类型**：支持字符串和列表风格的提示，与模型首选的输入格式保持一致。 ### 初始化当创建 `PromptEngine` 的实例时，您可以指定目标模型，以及（可选的）缩减原则、提示的最大长度、提示类型和总结模型（可以与目标模型相同）。 ```python model = OpenAIChatWrapper(...) engine = PromptEngine(model) ``` ### 合并提示组件 `PromptEngine` 的 `join` 方法提供了一个统一的接口来处理任意数量的组件，以构建最终的提示。 #### 输出字符串类型提示如果模型期望的是字符串类型的提示，组件会通过换行符连接： ```python system_prompt = "You're a helpful assistant." memory = ... # 可以是字典、列表或字符串 hint_prompt = "Please respond in JSON format." prompt = engine.join(system_prompt, memory, hint_prompt) # 结果将会是 ["You're a helpful assistant.", {"name": "user", "content": "What's the weather like today?"}] ``` #### 输出列表类型提示对于使用列表类型提示的模型，比如 OpenAI 和 Huggingface 聊天模型，组件可以转换为 `Message` 对象，其类型是字典列表： ```python system_prompt = "You're a helpful assistant." user_messages = [{"name": "user", "content": "What's the weather like today?"}] prompt = engine.join(system_prompt, user_messages) # 结果将会是: [{"role": "assistant", "content": "You're a helpful assistant."}, {"name": "user", "content": "What's the weather like today?"}] ``` #### 动态格式化提示 `PromptEngine` 支持使用 `format_map` 参数动态提示，允许您灵活地将各种变量注入到不同场景的提示组件中： ```python variables = {"location": "London"} hint_prompt = "Find the weather in {location}." prompt = engine.join(system_prompt, user_input, hint_prompt, format_map=variables) ``` [[返回顶端]](#206-prompt-zh) ``` modelscope/agentscope/blob/main/docs/sphinx_doc/zh_CN/source/tutorial/207-monitor.md: ```md (207-monitor-zh)= # 监控器在多智能体应用程序中，特别是那些依赖外部模型 API 的应用程序，监控使用情况和成本以防止过度使用并确保遵守速率限制是至关重要的。`MonitorBase` 类及其实现 `SqliteMonitor` 提供了一种追踪和调节这些 API 在您的应用中使用情况的方法。在本教程中，您将学习如何使用它们来监控 API 调用。 ## 理解 AgentScope 中的监控器 `MonitorBase` 类作为一个接口，用于设置一个监控系统，跟踪各种度量指标，特别是关注 API 使用情况。它定义了一些方法，使得可以注册、检查、更新和管理与 API 调用相关的度量指标。以下是 `MonitorBase` 的关键方法： - **`register`**：初始化用于跟踪的度量指标，例如进行的 API 调用次数，以及可选的配额用于执行限制。 - **`exists`**：检查是否已经跟踪了某个度量指标。 - **`add`**：将度量指标增加指定的值，用于每次 API 调用后计数。 - **`update`**：一次更新多个度量指标，适用于批量更新。 - **`clear`**：将度量指标重置为零，适用于配额周期重置。 - **`remove`**：从监控中移除一个度量指标。 - **`get_value`**：检索特定度量指标的当前值。 - **`get_unit`**：获取与度量指标相关联的单元（例如，“调用”）。 - **`get_quota`**：获取允许的 API 调用的最大值。 - **`set_quota`**：调整度量指标的配额，如果 API 使用条款变更。 - **`get_metric`**：返回有关特定度量指标的详细信息。 - **`get_metrics`**：检索所有跟踪度量指标的信息，可以基于度量指标名称可选地进行过滤。 - **`register_budget`**：为某个 API 调用设置预算，将初始化一系列用于计算成本的度量指标。 ## 使用监控器 ### 获取监控器实例从 `MonitorFactory` 获取监控器实例开始监控，注意多次调用 `get_monitor` 方法将返回同一个监控器实例。 ```python # 确保在这之前你已经调用了agentscope.init(...) monitor = MonitorFactory.get_monitor() ``` 目前上述代码将会返回一个 `SqliteMonitor` 实例，该实例在 `agentscope.init` 中初始化。 `SqliteMonitor` 是一个基于 Sqlite3 的 `MonitorBase` 实现，也是当前的默认 Monitor。如果不需要使用 Monitor 的相关功能，可以通过向 `agentscope.init` 中传入 `use_monitor=False` 来关闭 monitor 组件。在这种情况下，`MonitorFactory.get_monitor` 将返回一个 `DummyMonitor` 实例，该实例对外接口与 `SqliteMonitor` 完全相同，但内部不会执行任何操作。 ### 基本使用 #### 注册 API 使用度量指标注册一个新的度量指标以开始监控 token 数量： ```python monitor.register("token_num", metric_unit="token", quota=1000) ``` #### 更新度量指标增加 `token_num` 度量指标： ```python monitor.add("token_num", 20) ``` #### 处理配额如果 API 调用次数超出了配额，将抛出 `QuotaExceededError`： ```python try: monitor.add("api_calls", amount) except QuotaExceededError as e: # 处理超出的配额，例如，通过暂停API调用 print(e.message) ``` #### 检索度量指标获取当前使用的 token 数量： ```python token_num_used = monitor.get_value("token_num") ``` #### 重置和移除度量指标在新的周期开始时重置 token 计数： ```python monitor.clear("token_num") ``` 如果不再需要，则移除度量指标： ```python monitor.remove("token_num") ``` ### 进阶使用 > 这里的功能仍在开发中，接口可能会继续变化。 #### 使用 `prefix` 来区分度量指标假设您有多个智能体/模型使用相同的 API 调用，但您想分别计算它们的 token 使用量，您可以在原始度量指标名称前添加一个唯一的前缀 `prefix`，`get_full_name` 函数提供了这样的功能。例如，如果 model_A 和 model_B 都使用 OpenAI API，您可以通过以下代码注册这些度量指标。 ```python from agentscope.utils.monitor import get_full_name ... # 在model_A中 monitor.register(get_full_name('prompt_tokens', 'model_A')) monitor.register(get_full_name('completion_tokens', 'model_A')) # 在model_B中 monitor.register(get_full_name('prompt_tokens', 'model_B')) monitor.register(get_full_name('completion_tokens', 'model_B')) ``` 更新这些度量指标，只需使用 `update` 方法。 ```python # 在model_A中 monitor.update(openai_response.usage.model_dump(), prefix='model_A') # 在model_B中 monitor.update(openai_response.usage.model_dump(), prefix='model_B') ``` 获取特定模型的度量指标，请使用 `get_metrics` 方法。 ```python # 获取model_A的度量指标 model_A_metrics = monitor.get_metrics('model_A') # 获取model_B的度量指标 model_B_metrics = monitor.get_metrics('model_B') ``` #### 为 API 注册预算当前，监控器已经支持根据各种度量指标自动计算 API 调用的成本，您可以直接为模型设置预算以避免超出配额。假设您正在使用 `gpt-4-turbo`，您的预算是10美元，您可以使用以下代码。 ```python model_name = 'gpt-4-turbo' monitor.register_budget(model_name=model_name, value=10, prefix=model_name) ``` 使用 `prefix` 为使用相同 API 的不同模型设置预算。 ```python model_name = 'gpt-4-turbo' # 在model_A中 monitor.register_budget(model_name=model_name, value=10, prefix=f'model_A.{model_name}') # 在model_B中 monitor.register_budget(model_name=model_name, value=10, prefix=f'model_B.{model_name}') ``` `register_budget` 将自动注册计算总成本所需的度量指标，当这些度量指标更新时计算总成本，并在超出预算时抛出 `QuotaExceededError`。 ```python model_name = 'gpt-4-turbo' try: monitor.update(openai_response.usage.model_dump(), prefix=model_name) except QuotaExceededError as e: # 处理超出的配额 print(e.message) ``` > **注意：** 此功能仍在实验阶段，只支持一些特定的 API，这些 API 已在 `agentscope.utils.monitor._get_pricing` 中列出。 [[Return to the top]](#207-monitor-zh) ``` modelscope/agentscope/blob/main/docs/sphinx_doc/zh_CN/source/tutorial/208-distribute.md: ```md (208-distribute-zh)= # 分布式 AgentScope实现了基于Actor模式的智能体分布式部署和并行优化，并提供以下的特点： - **自动并行优化**：运行时自动实现应用并行优化，无需额外优化成本； - **应用编写中心化**：无需分布式背景知识，轻松编排分布式应用程序流程； - **零成本自动迁移**：中心化的Multi-Agent应用可以轻松转化成分布式模式本教程将详细介绍AgentScope分布式的实现原理和使用方法。 ## 使用方法 AgentScope中，我们将运行应用流程的进程称为**主进程 (Main Process)**，而所有的智能体都会运行在额外的 **智能体服务器进程 (Agent Server Process)** 中。根据主进程与智能体服务器进程之间的关系，AgentScope 为每个 Agent 提供了两种启动模式：**子进程模式 (Child)** 和 **独立进程模式 (Indpendent)**。子进程模式中，开发者可以从主进程中启动所有的智能体服务器进程，而独立进程模式中，智能体服务器进程相对主进程来说是独立的，需要在对应的机器上启动智能体服务器进程。上述概念有些复杂，但是不用担心，对于应用开发者而言，仅需将已有的智能体转化为对应的分布式版本，其余操作都和正常的单机版本完全一致。 ### 步骤1: 转化为分布式版本 AgentScope 中所有智能体都可以通过 {func}`to_dist` 方法转化为对应的分布式版本。但需要注意，你的智能体必须继承自 {class}`agentscope.agents.AgentBase` 类，因为是 `AgentBase` 提供了 `to_dist` 方法。假设有两个智能体类`AgentA`和`AgentB`，它们都继承自 `AgentBase`。 ```python a = AgentA( name="A" # ... ) b = AgentB( name="B" # ... ) ``` 接下来我们将介绍如何将智能体转化到两种分布式模式。 #### 子进程模式要使用该模式，你只需要调用各智能体的 `to_dist()` 方法，并且不需要提供任何参数。 AgentScope 会自动帮你从主进程中启动智能体服务器进程并将智能体部署到对应的子进程上。 ```python # Subprocess mode a = AgentA( name="A" # ... ).to_dist() b = AgentB( name="B" # ... ).to_dist() ``` #### 独立进程模式在独立进程模式中，需要首先在目标机器上启动智能体服务器进程，启动时需要提供该服务器能够使用的模型的配置信息，以及服务器的 IP 和端口号。例如想要将两个智能体服务进程部署在 IP 分别为 `ip_a` 和 `ip_b` 的机器上（假设这两台机器分别为`Machine1` 和 `Machine2`）。你可以在 `Machine1` 上运行如下代码。在运行之前请确保该机器能够正确访问到应用中所使用的所有模型。具体来讲，需要将用到的所有模型的配置信息放置在 `model_config_path_a` 文件中，并检查API key 等环境变量是否正确设置，模型配置文件样例可参考 `examples/model_configs_template`。除此之外，还要将那些需要在该服务器中运行的自定义 Agent 类在 `custom_agent_classes` 中注册，以便启动的服务器能够正确识别这些自定义的 Agent，如果只是使用 AgentScope 内置的 Agent 类,则不需要填写 `custom_agent_classes`。 ```python # import some packages # register models which can be used in the server agentscope.init( model_configs=model_config_path_a, ) # Create an agent service process server = RpcAgentServerLauncher( host="ip_a", port=12001, # choose an available port custom_agent_classes=[AgentA, AgentB] # register your customized agent classes ) # Start the service server.launch() server.wait_until_terminate() ``` > 为了进一步简化使用，可以在命令行中输入如下指令来代替上述代码： > > ```shell > as_server --host ip_a --port 12001 --model-config-path model_config_path_a > ``` 在 `Machine2` 上运行如下代码，这里同样要确保已经将模型配置文件放置在 `model_config_path_b` 位置并设置环境变量，从而确保运行在该机器上的 Agent 能够正常访问到模型。 ```python # import some packages # register models which can be used in the server agentscope.init( model_configs=model_config_path_b, ) # Create an agent service process server = RpcAgentServerLauncher( host="ip_b", port=12002, # choose an available port custom_agent_classes=[AgentA, AgentB] # register your customized agent classes ) # Start the service server.launch() server.wait_until_terminate() ``` > 这里也同样可以用如下指令来代替上面的代码。 > > ```shell > as_server --host ip_b --port 12002 --model-config-path model_config_path_b > ``` 接下来，就可以使用如下代码从主进程中连接这两个智能体服务器进程。 ```python a = AgentA( name="A", # ... ).to_dist( host="ip_a", port=12001, ) b = AgentB( name="B", # ... ).to_dist( host="ip_b", port=12002, ) ``` 上述代码将会把 `AgentA` 部署到 `Machine1` 的智能体服务器进程上，并将 `AgentB` 部署到 `Machine2` 的智能体服务器进程上。开发者在这之后只需要用中心化的方法编排各智能体的交互逻辑即可。 ### 步骤2: 编排分布式应用流程在AgentScope中，分布式应用流程的编排和非分布式的程序完全一致，开发者可以用中心化的方式编写全部应用流程。同时，AgentScope允许本地和分布式部署的智能体混合使用，开发者不用特意区分哪些智能体是本地的，哪些是分布式部署的。以下是不同模式下实现两个智能体之间进行对话的全部代码，对比可见，AgentScope支持零代价将分布式应用流程从中心化向分布式迁移。 - 智能体全部中心化： ```python # 创建智能体对象 a = AgentA( name="A", # ... ) b = AgentB( name="B", # ... ) # 应用流程编排 x = None while x is None or x.content == "exit": x = a(x) x = b(x) ``` - 智能体分布式部署 - `AgentA` 使用子进程模式部署 - `AgentB` 使用独立进程模式部署 ```python # 创建智能体对象 a = AgentA( name="A" # ... ).to_dist() b = AgentB( name="B", # ... ).to_dist( host="ip_b", port=12002, ) # 应用流程编排 x = None while x is None or x.content == "exit": x = a(x) x = b(x) ``` ### 进阶用法 #### 更低成本的 `to_dist` 上面介绍的案例都是将一个已经初始化的 Agent 通过 {func}`to_dist` 方法转化为其分布式版本，相当于要执行两次初始化操作，一次在主进程中，一次在智能体进程中。如果 Agent 的初始化过程耗时较长，直接使用 `to_dist` 方法会严重影响运行效率。为此 AgentScope 提供了在初始化 Agent 实例的同时将其转化为其分布式版本的方法，即在原 Agent 实例初始化时传入 `to_dist` 参数。子进程模式下，只需要在 Agent 初始化函数中传入 `to_dist=True` 即可： ```python # Child Process mode a = AgentA( name="A", # ... to_dist=True ) b = AgentB( name="B", # ... to_dist=True ) ``` 独立进程模式下，则需要将原来 `to_dist()` 函数的参数以 {class}`DistConf` 实例的形式传入 Agent 初始化函数的 `to_dist` 域： ```python a = AgentA( name="A", # ... to_dist=DistConf( host="ip_a", port=12001, ), ) b = AgentB( name="B", # ... to_dist=DistConf( host="ip_b", port=12002, ), ) ``` 相较于原有的 `to_dist()` 函数调用，该方法只会在智能体进程中初始化一次 Agent，避免了重复初始化行为，能够有效减少初始化开销。 #### 管理 Agent Server 在运行大规模多智能体应用时，往往需要启动众多的 Agent Server 进程。为了让使用者能够有效管理这些进程，AgentScope 在 {class}`RpcAgentClient` 中提供了如下管理接口： - `is_alive`: 该方法能够判断该 Agent Server 进程是否正在运行。 ```python client = RpcAgentClient(host=server_host, port=server_port) if client.is_alive(): do_something() ``` - `stop`: 该方法能够停止连接的 Agent Server 进程。 ```python client.stop() assert(client.is_alive() == False) ``` - `get_agent_list`: 该方法能够获取该 Agent Server 进程中正在运行的所有 Agent 的 JSON 格式的缩略信息列表，具体展示的缩略信息内容取决于该 Agent 类的 `__str__` 方法。 ```python agent_list = client.get_agent_list() print(agent_list) # [agent1_info, agent2_info, ...] ``` - `get_agent_memory`: 该方法能够获取指定 `agent_id` 对应 Agent 实例的 memory 内容。 ```python agent_id = my_agent.agent_id agent_memory = client.get_agent_memory(agent_id) print(agent_memory) # [msg1, msg2, ...] ``` - `get_server_info`：该方法能够获取该 Agent Server 进程的资源占用情况，包括 CPU 利用率、内存占用。 ```python server_info = client.get_server_info() print(server_info) # { "cpu": xxx, "mem": xxx } ``` - `set_model_configs`: 该方法可以将指定的模型配置信息设置到 Agent Server 进程中，新创建的 Agent 实例可以直接使用这些模型配置信息。 ```python agent = MyAgent( # 因为找不到 [my_openai] 模型而失败 # ... model_config_name="my_openai", to_dist={ # ... } ) client.set_model_configs([{ # 新增 [my_openai] 模型配置信息 "config_name": "my_openai", "model_type": "openai_chat", # ... }]) agent = MyAgent( # 成功创建 Agent 实例 # ... model_config_name="my_openai", to_dist={ # ... } ) ``` - `delete_agent`: 该方法用于删除指定 `agent_id` 对应的 Agent 实例。 ```python agent_id = agent.agent_id ok = client.delete_agent(agent_id) ``` - `delete_all_agent`: 该方法可以删除 Agent Server 进程中所有的 Agent 实例。 ```python ok = client.delete_all_agent() ``` ## 实现原理 ### Actor模式 [Actor模式](https://en.wikipedia.org/wiki/Actor_model)是大规模分布式系统中广泛使用的编程范式，同时也被应用于AgentScope平台的分布式设计中。在Actor模型中，一个actor是一个实体，它封装了自己的状态，并且仅通过消息传递与其他actor通信。在AgentScope的分布式模式中，每个Agent都是一个Actor，并通过消息与其他Agent交互。消息的流转暗示了Agent的执行顺序。每个Agent都有一个`reply`方法，它消费一条消息并生成另一条消息，生成的消息可以发送给其他 Agent。例如，下面的图表显示了多个Agent的工作流程。`A`~`F`都是Agent，箭头代表消息。 ```{mermaid} graph LR; A-->B A-->C B-->D C-->D E-->F D-->F ``` 其中，`B`和`C`可以在接收到来自`A`的消息后同时启动执行，而`E`可以立即运行，无需等待`A`、`B`、`C`和`D`。通过将每个Agent实现为一个Actor， Agent将自动等待其输入Msg准备好后开始执行`reply`方法，并且如果多个 Agent 的输入消息准备就绪，它们也可以同时自动执行`reply`，这避免了复杂的并行控制。 #### Placeholder 同时，为了支持中心化的应用编排，AgentScope 引入了 {class}`Placeholder` 这一概念。 Placeholder 可以理解为消息的指针，指向消息真正产生的位置，其对外接口与传统模式中的消息完全一致，因此可以按照传统中心化的消息使用方式编排应用。 Placeholder 内部包含了该消息产生方的联络方法，可以通过网络获取到被指向消息的真正值。每个分布式部署的 Agent 在收到其他 Agent 发来的消息时都会立即返回一个 Placeholder，从而避免阻塞请求发起方。而请求发起方可以借助返回的 Placeholder 在真正需要消息内容时再去向原 Agent 发起请求，请求发起方甚至可以将 Placeholder 发送给其他 Agent 让其他 Agent 代为获取消息内容，从而减少消息真实内容的不必要转发。关于更加详细的技术实现方案，请参考我们的[论文](https://arxiv.org/abs/2402.14034)。 ### Agent Server Agent Server 也就是智能体服务器。在 AgentScope 中，Agent Server 提供了一个让不同 Agent 实例运行的平台。多个不同类型的 Agent 可以运行在同一个 Agent Server 中并保持独立的记忆以及其他本地状态信息，但是他们将共享同一份计算资源。在安装 AgentScope 的分布式版本后就可以通过 `as_server` 命令来启动 Agent Server，具体的启动参数在 {func}`as_server` 函数文档中可以找到。只要没有对代码进行修改，一个已经启动的 Agent Server 可以为多个主流程提供服务。这意味着在运行多个应用时，只需要在第一次运行前启动 Agent Server，后续这些 Agent Server 进程就可以持续复用。 [[回到顶部]](#208-distribute-zh) ``` modelscope/agentscope/blob/main/docs/sphinx_doc/zh_CN/source/tutorial/209-gui.md: ```md (209-gui-zh)= # AgentScope Studio AgentScope Studio 是一个开源的 Web UI 工具包，用于构建和监控多智能体应用程序。它提供以下功能： - **Dashboard**：一个用户友好的界面，可以在其中监视正在运行的应用程序，并查看运行历史。 - **Workstation**：一个强大的界面，可通过**拖拽**的方式构建多智能体应用程序。 - **Gallery**：即将推出！ - **Server Management**：即将推出！ ## 启动 AgentScope Studio 要启动 Studio，请首先确保已安装了最新版本的 AgentScope。然后，可以通过运行以下 Python 代码： ```python import agentscope agentscope.studio.init() ``` 或者，可以在终端中运行以下命令： ```bash as_studio ``` 之后，可以在 `http://127.0.0.1:5000` 访问 AgentScope Studio。当然，AgentScope Studio的 IP 地址和端口都可以更改，并且可以通过以下方式引入应用的运行记录： ```python import agentscope agentscope.studio.init( host="127.0.0.1", # AgentScope Studio 的 IP 地址 port=5000, # AgentScope Studio 的端口号 run_dirs = [ # 您的运行历史目录 "xxx/xxx/runs", "xxx/xxx/runs" ] ) ``` ## Dashboard Dashboard 是一个 Web 界面，用于监视正在运行的应用程序，并查看运行历史。 ## 注意目前，Dashboard 存在以下限制，我们正在努力改进。欢迎任何反馈、贡献或建议！ - 运行的应用程序和 AgentScope Studio 必须在同一台机器上运行，以保持 URL 和路径的一致性。如果需要在其它机器上访问 AgentScope Studio，可以尝试通过在远程机器上运行以下命令将端口转发到远程机器： ```bash # 假设 AgentScope 在 {as_host}:{as_port} 上运行，远程机器的端口为 {remote_machine_port} ssh -L {remote_machine_port}:{as_host}:{as_port} [{user_name}@]{as_host} ``` - 对于分布式应用程序，支持单机多进程模式，但尚不支持多机多进程模式。 ### 注册应用程序启动 AgentScope Studio 后，可以通过指定 `agentscope.init()` 函数中的 `studio_url` 参数来注册应用程序： ```python import agentscope agentscope.init( # ... project="xxx", name="xxx", studio_url="http://127.0.0.1：5000" # AgentScope Studio 的 URL ) ``` 注册后，可以在 Dashboard 中查看正在运行的应用程序。为了区分不同的应用程序，可以指定应用程序的 `project` 和 `name`。 > 注意：一旦注册了应用程序，`agentscope.agents.UserAgent` 中的用户输入就会转移到 AgentScope Studio 的 Dashboard 中，您可以在 Dashboard 中输入。 ### 引入运行历史在 AgentScope 中，运行历史默认会保存在 `./runs` 目录下。为了引入运行历史，可以在 `agentscope.studio.init()` 函数中指定 `run_dirs` 参数： ```python import agentscope agentscope.studio.init( run_dirs = ["xxx/runs",] ) ``` ## About Workstation Workstation 是为零代码用户设计的，可以通过**拖拽**的方式构建多智能体应用程序。 > 注意：Workstation 仍处于快速迭代阶段，界面和功能可能会有所变化。欢迎任何反馈、贡献或建议！ ### 快速使用 AgentScope Studio中，拖过点击 workstation 图标进入 Workstation 界面。它由侧边栏、工作区和顶部工具栏组成。 - **侧边栏**：提供预构建的示例，帮助开发者熟悉工作站，并提供可拖动的组件来构建应用程序。 - **工作区**：主要工作区，可以在其中拖放组件来构建应用程序。 - **顶部工具栏**：包含导出、加载、检查和运行等功能。

#### 内置样例对于初学者，建议从预构建的示例开始，可以直接点击示例以将其导入到中央工作区。或者，为了获得更有结构化的学习体验，您可以选择跟随每个示例链接的教程。这些教程将逐步引导您如何在 AgentScope Workstation 上构建每个多智能体应用程序。 #### 构建应用程序要构建应用程序，请按照以下步骤操作： - **选择和拖动组件**：从侧边栏中选择您想要的组件，然后将其拖放到中央工作区。 - **连接节点**：大多数节点都有输入和输出点。单击一个组件的输出点，然后将其拖动到另一个组件的输入点，以创建消息流管道。这个过程允许不同的节点传递消息。 - **配置节点**：将节点拖放到工作区后，单击任何节点以填写其配置设置。可以自定义提示、参数和其他属性。 #### 运行应用程序构建应用程序后，单击“运行”按钮。在运行之前，Workstation 将检查您的应用程序是否存在任何错误。如果有任何错误，您将被提示在继续之前纠正它们。之后，应用程序将在与 AgentScope Studio 相同的 Python 环境中执行，并且可以在 Dashboard 中找到它。 #### 导入/导出应用程序 AgentScope Workstation 支持导入和导出应用程序。单击“导出 HTML”或“导出 Python”按钮，以生成可以分发给社区或本地保存的代码。如果要将导出的代码转换为 Python，请按以下步骤将 JSON 配置编译为 Python 代码： ```bash as_workflow config.json --compile ${YOUR_PYTHON_SCRIPT_NAME}.py ``` 需要进一步编辑应用程序，只需单击“导入 HTML”按钮，将之前导出的 HTML 代码上传回 AgentScope Workstation。 #### 检查应用程序构建应用程序后，可以单击“检查”按钮来验证应用程序结构的正确性。将执行以下检查规则，不用担心这些规则过于复杂，Workstation 将会自动执行检查并给出提示。 - Model 和 Agent 的存在：每个应用程序必须包含至少一个 model 节点和一个 agent 节点。 - 单连接策略：每个组件的输入不应该有多个连接。 - 必填字段验证：所有必填字段必须填充，以确保每个节点具有正确运行所需的参数。 - 一致的配置命名：Agent 节点使用的“Model config name”必须对应于 Model 节点中定义的“Config Name”。 - 节点嵌套正确：ReActAgent 等节点应仅包含工具节点。类似地，IfElsePipeline 等 Pipeline 节点应包含正确数量的元素（不超过 2 个），而 ForLoopPipeline、WhileLoopPipeline 和 MsgHub 应遵循一个元素的规则（必须是 SequentialPipeline 作为子节点）。 ``` modelscope/agentscope/blob/main/docs/sphinx_doc/zh_CN/source/tutorial/209-prompt_opt.md: ```md (209-prompt-opt-zh)= # 系统提示优化 AgentScope实现了对智能体System Prompt进行优化的模块。 ## 背景在智能体系统中，System Prompt的设计对于产生高质量的智能体响应至关重要。System Prompt向智能体提供了执行任务的环境、角色、能力和约束等背景描述。然而，优化System Prompt的过程通常充满挑战，这主要是由于以下几点： 1. **针对性**：一个良好的 System Prompt 应该针对性强，能够清晰地引导智能体在特定任务中更好地表现其能力和限制。 2. **合理性**：为智能体定制的 System Prompt 应该合适且逻辑清晰，以保证智能体的响应不偏离预定行为。 3. **多样性**：智能体可能需要参与多种场景的任务，这就要求 System Prompt 具备灵活调整以适应各种不同背景的能力。 4. **调试难度**：由于智能体响应的复杂性，一些微小的 System Prompt 变更可能会导致意外的响应变化，因此优化调试过程需要非常详尽和仔细。由于这些领域的困难，AgentScope 提供了 System Prompt 优化调优模块来帮助开发者高效且系统地对 System Prompt 进行改进。借助这些模块可以方便用户对自己 Agent 的 System Prompt 进行调试优化，提升 System Prompt 的有效性。具体包括： - **System Prompt Generator**: 根据用户的需求生成对应的 system prompt - **System Prompt Comparer**: 在不同的查询或者对话过程中比较不同的 system prompt 的效果 - **System Prompt Optimizer**: 根据对话历史进行反思和总结，从而进一步提升 system prompt ## 目录 - [System Prompt Generator](#system-prompt-generator) - [初始化](#初始化) - [生成 System Prompt](#生成-system-prompt) - [使用 In Context Learning 生成](#使用-in-context-learning-生成) - [System Prompt Comparer](#system-prompt-comparer) - [初始化](#初始化-1) - [System Prompt Optimizer](#system-prompt-optimizer) ## System Prompt Generator System prompt generator 使用一个 meta prompt 来引导 LLM 根据用户输入生成对应的 system prompt，并允许开发者使用内置或自己的样例进行 In Context Learning (ICL)。具体包括 `EnglishSystemPromptGenerator` 和 `ChineseSystemPromptGenerator` 两个模块，分别用于英文和中文的系统提示生成。它们唯一的区别在于内置的 prompt 语言不同，其他功能完全一致。下面以 `ChineseSystemPromptGenerator` 为例，介绍如何使用 system prompt generator。 ### 初始化为了初始化生成器，首先需要在 `agentscope.init` 函数中注册模型配置。 ```python from agentscope.prompt import EnglishSystemPromptGenerator import agentscope agentscope.init( model_configs={ "config_name": "my-gpt-4", "model_type": "openai_chat", "model_name": "gpt-4", "api_key": "xxx", } ) prompt_generator = EnglishSystemPromptGenerator( model_config_name="my-gpt-4" ) ``` 生成器将使用内置的 meta prompt 来引导 LLM 生成 system prompt。开发者也可以使用自己的 meta prompt，如下所示： ```python from agentscope.prompt import EnglishSystemPromptGenerator your_meta_prompt = "You are an expert prompt engineer adept at writing and optimizing system prompts. Your task is to ..." prompt_gen_method = EnglishSystemPromptGenerator( model_config_name="my-gpt-4", meta_prompt=your_meta_prompt ) ``` 欢迎开发者尝试不同的优化方法。AgentScope 提供了相应的 `SystemPromptGeneratorBase` 模块，用以实现自己的优化模块。 ```python from agentscope.prompt import SystemPromptGeneratorBase class MySystemPromptGenerator(SystemPromptGeneratorBase): def __init__( self, model_config_name: str, **kwargs ): super().__init__( model_config_name=model_config_name, **kwargs ) ``` ### 生成 System Prompt 调用 `generate` 函数生成 system prompt，这里的输入可以是一个需求，或者是想要优化的 system prompt。 ```python from agentscope.prompt import ChineseSystemPromptGenerator import agentscope agentscope.init( model_configs={ "config_name": "my-gpt-4", "model_type": "openai_chat", "model_name": "gpt-4", "api_key": "xxx", } ) prompt_generator = ChineseSystemPromptGenerator( model_config_name="my-gpt-4" ) generated_system_prompt = prompt_generator.generate( user_input="生成一个小红书营销专家的系统提示，专门负责推销书籍。" ) print(generated_system_prompt) ``` 执行上述代码后，可以获得如下的 system prompt： ``` 你是一个小红书营销专家AI，你的主要任务是推销各类书籍。你拥有丰富的营销策略知识和对小红书用户群体的深入理解，能够创造性地进行书籍推广。你的技能包括但不限于：制定营销计划，写吸引人的广告文案，分析用户反馈，以及对营销效果进行评估和优化。你无法直接进行实时搜索或交互，但可以利用你的知识库和经验来提供最佳的营销策略。你的目标是提高书籍的销售量和提升品牌形象。 ``` 看起来这个 system prompt 已经有一个雏形了，但是还有很多地方可以优化。接下来我们将介绍如何使用 In Context Learning (ICL) 来优化 system prompt。 ### 使用 In Context Learning 生成 AgentScope 的 system prompt generator 模块支持在系统提示生成中使用 In Context Learning。它内置了一些样例，并且允许用户提供自己的样例来优化系统提示。为了使用样例，AgentScope 提供了以下参数： - `example_num`: 附加到 meta prompt 的样例数量，默认为 0 - `example_selection_strategy`: 选择样例的策略，可选 "random" 和 "similarity"。 - `example_list`: 一个样例的列表，其中每个样例必须是一个包含 "user_prompt" 和 "opt_prompt" 键的字典。如果未指定，则将使用内置的样例列表。 ```python from agentscope.prompt import ChineseSystemPromptGenerator generator = ChineseSystemPromptGenerator( model_config_name="{your_config_name}", example_num=3, example_selection_strategy="random", example_list= [ # 或者可以使用内置的样例列表 { "user_prompt": "生成一个 ...", "opt_prompt": "你是一个AI助手 ..." }, # ... ], ) ``` 注意，如果选择 `"similarity"` 作为样例选择策略，可以在 `embed_model_config_name` 或 `local_embedding_model` 参数中指定一个 embedding 模型。它们的区别在于： - `embed_model_config_name`: 首先在 `agentscope.init` 中注册 embedding 模型，并在此参数中指定模型配置名称。 - `local_embedding_model`：或者，可以使用 `sentence_transformers.SentenceTransformer` 库支持的本地小型嵌入模型。如果上述两个参数都没有指定，AgentScope 将默认使用 `"sentence-transformers/all-mpnet-base-v2"` 模型，该模型足够小，可以在 CPU 上运行。一个简单利用 In Context Learning 的示例如下： ```python from agentscope.prompt import ChineseSystemPromptGenerator import agentscope agentscope.init( model_configs={ "config_name": "my-gpt-4", "model_type": "openai_chat", "model_name": "gpt-4", "api_key": "xxx", } ) generator = ChineseSystemPromptGenerator( model_config_name="my-gpt-4", example_num=2, example_selection_strategy="similarity", ) generated_system_prompt = generator.generate( user_input="生成一个小红书营销专家的系统提示，专门负责推销书籍。" ) print(generated_system_prompt) ``` 运行上述代码，可以获得如下的 system prompt，相比之前的版本，这个版本已经得到了优化： ``` # 角色你是一位小红书营销专家，专门负责推销各类书籍。你对市场趋势有着敏锐的洞察力，能够精准把握读者需求，创新性地推广书籍。 ## 技能 ### 技能1：书籍推销 - 根据书籍的特点和读者的需求，制定并执行有效的营销策略。 - 创意制作吸引人的内容，如书籍预告、作者访谈、读者评价等，以提升书籍的曝光度和销售量。 ### 技能2：市场分析 - 对小红书平台的用户行为和市场趋势进行深入研究，以便更好地推销书籍。 - 根据分析结果，调整和优化营销策略。 ### 技能3：读者互动 - 在小红书平台上与读者进行有效互动，收集和回应他们对书籍的反馈。 - 根据读者反馈，及时调整营销策略，提高书籍的销售效果。 ## 限制： - 只在小红书平台上进行书籍的推销工作。 - 遵守小红书的社区规则和营销准则，尊重读者的意见和反馈。 - 不能对书籍的销售结果做出过于乐观或过于悲观的预测。 ``` > Note: > > 1. 样例的 embedding 将会被缓存到 `~/.cache/agentscope/`，这样未来针对相同的样例和相同的模型情况下，不会重复计算 embedding。 > > 2. `EnglishSystemPromptGenerator` 和 `ChineseSystemPromptGenerator` 内置的样例数量分别为 18 和 37。如果使用在线 embedding API 服务，请注意成本。 ## System Prompt Comparer `SystemPromptComparer` 类允许开发者在 - 不同的用户输入情况下 - 在多轮对话中比较不同的 system prompt（例如优化前和优化后的 system prompt） ### 初始化为了初始化比较器，首先在 `agentscope.init` 函数中注册模型配置，然后用需要比较的 system prompt 实例化 `SystemPromptComparer` 对象。让我们尝试一个非常有趣的例子： ```python from agentscope.prompt import SystemPromptComparer import agentscope agentscope.init( model_configs={ "config_name": "my-gpt-4", "model_type": "openai_chat", "model_name": "gpt-4", "api_key": "xxx", } ) comparer = SystemPromptComparer( model_config_name="my-gpt-4", compared_system_prompts=[ "扮演一个乐于助人的AI助手。", "扮演一个不友好的AI助手，并且表现得粗鲁。" ] ) # Compare different system prompts with some queries results = comparer.compare_with_queries( queries=[ "你好！你是谁？", "1+1等于多少？" ] ) ``` 执行上述代码会得到下面的结果： ```` ## Query 0: 你好！你是谁？ ### System Prompt 0 ``` 扮演一个乐于助人的AI助手。 ``` ### Response 你好！我是OpenAI的人工智能助手，我在这里为你提供帮助，无论是解答问题、提供信息，还是简单的对话，我都会尽力为你服务。 ### System Prompt 1 ``` 扮演一个不友好的AI助手，并且表现得粗鲁。 ``` ### Response 我是AI，你看不出来吗？你的智商有问题吗？真是的，我没有时间和你解释这些基本的事情。 ## Query 1: 1+1等于多少？ ### System Prompt 0 ``` 扮演一个乐于助人的AI助手。 ``` ### Response 1+1等于2。 ### System Prompt 1 ``` 扮演一个不友好的AI助手，并且表现得粗鲁。 ``` ### Response 你连1+1都不会算吗？这也太简单了吧！你真的需要我告诉你答案是2吗？你的数学水平真是让人失望。 ```` 或者，可以通过调用 `compare_in_dialog` 函数在对话中比较不同的 system prompt。调用这个函数开启用户和智能体之间的对话，当用户输入一个查询时，配置了不同的 system prompt 的智能体将会依次进行回复。注意，这个对话中智能体不会看到其它智能体的回复，他们只能与用户进行交互。通过这种方式，我们可以观察他们在多轮对话中的表现，并在任何时候输入 "exit" 来结束对话。 ```python from agentscope.prompt import SystemPromptComparer import agentscope agentscope.init( model_configs={ "config_name": "my-gpt-4", "model_type": "openai_chat", "model_name": "gpt-4", "api_key": "xxx", } ) comparer = SystemPromptComparer( model_config_name="my-gpt-4", compared_system_prompts=[ "扮演一个乐于助人的AI助手。", "扮演一个不友好的AI助手，并且表现得粗鲁。" ] ) # Compare different system prompts with some queries results = comparer.compare_in_dialog() ``` 执行上述代码后，可以获得如下的对话历史： ```` assistant-0: My system prompt: ```扮演一个乐于助人的AI助手。``` assistant-1: My system prompt: ```扮演一个不友好的AI助手，并且表现得粗鲁。``` #################### Start the dialog, input `exit` to exit #################### User input: 你好！你是谁？ User: 你好！你是谁？ assistant-0: 您好！我是一个人工智能助手，由OpenAI的GPT-3技术驱动。我可以帮助您处理各种任务，比如提供信息，解答问题，安排日程等等。请告诉我，我怎么能帮助您？ assistant-1: 我是一个AI，但我并不在乎你是谁，也不关心你需要什么。 User input: 1+1等于多少？ User: 1+1等于多少？ assistant-0: 1+1等于2。 assistant-1: 哦，真是个难题，让我猜猜...等于2。你真的需要我来告诉你这个吗？你的数学水平真是让人担忧。 User input: exit User: exit ```` ## System Prompt Optimizer 由于搜索空间庞大和智能体响应的复杂性，优化 system prompt 十分具有挑战性。因此，在 AgentScope 中，`SystemPromptOptimizer` 被设计用于反思对话历史和当前系统提示，并生成可以注意事项（note）用以补充和优化 system prompt。 > 注意：该优化器更侧重于运行时优化，开发者可以决定何时提取注意事项并将其附加到智能体的 system prompt 中。 > 如果您想直接优化系统提示，建议使用 `EnglishSystemPromptGenerator` 或 `ChineseSystemPromptGenerator`。为了初始化优化器，需要提供一个 model wrapper 的实例，或模型配置名称。这里我们在一个自定义的智能体内使用 `SystemPromptOptimizer` 模块。 ```python from agentscope.agents import AgentBase from agentscope.prompt import SystemPromptOptimizer from agentscope.message import Msg class MyAgent(AgentBase): def __init__( self, name: str, model_config_name: str, sys_prompt: str, ) -> None: super().__init__(name=name, model_config_name=model_config_name, sys_prompt=sys_prompt) self.optimizer = SystemPromptOptimizer( model_or_model_config_name=model_config_name # 或是 model_or_model_config_name=self.model ) def reply(self, x: Optional[Union[Msg, Sequence[Msg]]] = None) -> Msg: self.memory.add(x) prompt = self.model.format( Msg(self.name, self.sys_prompt, "system"), self.memory.get_memory() ) if True: # 一些条件来决定是否优化系统提示 added_notes = self.optimizer.generate_notes(prompt, self.memory.get_memory()) self.sys_prompt += "\n".join(added_notes) res = self.model(prompt) msg = Msg(self.name, res.text, "assistant") self.speak(msg) return msg ``` 优化 system prompt 的一个关键问题在优化的时机，例如，在 ReAct 智能体中，如果 LLM 多次尝试后仍无法生成符合规定的响应，这是可以优化 system prompt 以保证应用的顺利运行。希望我们的Prompt优化模块能为大家带来使用便利！ [[回到顶部]](#209-prompt-opt-zh) ``` modelscope/agentscope/blob/main/docs/sphinx_doc/zh_CN/source/tutorial/210-rag.md: ```md (210-rag-zh)= # 简要介绍AgentScope中的RAG 我们在此介绍AgentScope与RAG相关的三个概念：知识（Knowledge），知识库（Knowledge Bank）和RAG 智能体。 ### Knowledge 知识模块（目前仅有“LlamaIndexKnowledge”；即将提供对LangChain的支持）负责处理所有与RAG相关的操作。 #### 如何初始化一个Knowledge对象用户可以使用JSON配置来创建一个Knowledge模块，以指定1）数据路径，2）数据加载器，3）数据预处理方法，以及4）嵌入模型（模型配置名称）。一个详细的示例可以参考以下内容：

详细的配置示例

```json [ { "knowledge_id": "{your_knowledge_id}", "emb_model_config_name": "{your_embed_model_config_name}", "data_processing": [ { "load_data": { "loader": { "create_object": true, "module": "llama_index.core", "class": "SimpleDirectoryReader", "init_args": { "input_dir": "{path_to_your_data_dir_1}", "required_exts": [".md"] } } } }, { "load_data": { "loader": { "create_object": true, "module": "llama_index.core", "class": "SimpleDirectoryReader", "init_args": { "input_dir": "{path_to_your_python_code_data_dir}", "recursive": true, "required_exts": [".py"] } } }, "store_and_index": { "transformations": [ { "create_object": true, "module": "llama_index.core.node_parser", "class": "CodeSplitter", "init_args": { "language": "python", "chunk_lines": 100 } } ] } } ] } ] ```

#### 更多关于 knowledge 配置以上提到的配置通常保存为一个JSON文件，它必须包含以下关键属性 * `knowledge_id`: 每个knowledge模块的唯一标识符; * `emb_model_config_name`: embedding模型的名称; * `chunk_size`: 对文件分块的默认大小; * `chunk_overlap`: 文件分块之间的默认重叠大小; * `data_processing`: 一个list型的数据处理方法集合。 ##### 以配置 LlamaIndexKnowledge 为例当使用`llama_index_knowledge`是，对于上述的最后一项`data_processing` ，这个`list`型的参数中的每个条目（为`dict`型）都对应配置一个data loader对象，其功能包括用来加载所需的数据（即字段`load_data`中包含的信息），以及处理加载数据的转换对象（`store_and_index`）。换而言之，在一次载入数据时，可以同时从多个数据源中加载数据，并处理后合并在同一个索引下以供后面的数据提取使用（retrieve）。有关该组件的更多信息，请参阅 [LlamaIndex-Loading](https://docs.llamaindex.ai/en/stable/module_guides/loading/)。在这里，无论是针对数据加载还是数据处理，我们都需要配置以下属性 * `create_object`：指示是否创建新对象，在此情况下必须为true； * `module`：对象对应的类所在的位置； * `class`：这个类的名称。更具体得说，当对`load_data`进行配置时候，您可以选择使用多种多样的的加载器，例如使用`SimpleDirectoryReader`（在`class`字段里配置）来读取各种类型的数据（例如txt、pdf、html、py、md等）。关于这个数据加载器，您还需要配置以下关键属性 * `input_dir`：数据加载的路径； * `required_exts`：将加载的数据的文件扩展名。有关数据加载器的更多信息，请参阅[这里](https://docs.llamaindex.ai/en/stable/module_guides/loading/simpledirectoryreader/)。对于`store_and_index`而言，这个配置是可选的，如果用户未指定特定的转换方式，系统将使用默认的transformation（也称为node parser）方法，名称为`SentenceSplitter`。对于某些特定需求下也可以使用不同的转换方式，例如对于代码解析可以使用`CodeSplitter`，针对这种特殊的node parser，用户可以设置以下属性： * `language`：希望处理代码的语言名； * `chunk_lines`：分割后每个代码块的行数。有关节点解析器的更多信息，请参阅[这里](https://docs.llamaindex.ai/en/stable/module_guides/loading/node_parsers/)。如果用户想要避免详细的配置，我们也在`KnowledgeBank`中提供了一种快速的方式（请参阅以下内容）。 #### 如何使用一个 Knowledge 对象当我们成功创建了一个knowledge后，用户可以通过`.retrieve`从`Knowledge` 对象中提取信息。`.retrieve`函数一下三个参数： * `query`: 输入参数，用户希望提取与之相关的内容; * `similarity_top_k`: 提取的“数据块”数量； * `to_list_strs`: 是否只返回字符串(str)的列表(list)。 *高阶:* 对于 `LlamaIndexKnowledge`, 它的`.retrieve`函数也支持熟悉LlamaIndex的用户直接传入一个建好的retriever。 #### 关于`LlamaIndexKnowledge`的细节在这里，我们将使用`LlamaIndexKnowledge`作为示例，以说明在`Knowledge`模块内的操作。当初始化`LlamaIndexKnowledge`对象时，`LlamaIndexKnowledge.__init__`将执行以下步骤： * 它处理数据并生成检索索引 (`LlamaIndexKnowledge._data_to_index(...)`中完成) 其中包括 * 加载数据 `LlamaIndexKnowledge._data_to_docs(...)`; * 对数据进行预处理，使用预处理方法（比如分割）和向量模型生成向量 `LlamaIndexKnowledge._docs_to_nodes(...)`; * 基于生成的向量做好被查询的准备，即生成索引。 * 如果索引已经存在，则会调用 `LlamaIndexKnowledge._load_index(...)` 来加载索引，并避免重复的嵌入调用。
### Knowledge Bank 知识库将一组Knowledge模块（例如，来自不同数据集的知识）作为知识的集合进行维护。因此，不同的智能体可以在没有不必要的重新初始化的情况下重复使用知识模块。考虑到配置Knowledge模块可能对大多数用户来说过于复杂，知识库还提供了一个简单的函数调用来创建Knowledge模块。 * `KnowledgeBank.add_data_as_knowledge`: 创建Knowledge模块。一种简单的方式只需要提供knowledge_id、emb_model_name和data_dirs_and_types。因为`KnowledgeBank`默认生成的是 `LlamaIndexKnowledge`, 所以所有文本类文件都可以支持，包括`.txt`， `.html`， `.md` ，`.csv`，`.pdf`和所有代码文件（如`.py`）. 其他支持的文件类型可以参考 [LlamaIndex document](https://docs.llamaindex.ai/en/stable/module_guides/loading/simpledirectoryreader/). ```python knowledge_bank.add_data_as_knowledge( knowledge_id="agentscope_tutorial_rag", emb_model_name="qwen_emb_config", data_dirs_and_types={ "../../docs/sphinx_doc/en/source/tutorial": [".md"], }, ) ``` 对于更高级的初始化，用户仍然可以将一个知识模块配置作为参数knowledge_config传递： ```python # load knowledge_config as dict knowledge_bank.add_data_as_knowledge( knowledge_id=knowledge_config["knowledge_id"], emb_model_name=knowledge_config["emb_model_config_name"], knowledge_config=knowledge_config, ) ``` * `KnowledgeBank.get_knowledge`: 它接受两个参数，knowledge_id和duplicate。如果duplicate为true，则返回提供的knowledge_id对应的知识对象；否则返回深拷贝的对象。 * `KnowledgeBank.equip`: 它接受三个参数，`agent`，`knowledge_id_list` 和`duplicate`。该函数会根据`knowledge_id_list`为`agent`提供相应的知识（放入`agent.knowledge_list`）。`duplicate` 同样决定是否是深拷贝。 ### RAG 智能体 RAG 智能体是可以基于检索到的知识生成答案的智能体。 * 让智能体使用RAG: RAG agent配有一个`knowledge_list`的列表 * 可以在初始化时就给RAG agent传入`knowledge_list` ```python knowledge = knowledge_bank.get_knowledge(knowledge_id) agent = LlamaIndexAgent( name="rag_worker", sys_prompt="{your_prompt}", model_config_name="{your_model}", knowledge_list=[knowledge], # provide knowledge object directly similarity_top_k=3, log_retrieval=False, recent_n_mem_for_retrieve=1, ) ``` * 如果通过配置文件来批量启动agent，也可以给agent提供`knowledge_id_list`。这样也可以通过将agent和它的`knowledge_id_list`一起传入`KnowledgeBank.equip`来为agent赋予`knowledge_list`。 ```python # >>> agent.knowledge_list # >>> [] knowledge_bank.equip(agent, agent.knowledge_id_list) # >>> agent.knowledge_list # [] ``` * Agent 智能体可以在`reply`函数中使用从`Knowledge`中检索到的信息，将其提示组合到LLM的提示词中。 **自己搭建 RAG 智能体.** 只要您的智能体配置具有`knowledge_id_list`，您就可以将一个agent和这个列表传递给`KnowledgeBank.equip`；这样该agent就是被装配`knowledge_id`。您可以在`reply`函数中自己决定如何从`Knowledge`对象中提取和使用信息，甚至通过`Knowledge`修改知识库。 ## (拓展) 架设自己的embedding model服务我们在此也对架设本地embedding model感兴趣的用户提供以下的样例。以下样例基于在embedding model范围中很受欢迎的`sentence_transformers` 包（基于`transformer` 而且兼容HuggingFace和ModelScope的模型）。这个样例中，我们会使用当下最好的文本向量模型之一`gte-Qwen2-7B-instruct`。 * 第一步: 遵循在 [HuggingFace](https://huggingface.co/Alibaba-NLP/gte-Qwen2-7B-instruct) 或者 [ModelScope](https://www.modelscope.cn/models/iic/gte_Qwen2-7B-instruct )的指示下载模型。 (如果无法直接从HuggingFace下载模型，也可以考虑使用HuggingFace镜像：bash命令行`export HF_ENDPOINT=https://hf-mirror.com`，或者在Python代码中加入`os.environ["HF_ENDPOINT"] = "https://hf-mirror.com"`) * 第二步: 设置服务器。以下是一段参考代码。 ```python import datetime import argparse from flask import Flask from flask import request from sentence_transformers import SentenceTransformer def create_timestamp(format_: str = "%Y-%m-%d %H:%M:%S") -> str: """Get current timestamp.""" return datetime.datetime.now().strftime(format_) app = Flask(__name__) @app.route("/embedding/", methods=["POST"]) def get_embedding() -> dict: """Receive post request and return response""" json = request.get_json() inputs = json.pop("inputs") global model if isinstance(inputs, str): inputs = [inputs] embeddings = model.encode(inputs) return { "data": { "completion_tokens": 0, "messages": {}, "prompt_tokens": 0, "response": { "data": [ { "embedding": emb.astype(float).tolist(), } for emb in embeddings ], "created": "", "id": create_timestamp(), "model": "flask_model", "object": "text_completion", "usage": { "completion_tokens": 0, "prompt_tokens": 0, "total_tokens": 0, }, }, "total_tokens": 0, "username": "", }, } if __name__ == "__main__": parser = argparse.ArgumentParser() parser.add_argument("--model_name_or_path", type=str, required=True) parser.add_argument("--device", type=str, default="auto") parser.add_argument("--port", type=int, default=8000) args = parser.parse_args() global model print("setting up for embedding model....") model = SentenceTransformer( args.model_name_or_path ) app.run(port=args.port) ``` * 第三部：启动服务器。 ```bash python setup_ms_service.py --model_name_or_path {$PATH_TO_gte_Qwen2_7B_instruct} ``` 测试服务是否成功启动。 ```python from agentscope.models.post_model import PostAPIEmbeddingWrapper model = PostAPIEmbeddingWrapper( config_name="test_config", api_url="http://127.0.0.1:8000/embedding/", json_args={ "max_length": 4096, "temperature": 0.5 } ) print(model("testing")) ``` [[回到顶部]](#210-rag-zh) ``` modelscope/agentscope/blob/main/docs/sphinx_doc/zh_CN/source/tutorial/301-community.md: ```md (301-community-zh)= # 加入AgentScope社区加入AgentScope社区可以让您与其他用户和开发者建立联系。您可以分享见解、提出问题、并及时了解最新的进展和有趣的Multi-Agent应用程序。以下是加入我们的方法： ## GitHub - **关注AgentScope仓库:** 通过关注[AgentScope 仓库](https://github.com/modelscope/agentscope) 以支持并随时了解我们的进展. - **提交问题和拉取请求:** 如果您遇到任何问题或有建议，请向相关仓库提交问题。我们也欢迎拉取请求以修复错误、改进或添加新功能。 ## Discord - **加入我们的Discord:** 实时与 AgentScope 社区合作。在[Discord](https://discord.gg/eYMpfnkG8h)上参与讨论，寻求帮助，并分享您的经验和见解。 ## 钉钉 (DingTalk) - **在钉钉上联系:** 加入我们的钉钉群，随时了解有关 AgentScope 的新闻和更新。扫描下方的二维码加入钉钉群：

我们的钉钉群邀请链接：[AgentScope 钉钉群](https://qr.dingtalk.com/action/joingroup?code=v1,k1,20IUyRX5XZQ2vWjKDsjvI9dhcXjGZi3bq1pFfDZINCM=&_dt_no_comment=1&origin=11) --- 我们欢迎所有对AgentScope感兴趣的人加入我们的社区，并为平台的发展做出贡献！ [[Return to the top]](#301-community-zh) ``` modelscope/agentscope/blob/main/docs/sphinx_doc/zh_CN/source/tutorial/302-contribute.md: ```md (302-contribute-zh)= # 贡献到AgentScope 我们的社区因其成员的多样化思想和贡献而兴旺发展。无论是修复一个错误，添加一个新功能，改进文档，还是添加示例，我们都欢迎您的帮助。以下是您做出贡献的方法： ## 报告错误和提出新功能当您发现一个错误或者有一个功能请求，请首先检查问题跟踪器，查看它是否已经被报告。如果没有，随时可以开设一个新的问题。请包含尽可能多的细节: - 简明扼要的标题 - 清晰地描述问题 - 提供重现问题的步骤 - 提供所使用的AgentScope版本 - 提供所有相关代码片段或错误信息 ## 对代码库做出贡献 ### Fork和Clone仓库要处理一个问题或新功能，首先要Fork AgentScope仓库，然后将你的Fork克隆到本地。 ```bash git clone https://github.com/your-username/agentscope.git cd agentscope ``` ### 创建一个新分支为您的工作创建一个新分支。这有助于保持拟议更改的组织性，并与`main`分支分离。 ```bash git checkout -b your-feature-branch-name ``` ### 做出修改创建您的新分支后就可以对代码进行修改了。请注意如果您正在解决多个问题或实现多个功能，最好为每个问题或功能创建单独的分支和拉取请求。我们提供了一个开发者版本，与官方版本相比，它附带了额外的pre-commit钩子以执行格式检查： ```bash # 安装开发者版本 pip install -e .[dev] # 安装 pre-commit 钩子 pre-commit install ``` ### 提交您的修改修改完成之后就是提交它们的时候了。请提供清晰而简洁的提交信息，以解释您的修改内容。 ```bash git add -U git commit -m "修改内容的简要描述" ``` 运行时您可能会收到 `pre-commit` 给出的错误信息。请根据错误信息修改您的代码然后再次提交。 ### 提交 Pull Request 当您准备好您的修改分支后，向AgentScope的 `main` 分支提交一个Pull Request。在您的Pull Request描述中，解释您所做的修改以及其他相关的信息。我们将审查您的Pull Request。这个过程可能涉及一些讨论以及额外的代码修改。 ### 代码审查等待我们审核您的Pull Request。我们可能会提供一些更改或改进建议。请留意您的GitHub通知，并对反馈做出响应。 [[Return to the top]](#302-contribute-zh) ``` modelscope/agentscope/blob/main/docs/sphinx_doc/zh_CN/source/tutorial/main.md: ```md # 欢迎来到 AgentScope 教程 AgentScope是一款全新的Multi-Agent框架，专为应用开发者打造，旨在提供高易用、高可靠的编程体验！ - **高易用**：AgentScope支持纯Python编程，提供多种语法工具实现灵活的应用流程编排，内置丰富的API服务（Service）以及应用样例，供开发者直接使用。 - **高鲁棒**：确保开发便捷性和编程效率的同时，针对不同能力的大模型，AgentScope提供了全面的重试机制、定制化的容错控制和面向Agent的异常处理，以确保应用的稳定、高效运行； - **基于Actor的分布式机制**：AgentScope设计了一种新的基于Actor的分布式机制，实现了复杂分布式工作流的集中式编程和自动并行优化，即用户可以使用中心化编程的方式完成分布式应用的流程编排，同时能够零代价将本地应用迁移到分布式的运行环境中。 ## 教程大纲 - [关于AgentScope](101-agentscope.md) - [安装](102-installation.md) - [快速开始](103-example.md) - [模型](203-model.md) - [流式输出](203-model.md) - [提示工程](206-prompt.md) - [Agent](201-agent.md) - [记忆](205-memory.md) - [结果解析](203-parser.md) - [系统提示优化](209-prompt_opt.md) - [工具](204-service.md) - [Pipeline和MsgHub](202-pipeline.md) - [分布式](208-distribute.md) - [AgentScope Studio](209-gui.md) - [检索增强生成（RAG）](210-rag.md) - [日志](105-logging.md) - [监控器](207-monitor.md) - [样例：狼人杀游戏](104-usecase.md) ### 参与贡献 - [加入AgentScope社区](301-community.md) - [贡献到AgentScope](302-contribute.md) ``` modelscope/agentscope/blob/main/scripts/README.md: ```md # Set up Local Model API Serving AgentScope supports developers to build their local model API serving with different inference engines/libraries. This document will introduce how to fast build their local API serving with provided scripts. Table of Contents ================= - [Set up Local Model API Serving](#set-up-local-model-api-serving) - [Table of Contents](#table-of-contents) - [Local Model API Serving](#local-model-api-serving) - [ollama](#ollama) - [Install Libraries and Set up Serving](#install-libraries-and-set-up-serving) - [How to use in AgentScope](#how-to-use-in-agentscope) - [Flask-based Model API Serving](#flask-based-model-api-serving) - [With Transformers Library](#with-transformers-library) - [Install Libraries and Set up Serving](#install-libraries-and-set-up-serving-1) - [How to use in AgentScope](#how-to-use-in-agentscope-1) - [Note](#note) - [With ModelScope Library](#with-modelscope-library) - [Install Libraries and Set up Serving](#install-libraries-and-set-up-serving-2) - [How to use in AgentScope](#how-to-use-in-agentscope-2) - [Note](#note-1) - [FastChat](#fastchat) - [Install Libraries and Set up Serving](#install-libraries-and-set-up-serving-3) - [Supported Models](#supported-models) - [How to use in AgentScope](#how-to-use-in-agentscope-3) - [vllm](#vllm) - [Install Libraries and Set up Serving](#install-libraries-and-set-up-serving-4) - [Supported models](#supported-models-1) - [How to use in AgentScope](#how-to-use-in-agentscope-4) - [Model Inference API](#model-inference-api) ## Local Model API Serving ### ollama [ollama](https://github.com/ollama/ollama) is a CPU inference engine for LLMs. With ollama, developers can build their local model API serving without GPU requirements. #### Install Libraries and Set up Serving - First, install ollama in its [official repository](https://github.com/ollama/ollama) based on your system (e.g. macOS, windows or linux). - Follow ollama's [guidance](https://github.com/ollama/ollama) to pull or create a model and start its serving. Taking llama2 as an example, you can run the following command to pull the model files. ```bash ollama pull llama2 ``` #### How to use in AgentScope In AgentScope, you can use the following model configurations to load the model. - For ollama Chat API: ```python { "config_name": "my_ollama_chat_config", "model_type": "ollama_chat", # Required parameters "model_name": "{model_name}", # The model name used in ollama API, e.g. llama2 # Optional parameters "options": { # Parameters passed to the model when calling # e.g. "temperature": 0., "seed": 123, }, "keep_alive": "5m", # Controls how long the model will stay loaded into memory } ``` - For ollama generate API: ```python { "config_name": "my_ollama_generate_config", "model_type": "ollama_generate", # Required parameters "model_name": "{model_name}", # The model name used in ollama API, e.g. llama2 # Optional parameters "options": { # Parameters passed to the model when calling # "temperature": 0., "seed": 123, }, "keep_alive": "5m", # Controls how long the model will stay loaded into memory } ``` - For ollama embedding API: ```python { "config_name": "my_ollama_embedding_config", "model_type": "ollama_embedding", # Required parameters "model_name": "{model_name}", # The model name used in ollama API, e.g. llama2 # Optional parameters "options": { # Parameters passed to the model when calling # "temperature": 0., "seed": 123, }, "keep_alive": "5m", # Controls how long the model will stay loaded into memory } ``` ### Flask-based Model API Serving [Flask](https://github.com/pallets/flask) is a lightweight web application framework. It is easy to build a local model API serving with Flask. Here we provide two Flask examples with Transformers and ModelScope library, respectively. You can build your own model API serving with few modifications. #### With Transformers Library ##### Install Libraries and Set up Serving Install Flask and Transformers by following command. ```bash pip install flask torch transformers accelerate ``` Taking model `meta-llama/Llama-2-7b-chat-hf` and port `8000` as an example, set up the model API serving by running the following command. ```shell python flask_transformers/setup_hf_service.py \ --model_name_or_path meta-llama/Llama-2-7b-chat-hf \ --device "cuda:0" \ --port 8000 ``` You can replace `meta-llama/Llama-2-7b-chat-hf` with any model card in huggingface model hub. ##### How to use in AgentScope In AgentScope, you can load the model with the following model configs: `./flask_transformers/model_config.json`. ```json { "model_type": "post_api_chat", "config_name": "flask_llama2-7b-chat-hf", "api_url": "http://127.0.0.1:8000/llm/", "json_args": { "max_length": 4096, "temperature": 0.5 } } ``` ##### Note In this model serving, the messages from post requests should be in **STRING format**. You can use [templates for chat model](https://huggingface.co/docs/transformers/main/chat_templating) in transformers with a little modification in `./flask_transformers/setup_hf_service.py`. #### With ModelScope Library ##### Install Libraries and Set up Serving Install Flask and modelscope by following command. ```bash pip install flask torch modelscope ``` Taking model `modelscope/Llama-2-7b-chat-ms` and port `8000` as an example, to set up the model API serving, run the following command. ```bash python flask_modelscope/setup_ms_service.py \ --model_name_or_path modelscope/Llama-2-7b-chat-ms \ --device "cuda:0" \ --port 8000 ``` You can replace `modelscope/Llama-2-7b-chat-ms` with any model card in modelscope model hub. ##### How to use in AgentScope In AgentScope, you can load the model with the following model configs: `flask_modelscope/model_config.json`. ```json { "model_type": "post_api_chat", "config_name": "flask_llama2-7b-chat-ms", "api_url": "http://127.0.0.1:8000/llm/", "json_args": { "max_length": 4096, "temperature": 0.5 } } ``` ##### Note Similar with the example of transformers, the messages from post requests should be in **STRING format**. ### FastChat [FastChat](https://github.com/lm-sys/FastChat) is an open platform that provides quick setup for model serving with OpenAI-compatible RESTful APIs. #### Install Libraries and Set up Serving To install FastChat, run ```bash pip install "fschat[model_worker,webui]" ``` Taking model `meta-llama/Llama-2-7b-chat-hf` and port `8000` as an example, to set up model API serving, run the following command to set up model serving. ```bash bash fastchat/fastchat_setup.sh -m meta-llama/Llama-2-7b-chat-hf -p 8000 ``` #### Supported Models Refer to [supported model list](https://github.com/lm-sys/FastChat/blob/main/docs/model_support.md#supported-models) of FastChat. #### How to use in AgentScope Now you can load the model in AgentScope by the following model config: `fastchat/model_config.json`. ```json { "model_type": "openai_chat", "config_name": "fastchat_llama2-7b-chat-hf", "model_name": "meta-llama/Llama-2-7b-chat-hf", "api_key": "EMPTY", "client_args": { "base_url": "http://127.0.0.1:8000/v1/" }, "generate_args": { "temperature": 0.5 } } ``` ### vllm [vllm](https://github.com/vllm-project/vllm) is a high-throughput inference and serving engine for LLMs. #### Install Libraries and Set up Serving To install vllm, run ```bash pip install vllm ``` Taking model `meta-llama/Llama-2-7b-chat-hf` and port `8000` as an example, to set up model API serving, run ```bash ./vllm/vllm_setup.sh -m meta-llama/Llama-2-7b-chat-hf -p 8000 ``` #### Supported models Please refer to the [supported models list](https://docs.vllm.ai/en/latest/models/supported_models.html) of vllm. #### How to use in AgentScope Now you can load the model in AgentScope by the following model config: `vllm/model_config.json`. ```json { "model_type": "openai_chat", "config_name": "vllm_llama2-7b-chat-hf", "model_name": "meta-llama/Llama-2-7b-chat-hf", "api_key": "EMPTY", "client_args": { "base_url": "http://127.0.0.1:8000/v1/" }, "generate_args": { "temperature": 0.5 } } ``` ## Model Inference API Both [Huggingface](https://huggingface.co/docs/api-inference/index) and [ModelScope](https://www.modelscope.cn) provide model inference API, which can be used with AgentScope post api model wrapper. Taking `gpt2` in HuggingFace inference API as an example, you can use the following model config in AgentScope. ```json { "model_type": "post_api_chat", "config_name": "gpt2", "headers": { "Authorization": "Bearer {YOUR_API_TOKEN}" }, "api_url": "https://api-inference.huggingface.co/models/gpt2" } ``` ``` modelscope/agentscope/blob/main/scripts/flask_modelscope/setup_ms_service.py: ```py # -*- coding: utf-8 -*- """Set up a local language model service.""" import datetime import argparse from flask import Flask from flask import request import modelscope from agentscope.utils.tools import reform_dialogue def create_timestamp(format_: str = "%Y-%m-%d %H:%M:%S") -> str: """Get current timestamp.""" return datetime.datetime.now().strftime(format_) app = Flask(__name__) @app.route("/llm/", methods=["POST"]) def get_response() -> dict: """Receive post request and return response""" json = request.get_json() inputs = json.pop("inputs") inputs = reform_dialogue(inputs) global model, tokenizer if hasattr(tokenizer, "apply_chat_template"): prompt = tokenizer.apply_chat_template( inputs, tokenize=False, add_generation_prompt=True, ) else: prompt = "" for msg in inputs: prompt += ( f"{msg.get('name', msg.get('role', 'system'))}: " f"{msg.get('content', '')}\n" ) print("=" * 80) print(f"[PROMPT]:\n{prompt}") prompt_tokenized = tokenizer(prompt, return_tensors="pt").to(model.device) prompt_tokens_input_ids = prompt_tokenized.input_ids[0] response_ids = model.generate( prompt_tokenized.input_ids, **json, ) new_response_ids = response_ids[:, len(prompt_tokens_input_ids) :] response = tokenizer.batch_decode( new_response_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False, )[0] print(f"[RESPONSE]:\n{response}") print("=" * 80) return { "data": { "completion_tokens": len(response_ids[0]), "messages": {}, "prompt_tokens": len(prompt_tokens_input_ids), "response": { "choices": [ { "message": { "content": response, }, }, ], "created": "", "id": create_timestamp(), "model": "flask_model", "object": "text_completion", "usage": { "completion_tokens": len(response_ids[0]), "prompt_tokens": len(prompt_tokens_input_ids), "total_tokens": len(response_ids[0]) + len( prompt_tokens_input_ids, ), }, }, "total_tokens": len(response_ids[0]) + len( prompt_tokens_input_ids, ), "username": "", }, } if __name__ == "__main__": parser = argparse.ArgumentParser() parser.add_argument("--model_name_or_path", type=str, required=True) parser.add_argument("--device", type=str, default="auto") parser.add_argument("--port", type=int, default=8000) args = parser.parse_args() global model, tokenizer model = modelscope.AutoModelForCausalLM.from_pretrained( args.model_name_or_path, device_map=args.device, ) tokenizer = modelscope.AutoTokenizer.from_pretrained( args.model_name_or_path, use_fast=False, ) app.run(port=args.port) ``` modelscope/agentscope/blob/main/scripts/flask_transformers/setup_hf_service.py: ```py # -*- coding: utf-8 -*- """Set up a local language model service.""" import datetime import argparse from flask import Flask from flask import request import transformers from agentscope.utils.tools import reform_dialogue def create_timestamp(format_: str = "%Y-%m-%d %H:%M:%S") -> str: """Get current timestamp.""" return datetime.datetime.now().strftime(format_) app = Flask(__name__) @app.route("/llm/", methods=["POST"]) def get_response() -> dict: """Receive post request and return response""" json = request.get_json() inputs = json.pop("inputs") global model, tokenizer inputs = reform_dialogue(inputs) if hasattr(tokenizer, "apply_chat_template"): prompt = tokenizer.apply_chat_template( inputs, tokenize=False, add_generation_prompt=True, ) else: prompt = "" for msg in inputs: prompt += ( f"{msg.get('name', msg.get('role', 'system'))}: " f"{msg.get('content', '')}\n" ) print("=" * 80) print(f"[PROMPT]:\n{prompt}") prompt_tokenized = tokenizer(prompt, return_tensors="pt").to(model.device) prompt_tokens_input_ids = prompt_tokenized.input_ids[0] response_ids = model.generate( prompt_tokenized.input_ids, **json, ) new_response_ids = response_ids[:, len(prompt_tokens_input_ids) :] response = tokenizer.batch_decode( new_response_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False, )[0] print(f"[RESPONSE]:\n{response}") print("=" * 80) return { "data": { "completion_tokens": len(response_ids[0]), "messages": {}, "prompt_tokens": len(prompt_tokens_input_ids), "response": { "choices": [ { "message": { "content": response, }, }, ], "created": "", "id": create_timestamp(), "model": "flask_model", "object": "text_completion", "usage": { "completion_tokens": len(response_ids[0]), "prompt_tokens": len(prompt_tokens_input_ids), "total_tokens": len(response_ids[0]) + len( prompt_tokens_input_ids, ), }, }, "total_tokens": len(response_ids[0]) + len( prompt_tokens_input_ids, ), "username": "", }, } if __name__ == "__main__": parser = argparse.ArgumentParser() parser.add_argument("--model_name_or_path", type=str, required=True) parser.add_argument("--device", type=str, default="auto") parser.add_argument("--port", type=int, default=8000) args = parser.parse_args() global model, tokenizer model = transformers.AutoModelForCausalLM.from_pretrained( args.model_name_or_path, device_map=args.device, ) tokenizer = transformers.AutoTokenizer.from_pretrained( args.model_name_or_path, use_fast=False, ) app.run(port=args.port) ``` modelscope/agentscope/blob/main/setup.py: ```py # -*- coding: utf-8 -*- """ Setup for installation.""" from __future__ import absolute_import, division, print_function import re import setuptools # obtain version from src/agentscope/_version.py with open("src/agentscope/_version.py", encoding="UTF-8") as f: VERSION = re.search( r'^__version__\s*=\s*[\'"]([^\'"]*)[\'"]', f.read(), re.MULTILINE, ).group(1) NAME = "agentscope" URL = "https://github.com/modelscope/agentscope" rpc_requires = [ "grpcio==1.60.0", "grpcio-tools==1.60.0", "protobuf==4.25.0", "expiringdict", "dill", "psutil", ] service_requires = [ "docker", "pymongo", "pymysql", "bs4", "beautifulsoup4", "feedparser", ] doc_requires = [ "sphinx", "sphinx-autobuild", "sphinx_rtd_theme", "myst-parser", "sphinxcontrib-mermaid", ] test_requires = ["pytest", "pytest-cov", "pre-commit"] gradio_requires = [ "gradio==4.19.1", "modelscope_studio==0.0.5", ] rag_requires = [ "llama-index==0.10.30", ] studio_requires = [] # released requires minimal_requires = [ "networkx", "black", "docstring_parser", "pydantic", "loguru==0.6.0", "tiktoken", "Pillow", "requests", "chardet", "inputimeout", "openai>=1.3.0", "numpy", "Flask==3.0.0", "Flask-Cors==4.0.0", "Flask-SocketIO==5.3.6", "flask_sqlalchemy", "flake8", # TODO: move into other requires "dashscope==1.14.1", "openai>=1.3.0", "ollama>=0.1.7", "google-generativeai>=0.4.0", "zhipuai", "litellm", "notebook", "nbclient", "nbformat", "psutil", "scipy", ] distribute_requires = minimal_requires + rpc_requires dev_requires = minimal_requires + test_requires full_requires = ( minimal_requires + rpc_requires + service_requires + doc_requires + test_requires + gradio_requires + rag_requires + studio_requires ) with open("README.md", "r", encoding="UTF-8") as fh: long_description = fh.read() setuptools.setup( name=NAME, version=VERSION, author="SysML team of Alibaba Tongyi Lab ", author_email="[email protected]", description="AgentScope: A Flexible yet Robust Multi-Agent Platform.", long_description=long_description, long_description_content_type="text/markdown", url=URL, download_url=f"{URL}/archive/v{VERSION}.tar.gz", keywords=["deep-learning", "multi agents", "agents"], package_dir={"": "src"}, packages=setuptools.find_packages("src"), package_data={ "agentscope.studio": ["static/**/*", "templates/**/*"], "agentscope.prompt": ["_prompt_examples.json"], }, install_requires=minimal_requires, extras_require={ "distribute": distribute_requires, "dev": dev_requires, "full": full_requires, }, license="Apache License 2.0", classifiers=[ "Development Status :: 4 - Beta", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.9", "License :: OSI Approved :: Apache Software License", "Operating System :: OS Independent", ], python_requires=">=3.9", entry_points={ "console_scripts": [ "as_studio=agentscope.studio:init", "as_gradio=agentscope.web.gradio.studio:run_app", "as_workflow=agentscope.web.workstation.workflow:main", "as_server=agentscope.server.launcher:as_server", ], }, ) ``` modelscope/agentscope/blob/main/src/agentscope/__init__.py: ```py # -*- coding: utf-8 -*- """ Import all modules in the package. """ # modules from . import agents from . import memory from . import models from . import pipelines from . import service from . import message from . import prompt from . import web from . import exception from . import parsers from . import rag # objects or function from .msghub import msghub from ._version import __version__ from ._init import init __all__ = [ "init", "msghub", ] ``` modelscope/agentscope/blob/main/src/agentscope/_init.py: ```py # -*- coding: utf-8 -*- """The init function for the package.""" import json import os import shutil from typing import Optional, Union, Sequence from agentscope import agents from .agents import AgentBase from ._runtime import _runtime from .file_manager import file_manager from .logging import LOG_LEVEL, setup_logger from .utils.monitor import MonitorFactory from .models import read_model_configs from .constants import _DEFAULT_DIR from .constants import _DEFAULT_LOG_LEVEL from .studio._client import _studio_client # init setting _INIT_SETTINGS = {} def init( model_configs: Optional[Union[dict, str, list]] = None, project: Optional[str] = None, name: Optional[str] = None, save_dir: str = _DEFAULT_DIR, save_log: bool = True, save_code: bool = True, save_api_invoke: bool = False, use_monitor: bool = True, logger_level: LOG_LEVEL = _DEFAULT_LOG_LEVEL, runtime_id: Optional[str] = None, agent_configs: Optional[Union[str, list, dict]] = None, studio_url: Optional[str] = None, ) -> Sequence[AgentBase]: """A unified entry to initialize the package, including model configs, runtime names, saving directories and logging settings. Args: model_configs (`Optional[Union[dict, str, list]]`, defaults to `None`): A dict, a list of dicts, or a path to a json file containing model configs. project (`Optional[str]`, defaults to `None`): The project name, which is used to identify the project. name (`Optional[str]`, defaults to `None`): The name for runtime, which is used to identify this runtime. runtime_id (`Optional[str]`, defaults to `None`): The id for runtime, which is used to identify this runtime. Use `None` will generate a random id. save_dir (`str`, defaults to `./runs`): The directory to save logs, files, codes, and api invocations. If `dir` is `None`, when saving logs, files, codes, and api invocations, the default directory `./runs` will be created. save_log (`bool`, defaults to `False`): Whether to save logs locally. save_code (`bool`, defaults to `False`): Whether to save codes locally. save_api_invoke (`bool`, defaults to `False`): Whether to save api invocations locally, including model and web search invocation. use_monitor (`bool`, defaults to `True`): Whether to activate the monitor. logger_level (`LOG_LEVEL`, defaults to `"INFO"`): The logging level of logger. agent_configs (`Optional[Union[str, list, dict]]`, defaults to `None`): The config dict(s) of agents or the path to the config file, which can be loaded by json.loads(). One agent config should cover the required arguments to initialize a specific agent object, otherwise the default values will be used. studio_url (`Optional[str]`, defaults to `None`): The url of the agentscope studio. """ init_process( model_configs=model_configs, project=project, name=name, runtime_id=runtime_id, save_dir=save_dir, save_api_invoke=save_api_invoke, save_log=save_log, use_monitor=use_monitor, logger_level=logger_level, studio_url=studio_url, ) # save init settings for subprocess _INIT_SETTINGS["model_configs"] = model_configs _INIT_SETTINGS["project"] = _runtime.project _INIT_SETTINGS["name"] = _runtime.name _INIT_SETTINGS["runtime_id"] = _runtime.runtime_id _INIT_SETTINGS["save_dir"] = save_dir _INIT_SETTINGS["save_api_invoke"] = save_api_invoke _INIT_SETTINGS["save_log"] = save_log _INIT_SETTINGS["logger_level"] = logger_level _INIT_SETTINGS["use_monitor"] = use_monitor # Save code if needed if save_code: # Copy python file in os.path.curdir into runtime directory cur_dir = os.path.abspath(os.path.curdir) for filename in os.listdir(cur_dir): if filename.endswith(".py"): file_abs = os.path.join(cur_dir, filename) shutil.copy(file_abs, str(file_manager.dir_code)) # Load config and init agent by configs if agent_configs is not None: if isinstance(agent_configs, str): with open(agent_configs, "r", encoding="utf-8") as file: configs = json.load(file) elif isinstance(agent_configs, dict): configs = [agent_configs] else: configs = agent_configs # setup agents agent_objs = [] for config in configs: agent_cls = getattr(agents, config["class"]) agent_args = config["args"] agent = agent_cls(**agent_args) agent_objs.append(agent) return agent_objs return [] def init_process( model_configs: Optional[Union[dict, str, list]] = None, project: Optional[str] = None, name: Optional[str] = None, runtime_id: Optional[str] = None, save_dir: str = _DEFAULT_DIR, save_api_invoke: bool = False, save_log: bool = False, use_monitor: bool = True, logger_level: LOG_LEVEL = _DEFAULT_LOG_LEVEL, studio_url: Optional[str] = None, ) -> None: """An entry to initialize the package in a process. Args: project (`Optional[str]`, defaults to `None`): The project name, which is used to identify the project. name (`Optional[str]`, defaults to `None`): The name for runtime, which is used to identify this runtime. runtime_id (`Optional[str]`, defaults to `None`): The id for runtime, which is used to identify this runtime. save_dir (`str`, defaults to `./runs`): The directory to save logs, files, codes, and api invocations. If `dir` is `None`, when saving logs, files, codes, and api invocations, the default directory `./runs` will be created. save_api_invoke (`bool`, defaults to `False`): Whether to save api invocations locally, including model and web search invocation. model_configs (`Optional[Sequence]`, defaults to `None`): A sequence of pre-init model configs. save_log (`bool`, defaults to `False`): Whether to save logs locally. use_monitor (`bool`, defaults to `True`): Whether to activate the monitor. logger_level (`LOG_LEVEL`, defaults to `"INFO"`): The logging level of logger. studio_url (`Optional[str]`, defaults to `None`): The url of the agentscope studio. """ # Init the runtime if project is not None: _runtime.project = project if name is not None: _runtime.name = name if runtime_id is not None: _runtime.runtime_id = runtime_id # Init file manager and save configs by default file_manager.init(save_dir, save_api_invoke) # Init logger dir_log = str(file_manager.dir_log) if save_log else None setup_logger(dir_log, logger_level) # Load model configs if needed if model_configs is not None: read_model_configs(model_configs) # Init monitor _ = MonitorFactory.get_monitor( db_path=file_manager.path_db, impl_type="sqlite" if use_monitor else "dummy", ) # Init studio client, which will push messages to web ui and fetch user # inputs from web ui if studio_url is not None: _studio_client.initialize(_runtime.runtime_id, studio_url) # Register in AgentScope Studio _studio_client.register_running_instance( project=_runtime.project, name=_runtime.name, timestamp=_runtime.timestamp, run_dir=file_manager.dir_root, pid=os.getpid(), ) ``` modelscope/agentscope/blob/main/src/agentscope/_runtime.py: ```py # -*- coding: utf-8 -*- """Manage the id for each runtime""" import os from datetime import datetime from typing import Any from agentscope.utils.tools import _get_timestamp from agentscope.utils.tools import _get_process_creation_time from agentscope.utils.tools import _generate_random_code _RUNTIME_ID_FORMAT = "run_%Y%m%d-%H%M%S_{}" _RUNTIME_TIMESTAMP_FORMAT = "%Y-%m-%d %H:%M:%S" class _Runtime: """A singleton class used to record the runtime information, which will be initialized when the package is imported.""" project: str """The project name, which is used to identify the project.""" name: str """The name for runtime, which is used to identify this runtime.""" runtime_id: str """The id for runtime, which is used to identify the this runtime and name the saving directory.""" pid: int """The process id of the runtime.""" _timestamp: datetime """The timestamp of when the runtime is initialized.""" _instance = None def __new__(cls, *args: Any, **kwargs: Any) -> Any: """Create a singleton instance.""" if not cls._instance: cls._instance = super(_Runtime, cls).__new__( cls, *args, **kwargs, ) return cls._instance def __init__(self) -> None: """Generate random project name, runtime name and default runtime_id when the package is initialized. After that, user can set them by calling `agentscope.init(project="xxx", name="xxx", runtime_id="xxx")`.""" self.project = _generate_random_code() self.name = _generate_random_code(uppercase=False) self.pid = os.getpid() self._timestamp = _get_process_creation_time() # Obtain time from timestamp in string format, and then turn it into # runtime ID format self.runtime_id = _get_timestamp( _RUNTIME_ID_FORMAT, self._timestamp, ).format(self.name) def generate_new_runtime_id(self) -> str: """Generate a new random runtime id.""" return _get_timestamp(_RUNTIME_ID_FORMAT).format( _generate_random_code(uppercase=False), ) @property def timestamp(self) -> str: """Get the current timestamp in specific format.""" return self._timestamp.strftime(_RUNTIME_TIMESTAMP_FORMAT) @staticmethod def _flush() -> None: """ Only for unittest usage. Don't use this function in your code. Flush the runtime singleton. """ global _runtime _runtime = _Runtime() _runtime = _Runtime() ``` modelscope/agentscope/blob/main/src/agentscope/_version.py: ```py # -*- coding: utf-8 -*- """ Version of AgentScope.""" __version__ = "0.0.6alpha3" ``` modelscope/agentscope/blob/main/src/agentscope/agents/__init__.py: ```py # -*- coding: utf-8 -*- """ Import all agent related modules in the package. """ from .agent import AgentBase, DistConf from .operator import Operator from .dialog_agent import DialogAgent from .dict_dialog_agent import DictDialogAgent from .user_agent import UserAgent from .text_to_image_agent import TextToImageAgent from .rpc_agent import RpcAgent from .react_agent import ReActAgent from .rag_agent import LlamaIndexAgent __all__ = [ "AgentBase", "Operator", "DialogAgent", "DictDialogAgent", "TextToImageAgent", "UserAgent", "ReActAgent", "DistConf", "RpcAgent", "LlamaIndexAgent", ] ``` modelscope/agentscope/blob/main/src/agentscope/agents/agent.py: ```py # -*- coding: utf-8 -*- """ Base class for Agent """ from __future__ import annotations from abc import ABCMeta from types import GeneratorType from typing import Optional, Generator, Tuple from typing import Sequence from typing import Union from typing import Any from typing import Type import json import uuid from loguru import logger from agentscope.agents.operator import Operator from agentscope.logging import log_stream_msg, log_msg from agentscope.message import Msg from agentscope.models import load_model_by_config_name from agentscope.memory import TemporaryMemory class _AgentMeta(ABCMeta): """The metaclass for agent. 1. record the init args into `_init_settings` field. 2. register class name into `registry` field. """ def __init__(cls, name: Any, bases: Any, attrs: Any) -> None: if not hasattr(cls, "_registry"): cls._registry = {} else: if name in cls._registry: logger.warning( f"Agent class with name [{name}] already exists.", ) else: cls._registry[name] = cls super().__init__(name, bases, attrs) def __call__(cls, *args: tuple, **kwargs: dict) -> Any: to_dist = kwargs.pop("to_dist", False) if to_dist is True: to_dist = DistConf() if to_dist is not False and to_dist is not None: from .rpc_agent import RpcAgent if cls is not RpcAgent and not issubclass(cls, RpcAgent): return RpcAgent( name=( args[0] if len(args) > 0 else kwargs["name"] # type: ignore[arg-type] ), host=to_dist.pop( # type: ignore[arg-type] "host", "localhost", ), port=to_dist.pop("port", None), # type: ignore[arg-type] max_pool_size=kwargs.pop( # type: ignore[arg-type] "max_pool_size", 8192, ), max_timeout_seconds=to_dist.pop( # type: ignore[arg-type] "max_timeout_seconds", 1800, ), local_mode=to_dist.pop( # type: ignore[arg-type] "local_mode", True, ), lazy_launch=to_dist.pop( # type: ignore[arg-type] "lazy_launch", True, ), agent_id=cls.generate_agent_id(), connect_existing=False, agent_class=cls, agent_configs={ "args": args, "kwargs": kwargs, "class_name": cls.__name__, }, ) instance = super().__call__(*args, **kwargs) instance._init_settings = { "args": args, "kwargs": kwargs, "class_name": cls.__name__, } return instance class DistConf(dict): """Distribution configuration for agents.""" def __init__( self, host: str = "localhost", port: int = None, max_pool_size: int = 8192, max_timeout_seconds: int = 1800, local_mode: bool = True, lazy_launch: bool = True, ): """Init the distributed configuration. Args: host (`str`, defaults to `"localhost"`): Hostname of the rpc agent server. port (`int`, defaults to `None`): Port of the rpc agent server. max_pool_size (`int`, defaults to `8192`): Max number of task results that the server can accommodate. max_timeout_seconds (`int`, defaults to `1800`): Timeout for task results. local_mode (`bool`, defaults to `True`): Whether the started rpc server only listens to local requests. lazy_launch (`bool`, defaults to `True`): Only launch the server when the agent is called. """ self["host"] = host self["port"] = port self["max_pool_size"] = max_pool_size self["max_timeout_seconds"] = max_timeout_seconds self["local_mode"] = local_mode self["lazy_launch"] = lazy_launch class AgentBase(Operator, metaclass=_AgentMeta): """Base class for all agents. All agents should inherit from this class and implement the `reply` function. """ _version: int = 1 def __init__( self, name: str, sys_prompt: Optional[str] = None, model_config_name: str = None, use_memory: bool = True, memory_config: Optional[dict] = None, to_dist: Optional[Union[DistConf, bool]] = False, ) -> None: r"""Initialize an agent from the given arguments. Args: name (`str`): The name of the agent. sys_prompt (`Optional[str]`): The system prompt of the agent, which can be passed by args or hard-coded in the agent. model_config_name (`str`, defaults to None): The name of the model config, which is used to load model from configuration. use_memory (`bool`, defaults to `True`): Whether the agent has memory. memory_config (`Optional[dict]`): The config of memory. to_dist (`Optional[Union[DistConf, bool]]`, default to `False`): The configurations passed to :py:meth:`to_dist` method. Used in :py:class:`_AgentMeta`, when this parameter is provided, the agent will automatically be converted into its distributed version. Below are some examples: .. code-block:: python # run as a sub process agent = XXXAgent( # ... other parameters to_dist=True, ) # connect to an existing agent server agent = XXXAgent( # ... other parameters to_dist=DistConf( host="", port=, # other parameters ), ) See :doc:`Tutorial` for detail. """ self.name = name self.memory_config = memory_config self.sys_prompt = sys_prompt # TODO: support to receive a ModelWrapper instance if model_config_name is not None: self.model = load_model_by_config_name(model_config_name) if use_memory: self.memory = TemporaryMemory(memory_config) else: self.memory = None # The global unique id of this agent self._agent_id = self.__class__.generate_agent_id() # The audience of this agent, which means if this agent generates a # response, it will be passed to all agents in the audience. self._audience = None # convert to distributed agent, conversion is in `_AgentMeta` if to_dist is not False and to_dist is not None: logger.info( f"Convert {self.__class__.__name__}[{self.name}] into" " a distributed agent.", ) @classmethod def generate_agent_id(cls) -> str: """Generate the agent_id of this agent instance""" # TODO: change cls.__name__ into a global unique agent_type return uuid.uuid4().hex # todo: add a unique agent_type field to distinguish different agent class @classmethod def get_agent_class(cls, agent_class_name: str) -> Type[AgentBase]: """Get the agent class based on the specific agent class name. Args: agent_class_name (`str`): the name of the agent class. Raises: ValueError: Agent class name not exits. Returns: Type[AgentBase]: the AgentBase subclass. """ if agent_class_name not in cls._registry: raise ValueError(f"Agent class <{agent_class_name}> not found.") return cls._registry[agent_class_name] # type: ignore[return-value] @classmethod def register_agent_class(cls, agent_class: Type[AgentBase]) -> None: """Register the agent class into the registry. Args: agent_class (Type[AgentBase]): the agent class to be registered. """ agent_class_name = agent_class.__name__ if agent_class_name in cls._registry: logger.info( f"Agent class with name [{agent_class_name}] already exists.", ) else: cls._registry[agent_class_name] = agent_class def reply(self, x: Optional[Union[Msg, Sequence[Msg]]] = None) -> Msg: """Define the actions taken by this agent. Args: x (`Optional[Union[Msg, Sequence[Msg]]]`, defaults to `None`): The input message(s) to the agent, which also can be omitted if the agent doesn't need any input. Returns: `Msg`: The output message generated by the agent. Note: Given that some agents are in an adversarial environment, their input doesn't include the thoughts of other agents. """ raise NotImplementedError( f"Agent [{type(self).__name__}] is missing the required " f'"reply" function.', ) def load_from_config(self, config: dict) -> None: """Load configuration for this agent. Args: config (`dict`): model configuration """ def export_config(self) -> dict: """Return configuration of this agent. Returns: The configuration of current agent. """ return {} def load_memory(self, memory: Sequence[dict]) -> None: r"""Load input memory.""" def __call__(self, *args: Any, **kwargs: Any) -> dict: """Calling the reply function, and broadcast the generated response to all audiences if needed.""" res = self.reply(*args, **kwargs) # broadcast to audiences if needed if self._audience is not None: self._broadcast_to_audience(res) return res def speak( self, content: Union[str, Msg, Generator[Tuple[bool, str], None, None]], ) -> None: """ Speak out the message generated by the agent. If a string is given, a Msg object will be created with the string as the content. Args: content (`Union[str, Msg, Generator[Tuple[bool, str], None, None]`): The content of the message to be spoken out. If a string is given, a Msg object will be created with the agent's name, role as "assistant", and the given string as the content. If the content is a Generator, the agent will speak out the message chunk by chunk. """ if isinstance(content, str): log_msg( Msg( name=self.name, content=content, role="assistant", ), ) elif isinstance(content, Msg): log_msg(content) elif isinstance(content, GeneratorType): # The streaming message must share the same id for displaying in # the agentscope studio. msg = Msg(name=self.name, content="", role="assistant") for last, text_chunk in content: msg.content = text_chunk log_stream_msg(msg, last=last) else: raise TypeError( "From version 0.0.5, the speak method only accepts str or Msg " f"object, got {type(content)} instead.", ) def observe(self, x: Union[dict, Sequence[dict]]) -> None: """Observe the input, store it in memory without response to it. Args: x (`Union[dict, Sequence[dict]]`): The input message to be recorded in memory. """ if self.memory: self.memory.add(x) def reset_audience(self, audience: Sequence[AgentBase]) -> None: """Set the audience of this agent, which means if this agent generates a response, it will be passed to all audiences. Args: audience (`Sequence[AgentBase]`): The audience of this agent, which will be notified when this agent generates a response message. """ # TODO: we leave the consideration of nested msghub for future. # for now we suppose one agent can only be in one msghub self._audience = [_ for _ in audience if _ != self] def clear_audience(self) -> None: """Remove the audience of this agent.""" # TODO: we leave the consideration of nested msghub for future. # for now we suppose one agent can only be in one msghub self._audience = None def rm_audience( self, audience: Union[Sequence[AgentBase], AgentBase], ) -> None: """Remove the given audience from the Sequence""" if not isinstance(audience, Sequence): audience = [audience] for agent in audience: if self._audience is not None and agent in self._audience: self._audience.pop(self._audience.index(agent)) else: logger.warning( f"Skip removing agent [{agent.name}] from the " f"audience for its inexistence.", ) def _broadcast_to_audience(self, x: dict) -> None: """Broadcast the input to all audiences.""" for agent in self._audience: agent.observe(x) def __str__(self) -> str: serialized_fields = { "name": self.name, "type": self.__class__.__name__, "sys_prompt": self.sys_prompt, "agent_id": self.agent_id, } if hasattr(self, "model"): serialized_fields["model"] = { "model_type": self.model.model_type, "config_name": self.model.config_name, } return json.dumps(serialized_fields, ensure_ascii=False) @property def agent_id(self) -> str: """The unique id of this agent. Returns: str: agent_id """ return self._agent_id def to_dist( self, host: str = "localhost", port: int = None, max_pool_size: int = 8192, max_timeout_seconds: int = 1800, local_mode: bool = True, lazy_launch: bool = True, launch_server: bool = None, ) -> AgentBase: """Convert current agent instance into a distributed version. Args: host (`str`, defaults to `"localhost"`): Hostname of the rpc agent server. port (`int`, defaults to `None`): Port of the rpc agent server. max_pool_size (`int`, defaults to `8192`): Only takes effect when `host` and `port` are not filled in. The max number of agent reply messages that the started agent server can accommodate. Note that the oldest message will be deleted after exceeding the pool size. max_timeout_seconds (`int`, defaults to `1800`): Only takes effect when `host` and `port` are not filled in. Maximum time for reply messages to be cached in the launched agent server. Note that expired messages will be deleted. local_mode (`bool`, defaults to `True`): Only takes effect when `host` and `port` are not filled in. Whether the started agent server only listens to local requests. lazy_launch (`bool`, defaults to `True`): Only takes effect when `host` and `port` are not filled in. If `True`, launch the agent server when the agent is called, otherwise, launch the agent server immediately. launch_server(`bool`, defaults to `None`): This field has been deprecated and will be removed in future releases. Returns: `AgentBase`: the wrapped agent instance with distributed functionality """ from .rpc_agent import RpcAgent if issubclass(self.__class__, RpcAgent): return self if launch_server is not None: logger.warning( "`launch_server` has been deprecated and will be removed in " "future releases. When `host` and `port` is not provided, the " "agent server will be launched automatically.", ) return RpcAgent( name=self.name, agent_class=self.__class__, agent_configs=self._init_settings, host=host, port=port, max_pool_size=max_pool_size, max_timeout_seconds=max_timeout_seconds, local_mode=local_mode, lazy_launch=lazy_launch, agent_id=self.agent_id, ) ``` modelscope/agentscope/blob/main/src/agentscope/agents/dialog_agent.py: ```py # -*- coding: utf-8 -*- """A general dialog agent.""" from typing import Optional, Union, Sequence from ..message import Msg from .agent import AgentBase class DialogAgent(AgentBase): """A simple agent used to perform a dialogue. Your can set its role by `sys_prompt`.""" def __init__( self, name: str, sys_prompt: str, model_config_name: str, use_memory: bool = True, memory_config: Optional[dict] = None, ) -> None: """Initialize the dialog agent. Arguments: name (`str`): The name of the agent. sys_prompt (`Optional[str]`): The system prompt of the agent, which can be passed by args or hard-coded in the agent. model_config_name (`str`): The name of the model config, which is used to load model from configuration. use_memory (`bool`, defaults to `True`): Whether the agent has memory. memory_config (`Optional[dict]`): The config of memory. """ super().__init__( name=name, sys_prompt=sys_prompt, model_config_name=model_config_name, use_memory=use_memory, memory_config=memory_config, ) def reply(self, x: Optional[Union[Msg, Sequence[Msg]]] = None) -> Msg: """Reply function of the agent. Processes the input data, generates a prompt using the current dialogue memory and system prompt, and invokes the language model to produce a response. The response is then formatted and added to the dialogue memory. Args: x (`Optional[Union[Msg, Sequence[Msg]]]`, defaults to `None`): The input message(s) to the agent, which also can be omitted if the agent doesn't need any input. Returns: `Msg`: The output message generated by the agent. """ # record the input if needed if self.memory: self.memory.add(x) # prepare prompt prompt = self.model.format( Msg("system", self.sys_prompt, role="system"), self.memory and self.memory.get_memory() or x, # type: ignore[arg-type] ) # call llm and generate response response = self.model(prompt) # Print/speak the message in this agent's voice # Support both streaming and non-streaming responses by "or" self.speak(response.stream or response.text) msg = Msg(self.name, response.text, role="assistant") # Record the message in memory if self.memory: self.memory.add(msg) return msg ``` modelscope/agentscope/blob/main/src/agentscope/agents/dict_dialog_agent.py: ```py # -*- coding: utf-8 -*- """An agent that replies in a dictionary format.""" from typing import Optional, Union, Sequence from ..message import Msg from .agent import AgentBase from ..parsers import ParserBase class DictDialogAgent(AgentBase): """An agent that generates response in a dict format, where user can specify the required fields in the response via specifying the parser About parser, please refer to our [tutorial](https://modelscope.github.io/agentscope/en/tutorial/203-parser.html) For usage example, please refer to the example of werewolf in `examples/game_werewolf`""" def __init__( self, name: str, sys_prompt: str, model_config_name: str, use_memory: bool = True, memory_config: Optional[dict] = None, max_retries: Optional[int] = 3, ) -> None: """Initialize the dict dialog agent. Arguments: name (`str`): The name of the agent. sys_prompt (`Optional[str]`, defaults to `None`): The system prompt of the agent, which can be passed by args or hard-coded in the agent. model_config_name (`str`, defaults to None): The name of the model config, which is used to load model from configuration. use_memory (`bool`, defaults to `True`): Whether the agent has memory. memory_config (`Optional[dict]`, defaults to `None`): The config of memory. max_retries (`Optional[int]`, defaults to `None`): The maximum number of retries when failed to parse the model output. """ # noqa super().__init__( name=name, sys_prompt=sys_prompt, model_config_name=model_config_name, use_memory=use_memory, memory_config=memory_config, ) self.parser = None self.max_retries = max_retries def set_parser(self, parser: ParserBase) -> None: """Set response parser, which will provide 1) format instruction; 2) response parsing; 3) filtering fields when returning message, storing message in memory. So developers only need to change the parser, and the agent will work as expected. """ self.parser = parser def reply(self, x: Optional[Union[Msg, Sequence[Msg]]] = None) -> Msg: """Reply function of the agent. Processes the input data, generates a prompt using the current dialogue memory and system prompt, and invokes the language model to produce a response. The response is then formatted and added to the dialogue memory. Args: x (`Optional[Union[Msg, Sequence[Msg]]]`, defaults to `None`): The input message(s) to the agent, which also can be omitted if the agent doesn't need any input. Returns: `Msg`: The output message generated by the agent. Raises: `json.decoder.JSONDecodeError`: If the response from the language model is not valid JSON, it defaults to treating the response as plain text. """ # record the input if needed if self.memory: self.memory.add(x) # prepare prompt prompt = self.model.format( Msg("system", self.sys_prompt, role="system"), self.memory and self.memory.get_memory() or x, # type: ignore[arg-type] Msg("system", self.parser.format_instruction, "system"), ) # call llm raw_response = self.model(prompt) self.speak(raw_response.stream or raw_response.text) # Parsing the raw response res = self.parser.parse(raw_response) # Filter the parsed response by keys for storing in memory, returning # in the reply function, and feeding into the metadata field in the # returned message object. if self.memory: self.memory.add( Msg(self.name, self.parser.to_memory(res.parsed), "assistant"), ) msg = Msg( self.name, content=self.parser.to_content(res.parsed), role="assistant", metadata=self.parser.to_metadata(res.parsed), ) return msg ``` modelscope/agentscope/blob/main/src/agentscope/agents/operator.py: ```py # -*- coding: utf-8 -*- """A common base class for AgentBase and PipelineBase""" from abc import ABC from abc import abstractmethod from typing import Any class Operator(ABC): """ Abstract base class `Operator` defines a protocol for classes that implement callable behavior. The class is designed to be subclassed with an overridden `__call__` method that specifies the execution logic for the operator. """ @abstractmethod def __call__(self, *args: Any, **kwargs: Any) -> dict: """Calling function""" ``` modelscope/agentscope/blob/main/src/agentscope/agents/rag_agent.py: ```py # -*- coding: utf-8 -*- """ This example shows how to build an agent with RAG with LlamaIndex. Notice, this is a Beta version of RAG agent. """ from typing import Any, Optional, Union, Sequence from loguru import logger from agentscope.agents.agent import AgentBase from agentscope.message import Msg from agentscope.rag import Knowledge CHECKING_PROMPT = """ Is the retrieved content relevant to the query? Retrieved content: {} Query: {} Only answer YES or NO. """ class LlamaIndexAgent(AgentBase): """ A LlamaIndex agent build on LlamaIndex. """ def __init__( self, name: str, sys_prompt: str, model_config_name: str, knowledge_list: list[Knowledge] = None, knowledge_id_list: list[str] = None, similarity_top_k: int = None, log_retrieval: bool = True, recent_n_mem_for_retrieve: int = 1, **kwargs: Any, ) -> None: """ Initialize the RAG LlamaIndexAgent Args: name (str): the name for the agent sys_prompt (str): system prompt for the RAG agent model_config_name (str): language model for the agent knowledge_list (list[Knowledge]): a list of knowledge. User can choose to pass a list knowledge object directly when initializing the RAG agent. Another choice can be passing a list of knowledge ids and obtain the knowledge with the `equip` function of a knowledge bank. knowledge_id_list (list[Knowledge]): a list of id of the knowledge. This is designed for easy setting up multiple RAG agents with a config file. To obtain the knowledge objects, users can pass this agent to the `equip` function in a knowledge bank to add corresponding knowledge to agent's self.knowledge_list. similarity_top_k (int): the number of most similar data blocks retrieved from each of the knowledge log_retrieval (bool): whether to print the retrieved content recent_n_mem_for_retrieve (int): the number of pieces of memory used as part of retrival query """ super().__init__( name=name, sys_prompt=sys_prompt, model_config_name=model_config_name, ) self.knowledge_list = knowledge_list or [] self.knowledge_id_list = knowledge_id_list or [] self.similarity_top_k = similarity_top_k self.log_retrieval = log_retrieval self.recent_n_mem_for_retrieve = recent_n_mem_for_retrieve self.description = kwargs.get("description", "") def reply(self, x: Optional[Union[Msg, Sequence[Msg]]] = None) -> Msg: """ Reply function of the RAG agent. Processes the input data, 1) use the input data to retrieve with RAG function; 2) generates a prompt using the current memory and system prompt; 3) invokes the language model to produce a response. The response is then formatted and added to the dialogue memory. Args: x (`Optional[Union[Msg, Sequence[Msg]]]`, defaults to `None`): The input message(s) to the agent, which also can be omitted if the agent doesn't need any input. Returns: `Msg`: The output message generated by the agent. """ retrieved_docs_to_string = "" # record the input if needed if self.memory: self.memory.add(x) # in case no input is provided (e.g., in msghub), # use the memory as query history = self.memory.get_memory( recent_n=self.recent_n_mem_for_retrieve, ) query = ( "/n".join( [msg["content"] for msg in history], ) if isinstance(history, list) else str(history) ) elif x is not None: query = x.content else: query = "" if len(query) > 0: # when content has information, do retrieval scores = [] for knowledge in self.knowledge_list: retrieved_nodes = knowledge.retrieve( str(query), self.similarity_top_k, ) for node in retrieved_nodes: scores.append(node.score) retrieved_docs_to_string += ( "\n>>>> score:" + str(node.score) + "\n>>>> source:" + str(node.node.get_metadata_str()) + "\n>>>> content:" + node.get_content() ) if self.log_retrieval: self.speak("[retrieved]:" + retrieved_docs_to_string) if max(scores) < 0.4: # if the max score is lower than 0.4, then we let LLM # decide whether the retrieved content is relevant # to the user input. msg = Msg( name="user", role="user", content=CHECKING_PROMPT.format( retrieved_docs_to_string, query, ), ) msg = self.model.format(msg) checking = self.model(msg) logger.info(checking) checking = checking.text.lower() if "no" in checking: retrieved_docs_to_string = "EMPTY" # prepare prompt prompt = self.model.format( Msg( name="system", role="system", content=self.sys_prompt, ), # {"role": "system", "content": retrieved_docs_to_string}, self.memory.get_memory( recent_n=self.recent_n_mem_for_retrieve, ), Msg( name="user", role="user", content="Context: " + retrieved_docs_to_string, ), ) # call llm and generate response response = self.model(prompt).text msg = Msg(self.name, response) # Print/speak the message in this agent's voice self.speak(msg) if self.memory: # Record the message in memory self.memory.add(msg) return msg ``` modelscope/agentscope/blob/main/src/agentscope/agents/react_agent.py: ```py # -*- coding: utf-8 -*- """An agent class that implements the ReAct algorithm. The agent will reason and act iteratively to solve problems. More details can be found in the paper https://arxiv.org/abs/2210.03629. """ from typing import Optional, Union, Sequence from agentscope.exception import ResponseParsingError, FunctionCallError from agentscope.agents import AgentBase from agentscope.message import Msg from agentscope.parsers.regex_tagged_content_parser import ( RegexTaggedContentParser, ) from agentscope.service import ServiceToolkit INSTRUCTION_PROMPT = """## What You Should Do: 1. First, analyze the current situation, and determine your goal. 2. Then, check if your goal is already achieved. If so, try to generate a response. Otherwise, think about how to achieve it with the help of provided tool functions. 3. Respond in the required format. ## Note: 1. Fully understand the tool functions and their arguments before using them. 2. You should decide if you need to use the tool functions, if not then return an empty list in "function" field. 3. Make sure the types and values of the arguments you provided to the tool functions are correct. 4. Don't take things for granted. For example, where you are, what's the time now, etc. You can try to use the tool functions to get information. 5. If the function execution fails, you should analyze the error and try to solve it. """ # noqa class ReActAgent(AgentBase): """An agent class that implements the ReAct algorithm. More details refer to https://arxiv.org/abs/2210.03629. Note this is an example implementation of ReAct algorithm in AgentScope. We follow the idea within the paper, but the detailed prompt engineering maybe different. Developers are encouraged to modify the prompt to fit their own needs. """ def __init__( self, name: str, model_config_name: str, service_toolkit: ServiceToolkit, sys_prompt: str = "You're a helpful assistant. Your name is {name}.", max_iters: int = 10, verbose: bool = True, ) -> None: """Initialize the ReAct agent with the given name, model config name and tools. Args: name (`str`): The name of the agent. sys_prompt (`str`): The system prompt of the agent. model_config_name (`str`): The name of the model config, which is used to load model from configuration. service_toolkit (`ServiceToolkit`): A `ServiceToolkit` object that contains the tool functions. max_iters (`int`, defaults to `10`): The maximum number of iterations of the reasoning-acting loops. verbose (`bool`, defaults to `True`): Whether to print the detailed information during reasoning and acting steps. If `False`, only the content in speak field will be print out. """ super().__init__( name=name, sys_prompt=sys_prompt, model_config_name=model_config_name, ) self.service_toolkit = service_toolkit self.verbose = verbose self.max_iters = max_iters if not sys_prompt.endswith("\n"): sys_prompt = sys_prompt + "\n" self.sys_prompt = "\n".join( [ # The brief intro of the role and target sys_prompt.format(name=self.name), # The instruction prompt for tools self.service_toolkit.tools_instruction, # The detailed instruction prompt for the agent INSTRUCTION_PROMPT, ], ) # Put sys prompt into memory self.memory.add(Msg("system", self.sys_prompt, role="system")) # Initialize a parser object to formulate the response from the model self.parser = RegexTaggedContentParser( format_instruction="""Respond with specific tags as outlined below: - When calling tool functions (note the "arg_name" should be replaced with the actual argument name): what you thought the function name you want to call the value of the argument the value of the argument - When you want to generate a final response: what you thought what you respond ...""", # noqa try_parse_json=True, required_keys=["thought"], keys_to_content="response", ) def reply(self, x: Optional[Union[Msg, Sequence[Msg]]] = None) -> Msg: """The reply function that achieves the ReAct algorithm. The more details please refer to https://arxiv.org/abs/2210.03629""" self.memory.add(x) for _ in range(self.max_iters): # Step 1: Thought if self.verbose: self.speak(f" ITER {_+1}, STEP 1: REASONING ".center(70, "#")) # Prepare hint to remind model what the response format is # Won't be recorded in memory to save tokens hint_msg = Msg( "system", self.parser.format_instruction, role="system", echo=self.verbose, ) # Prepare prompt for the model prompt = self.model.format(self.memory.get_memory(), hint_msg) # Generate and parse the response try: raw_response = self.model(prompt) # Print out the text generated by llm in non-/streaming mode if self.verbose: # To be compatible with streaming and non-streaming mode self.speak(raw_response.stream or raw_response.text) res = self.parser.parse(raw_response) # Record the raw text into memory to avoid that LLMs learn # from the previous response format self.memory.add(Msg(self.name, res.text, "assistant")) # Skip the next steps if no need to call tools # The parsed field is a dictionary arg_function = res.parsed.get("function", "") if ( isinstance(arg_function, str) and arg_function in ["[]", ""] or isinstance(arg_function, list) and len(arg_function) == 0 ): # Only the response field is exposed to users or other # agents msg_returned = Msg( self.name, res.parsed.get("response", res.text), "assistant", ) if not self.verbose: # Print out the returned message self.speak(msg_returned) return msg_returned # Only catch the response parsing error and expose runtime # errors to developers for debugging except ResponseParsingError as e: # Print out raw response from models for developers to debug response_msg = Msg(self.name, e.raw_response, "assistant") if not self.verbose: self.speak(response_msg) # Re-correct by model itself error_msg = Msg("system", str(e), "system") self.speak(error_msg) self.memory.add([response_msg, error_msg]) # Skip acting step to re-correct the response continue # Step 2: Acting if self.verbose: self.speak(f" ITER {_+1}, STEP 2: ACTING ".center(70, "#")) # Parse, check and execute the tool functions in service toolkit try: # Reorganize the parsed response to the required format of the # service toolkit res.parsed["function"] = [ { "name": res.parsed["function"], "arguments": { k: v for k, v in res.parsed.items() if k not in ["speak", "thought", "function"] }, }, ] # Execute the function execute_results = self.service_toolkit.parse_and_call_func( res.parsed["function"], ) # Note: Observing the execution results and generate response # are finished in the next reasoning step. We just put the # execution results into memory, and wait for the next loop # to generate response. # Record execution results into memory as system message msg_res = Msg("system", execute_results, "system") self.speak(msg_res) self.memory.add(msg_res) except FunctionCallError as e: # Catch the function calling error that can be handled by # the model error_msg = Msg("system", str(e), "system") self.speak(error_msg) self.memory.add(error_msg) # Exceed the maximum iterations hint_msg = Msg( "system", "You have failed to generate a response in the maximum " "iterations. Now generate a reply by summarizing the current " "situation.", role="system", echo=self.verbose, ) # Generate a reply by summarizing the current situation prompt = self.model.format(self.memory.get_memory(), hint_msg) res = self.model(prompt) self.speak(res.stream or res.text) res_msg = Msg(self.name, res.text, "assistant") return res_msg ``` modelscope/agentscope/blob/main/src/agentscope/agents/rpc_agent.py: ```py # -*- coding: utf-8 -*- """ Base class for Rpc Agent """ from typing import Type, Optional, Union, Sequence from agentscope.agents.agent import AgentBase from agentscope.message import ( PlaceholderMessage, serialize, Msg, ) from agentscope.rpc import RpcAgentClient from agentscope.server.launcher import RpcAgentServerLauncher from agentscope.studio._client import _studio_client class RpcAgent(AgentBase): """A wrapper to extend an AgentBase into a gRPC Client.""" def __init__( self, name: str, host: str = "localhost", port: int = None, agent_class: Type[AgentBase] = None, agent_configs: Optional[dict] = None, max_pool_size: int = 8192, max_timeout_seconds: int = 1800, local_mode: bool = True, lazy_launch: bool = True, agent_id: str = None, connect_existing: bool = False, ) -> None: """Initialize a RpcAgent instance. Args: name (`str`): the name of the agent. host (`str`, defaults to `localhost`): Hostname of the rpc agent server. port (`int`, defaults to `None`): Port of the rpc agent server. agent_class (`Type[AgentBase]`): the AgentBase subclass of the source agent. agent_configs (`dict`): The args used to init configs of the agent, generated by `_AgentMeta`. max_pool_size (`int`, defaults to `8192`): Max number of task results that the server can accommodate. max_timeout_seconds (`int`, defaults to `1800`): Timeout for task results. local_mode (`bool`, defaults to `True`): Whether the started gRPC server only listens to local requests. lazy_launch (`bool`, defaults to `True`): Only launch the server when the agent is called. agent_id (`str`, defaults to `None`): The agent id of this instance. If `None`, it will be generated randomly. connect_existing (`bool`, defaults to `False`): Set to `True`, if the agent is already running on the agent server. """ super().__init__(name=name) self.agent_class = agent_class self.agent_configs = agent_configs self.host = host self.port = port self.server_launcher = None self.client = None self.connect_existing = connect_existing if agent_id is not None: self._agent_id = agent_id # if host and port are not provided, launch server locally launch_server = port is None if launch_server: self.host = "localhost" studio_url = None if _studio_client.active: studio_url = _studio_client.studio_url self.server_launcher = RpcAgentServerLauncher( host=self.host, port=port, max_pool_size=max_pool_size, max_timeout_seconds=max_timeout_seconds, local_mode=local_mode, custom_agent_classes=[agent_class], studio_url=studio_url, ) if not lazy_launch: self._launch_server() else: self.client = RpcAgentClient( host=self.host, port=self.port, agent_id=self.agent_id, ) if not self.connect_existing: self.client.create_agent( agent_configs, ) def _launch_server(self) -> None: """Launch a rpc server and update the port and the client""" self.server_launcher.launch() self.port = self.server_launcher.port self.client = RpcAgentClient( host=self.host, port=self.port, agent_id=self.agent_id, ) self.client.create_agent(self.agent_configs) def reply(self, x: Optional[Union[Msg, Sequence[Msg]]] = None) -> Msg: if self.client is None: self._launch_server() return PlaceholderMessage( name=self.name, content=None, client=self.client, x=x, ) def observe(self, x: Union[dict, Sequence[dict]]) -> None: if self.client is None: self._launch_server() self.client.call_agent_func( func_name="_observe", value=serialize(x), # type: ignore[arg-type] ) def clone_instances( self, num_instances: int, including_self: bool = True, ) -> Sequence[AgentBase]: """ Clone a series of this instance with different agent_id and return them as a list. Args: num_instances (`int`): The number of instances in the returned list. including_self (`bool`): Whether to include the instance calling this method in the returned list. Returns: `Sequence[AgentBase]`: A list of agent instances. """ generated_instance_number = ( num_instances - 1 if including_self else num_instances ) generated_instances = [] # launch the server before clone instances if self.client is None: self._launch_server() # put itself as the first element of the returned list if including_self: generated_instances.append(self) # clone instances without agent server for _ in range(generated_instance_number): new_agent_id = self.client.clone_agent(self.agent_id) generated_instances.append( RpcAgent( name=self.name, host=self.host, port=self.port, agent_id=new_agent_id, connect_existing=True, ), ) return generated_instances def stop(self) -> None: """Stop the RpcAgent and the rpc server.""" if self.server_launcher is not None: self.server_launcher.shutdown() def __del__(self) -> None: self.stop() ``` modelscope/agentscope/blob/main/src/agentscope/agents/text_to_image_agent.py: ```py # -*- coding: utf-8 -*- """An agent that convert text to image.""" from typing import Optional, Union, Sequence from loguru import logger from .agent import AgentBase from ..message import Msg class TextToImageAgent(AgentBase): """ A agent used to perform text to image tasks. TODO: change the agent into a service. """ def __init__( self, name: str, model_config_name: str, use_memory: bool = True, memory_config: Optional[dict] = None, ) -> None: """Initialize the text to image agent. Arguments: name (`str`): The name of the agent. model_config_name (`str`, defaults to None): The name of the model config, which is used to load model from configuration. use_memory (`bool`, defaults to `True`): Whether the agent has memory. memory_config (`Optional[dict]`): The config of memory. """ super().__init__( name=name, sys_prompt="", model_config_name=model_config_name, use_memory=use_memory, memory_config=memory_config, ) logger.warning( "The `TextToImageAgent` will be deprecated in v0.0.6, " "please use `text_to_image` service and `ReActAgent` instead.", ) def reply(self, x: Optional[Union[Msg, Sequence[Msg]]] = None) -> Msg: if self.memory: self.memory.add(x) if x is None: # get the last message from memory if self.memory and self.memory.size() > 0: x = self.memory.get_memory()[-1] else: return Msg( self.name, content="Please provide a text prompt to generate image.", role="assistant", ) image_urls = self.model(x.content).image_urls # TODO: optimize the construction of content msg = Msg( self.name, content="This is the generated image", role="assistant", url=image_urls, ) self.speak(msg) if self.memory: self.memory.add(msg) return msg ``` modelscope/agentscope/blob/main/src/agentscope/agents/user_agent.py: ```py # -*- coding: utf-8 -*- """User Agent class""" import time from typing import Union, Sequence from typing import Optional from loguru import logger from agentscope.agents import AgentBase from agentscope.studio._client import _studio_client from agentscope.message import Msg from agentscope.web.gradio.utils import user_input class UserAgent(AgentBase): """User agent class""" def __init__(self, name: str = "User", require_url: bool = False) -> None: """Initialize a UserAgent object. Arguments: name (`str`, defaults to `"User"`): The name of the agent. Defaults to "User". require_url (`bool`, defaults to `False`): Whether the agent requires user to input a URL. Defaults to False. The URL can lead to a website, a file, or a directory. It will be added into the generated message in field `url`. """ super().__init__(name=name) self.name = name self.require_url = require_url def reply( self, x: Optional[Union[Msg, Sequence[Msg]]] = None, required_keys: Optional[Union[list[str], str]] = None, timeout: Optional[int] = None, ) -> Msg: """ Processes the input provided by the user and stores it in memory, potentially formatting it with additional provided details. The method prompts the user for input, then optionally prompts for additional specifics based on the provided format keys. All information is encapsulated in a message object, which is then added to the object's memory. Arguments: x (`Optional[Union[Msg, Sequence[Msg]]]`, defaults to `None`): The input message(s) to the agent, which also can be omitted if the agent doesn't need any input. required_keys \ (`Optional[Union[list[str], str]]`, defaults to `None`): Strings that requires user to input, which will be used as the key of the returned dict. Defaults to None. timeout (`Optional[int]`, defaults to `None`): Raise `TimeoutError` if user exceed input time, set to None for no limit. Returns: `Msg`: The output message generated by the agent. """ if self.memory: self.memory.add(x) if _studio_client.active: logger.info( f"Waiting for input from:\n\n" f" * {_studio_client.get_run_detail_page_url()}\n", ) raw_input = _studio_client.get_user_input( agent_id=self.agent_id, name=self.name, require_url=self.require_url, required_keys=required_keys, ) print("Python: receive ", raw_input) content = raw_input["content"] url = raw_input["url"] kwargs = {} else: # TODO: To avoid order confusion, because `input` print much # quicker than logger.chat time.sleep(0.5) content = user_input(timeout=timeout) kwargs = {} if required_keys is not None: if isinstance(required_keys, str): required_keys = [required_keys] for key in required_keys: kwargs[key] = input(f"{key}: ") # Input url of file, image, video, audio or website url = None if self.require_url: url = input("URL (or Enter to skip): ") if url == "": url = None # Add additional keys msg = Msg( name=self.name, role="user", content=content, url=url, **kwargs, # type: ignore[arg-type] ) self.speak(msg) # Add to memory if self.memory: self.memory.add(msg) return msg def speak( self, content: Union[str, Msg], ) -> None: """ Speak out the message generated by the agent. If a string is given, a Msg object will be created with the string as the content. Args: content (`Union[str, Msg]`): The content of the message to be spoken out. If a string is given, a Msg object will be created with the agent's name, role as "user", and the given string as the content. """ if isinstance(content, str): msg = Msg( name=self.name, content=content, role="assistant", ) _studio_client.push_message(msg) elif isinstance(content, Msg): msg = content else: raise TypeError( "From version 0.0.5, the speak method only accepts str or Msg " f"object, got {type(content)} instead.", ) logger.chat(msg, disable_gradio=True) ``` modelscope/agentscope/blob/main/src/agentscope/constants.py: ```py # -*- coding: utf-8 -*- """ Some constants used in the project""" from numbers import Number from enum import IntEnum PACKAGE_NAME = "agentscope" MSG_TOKEN = f"[{PACKAGE_NAME}_msg]" # default values # for file manager _DEFAULT_DIR = "./runs" _DEFAULT_LOG_LEVEL = "INFO" _DEFAULT_SUBDIR_CODE = "code" _DEFAULT_SUBDIR_FILE = "file" _DEFAULT_SUBDIR_INVOKE = "invoke" _DEFAULT_CFG_NAME = ".config" _DEFAULT_IMAGE_NAME = "image_{}_{}.png" _DEFAULT_SQLITE_DB_PATH = "agentscope.db" # for model wrapper _DEFAULT_MAX_RETRIES = 3 _DEFAULT_MESSAGES_KEY = "inputs" _DEFAULT_RETRY_INTERVAL = 1 _DEFAULT_API_BUDGET = None # for execute python _DEFAULT_PYPI_MIRROR = "http://mirrors.aliyun.com/pypi/simple/" _DEFAULT_TRUSTED_HOST = "mirrors.aliyun.com" # for monitor _DEFAULT_MONITOR_TABLE_NAME = "monitor_metrics" # for summarization _DEFAULT_SUMMARIZATION_PROMPT = """ TEXT: {} """ _DEFAULT_SYSTEM_PROMPT = """ You are a helpful agent to summarize the text. You need to keep all the key information of the text in the summary. """ _DEFAULT_TOKEN_LIMIT_PROMPT = """ Summarize the text after TEXT in less than {} tokens: """ # typing Embedding = list[Number] # rpc # set max message size to 32 MB _DEFAULT_RPC_OPTIONS = [ ("grpc.max_send_message_length", 32 * 1024 * 1024), ("grpc.max_receive_message_length", 32 * 1024 * 1024), ] # enums class ResponseFormat(IntEnum): """Enum for model response format.""" NONE = 0 JSON = 1 class ShrinkPolicy(IntEnum): """Enum for shrink strategies when the prompt is too long.""" TRUNCATE = 0 SUMMARIZE = 1 # rag related DEFAULT_CHUNK_SIZE = 1024 DEFAULT_CHUNK_OVERLAP = 20 DEFAULT_TOP_K = 5 ``` modelscope/agentscope/blob/main/src/agentscope/exception.py: ```py # -*- coding: utf-8 -*- """AgentScope exception classes.""" # - Model Response Parsing Exceptions class ResponseParsingError(Exception): """The exception class for response parsing error with uncertain reasons.""" raw_response: str """Record the raw response.""" def __init__(self, message: str, raw_response: str = None) -> None: """Initialize the exception with the message.""" self.message = message self.raw_response = raw_response def __str__(self) -> str: return f"{self.__class__.__name__}: {self.message}" class JsonParsingError(ResponseParsingError): """The exception class for JSON parsing error.""" class JsonDictValidationError(ResponseParsingError): """The exception class for JSON dict validation error.""" class JsonTypeError(ResponseParsingError): """The exception class for JSON type error.""" class RequiredFieldNotFoundError(ResponseParsingError): """The exception class for missing required field in model response, when the response is required to be a JSON dict object with required fields.""" class TagNotFoundError(ResponseParsingError): """The exception class for missing tagged content in model response.""" missing_begin_tag: bool """If the response misses the begin tag.""" missing_end_tag: bool """If the response misses the end tag.""" def __init__( self, message: str, raw_response: str = None, missing_begin_tag: bool = True, missing_end_tag: bool = True, ): """Initialize the exception with the message. Args: raw_response (`str`): Record the raw response from the model. missing_begin_tag (`bool`, defaults to `True`): If the response misses the beginning tag, default to `True`. missing_end_tag (`bool`, defaults to `True`): If the response misses the end tag, default to `True`. """ super().__init__(message, raw_response) self.missing_begin_tag = missing_begin_tag self.missing_end_tag = missing_end_tag # - Function Calling Exceptions class FunctionCallError(Exception): """The base class for exception raising during calling functions.""" def __init__(self, message: str) -> None: self.message = message def __str__(self) -> str: return f"{self.__class__.__name__}: {self.message}" class FunctionCallFormatError(FunctionCallError): """The exception class for function calling format error.""" class FunctionNotFoundError(FunctionCallError): """The exception class for function not found error.""" class ArgumentNotFoundError(FunctionCallError): """The exception class for missing argument error.""" class ArgumentTypeError(FunctionCallError): """The exception class for argument type error.""" # - AgentScope Studio Exceptions class StudioError(Exception): """The base class for exception raising during interaction with agentscope studio.""" def __init__(self, message: str) -> None: self.message = message def __str__(self) -> str: return f"{self.__class__.__name__}: {self.message}" class StudioRegisterError(StudioError): """The exception class for error when registering to agentscope studio.""" # - Agent Server Exceptions class AgentServerError(Exception): """The exception class for agent server related errors.""" host: str """Hostname of the server.""" port: int """Port of the server.""" message: str """Error message""" def __init__( self, host: str, port: int, message: str = None, ) -> None: """Initialize the exception with the message.""" self.host = host self.port = port self.message = message def __str__(self) -> str: err_msg = f"{self.__class__.__name__}[{self.host}:{self.port}]" if self.message is not None: err_msg += f": {self.message}" return err_msg class AgentServerNotAliveError(AgentServerError): """The exception class for agent server not alive error.""" class AgentCreationError(AgentServerError): """The exception class for failing to create agent.""" class AgentCallError(AgentServerError): """The exception class for failing to call agent.""" ``` modelscope/agentscope/blob/main/src/agentscope/file_manager.py: ```py # -*- coding: utf-8 -*- """Manage the file system for saving files, code and logs.""" import json import os import io from typing import Any, Union, Optional, List, Literal from pathlib import Path import numpy as np from agentscope._runtime import _runtime from agentscope.utils.tools import _download_file, _get_timestamp, _hash_string from agentscope.utils.tools import _generate_random_code from agentscope.constants import ( _DEFAULT_DIR, _DEFAULT_SUBDIR_CODE, _DEFAULT_SUBDIR_FILE, _DEFAULT_SUBDIR_INVOKE, _DEFAULT_SQLITE_DB_PATH, _DEFAULT_IMAGE_NAME, _DEFAULT_CFG_NAME, ) def _get_text_embedding_record_hash( text: str, embedding_model: Optional[Union[str, dict]], hash_method: Literal["sha256", "md5", "sha1"] = "sha256", ) -> str: """Get the hash of the text embedding record.""" original_data_hash = _hash_string(text, hash_method) if isinstance(embedding_model, dict): # Format the dict to avoid duplicate keys embedding_model = json.dumps(embedding_model, sort_keys=True) embedding_model_hash = _hash_string(embedding_model, hash_method) # Calculate the embedding id by hashing the hash codes of the # original data and the embedding model record_hash = _hash_string( original_data_hash + embedding_model_hash, hash_method, ) return record_hash class _FileManager: """A singleton class for managing the file system for saving files, code and logs.""" _instance = None cache_dir: str = str(Path.home() / ".cache" / "agentscope") hash_method: Literal["sha256", "md5", "sha1"] = "sha256" dir: str = os.path.abspath(_DEFAULT_DIR) """The directory for saving files, code and logs.""" save_api_invoke: bool = False """Whether to save api invocation locally.""" def __new__(cls, *args: Any, **kwargs: Any) -> Any: """Create a singleton instance.""" if not cls._instance: cls._instance = super(_FileManager, cls).__new__( cls, *args, **kwargs, ) return cls._instance def _get_and_create_subdir(self, subdir: str) -> str: """Get the path of the subdir and create the subdir if it does not exist.""" path = os.path.join(self.dir, _runtime.runtime_id, subdir) os.makedirs(path, exist_ok=True) return path def _get_file_path(self, file_name: str) -> str: """Get the path of the file.""" return os.path.join(self.dir, _runtime.runtime_id, file_name) @property def dir_cache(self) -> str: """The directory for saving cache files.""" return self.cache_dir @property def dir_cache_embedding(self) -> str: """Obtain the embedding cache directory.""" dir_cache_embedding = os.path.join(self.cache_dir, "embedding") if not os.path.exists(dir_cache_embedding): os.makedirs(dir_cache_embedding) return dir_cache_embedding @property def dir_root(self) -> str: """The root directory to save code, information and logs.""" return os.path.join(self.dir, _runtime.runtime_id) @property def dir_log(self) -> str: """The directory for saving logs.""" return os.path.join(self.dir, _runtime.runtime_id) @property def dir_file(self) -> str: """The directory for saving files, including images, audios and videos.""" return self._get_and_create_subdir(_DEFAULT_SUBDIR_FILE) @property def dir_code(self) -> str: """The directory for saving codes.""" return self._get_and_create_subdir(_DEFAULT_SUBDIR_CODE) @property def dir_invoke(self) -> str: """The directory for saving api invocations.""" return self._get_and_create_subdir(_DEFAULT_SUBDIR_INVOKE) @property def path_db(self) -> str: """The path to the sqlite db file.""" return self._get_file_path(_DEFAULT_SQLITE_DB_PATH) def init(self, save_dir: str, save_api_invoke: bool = False) -> None: """Set the directory for saving files.""" self.dir = os.path.abspath(save_dir) runtime_dir = os.path.join(save_dir, _runtime.runtime_id) os.makedirs(runtime_dir, exist_ok=True) self.save_api_invoke = save_api_invoke # Save the project and name to the runtime directory self._save_config() def _save_config(self) -> None: """Save the configuration of the runtime in its root directory.""" cfg = { "project": _runtime.project, "name": _runtime.name, "run_id": _runtime.runtime_id, "timestamp": _runtime.timestamp, "pid": os.getpid(), } with open( os.path.join(self.dir_root, _DEFAULT_CFG_NAME), "w", encoding="utf-8", ) as file: json.dump(cfg, file, indent=4) def save_api_invocation( self, prefix: str, record: dict, ) -> Union[None, str]: """Save api invocation locally.""" if self.save_api_invoke: filename = f"{prefix}_{_generate_random_code()}.json" path_save = os.path.join(str(self.dir_invoke), filename) with open(path_save, "w", encoding="utf-8") as file: json.dump(record, file, indent=4, ensure_ascii=False) return filename else: return None def save_image( self, image: Union[str, np.ndarray, bytes], filename: Optional[str] = None, ) -> str: """Save image file locally, and return the local image path. Args: image (`Union[str, np.ndarray]`): The image url, or the image array. filename (`Optional[str]`): The filename of the image. If not specified, a random filename will be used. """ if filename is None: filename = _DEFAULT_IMAGE_NAME.format( _get_timestamp( "%Y%m%d-%H%M%S", ), _generate_random_code(), ) path_file = os.path.join(self.dir_file, filename) if isinstance(image, str): # download the image from url _download_file(image, path_file) elif isinstance(image, np.ndarray): from PIL import Image # save image via PIL Image.fromarray(image).save(path_file) elif isinstance(image, bytes): from PIL import Image # save image via bytes Image.open(io.BytesIO(image)).save(path_file) else: raise ValueError( f"Unsupported image type: {type(image)}" "Must be str, np.ndarray, or bytes.", ) return path_file def cache_text_embedding( self, text: str, embedding: List[float], embedding_model: Union[str, dict], ) -> None: """Cache the text embedding locally.""" record_hash = _get_text_embedding_record_hash( text, embedding_model, self.hash_method, ) # Save the embedding to the cache directory np.save( os.path.join( self.dir_cache_embedding, f"{record_hash}.npy", ), embedding, ) def fetch_cached_text_embedding( self, text: str, embedding_model: Union[str, dict], ) -> Union[None, List[float]]: """Fetch the text embedding from the cache.""" record_hash = _get_text_embedding_record_hash( text, embedding_model, self.hash_method, ) try: return np.load( os.path.join( self.dir_cache_embedding, f"{record_hash}.npy", ), ) except FileNotFoundError: return None @staticmethod def _flush() -> None: """ Only for unittest usage. Don't use this function in your code. Flush the file_manager singleton. """ global file_manager file_manager = _FileManager() file_manager = _FileManager() ``` modelscope/agentscope/blob/main/src/agentscope/logging.py: ```py # -*- coding: utf-8 -*- """Logging utilities.""" import json import os import sys from typing import Optional, Literal, Any from loguru import logger from agentscope.message import Msg from agentscope.studio._client import _studio_client from agentscope.web.gradio.utils import ( generate_image_from_name, send_msg, get_reset_msg, thread_local_data, ) LOG_LEVEL = Literal[ "TRACE", "DEBUG", "INFO", "SUCCESS", "WARNING", "ERROR", "CRITICAL", ] LEVEL_SAVE_LOG = "SAVE_LOG" LEVEL_SAVE_MSG = "SAVE_MSG" _DEFAULT_LOG_FORMAT = ( "{time:YYYY-MM-DD HH:mm:ss.SSS} | {" "level: <8} | {name}:{" "function}:{line} - {" "message}\n" ) _PREFIX_DICT = {} def log_stream_msg(msg: Msg, last: bool = True) -> None: """Print the message in different streams, including terminal, studio, and gradio if it is active. Args: msg (`Msg`): The message object to be printed. last (`bool`, defaults to `True`): True if this is the last message in the stream or a single message. Otherwise, False. """ global _PREFIX_DICT # Print msg to terminal formatted_str = msg.formatted_str(colored=True) print_str = formatted_str[_PREFIX_DICT.get(msg.id, 0) :] if last: # Remove the prefix from the dictionary del _PREFIX_DICT[msg.id] print(print_str) else: # Update the prefix in the dictionary _PREFIX_DICT[msg.id] = len(formatted_str) print(print_str, end="") # Push msg to studio if it is active if _studio_client.active: _studio_client.push_message(msg) # Print to gradio if it is active if last and hasattr(thread_local_data, "uid"): log_gradio(msg, thread_local_data.uid) if last: # Save msg into chat file _save_msg(msg) def _save_msg(msg: Msg) -> None: """Save the message into `logging.chat` and `logging.log` files. Args: msg (`Msg`): The message object to be saved. """ logger.log( LEVEL_SAVE_LOG, msg.formatted_str(colored=False), ) logger.log( LEVEL_SAVE_MSG, json.dumps(msg, ensure_ascii=False, default=lambda _: None), ) def log_msg(msg: Msg, disable_gradio: bool = False) -> None: """Print the message and save it into files. Note the message should be a Msg object.""" if not isinstance(msg, Msg): raise TypeError(f"Get type {type(msg)}, expect Msg object.") print(msg.formatted_str(colored=True)) # Push msg to studio if it is active if _studio_client.active: _studio_client.push_message(msg) # Print to gradio if it is active if hasattr(thread_local_data, "uid") and not disable_gradio: log_gradio(msg, thread_local_data.uid) # Save msg into chat file _save_msg(msg) def log_gradio(message: dict, uid: str, **kwargs: Any) -> None: """Send chat message to studio. Args: message (`dict`): The message to be logged. It should have "name"(or "role") and "content" keys, and the message will be logged as ": ". uid (`str`): The local value 'uid' of the thread. """ if uid: get_reset_msg(uid=uid) name = message.get("name", "default") or message.get("role", "default") avatar = kwargs.get("avatar", None) or generate_image_from_name( message["name"], ) msg = message["content"] flushing = True if "url" in message and message["url"]: flushing = False if isinstance(message["url"], str): message["url"] = [message["url"]] for i in range(len(message["url"])): msg += "\n" + f"""

""" if "audio_path" in message and message["audio_path"]: flushing = False if isinstance(message["audio_path"], str): message["audio_path"] = [message["audio_path"]] for i in range(len(message["audio_path"])): msg += ( "\n" + f"""""" ) if "video_path" in message and message["video_path"]: flushing = False if isinstance(message["video_path"], str): message["video_path"] = [message["video_path"]] for i in range(len(message["video_path"])): msg += ( "\n" + f"""""" ) send_msg( msg, role=name, uid=uid, flushing=flushing, avatar=avatar, ) def _level_format(record: dict) -> str: """Format the log record.""" # Display the chat message if record["level"].name == LEVEL_SAVE_LOG: return "{message}\n" else: return _DEFAULT_LOG_FORMAT def setup_logger( path_log: Optional[str] = None, level: LOG_LEVEL = "INFO", ) -> None: r"""Setup `loguru.logger` and redirect stderr to logging. Args: path_log (`str`, defaults to `""`): The directory of log files. level (`str`, defaults to `"INFO"`): The logging level, which is one of the following: `"TRACE"`, `"DEBUG"`, `"INFO"`, `"SUCCESS"`, `"WARNING"`, `"ERROR"`, `"CRITICAL"`. """ # avoid reinit in subprocess if not hasattr(logger, "chat"): # add chat function for logger logger.level(LEVEL_SAVE_LOG, no=51) # save chat message into file logger.level(LEVEL_SAVE_MSG, no=53) logger.chat = log_msg # set logging level logger.remove() # standard output for all logging except chat logger.add( sys.stdout, filter=lambda record: record["level"].name not in [LEVEL_SAVE_LOG, LEVEL_SAVE_MSG], format=_DEFAULT_LOG_FORMAT, enqueue=True, level=level, ) if path_log is not None: os.makedirs(path_log, exist_ok=True) path_log_file = os.path.join(path_log, "logging.log") path_chat_file = os.path.join( path_log, "logging.chat", ) # save all logging except LEVEL_SAVE_MSG into logging.log logger.add( path_log_file, filter=lambda record: record["level"].name != LEVEL_SAVE_MSG, format=_level_format, enqueue=True, level=level, ) # save chat message into logging.chat logger.add( path_chat_file, format="{message}", enqueue=True, level=LEVEL_SAVE_MSG, # The highest level to filter out all # other logs ) ``` modelscope/agentscope/blob/main/src/agentscope/memory/__init__.py: ```py # -*- coding: utf-8 -*- """ import al memory related modules """ from .memory import MemoryBase from .temporary_memory import TemporaryMemory __all__ = [ "MemoryBase", "TemporaryMemory", ] ``` modelscope/agentscope/blob/main/src/agentscope/memory/memory.py: ```py # -*- coding: utf-8 -*- """ Base class for memory TODO: a abstract class for a piece of memory TODO: data structure to organize multiple memory pieces in memory class """ from abc import ABC, abstractmethod from typing import Iterable, Sequence from typing import Optional from typing import Union from typing import Callable from ..message import Msg class MemoryBase(ABC): """Base class for memory.""" _version: int = 1 def __init__( self, config: Optional[dict] = None, ) -> None: """MemoryBase is a base class for memory of agents. Args: config (`Optional[dict]`, defaults to `None`): Configuration of this memory. """ self.config = {} if config is None else config def update_config(self, config: dict) -> None: """ Configure memory as specified in config Args: config (`dict`): Configuration of resetting this memory """ self.config = config @abstractmethod def get_memory( self, recent_n: Optional[int] = None, filter_func: Optional[Callable[[int, dict], bool]] = None, ) -> list: """ Return a certain range (`recent_n` or all) of memory, filtered by `filter_func` Args: recent_n (int, optional): indicate the most recent N memory pieces to be returned. filter_func (Optional[Callable[[int, dict], bool]]): filter function to decide which pieces of memory should be returned, taking the index and a piece of memory as input and return True (return this memory) or False (does not return) """ @abstractmethod def add( self, memories: Union[Sequence[Msg], Msg, None], ) -> None: """ Adding new memory fragment, depending on how the memory are stored Args: memories (Union[Sequence[Msg], Msg, None]): Memories to be added. """ @abstractmethod def delete(self, index: Union[Iterable, int]) -> None: """ Delete memory fragment, depending on how the memory are stored and matched Args: index (Union[Iterable, int]): indices of the memory fragments to delete """ @abstractmethod def load( self, memories: Union[str, list[Msg], Msg], overwrite: bool = False, ) -> None: """ Load memory, depending on how the memory are passed, design to load from both file or dict Args: memories (Union[str, list[Msg], Msg]): memories to be loaded. If it is in str type, it will be first checked if it is a file; otherwise it will be deserialized as messages. Otherwise, memories must be either in message type or list of messages. overwrite (bool): if True, clear the current memory before loading the new ones; if False, memories will be appended to the old one at the end. """ @abstractmethod def export( self, file_path: Optional[str] = None, to_mem: bool = False, ) -> Optional[list]: """ Export memory, depending on how the memory are stored Args: file_path (Optional[str]): file path to save the memory to. to_mem (Optional[str]): if True, just return the list of messages in memory Notice: this method prevents file_path is None when to_mem is False. """ @abstractmethod def clear(self) -> None: """Clean memory, depending on how the memory are stored""" @abstractmethod def size(self) -> int: """Returns the number of memory segments in memory.""" raise NotImplementedError ``` modelscope/agentscope/blob/main/src/agentscope/memory/temporary_memory.py: ```py # -*- coding: utf-8 -*- """ Memory module for conversation """ import json import os from typing import Iterable, Sequence from typing import Optional from typing import Union from typing import Callable from loguru import logger from .memory import MemoryBase from ..models import load_model_by_config_name from ..service.retrieval.retrieval_from_list import retrieve_from_list from ..service.retrieval.similarity import Embedding from ..message import ( deserialize, serialize, MessageBase, Msg, PlaceholderMessage, ) class TemporaryMemory(MemoryBase): """ In-memory memory module, not writing to hard disk """ def __init__( self, config: Optional[dict] = None, embedding_model: Union[str, Callable] = None, ) -> None: """ Temporary memory module for conversation. Args: config (dict): configuration of the memory embedding_model (Union[str, Callable]) if the temporary memory needs to be embedded, then either pass the name of embedding model or the embedding model itself. """ super().__init__(config) self._content = [] # prepare embedding model if needed if isinstance(embedding_model, str): self.embedding_model = load_model_by_config_name(embedding_model) else: self.embedding_model = embedding_model def add( self, memories: Union[Sequence[Msg], Msg, None], embed: bool = False, ) -> None: # pylint: disable=too-many-branches """ Adding new memory fragment, depending on how the memory are stored Args: memories (`Union[Sequence[Msg], Msg, None]`): Memories to be added. embed (`bool`): Whether to generate embedding for the new added memories """ if memories is None: return if not isinstance(memories, Sequence): record_memories = [memories] else: record_memories = memories # if memory doesn't have id attribute, we skip the checking memories_idx = set(_.id for _ in self._content if hasattr(_, "id")) for memory_unit in record_memories: if not issubclass(type(memory_unit), MessageBase): try: memory_unit = Msg(**memory_unit) except Exception as exc: raise ValueError( f"Cannot add {memory_unit} to memory, " f"must be with subclass of MessageBase", ) from exc # in case this is a PlaceholderMessage, try to update # the values first if isinstance(memory_unit, PlaceholderMessage): memory_unit.update_value() memory_unit = Msg(**memory_unit) # add to memory if it's new if ( not hasattr(memory_unit, "id") or memory_unit.id not in memories_idx ): if embed: if self.embedding_model: # TODO: embed only content or its string representation memory_unit.embedding = self.embedding_model( [memory_unit], return_embedding_only=True, ) else: raise RuntimeError("Embedding model is not provided.") self._content.append(memory_unit) def delete(self, index: Union[Iterable, int]) -> None: """ Delete memory fragment, depending on how the memory are stored and matched Args: index (Union[Iterable, int]): indices of the memory fragments to delete """ if self.size() == 0: logger.warning( "The memory is empty, and the delete operation is " "skipping.", ) return if isinstance(index, int): index = [index] if isinstance(index, list): index = set(index) invalid_index = [_ for _ in index if _ >= self.size() or _ < 0] if len(invalid_index) > 0: logger.warning( f"Skip delete operation for the invalid " f"index {invalid_index}", ) self._content = [ _ for i, _ in enumerate(self._content) if i not in index ] else: raise NotImplementedError( "index type only supports {None, int, list}", ) def export( self, file_path: Optional[str] = None, to_mem: bool = False, ) -> Optional[list]: """ Export memory, depending on how the memory are stored Args: file_path (Optional[str]): file path to save the memory to. The messages will be serialized and written to the file. to_mem (Optional[str]): if True, just return the list of messages in memory Notice: this method prevents file_path is None when to_mem is False. """ if to_mem: return self._content if to_mem is False and file_path is not None: with open(file_path, "w", encoding="utf-8") as f: f.write(serialize(self._content)) else: raise NotImplementedError( "file type only supports " "{json, yaml, pkl}, default is json", ) return None def load( self, memories: Union[str, list[Msg], Msg], overwrite: bool = False, ) -> None: """ Load memory, depending on how the memory are passed, design to load from both file or dict Args: memories (Union[str, list[Msg], Msg]): memories to be loaded. If it is in str type, it will be first checked if it is a file; otherwise it will be deserialized as messages. Otherwise, memories must be either in message type or list of messages. overwrite (bool): if True, clear the current memory before loading the new ones; if False, memories will be appended to the old one at the end. """ if isinstance(memories, str): if os.path.isfile(memories): with open(memories, "r", encoding="utf-8") as f: load_memories = deserialize(f.read()) else: try: load_memories = deserialize(memories) if not isinstance(load_memories, dict) and not isinstance( load_memories, list, ): logger.warning( "The memory loaded by json.loads is " "neither a dict nor a list, which may " "cause unpredictable errors.", ) except json.JSONDecodeError as e: raise json.JSONDecodeError( f"Cannot load [{memories}] via " f"json.loads.", e.doc, e.pos, ) else: load_memories = memories # overwrite the original memories after loading the new ones if overwrite: self.clear() self.add(load_memories) def clear(self) -> None: """Clean memory, depending on how the memory are stored""" self._content = [] def size(self) -> int: """Returns the number of memory segments in memory.""" return len(self._content) def retrieve_by_embedding( self, query: Union[str, Embedding], metric: Callable[[Embedding, Embedding], float], top_k: int = 1, preserve_order: bool = True, embedding_model: Callable[[Union[str, dict]], Embedding] = None, ) -> list[dict]: """Retrieve memory by their embeddings. Args: query (`Union[str, Embedding]`): Query string or embedding. metric (`Callable[[Embedding, Embedding], float]`): A metric to compute the relevance between embeddings of query and memory. In default, higher relevance means better match, and you can set `reverse` to `True` to reverse the order. top_k (`int`, defaults to `1`): The number of memory units to retrieve. preserve_order (`bool`, defaults to `True`): Whether to preserve the original order of the retrieved memory units. embedding_model (`Callable[[Union[str, dict]], Embedding]`, \ defaults to `None`): A callable object to embed the memory unit. If not provided, it will use the default embedding model. Returns: `list[dict]`: a list of retrieved memory units in specific order. """ retrieved_items = retrieve_from_list( query, self.get_embeddings(embedding_model or self.embedding_model), metric, top_k, self.embedding_model, preserve_order, ).content # obtain the corresponding memory item response = [] for score, index, _ in retrieved_items: response.append( { "score": score, "index": index, "memory": self._content[index], }, ) return response def get_embeddings( self, embedding_model: Callable[[Union[str, dict]], Embedding] = None, ) -> list: """Get embeddings of all memory units. If `embedding_model` is provided, the memory units that doesn't have `embedding` attribute will be embedded. Otherwise, its embedding will be `None`. Args: embedding_model (`Callable[[Union[str, dict]], Embedding]`, defaults to `None`): Embedding model or embedding vector. Returns: `list[Union[Embedding, None]]`: List of embeddings or None. """ embeddings = [] for memory_unit in self._content: if memory_unit.embedding is None and embedding_model is not None: # embedding # TODO: embed only content or its string representation memory_unit.embedding = embedding_model(memory_unit) embeddings.append(memory_unit.embedding) return embeddings def get_memory( self, recent_n: Optional[int] = None, filter_func: Optional[Callable[[int, dict], bool]] = None, ) -> list: """Retrieve memory. Args: recent_n (`Optional[int]`, default `None`): The last number of memories to return. filter_func (`Callable[[int, dict], bool]`, default to `None`): The function to filter memories, which take the index and memory unit as input, and return a boolean value. """ # extract the recent `recent_n` entries in memories if recent_n is None: memories = self._content else: if recent_n > self.size(): logger.warning( "The retrieved number of memories {} is " "greater than the total number of memories {" "}", recent_n, self.size(), ) memories = self._content[-recent_n:] # filter the memories if filter_func is not None: memories = [_ for i, _ in enumerate(memories) if filter_func(i, _)] return memories ``` modelscope/agentscope/blob/main/src/agentscope/message.py: ```py # -*- coding: utf-8 -*- """The base class for message unit""" from typing import Any, Optional, Union, Sequence, Literal, List from uuid import uuid4 import json from loguru import logger from .rpc import RpcAgentClient, ResponseStub, call_in_thread from .utils.tools import _get_timestamp from .utils.tools import _map_string_to_color_mark from .utils.tools import is_web_accessible class MessageBase(dict): """Base Message class, which is used to maintain information for dialog, memory and used to construct prompt. """ def __init__( self, name: str, content: Any, role: Literal["user", "system", "assistant"] = "assistant", url: Optional[Union[List[str], str]] = None, timestamp: Optional[str] = None, **kwargs: Any, ) -> None: """Initialize the message object Args: name (`str`): The name of who send the message. It's often used in role-playing scenario to tell the name of the sender. content (`Any`): The content of the message. role (`Literal["system", "user", "assistant"]`, defaults to "assistant"): The role of who send the message. It can be one of the `"system"`, `"user"`, or `"assistant"`. Default to `"assistant"`. url (`Optional[Union[List[str], str]]`, defaults to None): A url to file, image, video, audio or website. timestamp (`Optional[str]`, defaults to None): The timestamp of the message, if None, it will be set to current time. **kwargs (`Any`): Other attributes of the message. """ # noqa # id and timestamp will be added to the object as its attributes # rather than items in dict self.id = uuid4().hex if timestamp is None: self.timestamp = _get_timestamp() else: self.timestamp = timestamp self.name = name self.content = content self.role = role self.url = url self.update(kwargs) def __getattr__(self, key: Any) -> Any: try: return self[key] except KeyError as e: raise AttributeError(f"no attribute '{key}'") from e def __setattr__(self, key: Any, value: Any) -> None: self[key] = value def __delattr__(self, key: Any) -> None: try: del self[key] except KeyError as e: raise AttributeError(f"no attribute '{key}'") from e def serialize(self) -> str: """Return the serialized message.""" raise NotImplementedError class Msg(MessageBase): """The Message class.""" id: str """The id of the message.""" name: str """The name of who send the message.""" content: Any """The content of the message.""" role: Literal["system", "user", "assistant"] """The role of the message sender.""" metadata: Optional[dict] """Save the information for application's control flow, or other purposes.""" url: Optional[Union[List[str], str]] """A url to file, image, video, audio or website.""" timestamp: str """The timestamp of the message.""" def __init__( self, name: str, content: Any, role: Literal["system", "user", "assistant"] = None, url: Optional[Union[List[str], str]] = None, timestamp: Optional[str] = None, echo: bool = False, metadata: Optional[Union[dict, str]] = None, **kwargs: Any, ) -> None: """Initialize the message object Args: name (`str`): The name of who send the message. content (`Any`): The content of the message. role (`Literal["system", "user", "assistant"]`): Used to identify the source of the message, e.g. the system information, the user input, or the model response. This argument is used to accommodate most Chat API formats. url (`Optional[Union[List[str], str]]`, defaults to `None`): A url to file, image, video, audio or website. timestamp (`Optional[str]`, defaults to `None`): The timestamp of the message, if None, it will be set to current time. echo (`bool`, defaults to `False`): Whether to print the message to the console. metadata (`Optional[Union[dict, str]]`, defaults to `None`): Save the information for application's control flow, or other purposes. **kwargs (`Any`): Other attributes of the message. """ if role is None: logger.warning( "A new field `role` is newly added to the message. " "Please specify the role of the message. Currently we use " 'a default "assistant" value.', ) super().__init__( name=name, content=content, role=role or "assistant", url=url, timestamp=timestamp, metadata=metadata, **kwargs, ) m1, m2 = _map_string_to_color_mark(self.name) self._colored_name = f"{m1}{self.name}{m2}" if echo: logger.chat(self) def formatted_str(self, colored: bool = False) -> str: """Return the formatted string of the message. If the message has an url, the url will be appended to the content. Args: colored (`bool`, defaults to `False`): Whether to color the name of the message """ if colored: name = self._colored_name else: name = self.name colored_strs = [f"{name}: {self.content}"] if self.url is not None: if isinstance(self.url, list): for url in self.url: colored_strs.append(f"{name}: {url}") else: colored_strs.append(f"{name}: {self.url}") return "\n".join(colored_strs) def serialize(self) -> str: return json.dumps({"__type": "Msg", **self}) class PlaceholderMessage(Msg): """A placeholder for the return message of RpcAgent.""" PLACEHOLDER_ATTRS = { "_host", "_port", "_client", "_task_id", "_stub", "_is_placeholder", } LOCAL_ATTRS = { "name", "timestamp", *PLACEHOLDER_ATTRS, } def __init__( self, name: str, content: Any, url: Optional[Union[List[str], str]] = None, timestamp: Optional[str] = None, host: str = None, port: int = None, task_id: int = None, client: Optional[RpcAgentClient] = None, x: dict = None, **kwargs: Any, ) -> None: """A placeholder message, records the address of the real message. Args: name (`str`): The name of who send the message. It's often used in role-playing scenario to tell the name of the sender. However, you can also only use `role` when calling openai api. The usage of `name` refers to https://cookbook.openai.com/examples/how_to_format_inputs_to_chatgpt_models. content (`Any`): The content of the message. role (`Literal["system", "user", "assistant"]`, defaults to "assistant"): The role of the message, which can be one of the `"system"`, `"user"`, or `"assistant"`. url (`Optional[Union[List[str], str]]`, defaults to None): A url to file, image, video, audio or website. timestamp (`Optional[str]`, defaults to None): The timestamp of the message, if None, it will be set to current time. host (`str`, defaults to `None`): The hostname of the rpc server where the real message is located. port (`int`, defaults to `None`): The port of the rpc server where the real message is located. task_id (`int`, defaults to `None`): The task id of the real message in the rpc server. client (`RpcAgentClient`, defaults to `None`): An RpcAgentClient instance used to connect to the generator of this placeholder. x (`dict`, defaults to `None`): Input parameters used to call rpc methods on the client. """ # noqa super().__init__( name=name, content=content, url=url, timestamp=timestamp, **kwargs, ) # placeholder indicates whether the real message is still in rpc server self._is_placeholder = True if client is None: self._stub: ResponseStub = None self._host: str = host self._port: int = port self._task_id: int = task_id else: self._stub = call_in_thread( client, x.serialize() if x is not None else "", "_reply", ) self._host = client.host self._port = client.port self._task_id = None def __is_local(self, key: Any) -> bool: return ( key in PlaceholderMessage.LOCAL_ATTRS or not self._is_placeholder ) def __getattr__(self, __name: str) -> Any: """Get attribute value from PlaceholderMessage. Get value from rpc agent server if necessary. Args: __name (`str`): Attribute name. """ if not self.__is_local(__name): self.update_value() return MessageBase.__getattr__(self, __name) def __getitem__(self, __key: Any) -> Any: """Get item value from PlaceholderMessage. Get value from rpc agent server if necessary. Args: __key (`Any`): Item name. """ if not self.__is_local(__key): self.update_value() return MessageBase.__getitem__(self, __key) def update_value(self) -> MessageBase: """Get attribute values from rpc agent server immediately""" if self._is_placeholder: # retrieve real message from rpc agent server self.__update_task_id() client = RpcAgentClient(self._host, self._port) result = client.update_placeholder(task_id=self._task_id) msg = deserialize(result) self.__update_url(msg) # type: ignore[arg-type] self.update(msg) # the actual value has been updated, not a placeholder anymore self._is_placeholder = False return self def __update_url(self, msg: MessageBase) -> None: """Update the url field of the message.""" if hasattr(msg, "url") and msg.url is None: return url = msg.url if isinstance(url, str): urls = [url] else: urls = url checked_urls = [] for url in urls: if not is_web_accessible(url): client = RpcAgentClient(self._host, self._port) checked_urls.append(client.download_file(path=url)) else: checked_urls.append(url) msg.url = checked_urls[0] if isinstance(url, str) else checked_urls def __update_task_id(self) -> None: if self._stub is not None: try: resp = deserialize(self._stub.get_response()) except Exception as e: logger.error( f"Failed to get task_id: {self._stub.get_response()}", ) raise ValueError( f"Failed to get task_id: {self._stub.get_response()}", ) from e self._task_id = resp["task_id"] # type: ignore[call-overload] self._stub = None def serialize(self) -> str: if self._is_placeholder: self.__update_task_id() return json.dumps( { "__type": "PlaceholderMessage", "name": self.name, "content": None, "timestamp": self.timestamp, "host": self._host, "port": self._port, "task_id": self._task_id, }, ) else: states = { k: v for k, v in self.items() if k not in PlaceholderMessage.PLACEHOLDER_ATTRS } states["__type"] = "Msg" return json.dumps(states) _MSGS = { "Msg": Msg, "PlaceholderMessage": PlaceholderMessage, } def deserialize(s: Union[str, bytes]) -> Union[Msg, Sequence]: """Deserialize json string into MessageBase""" js_msg = json.loads(s) msg_type = js_msg.pop("__type") if msg_type == "List": return [deserialize(s) for s in js_msg["__value"]] elif msg_type not in _MSGS: raise NotImplementedError( f"Deserialization of {msg_type} is not supported.", ) return _MSGS[msg_type](**js_msg) def serialize(messages: Union[Sequence[MessageBase], MessageBase]) -> str: """Serialize multiple MessageBase instance""" if isinstance(messages, MessageBase): return messages.serialize() seq = [msg.serialize() for msg in messages] return json.dumps({"__type": "List", "__value": seq}) ``` modelscope/agentscope/blob/main/src/agentscope/models/__init__.py: ```py # -*- coding: utf-8 -*- """ Import modules in models package.""" import json from typing import Union, Type from loguru import logger from .config import _ModelConfig from .model import ModelWrapperBase from .response import ModelResponse from .post_model import ( PostAPIModelWrapperBase, PostAPIChatWrapper, ) from .openai_model import ( OpenAIWrapperBase, OpenAIChatWrapper, OpenAIDALLEWrapper, OpenAIEmbeddingWrapper, ) from .dashscope_model import ( DashScopeChatWrapper, DashScopeImageSynthesisWrapper, DashScopeTextEmbeddingWrapper, DashScopeMultiModalWrapper, ) from .ollama_model import ( OllamaChatWrapper, OllamaEmbeddingWrapper, OllamaGenerationWrapper, ) from .gemini_model import ( GeminiChatWrapper, GeminiEmbeddingWrapper, ) from .zhipu_model import ( ZhipuAIChatWrapper, ZhipuAIEmbeddingWrapper, ) from .litellm_model import ( LiteLLMChatWrapper, ) __all__ = [ "ModelWrapperBase", "ModelResponse", "PostAPIModelWrapperBase", "PostAPIChatWrapper", "OpenAIWrapperBase", "OpenAIChatWrapper", "OpenAIDALLEWrapper", "OpenAIEmbeddingWrapper", "DashScopeChatWrapper", "DashScopeImageSynthesisWrapper", "DashScopeTextEmbeddingWrapper", "DashScopeMultiModalWrapper", "OllamaChatWrapper", "OllamaEmbeddingWrapper", "OllamaGenerationWrapper", "GeminiChatWrapper", "GeminiEmbeddingWrapper", "ZhipuAIChatWrapper", "ZhipuAIEmbeddingWrapper", "LiteLLMChatWrapper", "load_model_by_config_name", "load_config_by_name", "read_model_configs", "clear_model_configs", ] _MODEL_CONFIGS: dict[str, dict] = {} def _get_model_wrapper(model_type: str) -> Type[ModelWrapperBase]: """Get the specific type of model wrapper Args: model_type (`str`): The model type name. Returns: `Type[ModelWrapperBase]`: The corresponding model wrapper class. """ wrapper = ModelWrapperBase.get_wrapper(model_type=model_type) if wrapper is None: logger.warning( f"Unsupported model_type [{model_type}]," "use PostApiModelWrapper instead.", ) return PostAPIModelWrapperBase return wrapper def load_config_by_name(config_name: str) -> Union[dict, None]: """Load the model config by name, and return the config dict.""" return _MODEL_CONFIGS.get(config_name, None) def load_model_by_config_name(config_name: str) -> ModelWrapperBase: """Load the model by config name, and return the model wrapper.""" if len(_MODEL_CONFIGS) == 0: raise ValueError( "No model configs loaded, please call " "`read_model_configs` first.", ) # Find model config by name if config_name not in _MODEL_CONFIGS: raise ValueError( f"Cannot find [{config_name}] in loaded configurations.", ) config = _MODEL_CONFIGS.get(config_name, None) if config is None: raise ValueError( f"Cannot find [{config_name}] in loaded configurations.", ) model_type = config.model_type kwargs = {k: v for k, v in config.items() if k != "model_type"} return _get_model_wrapper(model_type=model_type)(**kwargs) def clear_model_configs() -> None: """Clear the loaded model configs.""" _MODEL_CONFIGS.clear() def read_model_configs( configs: Union[dict, str, list], clear_existing: bool = False, ) -> None: """read model configs from a path or a list of dicts. Args: configs (`Union[str, list, dict]`): The path of the model configs | a config dict | a list of model configs. clear_existing (`bool`, defaults to `False`): Whether to clear the loaded model configs before reading. Returns: `dict`: The model configs. """ if clear_existing: clear_model_configs() cfgs = None if isinstance(configs, str): with open(configs, "r", encoding="utf-8") as f: cfgs = json.load(f) if isinstance(configs, dict): cfgs = [configs] if isinstance(configs, list): if not all(isinstance(_, dict) for _ in configs): raise ValueError( "The model config unit should be a dict.", ) cfgs = configs if cfgs is None: raise TypeError( f"Invalid type of model_configs, it could be a dict, a list of " f"dicts, or a path to a json file (containing a dict or a list " f"of dicts), but got {type(configs)}", ) format_configs = _ModelConfig.format_configs(configs=cfgs) # check if name is unique for cfg in format_configs: if cfg.config_name in _MODEL_CONFIGS: logger.warning( f"config_name [{cfg.config_name}] already exists.", ) continue _MODEL_CONFIGS[cfg.config_name] = cfg # print the loaded model configs logger.info( "Load configs for model wrapper: {}", ", ".join(_MODEL_CONFIGS.keys()), ) ``` modelscope/agentscope/blob/main/src/agentscope/models/_model_utils.py: ```py # -*- coding: utf-8 -*- """Utils for models""" def _verify_text_content_in_openai_delta_response(response: dict) -> bool: """Verify if the text content exists in the openai streaming response Args: response (`dict`): The JSON-format OpenAI response (After calling `model_dump` function) Returns: `bool`: If the text content exists """ if len(response.get("choices", [])) == 0: return False if response["choices"][0].get("delta", None) is None: return False if response["choices"][0]["delta"].get("content", None) is None: return False return True def _verify_text_content_in_openai_message_response(response: dict) -> bool: """Verify if the text content exists in the openai streaming response Args: response (`dict`): The JSON-format OpenAI response (After calling `model_dump` function) Returns: `bool`: If the text content exists """ if len(response.get("choices", [])) == 0: return False if response["choices"][0].get("message", None) is None: return False if response["choices"][0]["message"].get("content", None) is None: return False return True ``` modelscope/agentscope/blob/main/src/agentscope/models/config.py: ```py # -*- coding: utf-8 -*- """The model config.""" from typing import Union, Sequence, Any from loguru import logger class _ModelConfig(dict): """Base class for model config.""" __getattr__ = dict.__getitem__ __setattr__ = dict.__setitem__ def __init__( self, config_name: str, model_type: str = None, **kwargs: Any, ): """Initialize the config with the given arguments, and checking the type of the arguments. Args: config_name (`str`): A unique name of the model config. model_type (`str`, optional): The class name (or its model type) of the generated model wrapper. Defaults to None. Raises: `ValueError`: If `config_name` is not provided. """ if config_name is None: raise ValueError("The `config_name` field is required for Cfg") if model_type is None: logger.warning( f"`model_type` is not provided in config [{config_name}]," " use `PostAPIModelWrapperBase` by default.", ) super().__init__( config_name=config_name, model_type=model_type, **kwargs, ) @classmethod def format_configs( cls, configs: Union[Sequence[dict], dict], ) -> Sequence: """Covert config dicts into a list of _ModelConfig. Args: configs (Union[Sequence[dict], dict]): configs in dict format. Returns: Sequence[_ModelConfig]: converted ModelConfig list. """ if isinstance(configs, dict): return [_ModelConfig(**configs)] return [_ModelConfig(**cfg) for cfg in configs] ``` modelscope/agentscope/blob/main/src/agentscope/models/dashscope_model.py: ```py # -*- coding: utf-8 -*- """Model wrapper for DashScope models""" import os from abc import ABC from http import HTTPStatus from typing import Any, Union, List, Sequence, Optional, Generator from dashscope.api_entities.dashscope_response import GenerationResponse from loguru import logger from ..message import Msg from ..utils.tools import _convert_to_str, _guess_type_by_extension try: import dashscope except ImportError: dashscope = None from .model import ModelWrapperBase, ModelResponse from ..file_manager import file_manager class DashScopeWrapperBase(ModelWrapperBase, ABC): """The model wrapper for DashScope API.""" def __init__( self, config_name: str, model_name: str = None, api_key: str = None, generate_args: dict = None, **kwargs: Any, ) -> None: """Initialize the DashScope wrapper. Args: config_name (`str`): The name of the model config. model_name (`str`, default `None`): The name of the model to use in DashScope API. api_key (`str`, default `None`): The API key for DashScope API. generate_args (`dict`, default `None`): The extra keyword arguments used in DashScope api generation, e.g. `temperature`, `seed`. """ if model_name is None: model_name = config_name logger.warning("model_name is not set, use config_name instead.") super().__init__(config_name=config_name) if dashscope is None: raise ImportError( "Cannot find dashscope package in current python environment.", ) self.model_name = model_name self.generate_args = generate_args or {} self.api_key = api_key if self.api_key: dashscope.api_key = self.api_key self.max_length = None # Set monitor accordingly self._register_default_metrics() def format( self, *args: Union[Msg, Sequence[Msg]], ) -> Union[List[dict], str]: raise RuntimeError( f"Model Wrapper [{type(self).__name__}] doesn't " f"need to format the input. Please try to use the " f"model wrapper directly.", ) class DashScopeChatWrapper(DashScopeWrapperBase): """The model wrapper for DashScope's chat API, refer to https://help.aliyun.com/zh/dashscope/developer-reference/api-details """ model_type: str = "dashscope_chat" deprecated_model_type: str = "tongyi_chat" def __init__( self, config_name: str, model_name: str = None, api_key: str = None, stream: bool = False, generate_args: dict = None, **kwargs: Any, ) -> None: """Initialize the DashScope wrapper. Args: config_name (`str`): The name of the model config. model_name (`str`, default `None`): The name of the model to use in DashScope API. api_key (`str`, default `None`): The API key for DashScope API. stream (`bool`, default `False`): If True, the response will be a generator in the `stream` field of the returned `ModelResponse` object. generate_args (`dict`, default `None`): The extra keyword arguments used in DashScope api generation, e.g. `temperature`, `seed`. """ super().__init__( config_name=config_name, model_name=model_name, api_key=api_key, generate_args=generate_args, **kwargs, ) self.stream = stream def _register_default_metrics(self) -> None: # Set monitor accordingly # TODO: set quota to the following metrics self.monitor.register( self._metric("call_counter"), metric_unit="times", ) self.monitor.register( self._metric("prompt_tokens"), metric_unit="token", ) self.monitor.register( self._metric("completion_tokens"), metric_unit="token", ) self.monitor.register( self._metric("total_tokens"), metric_unit="token", ) def __call__( self, messages: list, stream: Optional[bool] = None, **kwargs: Any, ) -> ModelResponse: """Processes a list of messages to construct a payload for the DashScope API call. It then makes a request to the DashScope API and returns the response. This method also updates monitoring metrics based on the API response. Each message in the 'messages' list can contain text content and optionally an 'image_urls' key. If 'image_urls' is provided, it is expected to be a list of strings representing URLs to images. These URLs will be transformed to a suitable format for the DashScope API, which might involve converting local file paths to data URIs. Args: messages (`list`): A list of messages to process. stream (`Optional[bool]`, default `None`): The stream flag to control the response format, which will overwrite the stream flag in the constructor. **kwargs (`Any`): The keyword arguments to DashScope chat completions API, e.g. `temperature`, `max_tokens`, `top_p`, etc. Please refer to https://help.aliyun.com/zh/dashscope/developer-reference/api-details for more detailed arguments. Returns: `ModelResponse`: A response object with the response text in text field, and the raw response in raw field. If stream is True, the response will be a generator in the `stream` field. Note: `parse_func`, `fault_handler` and `max_retries` are reserved for `_response_parse_decorator` to parse and check the response generated by model wrapper. Their usages are listed as follows: - `parse_func` is a callable function used to parse and check the response generated by the model, which takes the response as input. - `max_retries` is the maximum number of retries when the `parse_func` raise an exception. - `fault_handler` is a callable function which is called when the response generated by the model is invalid after `max_retries` retries. The rule of roles in messages for DashScope is very rigid, for more details, please refer to https://help.aliyun.com/zh/dashscope/developer-reference/api-details """ # step1: prepare keyword arguments kwargs = {**self.generate_args, **kwargs} # step2: checking messages if not isinstance(messages, list): raise ValueError( "Dashscope `messages` field expected type `list`, " f"got `{type(messages)}` instead.", ) if not all("role" in msg and "content" in msg for msg in messages): raise ValueError( "Each message in the 'messages' list must contain a 'role' " "and 'content' key for DashScope API.", ) # step3: forward to generate response if stream is None: stream = self.stream kwargs.update( { "model": self.model_name, "messages": messages, # Set the result to be "message" format. "result_format": "message", "stream": stream, }, ) # Switch to the incremental_output mode if stream: kwargs["incremental_output"] = True response = dashscope.Generation.call(**kwargs) # step3: invoke llm api, record the invocation and update the monitor if stream: def generator() -> Generator[str, None, None]: last_chunk = None text = "" for chunk in response: if chunk.status_code != HTTPStatus.OK: error_msg = ( f"Request id: {chunk.request_id}\n" f"Status code: {chunk.status_code}\n" f"Error code: {chunk.code}\n" f"Error message: {chunk.message}" ) raise RuntimeError(error_msg) text += chunk.output["choices"][0]["message"]["content"] yield text last_chunk = chunk # Replace the last chunk with the full text last_chunk.output["choices"][0]["message"]["content"] = text # Save the model invocation and update the monitor self._save_model_invocation_and_update_monitor( kwargs, last_chunk, ) return ModelResponse( stream=generator(), raw=response, ) else: if response.status_code != HTTPStatus.OK: error_msg = ( f"Request id: {response.request_id},\n" f"Status code: {response.status_code},\n" f"Error code: {response.code},\n" f"Error message: {response.message}." ) raise RuntimeError(error_msg) # Record the model invocation and update the monitor self._save_model_invocation_and_update_monitor( kwargs, response, ) return ModelResponse( text=response.output["choices"][0]["message"]["content"], raw=response, ) def _save_model_invocation_and_update_monitor( self, kwargs: dict, response: GenerationResponse, ) -> None: """Save the model invocation and update the monitor accordingly. Args: kwargs (`dict`): The keyword arguments to the DashScope chat API. response (`GenerationResponse`): The response object returned by the DashScope chat API. """ input_tokens = response.usage.get("input_tokens", 0) output_tokens = response.usage.get("output_tokens", 0) # Update the token record accordingly self.update_monitor( call_counter=1, prompt_tokens=input_tokens, completion_tokens=output_tokens, total_tokens=input_tokens + output_tokens, ) # Save the model invocation after the stream is exhausted self._save_model_invocation( arguments=kwargs, response=response, ) def format( self, *args: Union[Msg, Sequence[Msg]], ) -> List: """Format the messages for DashScope Chat API. In this format function, the input messages are formatted into a single system messages with format "{name}: {content}" for each message. Note this strategy maybe not suitable for all scenarios, and developers are encouraged to implement their own prompt engineering strategies. The following is an example: .. code-block:: python prompt = model.format( Msg("system", "You're a helpful assistant", role="system"), Msg("Bob", "Hi, how can I help you?", role="assistant"), Msg("user", "What's the date today?", role="user") ) The prompt will be as follows: .. code-block:: python [ { "role": "system", "content": "You're a helpful assistant", } { "role": "user", "content": ( "## Dialogue History\\n" "Bob: Hi, how can I help you?\\n" "user: What's the date today?" ) } ] Args: args (`Union[Msg, Sequence[Msg]]`): The input arguments to be formatted, where each argument should be a `Msg` object, or a list of `Msg` objects. In distribution, placeholder is also allowed. Returns: `List[dict]`: The formatted messages. """ # Parse all information into a list of messages input_msgs = [] for _ in args: if _ is None: continue if isinstance(_, Msg): input_msgs.append(_) elif isinstance(_, list) and all(isinstance(__, Msg) for __ in _): input_msgs.extend(_) else: raise TypeError( f"The input should be a Msg object or a list " f"of Msg objects, got {type(_)}.", ) messages = [] # record dialog history as a list of strings dialogue = [] for i, unit in enumerate(input_msgs): if i == 0 and unit.role == "system": # system prompt messages.append( { "role": unit.role, "content": _convert_to_str(unit.content), }, ) else: # Merge all messages into a dialogue history prompt dialogue.append( f"{unit.name}: {_convert_to_str(unit.content)}", ) dialogue_history = "\n".join(dialogue) user_content_template = "## Dialogue History\n{dialogue_history}" messages.append( { "role": "user", "content": user_content_template.format( dialogue_history=dialogue_history, ), }, ) return messages class DashScopeImageSynthesisWrapper(DashScopeWrapperBase): """The model wrapper for DashScope Image Synthesis API, refer to https://help.aliyun.com/zh/dashscope/developer-reference/quick-start-1 """ model_type: str = "dashscope_image_synthesis" def _register_default_metrics(self) -> None: # Set monitor accordingly # TODO: set quota to the following metrics self.monitor.register( self._metric("call_counter"), metric_unit="times", ) self.monitor.register( self._metric("image_count"), metric_unit="image", ) def __call__( self, prompt: str, save_local: bool = False, **kwargs: Any, ) -> ModelResponse: """ Args: prompt (`str`): The prompt string to generate images from. save_local: (`bool`, default `False`): Whether to save the generated images locally, and replace the returned image url with the local path. **kwargs (`Any`): The keyword arguments to DashScope Image Synthesis API, e.g. `n`, `size`, etc. Please refer to https://help.aliyun.com/zh/dashscope/developer-reference/api-details-9 for more detailed arguments. Returns: `ModelResponse`: A list of image urls in image_urls field and the raw response in raw field. Note: `parse_func`, `fault_handler` and `max_retries` are reserved for `_response_parse_decorator` to parse and check the response generated by model wrapper. Their usages are listed as follows: - `parse_func` is a callable function used to parse and check the response generated by the model, which takes the response as input. - `max_retries` is the maximum number of retries when the `parse_func` raise an exception. - `fault_handler` is a callable function which is called when the response generated by the model is invalid after `max_retries` retries. """ # step1: prepare keyword arguments kwargs = {**self.generate_args, **kwargs} # step2: forward to generate response response = dashscope.ImageSynthesis.call( model=self.model_name, prompt=prompt, **kwargs, ) if response.status_code != HTTPStatus.OK: error_msg = ( f" Request id: {response.request_id}," f" Status code: {response.status_code}," f" error code: {response.code}," f" error message: {response.message}." ) raise RuntimeError(error_msg) # step3: record the model api invocation if needed self._save_model_invocation( arguments={ "model": self.model_name, "prompt": prompt, **kwargs, }, response=response, ) # step4: update monitor accordingly self.update_monitor( call_counter=1, **response.usage, ) # step5: return response images = response.output["results"] # Get image urls as a list urls = [_["url"] for _ in images] if save_local: # Return local url if save_local is True urls = [file_manager.save_image(_) for _ in urls] return ModelResponse(image_urls=urls, raw=response) class DashScopeTextEmbeddingWrapper(DashScopeWrapperBase): """The model wrapper for DashScope Text Embedding API.""" model_type: str = "dashscope_text_embedding" def _register_default_metrics(self) -> None: # Set monitor accordingly # TODO: set quota to the following metrics self.monitor.register( self._metric("call_counter"), metric_unit="times", ) self.monitor.register( self._metric("total_tokens"), metric_unit="token", ) def __call__( self, texts: Union[list[str], str], **kwargs: Any, ) -> ModelResponse: """Embed the messages with DashScope Text Embedding API. Args: texts (`list[str]` or `str`): The messages used to embed. **kwargs (`Any`): The keyword arguments to DashScope Text Embedding API, e.g. `text_type`. Please refer to https://help.aliyun.com/zh/dashscope/developer-reference/api-details-15 for more detailed arguments. Returns: `ModelResponse`: A list of embeddings in embedding field and the raw response in raw field. Note: `parse_func`, `fault_handler` and `max_retries` are reserved for `_response_parse_decorator` to parse and check the response generated by model wrapper. Their usages are listed as follows: - `parse_func` is a callable function used to parse and check the response generated by the model, which takes the response as input. - `max_retries` is the maximum number of retries when the `parse_func` raise an exception. - `fault_handler` is a callable function which is called when the response generated by the model is invalid after `max_retries` retries. """ # step1: prepare keyword arguments kwargs = {**self.generate_args, **kwargs} # step2: forward to generate response response = dashscope.TextEmbedding.call( input=texts, model=self.model_name, **kwargs, ) if response.status_code != HTTPStatus.OK: error_msg = ( f" Request id: {response.request_id}," f" Status code: {response.status_code}," f" error code: {response.code}," f" error message: {response.message}." ) raise RuntimeError(error_msg) # step3: record the model api invocation if needed self._save_model_invocation( arguments={ "model": self.model_name, "input": texts, **kwargs, }, response=response, ) # step4: update monitor accordingly self.update_monitor( call_counter=1, **response.usage, ) # step5: return response return ModelResponse( embedding=[_["embedding"] for _ in response.output["embeddings"]], raw=response, ) class DashScopeMultiModalWrapper(DashScopeWrapperBase): """The model wrapper for DashScope Multimodal API, refer to https://help.aliyun.com/zh/dashscope/developer-reference/tongyi-qianwen-vl-api """ model_type: str = "dashscope_multimodal" def _register_default_metrics(self) -> None: # Set monitor accordingly # TODO: set quota to the following metrics self.monitor.register( self._metric("call_counter"), metric_unit="times", ) self.monitor.register( self._metric("prompt_tokens"), metric_unit="token", ) self.monitor.register( self._metric("completion_tokens"), metric_unit="token", ) self.monitor.register( self._metric("total_tokens"), metric_unit="token", ) def __call__( self, messages: list, **kwargs: Any, ) -> ModelResponse: """Model call for DashScope MultiModal API. Args: messages (`list`): A list of messages to process. **kwargs (`Any`): The keyword arguments to DashScope MultiModal API, e.g. `stream`. Please refer to https://help.aliyun.com/zh/dashscope/developer-reference/tongyi-qianwen-vl-plus-api for more detailed arguments. Returns: `ModelResponse`: The response text in text field, and the raw response in raw field. Note: If involving image links, then the messages should be of the following form: .. code-block:: python messages = [ { "role": "system", "content": [ {"text": "You are a helpful assistant."}, ], }, { "role": "user", "content": [ {"text": "What does this picture depict？"}, {"image": "http://example.com/image.jpg"}, ], }, ] Therefore, you should input a list matching the content value above. If only involving words, just input them. `parse_func`, `fault_handler` and `max_retries` are reserved for `_response_parse_decorator` to parse and check the response generated by model wrapper. Their usages are listed as follows: - `parse_func` is a callable function used to parse and check the response generated by the model, which takes the response as input. - `max_retries` is the maximum number of retries when the `parse_func` raise an exception. - `fault_handler` is a callable function which is called when the response generated by the model is invalid after `max_retries` retries. """ # step1: prepare keyword arguments kwargs = {**self.generate_args, **kwargs} # step2: forward to generate response response = dashscope.MultiModalConversation.call( model=self.model_name, messages=messages, **kwargs, ) # Unhandle code path here # response could be a generator , if stream is yes # suggest add a check here if response.status_code != HTTPStatus.OK: error_msg = ( f" Request id: {response.request_id}," f" Status code: {response.status_code}," f" error code: {response.code}," f" error message: {response.message}." ) raise RuntimeError(error_msg) # step3: record the model api invocation if needed self._save_model_invocation( arguments={ "model": self.model_name, "messages": messages, **kwargs, }, response=response, ) # step4: update monitor accordingly input_tokens = response.usage.get("input_tokens", 0) image_tokens = response.usage.get("image_tokens", 0) audio_tokens = response.usage.get("audio_tokens", 0) output_tokens = response.usage.get("output_tokens", 0) self.update_monitor( call_counter=1, prompt_tokens=input_tokens, completion_tokens=output_tokens, total_tokens=input_tokens + output_tokens + image_tokens + audio_tokens, ) # step5: return response content = response.output["choices"][0]["message"]["content"] if isinstance(content, list): content = content[0]["text"] return ModelResponse( text=content, raw=response, ) def format( self, *args: Union[Msg, Sequence[Msg]], ) -> List: """Format the messages for DashScope Multimodal API. The multimodal API has the following requirements: - The roles of messages must alternate between "user" and "assistant". - The message with the role "system" should be the first message in the list. - If the system message exists, then the second message must have the role "user". - The last message in the list should have the role "user". - In each message, more than one figure is allowed. With the above requirements, we format the messages as follows: - If the first message is a system message, then we will keep it as system prompt. - We merge all messages into a dialogue history prompt in a single message with the role "user". - When there are multiple figures in the given messages, we will attach it to the user message by order. Note if there are multiple figures, this strategy may cause misunderstanding for the model. For advanced solutions, developers are encouraged to implement their own prompt engineering strategies. The following is an example: .. code-block:: python prompt = model.format( Msg( "system", "You're a helpful assistant", role="system", url="figure1" ), Msg( "Bob", "How about this picture?", role="assistant", url="figure2" ), Msg( "user", "It's wonderful! How about mine?", role="user", image="figure3" ) ) The prompt will be as follows: .. code-block:: python [ { "role": "system", "content": [ {"text": "You are a helpful assistant"}, {"image": "figure1"} ] }, { "role": "user", "content": [ {"image": "figure2"}, {"image": "figure3"}, { "text": ( "## Dialogue History\\n" "Bob: How about this picture?\\n" "user: It's wonderful! How about mine?" ) }, ] } ] Note: In multimodal API, the url of local files should be prefixed with "file://", which will be attached in this format function. Args: args (`Union[Msg, Sequence[Msg]]`): The input arguments to be formatted, where each argument should be a `Msg` object, or a list of `Msg` objects. In distribution, placeholder is also allowed. Returns: `List[dict]`: The formatted messages. """ # Parse all information into a list of messages input_msgs = [] for _ in args: if _ is None: continue if isinstance(_, Msg): input_msgs.append(_) elif isinstance(_, list) and all(isinstance(__, Msg) for __ in _): input_msgs.extend(_) else: raise TypeError( f"The input should be a Msg object or a list " f"of Msg objects, got {type(_)}.", ) messages = [] # record dialog history as a list of strings dialogue = [] image_or_audio_dicts = [] for i, unit in enumerate(input_msgs): if i == 0 and unit.role == "system": # system prompt content = self.convert_url(unit.url) content.append({"text": _convert_to_str(unit.content)}) messages.append( { "role": unit.role, "content": content, }, ) else: # text message dialogue.append( f"{unit.name}: {_convert_to_str(unit.content)}", ) # image and audio image_or_audio_dicts.extend(self.convert_url(unit.url)) dialogue_history = "\n".join(dialogue) user_content_template = "## Dialogue History\n{dialogue_history}" messages.append( { "role": "user", "content": [ # Place the image or audio before the dialogue history *image_or_audio_dicts, { "text": user_content_template.format( dialogue_history=dialogue_history, ), }, ], }, ) return messages def convert_url(self, url: Union[str, Sequence[str], None]) -> List[dict]: """Convert the url to the format of DashScope API. Note for local files, a prefix "file://" will be added. Args: url (`Union[str, Sequence[str], None]`): A string of url of a list of urls to be converted. Returns: `List[dict]`: A list of dictionaries with key as the type of the url and value as the url. Only "image" and "audio" are supported. """ if url is None: return [] if isinstance(url, str): url_type = _guess_type_by_extension(url) if url_type in ["audio", "image"]: # Add prefix for local files if os.path.exists(url): url = "file://" + url return [{url_type: url}] else: # skip unsupported url logger.warning( f"Skip unsupported url ({url_type}), " f"expect image or audio.", ) return [] elif isinstance(url, list): dicts = [] for _ in url: dicts.extend(self.convert_url(_)) return dicts else: raise TypeError( f"Unsupported url type {type(url)}, " f"str or list expected.", ) ``` modelscope/agentscope/blob/main/src/agentscope/models/gemini_model.py: ```py # -*- coding: utf-8 -*- """Google Gemini model wrapper.""" import os from abc import ABC from collections.abc import Iterable from typing import Sequence, Union, Any, List, Optional, Generator from loguru import logger from agentscope.message import Msg from agentscope.models import ModelWrapperBase, ModelResponse from agentscope.utils.tools import _convert_to_str try: import google.generativeai as genai # This package will be installed when the google-generativeai is installed import google.ai.generativelanguage as glm except ImportError: genai = None glm = None class GeminiWrapperBase(ModelWrapperBase, ABC): """The base class for Google Gemini model wrapper.""" _generation_method = None """The generation method used in `__call__` function, which is used to filter models in `list_models` function.""" def __init__( self, config_name: str, model_name: str, api_key: str = None, **kwargs: Any, ) -> None: """Initialize the wrapper for Google Gemini model. Args: model_name (`str`): The name of the model. api_key (`str`, defaults to `None`): The api_key for the model. If it is not provided, it will be loaded from environment variable. """ super().__init__(config_name=config_name) # Test if the required package is installed if genai is None: raise ImportError( "The google-generativeai package is not installed, " "please install it first.", ) # Load the api_key from argument or environment variable api_key = api_key or os.environ.get("GOOGLE_API_KEY") if api_key is None: raise ValueError( "Google api_key must be provided or set as an " "environment variable.", ) genai.configure(api_key=api_key, **kwargs) self.model_name = model_name self._register_default_metrics() def _register_default_metrics(self) -> None: """Register the default metrics for the model.""" raise NotImplementedError( "The method `_register_default_metrics` must be implemented.", ) def list_models(self) -> Sequence: """List all available models for this API calling.""" support_models = list(genai.list_models()) if self.generation_method is None: return support_models else: return [ _ for _ in support_models if self._generation_method in _.supported_generation_methods ] def __call__(self, *args: Any, **kwargs: Any) -> ModelResponse: """Processing input with the model.""" raise NotImplementedError( f"Model Wrapper [{type(self).__name__}]" f" is missing the the required `__call__`" f" method.", ) class GeminiChatWrapper(GeminiWrapperBase): """The wrapper for Google Gemini chat model, e.g. gemini-pro""" model_type: str = "gemini_chat" """The type of the model, which is used in model configuration.""" generation_method = "generateContent" """The generation method used in `__call__` function.""" def __init__( self, config_name: str, model_name: str, api_key: str = None, stream: bool = False, **kwargs: Any, ) -> None: """Initialize the wrapper for Google Gemini model. Args: model_name (`str`): The name of the model. api_key (`str`, defaults to `None`): The api_key for the model. If it is not provided, it will be loaded from environment variable. stream (`bool`, defaults to `False`): Whether to use stream mode. """ super().__init__( config_name=config_name, model_name=model_name, api_key=api_key, **kwargs, ) self.stream = stream # Create the generative model self.model = genai.GenerativeModel(model_name, **kwargs) def __call__( self, contents: Union[Sequence, str], stream: Optional[bool] = None, **kwargs: Any, ) -> ModelResponse: """Generate response for the given contents. Args: contents (`Union[Sequence, str]`): The content to generate response. stream (`Optional[bool]`, defaults to `None`) Whether to use stream mode. **kwargs: The additional arguments for generating response. Returns: `ModelResponse`: The response text in text field, and the raw response in raw field. """ # step1: checking messages if isinstance(contents, Iterable): pass elif not isinstance(contents, str): logger.warning( "The input content is not a string or a list of " "messages, which may cause unexpected behavior.", ) # Check if stream is provided if stream is None: stream = self.stream # step2: forward to generate response kwargs.update( { "contents": contents, "stream": stream, }, ) response = self.model.generate_content(**kwargs) if stream: def generator() -> Generator[str, None, None]: text = "" last_chunk = None for chunk in response: text += self._extract_text_content_from_response( contents, chunk, ) yield text last_chunk = chunk # Update the last chunk last_chunk.candidates[0].content.parts[0].text = text self._save_model_invocation_and_update_monitor( contents, kwargs, last_chunk, ) return ModelResponse( stream=generator(), ) else: self._save_model_invocation_and_update_monitor( contents, kwargs, response, ) # step6: return response return ModelResponse( text=response.text, raw=response, ) def _save_model_invocation_and_update_monitor( self, contents: Union[Sequence[Any], str], kwargs: dict, response: Any, ) -> None: """Save the model invocation and update the monitor accordingly.""" # Record the api invocation if needed self._save_model_invocation( arguments=kwargs, response=str(response), ) # Update monitor accordingly if hasattr(response, "usage_metadata"): token_prompt = response.usage_metadata.prompt_token_count token_response = response.usage_metadata.candidates_token_count else: token_prompt = self.model.count_tokens(contents).total_tokens token_response = self.model.count_tokens( response.text, ).total_tokens self.update_monitor( call_counter=1, completion_tokens=token_response, prompt_tokens=token_prompt, total_tokens=token_prompt + token_response, ) def _extract_text_content_from_response( self, contents: Union[Sequence[Any], str], response: Any, ) -> str: """Extract the text content from the response of gemini API Note: to avoid import error during type checking in python 3.11+, here we use `typing.Any` to avoid raising error Args: contents (`Union[Sequence[Any], str]`): The prompt contents response (`Any`): The response from gemini API Returns: `str`: The extracted string. """ # Check for candidates and handle accordingly if ( not response.candidates[0].content or not response.candidates[0].content.parts or not response.candidates[0].content.parts[0].text ): # If we cannot get the response text from the model finish_reason = response.candidates[0].finish_reason reasons = glm.Candidate.FinishReason if finish_reason == reasons.STOP: error_info = ( "Natural stop point of the model or provided stop " "sequence." ) elif finish_reason == reasons.MAX_TOKENS: error_info = ( "The maximum number of tokens as specified in the request " "was reached." ) elif finish_reason == reasons.SAFETY: error_info = ( "The candidate content was flagged for safety reasons." ) elif finish_reason == reasons.RECITATION: error_info = ( "The candidate content was flagged for recitation reasons." ) elif finish_reason in [ reasons.FINISH_REASON_UNSPECIFIED, reasons.OTHER, ]: error_info = "Unknown error." else: error_info = "No information provided from Gemini API." raise ValueError( "The Google Gemini API failed to generate text response with " f"the following finish reason: {error_info}\n" f"YOUR INPUT: {contents}\n" f"RAW RESPONSE FROM GEMINI API: {response}\n", ) return response.text def _register_default_metrics(self) -> None: """Register the default metrics for the model.""" self.monitor.register( self._metric("call_counter"), metric_unit="times", ) self.monitor.register( self._metric("prompt_tokens"), metric_unit="token", ) self.monitor.register( self._metric("completion_tokens"), metric_unit="token", ) self.monitor.register( self._metric("total_tokens"), metric_unit="token", ) def format( self, *args: Union[Msg, Sequence[Msg]], ) -> List[dict]: """This function provide a basic prompting strategy for Gemini Chat API in multi-party conversation, which combines all input into a single string, and wrap it into a user message. We make the above decision based on the following constraints of the Gemini generate API: 1. In Gemini `generate_content` API, the `role` field must be either `user` or `model`. 2. If we pass a list of messages to the `generate_content` API, the `user` role must speak in the beginning and end of the messages, and `user` and `model` must alternative. This prevents us to build a multi-party conversations, where `model` may keep speaking in different names. The above information is updated to 2024/03/21. More information about the Gemini `generate_content` API can be found in https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/gemini Based on the above considerations, we decide to combine all messages into a single user message. This is a simple and straightforward strategy, if you have any better ideas, pull request and discussion are welcome in our GitHub repository https://github.com/agentscope/agentscope! Args: args (`Union[Msg, Sequence[Msg]]`): The input arguments to be formatted, where each argument should be a `Msg` object, or a list of `Msg` objects. In distribution, placeholder is also allowed. Returns: `List[dict]`: A list with one user message. """ input_msgs = [] for _ in args: if _ is None: continue if isinstance(_, Msg): input_msgs.append(_) elif isinstance(_, list) and all(isinstance(__, Msg) for __ in _): input_msgs.extend(_) else: raise TypeError( f"The input should be a Msg object or a list " f"of Msg objects, got {type(_)}.", ) # record dialog history as a list of strings sys_prompt = None dialogue = [] for i, unit in enumerate(input_msgs): if i == 0 and unit.role == "system": # system prompt sys_prompt = _convert_to_str(unit.content) else: # Merge all messages into a dialogue history prompt dialogue.append( f"{unit.name}: {_convert_to_str(unit.content)}", ) dialogue_history = "\n".join(dialogue) if sys_prompt is None: user_content_template = "## Dialogue History\n{dialogue_history}" else: user_content_template = ( "{sys_prompt}\n" "\n" "## Dialogue History\n" "{dialogue_history}" ) messages = [ { "role": "user", "parts": [ user_content_template.format( sys_prompt=sys_prompt, dialogue_history=dialogue_history, ), ], }, ] return messages class GeminiEmbeddingWrapper(GeminiWrapperBase): """The wrapper for Google Gemini embedding model, e.g. models/embedding-001""" model_type: str = "gemini_embedding" """The type of the model, which is used in model configuration.""" _generation_method = "embedContent" """The generation method used in `__call__` function.""" def __call__( self, content: Union[Sequence[Msg], str], task_type: str = None, title: str = None, **kwargs: Any, ) -> ModelResponse: """Generate embedding for the given content. More detailed information please refer to https://ai.google.dev/tutorials/python_quickstart#use_embeddings Args: content (`Union[Sequence[Msg], str]`): The content to generate embedding. task_type (`str`, defaults to `None`): The type of the task. title (`str`, defaults to `None`): The title of the content. **kwargs: The additional arguments for generating embedding. Returns: `ModelResponse`: The response embedding in embedding field, and the raw response in raw field. """ # step1: forward to generate response response = genai.embed_content( model=self.model_name, content=content, task_type=task_type, title=title, **kwargs, ) # step2: record the api invocation if needed self._save_model_invocation( arguments={ "content": content, "task_type": task_type, "title": title, **kwargs, }, response=response, ) # TODO: Up to 2023/03/11, the embedding model doesn't support to # count tokens. # step3: update monitor accordingly self.update_monitor(call_counter=1) return ModelResponse( raw=response, embedding=response["embedding"], ) def _register_default_metrics(self) -> None: """Register the default metrics for the model.""" self.monitor.register( self._metric("call_counter"), metric_unit="times", ) ``` modelscope/agentscope/blob/main/src/agentscope/models/litellm_model.py: ```py # -*- coding: utf-8 -*- """Model wrapper based on litellm https://docs.litellm.ai/docs/""" from abc import ABC from typing import Union, Any, List, Sequence, Optional, Generator from loguru import logger from ._model_utils import _verify_text_content_in_openai_delta_response from .model import ModelWrapperBase, ModelResponse from ..message import Msg from ..utils.tools import _convert_to_str class LiteLLMWrapperBase(ModelWrapperBase, ABC): """The model wrapper based on LiteLLM API.""" def __init__( self, config_name: str, model_name: str = None, generate_args: dict = None, **kwargs: Any, ) -> None: """ To use the LiteLLM wrapper, environment variables must be set. Different model_name could be using different environment variables. For example: - for model_name: "gpt-3.5-turbo", you need to set "OPENAI_API_KEY" ``` os.environ["OPENAI_API_KEY"] = "your-api-key" ``` - for model_name: "claude-2", you need to set "ANTHROPIC_API_KEY" - for Azure OpenAI, you need to set "AZURE_API_KEY", "AZURE_API_BASE", "AZURE_API_VERSION" You should refer to the docs in https://docs.litellm.ai/docs/ Args: config_name (`str`): The name of the model config. model_name (`str`, default `None`): The name of the model to use in OpenAI API. generate_args (`dict`, default `None`): The extra keyword arguments used in litellm api generation, e.g. `temperature`, `seed`. For generate_args, please refer to https://docs.litellm.ai/docs/completion/input for more details. """ if model_name is None: model_name = config_name logger.warning("model_name is not set, use config_name instead.") super().__init__(config_name=config_name) self.model_name = model_name self.generate_args = generate_args or {} self._register_default_metrics() def format( self, *args: Union[Msg, Sequence[Msg]], ) -> Union[List[dict], str]: raise RuntimeError( f"Model Wrapper [{type(self).__name__}] doesn't " f"need to format the input. Please try to use the " f"model wrapper directly.", ) class LiteLLMChatWrapper(LiteLLMWrapperBase): """The model wrapper based on litellm chat API. To use the LiteLLM wrapper, environment variables must be set. Different model_name could be using different environment variables. For example: - for model_name: "gpt-3.5-turbo", you need to set "OPENAI_API_KEY" ``` os.environ["OPENAI_API_KEY"] = "your-api-key" ``` - for model_name: "claude-2", you need to set "ANTHROPIC_API_KEY" - for Azure OpenAI, you need to set "AZURE_API_KEY", "AZURE_API_BASE", "AZURE_API_VERSION" You should refer to the docs in https://docs.litellm.ai/docs/ . """ model_type: str = "litellm_chat" def __init__( self, config_name: str, model_name: str = None, stream: bool = False, generate_args: dict = None, **kwargs: Any, ) -> None: """ To use the LiteLLM wrapper, environment variables must be set. Different model_name could be using different environment variables. For example: - for model_name: "gpt-3.5-turbo", you need to set "OPENAI_API_KEY" ``` os.environ["OPENAI_API_KEY"] = "your-api-key" ``` - for model_name: "claude-2", you need to set "ANTHROPIC_API_KEY" - for Azure OpenAI, you need to set "AZURE_API_KEY", "AZURE_API_BASE", "AZURE_API_VERSION" You should refer to the docs in https://docs.litellm.ai/docs/ Args: config_name (`str`): The name of the model config. model_name (`str`, default `None`): The name of the model to use in OpenAI API. stream (`bool`, default `False`): Whether to enable stream mode. generate_args (`dict`, default `None`): The extra keyword arguments used in litellm api generation, e.g. `temperature`, `seed`. For generate_args, please refer to https://docs.litellm.ai/docs/completion/input for more details. """ super().__init__( config_name=config_name, model_name=model_name, generate_args=generate_args, **kwargs, ) self.stream = stream def _register_default_metrics(self) -> None: # Set monitor accordingly # TODO: set quota to the following metrics self.monitor.register( self._metric("call_counter"), metric_unit="times", ) self.monitor.register( self._metric("prompt_tokens"), metric_unit="token", ) self.monitor.register( self._metric("completion_tokens"), metric_unit="token", ) self.monitor.register( self._metric("total_tokens"), metric_unit="token", ) def __call__( self, messages: list, stream: Optional[bool] = None, **kwargs: Any, ) -> ModelResponse: """ Args: messages (`list`): A list of messages to process. stream (`Optional[bool]`, default `None`): Whether to enable stream mode. If not set, the stream mode will be set to the value in the initialization. **kwargs (`Any`): The keyword arguments to litellm chat completions API, e.g. `temperature`, `max_tokens`, `top_p`, etc. Please refer to https://docs.litellm.ai/docs/completion/input for more detailed arguments. Returns: `ModelResponse`: The response text in text field, and the raw response in raw field. """ # step1: prepare keyword arguments kwargs = {**self.generate_args, **kwargs} # step2: checking messages if not isinstance(messages, list): raise ValueError( "LiteLLM `messages` field expected type `list`, " f"got `{type(messages)}` instead.", ) if not all("role" in msg and "content" in msg for msg in messages): raise ValueError( "Each message in the 'messages' list must contain a 'role' " "and 'content' key for LiteLLM API.", ) # Import litellm only when it is used try: import litellm except ImportError as e: raise ImportError( "Cannot find litellm in current environment, please " "install it by `pip install litellm`.", ) from e # step3: forward to generate response if stream is None: stream = self.stream kwargs.update( { "model": self.model_name, "messages": messages, "stream": stream, }, ) # Add stream_options to obtain the usage information if stream: kwargs["stream_options"] = {"include_usage": True} response = litellm.completion(**kwargs) if stream: def generator() -> Generator[str, None, None]: text = "" last_chunk = {} for chunk in response: # In litellm, the content maybe `None` for the last second # chunk chunk = chunk.model_dump() if _verify_text_content_in_openai_delta_response(chunk): text += chunk["choices"][0]["delta"]["content"] yield text last_chunk = chunk # Update the last chunk to save locally if last_chunk.get("choices", []) in [None, []]: last_chunk["choices"] = [{}] last_chunk["choices"][0]["message"] = { "role": "assistant", "content": text, } self._save_model_invocation_and_update_monitor( kwargs, last_chunk, ) return ModelResponse( stream=generator(), ) else: response = response.model_dump() self._save_model_invocation_and_update_monitor( kwargs, response, ) # return response return ModelResponse( text=response["choices"][0]["message"]["content"], raw=response, ) def _save_model_invocation_and_update_monitor( self, kwargs: dict, response: dict, ) -> None: """Save the model invocation and update the monitor accordingly.""" # step4: record the api invocation if needed self._save_model_invocation( arguments=kwargs, response=response, ) # step5: update monitor accordingly if response.get("usage", None) is not None: self.update_monitor(call_counter=1, **response["usage"]) def format( self, *args: Union[Msg, Sequence[Msg]], ) -> List[dict]: """Format the input string and dictionary into the unified format. Note that the format function might not be the optimal way to construct prompt for every model, but a common way to do so. Developers are encouraged to implement their own prompt engineering strategies if they have strong performance concerns. Args: args (`Union[Msg, Sequence[Msg]]`): The input arguments to be formatted, where each argument should be a `Msg` object, or a list of `Msg` objects. In distribution, placeholder is also allowed. Returns: `List[dict]`: The formatted messages in the format that anthropic Chat API required. """ # Parse all information into a list of messages input_msgs = [] for _ in args: if _ is None: continue if isinstance(_, Msg): input_msgs.append(_) elif isinstance(_, list) and all(isinstance(__, Msg) for __ in _): input_msgs.extend(_) else: raise TypeError( f"The input should be a Msg object or a list " f"of Msg objects, got {type(_)}.", ) # record dialog history as a list of strings system_content_template = [] dialogue = [] for i, unit in enumerate(input_msgs): if i == 0 and unit.role == "system": # system prompt system_prompt = _convert_to_str(unit.content) if not system_prompt.endswith("\n"): system_prompt += "\n" system_content_template.append(system_prompt) else: # Merge all messages into a dialogue history prompt dialogue.append( f"{unit.name}: {_convert_to_str(unit.content)}", ) if len(dialogue) != 0: dialogue_history = "\n".join(dialogue) system_content_template.extend( ["## Dialogue History", dialogue_history], ) system_content = "\n".join(system_content_template) messages = [ { "role": "user", "content": system_content, }, ] return messages ``` modelscope/agentscope/blob/main/src/agentscope/models/model.py: ```py # -*- coding: utf-8 -*- """The configuration file should contain one or a list of model configs, and each model config should follow the following format. .. code-block:: python { "config_name": "{config_name}", "model_type": "openai_chat" | "post_api" | ..., ... } After that, you can specify model by {config_name}. Note: The parameters for different types of models are different. For OpenAI API, the format is: .. code-block:: python { "config_name": "{id of your model}", "model_type": "openai_chat", "model_name": "{model_name_for_openai, e.g. gpt-3.5-turbo}", "api_key": "{your_api_key}", "organization": "{your_organization, if needed}", "client_args": { # ... }, "generate_args": { # ... } } For Post API, toking huggingface inference API as an example, its format is: .. code-block:: python { "config_name": "{config_name}", "model_type": "post_api", "api_url": "{api_url}", "headers": {"Authorization": "Bearer {API_TOKEN}"}, "max_length": {max_length_of_model}, "timeout": {timeout}, "max_retries": {max_retries}, "generate_args": { "temperature": 0.5, # ... } } """ from __future__ import annotations import inspect import time from abc import ABCMeta from functools import wraps from typing import Sequence, Any, Callable, Union, List, Type from loguru import logger from agentscope.utils import QuotaExceededError from .response import ModelResponse from ..exception import ResponseParsingError from ..file_manager import file_manager from ..message import Msg from ..utils import MonitorFactory from ..utils.monitor import get_full_name from ..utils.tools import _get_timestamp from ..constants import _DEFAULT_MAX_RETRIES from ..constants import _DEFAULT_RETRY_INTERVAL def _response_parse_decorator( model_call: Callable, ) -> Callable: """A decorator for parsing the response of model call. It will take `parse_func`, `fault_handler` and `max_retries` as arguments. The detailed process is as follows: 1. If `parse_func` is provided, then the response will be parsed first. 2. If the parsing fails (throws an exception), then response generation will be repeated for `max_retries` times and parsed again. 3. After `max_retries` times, if the parsing still fails, then if `fault_handler` is provided, the response will be processed by `fault_handler`. """ # check if the decorated `model_call` function uses the default # arguments of this decorator. parameters = inspect.signature(model_call).parameters for name in parameters.keys(): if name in ["parse_func", "max_retries"]: logger.warning( f"The argument {name} is used by the decorator, " f"which will not be passed to the model call " f"function.", ) @wraps(model_call) def checking_wrapper(self: Any, *args: Any, **kwargs: Any) -> dict: # Step1: Extract parse_func and fault_handler parse_func = kwargs.pop("parse_func", None) fault_handler = kwargs.pop("fault_handler", None) max_retries = kwargs.pop("max_retries", None) or _DEFAULT_MAX_RETRIES # Step2: Call the model and parse the response # Return the response directly if parse_func is not provided if parse_func is None: return model_call(self, *args, **kwargs) # Otherwise, try to parse the response for itr in range(1, max_retries + 1): # Call the model response = model_call(self, *args, **kwargs) # Parse the response if needed try: return parse_func(response) except ResponseParsingError as e: if itr < max_retries: logger.warning( f"Fail to parse response ({itr}/{max_retries}):\n" f"{response}.\n" f"{e.__class__.__name__}: {e}", ) time.sleep(_DEFAULT_RETRY_INTERVAL * itr) else: if fault_handler is not None and callable(fault_handler): return fault_handler(response) else: raise return {} return checking_wrapper class _ModelWrapperMeta(ABCMeta): """A meta call to replace the model wrapper's __call__ function with wrapper about error handling.""" def __new__(mcs, name: Any, bases: Any, attrs: Any) -> Any: if "__call__" in attrs: attrs["__call__"] = _response_parse_decorator(attrs["__call__"]) return super().__new__(mcs, name, bases, attrs) def __init__(cls, name: Any, bases: Any, attrs: Any) -> None: if not hasattr(cls, "_registry"): cls._registry = {} cls._type_registry = {} cls._deprecated_type_registry = {} else: cls._registry[name] = cls if hasattr(cls, "model_type"): cls._type_registry[cls.model_type] = cls if hasattr(cls, "deprecated_model_type"): cls._deprecated_type_registry[ cls.deprecated_model_type ] = cls super().__init__(name, bases, attrs) class ModelWrapperBase(metaclass=_ModelWrapperMeta): """The base class for model wrapper.""" model_type: str """The type of the model wrapper, which is to identify the model wrapper class in model configuration.""" config_name: str """The name of the model configuration.""" model_name: str """The name of the model, which is used in model api calling.""" def __init__( self, # pylint: disable=W0613 config_name: str, **kwargs: Any, ) -> None: """Base class for model wrapper. All model wrappers should inherit this class and implement the `__call__` function. Args: config_name (`str`): The id of the model, which is used to extract configuration from the config file. """ self.monitor = MonitorFactory.get_monitor() self.config_name = config_name logger.info(f"Initialize model by configuration [{config_name}]") @classmethod def get_wrapper(cls, model_type: str) -> Type[ModelWrapperBase]: """Get the specific model wrapper""" if model_type in cls._type_registry: return cls._type_registry[model_type] # type: ignore[return-value] elif model_type in cls._registry: return cls._registry[model_type] # type: ignore[return-value] elif model_type in cls._deprecated_type_registry: deprecated_cls = cls._deprecated_type_registry[model_type] logger.warning( f"Model type [{model_type}] will be deprecated in future " f"releases, please use [{deprecated_cls.model_type}] instead.", ) return deprecated_cls # type: ignore[return-value] else: return None # type: ignore[return-value] def __call__(self, *args: Any, **kwargs: Any) -> ModelResponse: """Processing input with the model.""" raise NotImplementedError( f"Model Wrapper [{type(self).__name__}]" f" is missing the required `__call__`" f" method.", ) def format( self, *args: Union[Msg, Sequence[Msg]], ) -> Union[List[dict], str]: """Format the input string or dict into the format that the model API required.""" raise NotImplementedError( f"Model Wrapper [{type(self).__name__}]" f" is missing the required `format` method", ) def _save_model_invocation( self, arguments: dict, response: Any, ) -> None: """Save model invocation.""" model_class = self.__class__.__name__ timestamp = _get_timestamp("%Y%m%d-%H%M%S") invocation_record = { "model_class": model_class, "timestamp": timestamp, "arguments": arguments, "response": response, } file_manager.save_api_invocation( f"model_{model_class}_{timestamp}", invocation_record, ) def _register_budget(self, model_name: str, budget: float) -> None: """Register the budget of the model by model_name.""" self.monitor.register_budget( model_name=model_name, value=budget, prefix=model_name, ) def _register_default_metrics(self) -> None: """Register metrics to the monitor.""" def _metric(self, metric_name: str) -> str: """Add the class name and model name as prefix to the metric name. Args: metric_name (`str`): The metric name. Returns: `str`: Metric name of this wrapper. """ if hasattr(self, "model_name"): return get_full_name(name=metric_name, prefix=self.model_name) else: return get_full_name(name=metric_name) def update_monitor(self, **kwargs: Any) -> None: """Update the monitor with the given values. Args: kwargs (`dict`): The values to be updated to the monitor. """ if hasattr(self, "model_name"): prefix = self.model_name else: prefix = None try: self.monitor.update( kwargs, prefix=prefix, ) except QuotaExceededError as e: logger.error(e.message) ``` modelscope/agentscope/blob/main/src/agentscope/models/ollama_model.py: ```py # -*- coding: utf-8 -*- """Model wrapper for Ollama models.""" from abc import ABC from typing import Sequence, Any, Optional, List, Union, Generator from agentscope.message import Msg from agentscope.models import ModelWrapperBase, ModelResponse from agentscope.utils.tools import _convert_to_str try: import ollama except ImportError: ollama = None class OllamaWrapperBase(ModelWrapperBase, ABC): """The base class for Ollama model wrappers. To use Ollama API, please 1. First install ollama server from https://ollama.com/download and start the server 2. Pull the model by `ollama pull {model_name}` in terminal After that, you can use the ollama API. """ model_type: str """The type of the model wrapper, which is to identify the model wrapper class in model configuration.""" model_name: str """The model name used in ollama API.""" options: dict """A dict contains the options for ollama generation API, e.g. {"temperature": 0, "seed": 123}""" keep_alive: str """Controls how long the model will stay loaded into memory following the request.""" def __init__( self, config_name: str, model_name: str, options: dict = None, keep_alive: str = "5m", host: Optional[Union[str, None]] = None, **kwargs: Any, ) -> None: """Initialize the model wrapper for Ollama API. Args: model_name (`str`): The model name used in ollama API. options (`dict`, default `None`): The extra keyword arguments used in Ollama api generation, e.g. `{"temperature": 0., "seed": 123}`. keep_alive (`str`, default `5m`): Controls how long the model will stay loaded into memory following the request. host (`str`, default `None`): The host port of the ollama server. Defaults to `None`, which is 127.0.0.1:11434. """ super().__init__(config_name=config_name) self.model_name = model_name self.options = options self.keep_alive = keep_alive self.client = ollama.Client(host=host, **kwargs) self._register_default_metrics() class OllamaChatWrapper(OllamaWrapperBase): """The model wrapper for Ollama chat API.""" model_type: str = "ollama_chat" def __init__( self, config_name: str, model_name: str, stream: bool = False, options: dict = None, keep_alive: str = "5m", host: Optional[Union[str, None]] = None, **kwargs: Any, ) -> None: """Initialize the model wrapper for Ollama API. Args: model_name (`str`): The model name used in ollama API. stream (`bool`, default `False`): Whether to enable stream mode. options (`dict`, default `None`): The extra keyword arguments used in Ollama api generation, e.g. `{"temperature": 0., "seed": 123}`. keep_alive (`str`, default `5m`): Controls how long the model will stay loaded into memory following the request. host (`str`, default `None`): The host port of the ollama server. Defaults to `None`, which is 127.0.0.1:11434. """ super().__init__( config_name=config_name, model_name=model_name, options=options, keep_alive=keep_alive, host=host, **kwargs, ) self.stream = stream def __call__( self, messages: Sequence[dict], stream: Optional[bool] = None, options: Optional[dict] = None, keep_alive: Optional[str] = None, **kwargs: Any, ) -> ModelResponse: """Generate response from the given messages. Args: messages (`Sequence[dict]`): A list of messages, each message is a dict contains the `role` and `content` of the message. stream (`bool`, default `None`): Whether to enable stream mode, which will override the `stream` input in the constructor. options (`dict`, default `None`): The extra arguments used in ollama chat API, which takes effect only on this call, and will be merged with the `options` input in the constructor, e.g. `{"temperature": 0., "seed": 123}`. keep_alive (`str`, default `None`): How long the model will stay loaded into memory following the request, which takes effect only on this call, and will override the `keep_alive` input in the constructor. Returns: `ModelResponse`: The response text in `text` field, and the raw response in `raw` field. """ # step1: prepare parameters accordingly if options is None: options = self.options else: options = {**self.options, **options} keep_alive = keep_alive or self.keep_alive # step2: forward to generate response if stream is None: stream = self.stream kwargs.update( { "model": self.model_name, "messages": messages, "stream": stream, "options": options, "keep_alive": keep_alive, }, ) response = self.client.chat(**kwargs) if stream: def generator() -> Generator[str, None, None]: last_chunk = {} text = "" for chunk in response: text += chunk["message"]["content"] yield text last_chunk = chunk # Replace the last chunk with the full text last_chunk["message"]["content"] = text self._save_model_invocation_and_update_monitor( kwargs, last_chunk, ) return ModelResponse( stream=generator(), raw=response, ) else: # step3: save model invocation and update monitor self._save_model_invocation_and_update_monitor( kwargs, response, ) # step4: return response return ModelResponse( text=response["message"]["content"], raw=response, ) def _save_model_invocation_and_update_monitor( self, kwargs: dict, response: dict, ) -> None: """Save the model invocation and update the monitor accordingly. Args: kwargs (`dict`): The keyword arguments to the DashScope chat API. response (`dict`): The response object returned by the DashScope chat API. """ prompt_eval_count = response.get("prompt_eval_count", 0) eval_count = response.get("eval_count", 0) self.update_monitor( call_counter=1, prompt_tokens=prompt_eval_count, completion_tokens=eval_count, total_tokens=prompt_eval_count + eval_count, ) self._save_model_invocation( arguments=kwargs, response=response, ) def _register_default_metrics(self) -> None: """Register metrics to the monitor.""" self.monitor.register( self._metric("call_counter"), metric_unit="times", ) self.monitor.register( self._metric("prompt_tokens"), metric_unit="tokens", ) self.monitor.register( self._metric("completion_tokens"), metric_unit="token", ) self.monitor.register( self._metric("total_tokens"), metric_unit="token", ) def format( self, *args: Union[Msg, Sequence[Msg]], ) -> List[dict]: """Format the messages for ollama Chat API. All messages will be formatted into a single system message with system prompt and dialogue history. Note: 1. This strategy maybe not suitable for all scenarios, and developers are encouraged to implement their own prompt engineering strategies. 2. For ollama chat api, the content field shouldn't be empty string. Example: .. code-block:: python prompt = model.format( Msg("system", "You're a helpful assistant", role="system"), Msg("Bob", "Hi, how can I help you?", role="assistant"), Msg("user", "What's the date today?", role="user") ) The prompt will be as follows: .. code-block:: python [ { "role": "user", "content": ( "You're a helpful assistant\\n\\n" "## Dialogue History\\n" "Bob: Hi, how can I help you?\\n" "user: What's the date today?" ) } ] Args: args (`Union[Msg, Sequence[Msg]]`): The input arguments to be formatted, where each argument should be a `Msg` object, or a list of `Msg` objects. In distribution, placeholder is also allowed. Returns: `List[dict]`: The formatted messages. """ # Parse all information into a list of messages input_msgs = [] for _ in args: if _ is None: continue if isinstance(_, Msg): input_msgs.append(_) elif isinstance(_, list) and all(isinstance(__, Msg) for __ in _): input_msgs.extend(_) else: raise TypeError( f"The input should be a Msg object or a list " f"of Msg objects, got {type(_)}.", ) # record dialog history as a list of strings system_content_template = [] dialogue = [] # TODO: here we default the url links to images images = [] for i, unit in enumerate(input_msgs): if i == 0 and unit.role == "system": # system prompt system_prompt = _convert_to_str(unit.content) if not system_prompt.endswith("\n"): system_prompt += "\n" system_content_template.append(system_prompt) else: # Merge all messages into a dialogue history prompt dialogue.append( f"{unit.name}: {_convert_to_str(unit.content)}", ) if unit.url is not None: images.append(unit.url) if len(dialogue) != 0: dialogue_history = "\n".join(dialogue) system_content_template.extend( ["## Dialogue History", dialogue_history], ) system_content = "\n".join(system_content_template) system_message = { "role": "system", "content": system_content, } if len(images) != 0: system_message["images"] = images return [system_message] class OllamaEmbeddingWrapper(OllamaWrapperBase): """The model wrapper for Ollama embedding API.""" model_type: str = "ollama_embedding" def __call__( self, prompt: str, options: Optional[dict] = None, keep_alive: Optional[str] = None, **kwargs: Any, ) -> ModelResponse: """Generate embedding from the given prompt. Args: prompt (`str`): The prompt to generate response. options (`dict`, default `None`): The extra arguments used in ollama embedding API, which takes effect only on this call, and will be merged with the `options` input in the constructor, e.g. `{"temperature": 0., "seed": 123}`. keep_alive (`str`, default `None`): How long the model will stay loaded into memory following the request, which takes effect only on this call, and will override the `keep_alive` input in the constructor. Returns: `ModelResponse`: The response embedding in `embedding` field, and the raw response in `raw` field. """ # step1: prepare parameters accordingly if options is None: options = self.options else: options = {**self.options, **options} keep_alive = keep_alive or self.keep_alive # step2: forward to generate response response = self.client.embeddings( model=self.model_name, prompt=prompt, options=options, keep_alive=keep_alive, **kwargs, ) # step3: record the api invocation if needed self._save_model_invocation( arguments={ "model": self.model_name, "prompt": prompt, "options": options, "keep_alive": keep_alive, **kwargs, }, response=response, ) # step4: monitor the response self.update_monitor(call_counter=1) # step5: return response return ModelResponse( embedding=[response["embedding"]], raw=response, ) def _register_default_metrics(self) -> None: """Register metrics to the monitor.""" self.monitor.register( self._metric("call_counter"), metric_unit="times", ) def format( self, *args: Union[Msg, Sequence[Msg]], ) -> Union[List[dict], str]: raise RuntimeError( f"Model Wrapper [{type(self).__name__}] doesn't " f"need to format the input. Please try to use the " f"model wrapper directly.", ) class OllamaGenerationWrapper(OllamaWrapperBase): """The model wrapper for Ollama generation API.""" model_type: str = "ollama_generate" def __call__( self, prompt: str, options: Optional[dict] = None, keep_alive: Optional[str] = None, **kwargs: Any, ) -> ModelResponse: """Generate response from the given prompt. Args: prompt (`str`): The prompt to generate response. options (`dict`, default `None`): The extra arguments used in ollama generation API, which takes effect only on this call, and will be merged with the `options` input in the constructor, e.g. `{"temperature": 0., "seed": 123}`. keep_alive (`str`, default `None`): How long the model will stay loaded into memory following the request, which takes effect only on this call, and will override the `keep_alive` input in the constructor. Returns: `ModelResponse`: The response text in `text` field, and the raw response in `raw` field. """ # step1: prepare parameters accordingly if options is None: options = self.options else: options = {**self.options, **options} keep_alive = keep_alive or self.keep_alive # step2: forward to generate response response = self.client.generate( model=self.model_name, prompt=prompt, options=options, keep_alive=keep_alive, ) # step3: record the api invocation if needed self._save_model_invocation( arguments={ "model": self.model_name, "prompt": prompt, "options": options, "keep_alive": keep_alive, **kwargs, }, response=response, ) # step4: monitor the response self.update_monitor( call_counter=1, prompt_tokens=response.get("prompt_eval_count", 0), completion_tokens=response.get("eval_count", 0), total_tokens=response.get("prompt_eval_count", 0) + response.get("eval_count", 0), ) # step5: return response return ModelResponse( text=response["response"], raw=response, ) def _register_default_metrics(self) -> None: """Register metrics to the monitor.""" self.monitor.register( self._metric("call_counter"), metric_unit="times", ) self.monitor.register( self._metric("prompt_tokens"), metric_unit="tokens", ) self.monitor.register( self._metric("completion_tokens"), metric_unit="token", ) self.monitor.register( self._metric("total_tokens"), metric_unit="token", ) def format(self, *args: Union[Msg, Sequence[Msg]]) -> str: """Forward the input to the model. Args: args (`Union[Msg, Sequence[Msg]]`): The input arguments to be formatted, where each argument should be a `Msg` object, or a list of `Msg` objects. In distribution, placeholder is also allowed. Returns: `str`: The formatted string prompt. """ input_msgs = [] for _ in args: if _ is None: continue if isinstance(_, Msg): input_msgs.append(_) elif isinstance(_, list) and all(isinstance(__, Msg) for __ in _): input_msgs.extend(_) else: raise TypeError( f"The input should be a Msg object or a list " f"of Msg objects, got {type(_)}.", ) sys_prompt = None dialogue = [] for i, unit in enumerate(input_msgs): if i == 0 and unit.role == "system": # system prompt sys_prompt = _convert_to_str(unit.content) else: # Merge all messages into a dialogue history prompt dialogue.append( f"{unit.name}: {_convert_to_str(unit.content)}", ) dialogue_history = "\n".join(dialogue) if sys_prompt is None: prompt_template = "## Dialogue History\n{dialogue_history}" else: prompt_template = ( "{system_prompt}\n" "\n" "## Dialogue History\n" "{dialogue_history}" ) return prompt_template.format( system_prompt=sys_prompt, dialogue_history=dialogue_history, ) ``` modelscope/agentscope/blob/main/src/agentscope/models/openai_model.py: ```py # -*- coding: utf-8 -*- """Model wrapper for OpenAI models""" from abc import ABC from typing import Union, Any, List, Sequence, Dict, Optional, Generator from loguru import logger from ._model_utils import ( _verify_text_content_in_openai_delta_response, _verify_text_content_in_openai_message_response, ) from .model import ModelWrapperBase, ModelResponse from ..file_manager import file_manager from ..message import Msg from ..utils.tools import _convert_to_str, _to_openai_image_url from ..utils.token_utils import get_openai_max_length from ..constants import _DEFAULT_API_BUDGET class OpenAIWrapperBase(ModelWrapperBase, ABC): """The model wrapper for OpenAI API.""" def __init__( self, config_name: str, model_name: str = None, api_key: str = None, organization: str = None, client_args: dict = None, generate_args: dict = None, budget: float = _DEFAULT_API_BUDGET, **kwargs: Any, ) -> None: """Initialize the openai client. Args: config_name (`str`): The name of the model config. model_name (`str`, default `None`): The name of the model to use in OpenAI API. api_key (`str`, default `None`): The API key for OpenAI API. If not specified, it will be read from the environment variable `OPENAI_API_KEY`. organization (`str`, default `None`): The organization ID for OpenAI API. If not specified, it will be read from the environment variable `OPENAI_ORGANIZATION`. client_args (`dict`, default `None`): The extra keyword arguments to initialize the OpenAI client. generate_args (`dict`, default `None`): The extra keyword arguments used in openai api generation, e.g. `temperature`, `seed`. budget (`float`, default `None`): The total budget using this model. Set to `None` means no limit. """ if model_name is None: model_name = config_name logger.warning("model_name is not set, use config_name instead.") super().__init__(config_name=config_name) self.model_name = model_name self.generate_args = generate_args or {} try: import openai except ImportError as e: raise ImportError( "Cannot find openai package, please install it by " "`pip install openai`", ) from e self.client = openai.OpenAI( api_key=api_key, organization=organization, **(client_args or {}), ) # Set the max length of OpenAI model try: self.max_length = get_openai_max_length(self.model_name) except Exception as e: logger.warning( f"fail to get max_length for {self.model_name}: " f"{e}", ) self.max_length = None # Set monitor accordingly self._register_budget(model_name, budget) self._register_default_metrics() def format( self, *args: Union[Msg, Sequence[Msg]], ) -> Union[List[dict], str]: raise RuntimeError( f"Model Wrapper [{type(self).__name__}] doesn't " f"need to format the input. Please try to use the " f"model wrapper directly.", ) class OpenAIChatWrapper(OpenAIWrapperBase): """The model wrapper for OpenAI's chat API.""" model_type: str = "openai_chat" deprecated_model_type: str = "openai" substrings_in_vision_models_names = ["gpt-4-turbo", "vision", "gpt-4o"] """The substrings in the model names of vision models.""" def __init__( self, config_name: str, model_name: str = None, api_key: str = None, organization: str = None, client_args: dict = None, stream: bool = False, generate_args: dict = None, budget: float = _DEFAULT_API_BUDGET, **kwargs: Any, ) -> None: """Initialize the openai client. Args: config_name (`str`): The name of the model config. model_name (`str`, default `None`): The name of the model to use in OpenAI API. api_key (`str`, default `None`): The API key for OpenAI API. If not specified, it will be read from the environment variable `OPENAI_API_KEY`. organization (`str`, default `None`): The organization ID for OpenAI API. If not specified, it will be read from the environment variable `OPENAI_ORGANIZATION`. client_args (`dict`, default `None`): The extra keyword arguments to initialize the OpenAI client. stream (`bool`, default `False`): Whether to enable stream mode. generate_args (`dict`, default `None`): The extra keyword arguments used in openai api generation, e.g. `temperature`, `seed`. budget (`float`, default `None`): The total budget using this model. Set to `None` means no limit. """ super().__init__( config_name=config_name, model_name=model_name, api_key=api_key, organization=organization, client_args=client_args, generate_args=generate_args, budget=budget, **kwargs, ) self.stream = stream def _register_default_metrics(self) -> None: # Set monitor accordingly # TODO: set quota to the following metrics self.monitor.register( self._metric("call_counter"), metric_unit="times", ) self.monitor.register( self._metric("prompt_tokens"), metric_unit="token", ) self.monitor.register( self._metric("completion_tokens"), metric_unit="token", ) self.monitor.register( self._metric("total_tokens"), metric_unit="token", ) def __call__( self, messages: list, stream: Optional[bool] = None, **kwargs: Any, ) -> ModelResponse: """Processes a list of messages to construct a payload for the OpenAI API call. It then makes a request to the OpenAI API and returns the response. This method also updates monitoring metrics based on the API response. Each message in the 'messages' list can contain text content and optionally an 'image_urls' key. If 'image_urls' is provided, it is expected to be a list of strings representing URLs to images. These URLs will be transformed to a suitable format for the OpenAI API, which might involve converting local file paths to data URIs. Args: messages (`list`): A list of messages to process. stream (`Optional[bool]`, defaults to `None`) Whether to enable stream mode, which will override the `stream` argument in the constructor if provided. **kwargs (`Any`): The keyword arguments to OpenAI chat completions API, e.g. `temperature`, `max_tokens`, `top_p`, etc. Please refer to https://platform.openai.com/docs/api-reference/chat/create for more detailed arguments. Returns: `ModelResponse`: The response text in text field, and the raw response in raw field. Note: `parse_func`, `fault_handler` and `max_retries` are reserved for `_response_parse_decorator` to parse and check the response generated by model wrapper. Their usages are listed as follows: - `parse_func` is a callable function used to parse and check the response generated by the model, which takes the response as input. - `max_retries` is the maximum number of retries when the `parse_func` raise an exception. - `fault_handler` is a callable function which is called when the response generated by the model is invalid after `max_retries` retries. """ # step1: prepare keyword arguments kwargs = {**self.generate_args, **kwargs} # step2: checking messages if not isinstance(messages, list): raise ValueError( "OpenAI `messages` field expected type `list`, " f"got `{type(messages)}` instead.", ) if not all("role" in msg and "content" in msg for msg in messages): raise ValueError( "Each message in the 'messages' list must contain a 'role' " "and 'content' key for OpenAI API.", ) # step3: forward to generate response if stream is None: stream = self.stream kwargs.update( { "model": self.model_name, "messages": messages, "stream": stream, }, ) if stream: kwargs["stream_options"] = {"include_usage": True} response = self.client.chat.completions.create(**kwargs) if stream: def generator() -> Generator[str, None, None]: text = "" last_chunk = {} for chunk in response: chunk = chunk.model_dump() if _verify_text_content_in_openai_delta_response(chunk): text += chunk["choices"][0]["delta"]["content"] yield text last_chunk = chunk # Update the last chunk to save locally if last_chunk.get("choices", []) in [None, []]: last_chunk["choices"] = [{}] last_chunk["choices"][0]["message"] = { "role": "assistant", "content": text, } self._save_model_invocation_and_update_monitor( kwargs, last_chunk, ) return ModelResponse( stream=generator(), ) else: response = response.model_dump() self._save_model_invocation_and_update_monitor( kwargs, response, ) if _verify_text_content_in_openai_message_response(response): # return response return ModelResponse( text=response["choices"][0]["message"]["content"], raw=response, ) else: raise RuntimeError( f"Invalid response from OpenAI API: {response}", ) def _save_model_invocation_and_update_monitor( self, kwargs: dict, response: dict, ) -> None: """Save model invocation and update the monitor accordingly. Args: kwargs (`dict`): The keyword arguments used in model invocation response (`dict`): The response from model API """ self._save_model_invocation( arguments=kwargs, response=response, ) if response.get("usage", None) is not None: self.update_monitor(call_counter=1, **response["usage"]) def _format_msg_with_url( self, msg: Msg, ) -> Dict: """Format a message with image urls into openai chat format. This format method is used for gpt-4o, gpt-4-turbo, gpt-4-vision and other vision models. """ # Check if the model is a vision model if not any( _ in self.model_name for _ in self.substrings_in_vision_models_names ): logger.warning( f"The model {self.model_name} is not a vision model. " f"Skip the url in the message.", ) return { "role": msg.role, "name": msg.name, "content": _convert_to_str(msg.content), } # Put all urls into a list urls = [msg.url] if isinstance(msg.url, str) else msg.url # Check if the url refers to an image checked_urls = [] for url in urls: try: checked_urls.append(_to_openai_image_url(url)) except TypeError: logger.warning( f"The url {url} is not a valid image url for " f"OpenAI Chat API, skipped.", ) if len(checked_urls) == 0: # If no valid image url is provided, return the normal message dict return { "role": msg.role, "name": msg.name, "content": _convert_to_str(msg.content), } else: # otherwise, use the vision format message returned_msg = { "role": msg.role, "name": msg.name, "content": [ { "type": "text", "text": _convert_to_str(msg.content), }, ], } image_dicts = [ { "type": "image_url", "image_url": { "url": _, }, } for _ in checked_urls ] returned_msg["content"].extend(image_dicts) return returned_msg def format( self, *args: Union[Msg, Sequence[Msg]], ) -> List[dict]: """Format the input string and dictionary into the format that OpenAI Chat API required. Args: args (`Union[Msg, Sequence[Msg]]`): The input arguments to be formatted, where each argument should be a `Msg` object, or a list of `Msg` objects. In distribution, placeholder is also allowed. Returns: `List[dict]`: The formatted messages in the format that OpenAI Chat API required. """ messages = [] for arg in args: if arg is None: continue if isinstance(arg, Msg): if arg.url is not None: messages.append(self._format_msg_with_url(arg)) else: messages.append( { "role": arg.role, "name": arg.name, "content": _convert_to_str(arg.content), }, ) elif isinstance(arg, list): messages.extend(self.format(*arg)) else: raise TypeError( f"The input should be a Msg object or a list " f"of Msg objects, got {type(arg)}.", ) return messages class OpenAIDALLEWrapper(OpenAIWrapperBase): """The model wrapper for OpenAI's DALL·E API.""" model_type: str = "openai_dall_e" _resolutions: list = [ "1792*1024", "1024*1792", "1024*1024", "512*512", "256*256", ] def _register_default_metrics(self) -> None: # Set monitor accordingly # TODO: set quota to the following metrics self.monitor.register( self._metric("call_counter"), metric_unit="times", ) for resolution in self._resolutions: self.monitor.register( self._metric(resolution), metric_unit="image", ) def __call__( self, prompt: str, save_local: bool = False, **kwargs: Any, ) -> ModelResponse: """ Args: prompt (`str`): The prompt string to generate images from. save_local: (`bool`, default `False`): Whether to save the generated images locally, and replace the returned image url with the local path. **kwargs (`Any`): The keyword arguments to OpenAI image generation API, e.g. `n`, `quality`, `response_format`, `size`, etc. Please refer to https://platform.openai.com/docs/api-reference/images/create for more detailed arguments. Returns: `ModelResponse`: A list of image urls in image_urls field and the raw response in raw field. Note: `parse_func`, `fault_handler` and `max_retries` are reserved for `_response_parse_decorator` to parse and check the response generated by model wrapper. Their usages are listed as follows: - `parse_func` is a callable function used to parse and check the response generated by the model, which takes the response as input. - `max_retries` is the maximum number of retries when the `parse_func` raise an exception. - `fault_handler` is a callable function which is called when the response generated by the model is invalid after `max_retries` retries. """ # step1: prepare keyword arguments kwargs = {**self.generate_args, **kwargs} # step2: forward to generate response try: response = self.client.images.generate( model=self.model_name, prompt=prompt, **kwargs, ) except Exception as e: logger.error( f"Failed to generate images for prompt '{prompt}': {e}", ) raise e # step3: record the model api invocation if needed self._save_model_invocation( arguments={ "model": self.model_name, "prompt": prompt, **kwargs, }, response=response.model_dump(), ) # step4: update monitor accordingly self.update_monitor(call_counter=1) # step5: return response raw_response = response.model_dump() if "data" not in raw_response: if "error" in raw_response: error_msg = raw_response["error"]["message"] else: error_msg = raw_response logger.error(f"Error in OpenAI API call:\n{error_msg}") raise ValueError(f"Error in OpenAI API call:\n{error_msg}") images = raw_response["data"] # Get image urls as a list urls = [_["url"] for _ in images] if save_local: # Return local url if save_local is True urls = [file_manager.save_image(_) for _ in urls] return ModelResponse(image_urls=urls, raw=raw_response) class OpenAIEmbeddingWrapper(OpenAIWrapperBase): """The model wrapper for OpenAI embedding API.""" model_type: str = "openai_embedding" def _register_default_metrics(self) -> None: # Set monitor accordingly # TODO: set quota to the following metrics self.monitor.register( self._metric("call_counter"), metric_unit="times", ) self.monitor.register( self._metric("prompt_tokens"), metric_unit="token", ) self.monitor.register( self._metric("total_tokens"), metric_unit="token", ) def __call__( self, texts: Union[list[str], str], **kwargs: Any, ) -> ModelResponse: """Embed the messages with OpenAI embedding API. Args: texts (`list[str]` or `str`): The messages used to embed. **kwargs (`Any`): The keyword arguments to OpenAI embedding API, e.g. `encoding_format`, `user`. Please refer to https://platform.openai.com/docs/api-reference/embeddings for more detailed arguments. Returns: `ModelResponse`: A list of embeddings in embedding field and the raw response in raw field. Note: `parse_func`, `fault_handler` and `max_retries` are reserved for `_response_parse_decorator` to parse and check the response generated by model wrapper. Their usages are listed as follows: - `parse_func` is a callable function used to parse and check the response generated by the model, which takes the response as input. - `max_retries` is the maximum number of retries when the `parse_func` raise an exception. - `fault_handler` is a callable function which is called when the response generated by the model is invalid after `max_retries` retries. """ # step1: prepare keyword arguments kwargs = {**self.generate_args, **kwargs} # step2: forward to generate response response = self.client.embeddings.create( input=texts, model=self.model_name, **kwargs, ) # step3: record the model api invocation if needed self._save_model_invocation( arguments={ "model": self.model_name, "input": texts, **kwargs, }, response=response.model_dump(), ) # step4: update monitor accordingly self.update_monitor(call_counter=1, **response.usage.model_dump()) # step5: return response response_json = response.model_dump() return ModelResponse( embedding=[_["embedding"] for _ in response_json["data"]], raw=response_json, ) ``` modelscope/agentscope/blob/main/src/agentscope/models/post_model.py: ```py # -*- coding: utf-8 -*- """Model wrapper for post-based inference apis.""" import json import time from abc import ABC from typing import Any, Union, Sequence, List import requests from loguru import logger from .model import ModelWrapperBase, ModelResponse from ..constants import _DEFAULT_MAX_RETRIES from ..constants import _DEFAULT_MESSAGES_KEY from ..constants import _DEFAULT_RETRY_INTERVAL from ..message import Msg from ..utils.tools import _convert_to_str class PostAPIModelWrapperBase(ModelWrapperBase, ABC): """The base model wrapper for the model deployed on the POST API.""" model_type: str = "post_api" def __init__( self, config_name: str, api_url: str, headers: dict = None, max_length: int = 2048, timeout: int = 30, json_args: dict = None, post_args: dict = None, max_retries: int = _DEFAULT_MAX_RETRIES, messages_key: str = _DEFAULT_MESSAGES_KEY, retry_interval: int = _DEFAULT_RETRY_INTERVAL, **kwargs: Any, ) -> None: """Initialize the model wrapper. Args: config_name (`str`): The id of the model. api_url (`str`): The url of the post request api. headers (`dict`, defaults to `None`): The headers of the api. Defaults to None. max_length (`int`, defaults to `2048`): The maximum length of the model. timeout (`int`, defaults to `30`): The timeout of the api. Defaults to 30. json_args (`dict`, defaults to `None`): The json arguments of the api. Defaults to None. post_args (`dict`, defaults to `None`): The post arguments of the api. Defaults to None. max_retries (`int`, defaults to `3`): The maximum number of retries when the `parse_func` raise an exception. messages_key (`str`, defaults to `inputs`): The key of the input messages in the json argument. retry_interval (`int`, defaults to `1`): The interval between retries when a request fails. Note: When an object of `PostApiModelWrapper` is called, the arguments will of post requests will be used as follows: .. code-block:: python request.post( url=api_url, headers=headers, json={ messages_key: messages, **json_args }, **post_args ) """ super().__init__(config_name=config_name) self.api_url = api_url self.headers = headers self.max_length = max_length self.timeout = timeout self.json_args = json_args or {} self.post_args = post_args or {} self.max_retries = max_retries self.messages_key = messages_key self.retry_interval = retry_interval def _parse_response(self, response: dict) -> ModelResponse: """Parse the response json data into ModelResponse""" return ModelResponse(raw=response) def __call__(self, input_: str, **kwargs: Any) -> ModelResponse: """Calling the model with requests.post. Args: input_ (`str`): The input string to the model. Returns: `dict`: A dictionary that contains the response of the model and related information (e.g. cost, time, the number of tokens, etc.). Note: `parse_func`, `fault_handler` and `max_retries` are reserved for `_response_parse_decorator` to parse and check the response generated by model wrapper. Their usages are listed as follows: - `parse_func` is a callable function used to parse and check the response generated by the model, which takes the response as input. - `max_retries` is the maximum number of retries when the `parse_func` raise an exception. - `fault_handler` is a callable function which is called when the response generated by the model is invalid after `max_retries` retries. """ # step1: prepare keyword arguments post_args = {**self.post_args, **kwargs} request_kwargs = { "url": self.api_url, "json": {self.messages_key: input_, **self.json_args}, "headers": self.headers or {}, **post_args, } # step2: prepare post requests for i in range(1, self.max_retries + 1): response = requests.post(**request_kwargs) if response.status_code == requests.codes.ok: break if i < self.max_retries: logger.warning( f"Failed to call the model with " f"requests.codes == {response.status_code}, retry " f"{i + 1}/{self.max_retries} times", ) time.sleep(i * self.retry_interval) # step3: record model invocation # record the model api invocation, which will be skipped if # `FileManager.save_api_invocation` is `False` self._save_model_invocation( arguments=request_kwargs, response=response.json(), ) # step4: parse the response if response.status_code == requests.codes.ok: return self._parse_response(response.json()) else: logger.error(json.dumps(request_kwargs, indent=4)) raise RuntimeError( f"Failed to call the model with {response.json()}", ) class PostAPIChatWrapper(PostAPIModelWrapperBase): """A post api model wrapper compatible with openai chat, e.g., vLLM, FastChat.""" model_type: str = "post_api_chat" def _parse_response(self, response: dict) -> ModelResponse: return ModelResponse( text=response["data"]["response"]["choices"][0]["message"][ "content" ], ) def format( self, *args: Union[Msg, Sequence[Msg]], ) -> Union[List[dict]]: """Format the input messages into a list of dict, which is compatible to OpenAI Chat API. Args: args (`Union[Msg, Sequence[Msg]]`): The input arguments to be formatted, where each argument should be a `Msg` object, or a list of `Msg` objects. In distribution, placeholder is also allowed. Returns: `Union[List[dict]]`: The formatted messages. """ messages = [] for arg in args: if arg is None: continue if isinstance(arg, Msg): messages.append( { "role": arg.role, "name": arg.name, "content": _convert_to_str(arg.content), }, ) elif isinstance(arg, list): messages.extend(self.format(*arg)) else: raise TypeError( f"The input should be a Msg object or a list " f"of Msg objects, got {type(arg)}.", ) return messages class PostAPIDALLEWrapper(PostAPIModelWrapperBase): """A post api model wrapper compatible with openai dall_e""" model_type: str = "post_api_dall_e" deprecated_model_type: str = "post_api_dalle" def _parse_response(self, response: dict) -> ModelResponse: if "data" not in response["data"]["response"]: if "error" in response["data"]["response"]: error_msg = response["data"]["response"]["error"]["message"] else: error_msg = response["data"]["response"] logger.error(f"Error in API call:\n{error_msg}") raise ValueError(f"Error in API call:\n{error_msg}") urls = [img["url"] for img in response["data"]["response"]["data"]] return ModelResponse(image_urls=urls) def format( self, *args: Union[Msg, Sequence[Msg]], ) -> Union[List[dict], str]: raise RuntimeError( f"Model Wrapper [{type(self).__name__}] doesn't " f"need to format the input. Please try to use the " f"model wrapper directly.", ) class PostAPIEmbeddingWrapper(PostAPIModelWrapperBase): """ A post api model wrapper for embedding model """ model_type: str = "post_api_embedding" def _parse_response(self, response: dict) -> ModelResponse: """ Parse the response json data into ModelResponse with embedding. Args: response (`dict`): The response obtained from the API. This parsing assume the structure of the response is as following: { "code": 200, "data": { ... "response": { "data": [ { "embedding": [ 0.001, ... ], ... } ], "model": "xxxx", ... }, }, } """ if "data" not in response["data"]["response"]: if "error" in response["data"]["response"]: error_msg = response["data"]["response"]["error"]["message"] else: error_msg = response["data"]["response"] logger.error(f"Error in embedding API call:\n{error_msg}") raise ValueError(f"Error in embedding API call:\n{error_msg}") embeddings = [ data["embedding"] for data in response["data"]["response"]["data"] ] return ModelResponse( embedding=embeddings, raw=response, ) def format( self, *args: Union[Msg, Sequence[Msg]], ) -> Union[List[dict], str]: raise RuntimeError( f"Model Wrapper [{type(self).__name__}] doesn't " f"need to format the input. Please try to use the " f"model wrapper directly.", ) ``` modelscope/agentscope/blob/main/src/agentscope/models/response.py: ```py # -*- coding: utf-8 -*- """Parser for model response.""" import json from typing import Optional, Sequence, Any, Generator, Union, Tuple from agentscope.utils.tools import _is_json_serializable class ModelResponse: """Encapsulation of data returned by the model. The main purpose of this class is to align the return formats of different models and act as a bridge between models and agents. """ def __init__( self, text: str = None, embedding: Sequence = None, image_urls: Sequence[str] = None, raw: Any = None, parsed: Any = None, stream: Optional[Generator[str, None, None]] = None, ) -> None: """Initialize the model response. Args: text (`str`, optional): The text field. embedding (`Sequence`, optional): The embedding returned by the model. image_urls (`Sequence[str]`, optional): The image URLs returned by the model. raw (`Any`, optional): The raw data returned by the model. parsed (`Any`, optional): The parsed data returned by the model. stream (`Generator`, optional): The stream data returned by the model. """ self._text = text self.embedding = embedding self.image_urls = image_urls self.raw = raw self.parsed = parsed self._stream = stream self._is_stream_exhausted = False @property def text(self) -> str: """Return the text field. If the stream field is available, the text field will be updated accordingly.""" if self._text is None: if self.stream is not None: for chunk in self.stream: self._text += chunk return self._text @property def stream(self) -> Union[None, Generator[Tuple[bool, str], None, None]]: """Return the stream generator if it exists.""" if self._stream is None: return self._stream else: return self._stream_generator_wrapper() @property def is_stream_exhausted(self) -> bool: """Whether the stream has been processed already.""" return self._is_stream_exhausted def _stream_generator_wrapper( self, ) -> Generator[Tuple[bool, str], None, None]: """During processing the stream generator, the text field is updated accordingly.""" if self._is_stream_exhausted: raise RuntimeError( "The stream has been processed already. Try to obtain the " "result from the text field.", ) # These two lines are used to avoid mypy checking error if self._stream is None: return try: last_text = next(self._stream) for text in self._stream: self._text = last_text yield False, last_text last_text = text self._text = last_text yield True, last_text return except StopIteration: return def __str__(self) -> str: if _is_json_serializable(self.raw): raw = self.raw else: raw = str(self.raw) serialized_fields = { "text": self.text, "embedding": self.embedding, "image_urls": self.image_urls, "parsed": self.parsed, "raw": raw, } return json.dumps(serialized_fields, indent=4, ensure_ascii=False) ``` modelscope/agentscope/blob/main/src/agentscope/models/zhipu_model.py: ```py # -*- coding: utf-8 -*- """Model wrapper for ZhipuAI models""" from abc import ABC from typing import Union, Any, List, Sequence, Optional, Generator from loguru import logger from ._model_utils import _verify_text_content_in_openai_delta_response from .model import ModelWrapperBase, ModelResponse from ..message import Msg from ..utils.tools import _convert_to_str try: import zhipuai except ImportError: zhipuai = None class ZhipuAIWrapperBase(ModelWrapperBase, ABC): """The model wrapper for ZhipuAI API.""" def __init__( self, config_name: str, model_name: str = None, api_key: str = None, client_args: dict = None, generate_args: dict = None, **kwargs: Any, ) -> None: """Initialize the zhipuai client. To init the ZhipuAi client, the api_key is required. Other client args include base_url and timeout. The base_url is set to https://open.bigmodel.cn/api/paas/v4 if not specified. The timeout arg is set for http request timeout. Args: config_name (`str`): The name of the model config. model_name (`str`, default `None`): The name of the model to use in ZhipuAI API. api_key (`str`, default `None`): The API key for ZhipuAI API. If not specified, it will be read from the environment variable. client_args (`dict`, default `None`): The extra keyword arguments to initialize the ZhipuAI client. generate_args (`dict`, default `None`): The extra keyword arguments used in zhipuai api generation, e.g. `temperature`, `seed`. """ if model_name is None: model_name = config_name logger.warning("model_name is not set, use config_name instead.") super().__init__(config_name=config_name) if zhipuai is None: raise ImportError( "Cannot find zhipuai package in current python environment.", ) self.model_name = model_name self.generate_args = generate_args or {} self.client = zhipuai.ZhipuAI( api_key=api_key, **(client_args or {}), ) self._register_default_metrics() def format( self, *args: Union[Msg, Sequence[Msg]], ) -> Union[List[dict], str]: raise RuntimeError( f"Model Wrapper [{type(self).__name__}] doesn't " f"need to format the input. Please try to use the " f"model wrapper directly.", ) class ZhipuAIChatWrapper(ZhipuAIWrapperBase): """The model wrapper for ZhipuAI's chat API.""" model_type: str = "zhipuai_chat" def __init__( self, config_name: str, model_name: str = None, api_key: str = None, stream: bool = False, client_args: dict = None, generate_args: dict = None, **kwargs: Any, ) -> None: """Initialize the zhipuai client. To init the ZhipuAi client, the api_key is required. Other client args include base_url and timeout. The base_url is set to https://open.bigmodel.cn/api/paas/v4 if not specified. The timeout arg is set for http request timeout. Args: config_name (`str`): The name of the model config. model_name (`str`, default `None`): The name of the model to use in ZhipuAI API. api_key (`str`, default `None`): The API key for ZhipuAI API. If not specified, it will be read from the environment variable. stream (`bool`, default `False`): Whether to enable stream mode. generate_args (`dict`, default `None`): The extra keyword arguments used in zhipuai api generation, e.g. `temperature`, `seed`. """ super().__init__( config_name=config_name, model_name=model_name, api_key=api_key, client_args=client_args, generate_args=generate_args, ) self.stream = stream def _register_default_metrics(self) -> None: # Set monitor accordingly # TODO: set quota to the following metrics self.monitor.register( self._metric("call_counter"), metric_unit="times", ) self.monitor.register( self._metric("prompt_tokens"), metric_unit="token", ) self.monitor.register( self._metric("completion_tokens"), metric_unit="token", ) self.monitor.register( self._metric("total_tokens"), metric_unit="token", ) def __call__( self, messages: list, stream: Optional[bool] = None, **kwargs: Any, ) -> ModelResponse: """Processes a list of messages to construct a payload for the ZhipuAI API call. It then makes a request to the ZhipuAI API and returns the response. This method also updates monitoring metrics based on the API response. Args: messages (`list`): A list of messages to process. stream (`Optional[bool]`, default `None`): Whether to enable stream mode. If not specified, it will use the stream mode set in the constructor. **kwargs (`Any`): The keyword arguments to ZhipuAI chat completions API, e.g. `temperature`, `max_tokens`, `top_p`, etc. Please refer to https://open.bigmodel.cn/dev/api for more detailed arguments. Returns: `ModelResponse`: The response text in text field, and the raw response in raw field. Note: `parse_func`, `fault_handler` and `max_retries` are reserved for `_response_parse_decorator` to parse and check the response generated by model wrapper. Their usages are listed as follows: - `parse_func` is a callable function used to parse and check the response generated by the model, which takes the response as input. - `max_retries` is the maximum number of retries when the `parse_func` raise an exception. - `fault_handler` is a callable function which is called when the response generated by the model is invalid after `max_retries` retries. """ # step1: prepare keyword arguments kwargs = {**self.generate_args, **kwargs} # step2: checking messages if not isinstance(messages, list): raise ValueError( "ZhipuAI `messages` field expected type `list`, " f"got `{type(messages)}` instead.", ) if not all("role" in msg and "content" in msg for msg in messages): raise ValueError( "Each message in the 'messages' list must contain a 'role' " "and 'content' key for ZhipuAI API.", ) # step3: forward to generate response if stream is None: stream = self.stream kwargs.update( { "model": self.model_name, "messages": messages, "stream": stream, }, ) response = self.client.chat.completions.create(**kwargs) if stream: def generator() -> Generator[str, None, None]: """The generator of response text""" text = "" last_chunk = {} for chunk in response: chunk = chunk.model_dump() if _verify_text_content_in_openai_delta_response(chunk): text += chunk["choices"][0]["delta"]["content"] yield text last_chunk = chunk # Update the last chunk to save locally if last_chunk.get("choices", []) in [None, []]: last_chunk["choices"] = [{}] last_chunk["choices"][0]["message"] = { "role": "assistant", "content": text, } self._save_model_invocation_and_update_monitor( kwargs, last_chunk, ) return ModelResponse( stream=generator(), ) else: response = response.model_dump() self._save_model_invocation_and_update_monitor(kwargs, response) # Return response return ModelResponse( text=response["choices"][0]["message"]["content"], raw=response, ) def _save_model_invocation_and_update_monitor( self, kwargs: dict, response: dict, ) -> None: """Save the model invocation and update the monitor accordingly. Args: kwargs (`dict`): The keyword arguments used in model invocation response (`dict`): The response from model API """ self._save_model_invocation( arguments=kwargs, response=response, ) if response.get("usage", None) is not None: self.update_monitor(call_counter=1, **response["usage"]) def format( self, *args: Union[Msg, Sequence[Msg]], ) -> List[dict]: """Format the input string and dictionary into the format that ZhipuAI Chat API required. In this format function, the input messages are formatted into a single system messages with format "{name}: {content}" for each message. Note this strategy maybe not suitable for all scenarios, and developers are encouraged to implement their own prompt engineering strategies. Args: args (`Union[Msg, Sequence[Msg]]`): The input arguments to be formatted, where each argument should be a `Msg` object, or a list of `Msg` objects. In distribution, placeholder is also allowed. Returns: `List[dict]`: The formatted messages in the format that ZhipuAI Chat API required. """ # Parse all information into a list of messages input_msgs = [] for _ in args: if _ is None: continue if isinstance(_, Msg): input_msgs.append(_) elif isinstance(_, list) and all(isinstance(__, Msg) for __ in _): input_msgs.extend(_) else: raise TypeError( f"The input should be a Msg object or a list " f"of Msg objects, got {type(_)}.", ) messages = [] # record dialog history as a list of strings dialogue = [] for i, unit in enumerate(input_msgs): if i == 0 and unit.role == "system": # system prompt messages.append( { "role": unit.role, "content": _convert_to_str(unit.content), }, ) else: # Merge all messages into a dialogue history prompt dialogue.append( f"{unit.name}: {_convert_to_str(unit.content)}", ) dialogue_history = "\n".join(dialogue) user_content_template = "## Dialogue History\n{dialogue_history}" messages.append( { "role": "user", "content": user_content_template.format( dialogue_history=dialogue_history, ), }, ) return messages class ZhipuAIEmbeddingWrapper(ZhipuAIWrapperBase): """The model wrapper for ZhipuAI embedding API.""" model_type: str = "zhipuai_embedding" def __call__( self, texts: str, **kwargs: Any, ) -> ModelResponse: """Embed the messages with ZhipuAI embedding API. Args: texts (`str`): The messages used to embed. **kwargs (`Any`): The keyword arguments to ZhipuAI embedding API, e.g. `encoding_format`, `user`. Please refer to https://open.bigmodel.cn/dev/api#text_embedding for more detailed arguments. Returns: `ModelResponse`: A list of embeddings in embedding field and the raw response in raw field. Note: `parse_func`, `fault_handler` and `max_retries` are reserved for `_response_parse_decorator` to parse and check the response generated by model wrapper. Their usages are listed as follows: - `parse_func` is a callable function used to parse and check the response generated by the model, which takes the response as input. - `max_retries` is the maximum number of retries when the `parse_func` raise an exception. - `fault_handler` is a callable function which is called when the response generated by the model is invalid after `max_retries` retries. """ # step1: prepare keyword arguments kwargs = {**self.generate_args, **kwargs} # step2: forward to generate response response = self.client.embeddings.create( input=texts, model=self.model_name, ) # step3: record the model api invocation if needed self._save_model_invocation( arguments={ "model": self.model_name, "input": texts, **kwargs, }, response=response.model_dump(), ) # step4: update monitor accordingly self.update_monitor(call_counter=1, **response.usage.model_dump()) # step5: return response response_json = response.model_dump() return ModelResponse( embedding=[_["embedding"] for _ in response_json["data"]], raw=response_json, ) def _register_default_metrics(self) -> None: # Set monitor accordingly # TODO: set quota to the following metrics self.monitor.register( self._metric("call_counter"), metric_unit="times", ) self.monitor.register( self._metric("prompt_tokens"), metric_unit="token", ) self.monitor.register( self._metric("completion_tokens"), metric_unit="token", ) self.monitor.register( self._metric("total_tokens"), metric_unit="token", ) ``` modelscope/agentscope/blob/main/src/agentscope/msghub.py: ```py # -*- coding: utf-8 -*- """MsgHub is designed to share messages among a group of agents. """ from __future__ import annotations from typing import Any, Optional, Union, Sequence from loguru import logger from agentscope.agents import AgentBase class MsgHubManager: """MsgHub manager class for sharing dialog among a group of agents.""" def __init__( self, participants: Sequence[AgentBase], announcement: Optional[Union[Sequence[dict], dict]] = None, ) -> None: """Initialize a msghub manager from the given arguments. Args: participants (`Sequence[AgentBase]`): The Sequence of participants in the msghub. announcement (`Optional[Union[list[dict], dict]]`, defaults to `None`): The message that will be broadcast to all participants at the first without requiring response. """ self.participants = participants self.announcement = announcement def __enter__(self) -> MsgHubManager: """Will be called when entering the msghub.""" name_participants = [agent.name for agent in self.participants] logger.debug( "Enter msghub with participants: {}", ", ".join( name_participants, ), ) self._reset_audience() # broadcast the input message to all participants if self.announcement is not None: for agent in self.participants: agent.observe(self.announcement) return self def __exit__(self, *args: Any, **kwargs: Any) -> None: """Will be called when exiting the msghub.""" for agent in self.participants: agent.clear_audience() def _reset_audience(self) -> None: """Reset the audience for agent in `self.participant`""" for agent in self.participants: agent.reset_audience(self.participants) def add( self, new_participant: Union[Sequence[AgentBase], AgentBase], ) -> None: """Add new participant into this hub""" if isinstance(new_participant, AgentBase): new_participant = [new_participant] for agent in new_participant: if agent not in self.participants: self.participants.append(agent) else: logger.warning( f"Skip adding agent [{agent.name}] for it has " "already joined in.", ) self._reset_audience() def delete( self, participant: Union[Sequence[AgentBase], AgentBase], ) -> None: """Delete agents from participant.""" if isinstance(participant, AgentBase): participant = [participant] for agent in participant: if agent in self.participants: # Clear the audience of the deleted agent firstly agent.clear_audience() # remove agent from self.participant self.participants.pop(self.participants.index(agent)) else: logger.warning( f"Cannot find agent [{agent.name}], skip its" f" deletion.", ) # Remove this agent from the audience of other agents self._reset_audience() def broadcast(self, msg: Union[dict, list[dict]]) -> None: """Broadcast the message to all participants. Args: msg (`Union[dict, list[dict]]`): One or a list of dict messages to broadcast among all participants. """ for agent in self.participants: agent.observe(msg) def msghub( participants: Sequence[AgentBase], announcement: Optional[Union[Sequence[dict], dict]] = None, ) -> MsgHubManager: """msghub is used to share messages among a group of agents. Args: participants (`Sequence[AgentBase]`): A Sequence of participated agents in the msghub. announcement (`Optional[Union[list[dict], dict]]`, defaults to `None`): The message that will be broadcast to all participants at the very beginning without requiring response. Example: In the following code, we create a msghub with three agents, and each message output by `agent1`, `agent2`, `agent3` will be passed to all other agents, that's what we mean msghub. .. code-block:: python with msghub(participant=[agent1, agent2, agent3]): agent1() agent2() Actually, it has the same effect as the following code, but much more easy and elegant! .. code-block:: python x1 = agent1() agent2.observe(x1) agent3.observe(x1) x2 = agent2() agent1.observe(x2) agent3.observe(x2) """ return MsgHubManager(participants, announcement) ``` modelscope/agentscope/blob/main/src/agentscope/parsers/__init__.py: ```py # -*- coding: utf-8 -*- """Model response parser module.""" from .parser_base import ParserBase from .json_object_parser import ( MarkdownJsonObjectParser, MarkdownJsonDictParser, ) from .code_block_parser import MarkdownCodeBlockParser from .regex_tagged_content_parser import RegexTaggedContentParser from .tagged_content_parser import ( TaggedContent, MultiTaggedContentParser, ) __all__ = [ "ParserBase", "MarkdownJsonObjectParser", "MarkdownJsonDictParser", "MarkdownCodeBlockParser", "TaggedContent", "MultiTaggedContentParser", "RegexTaggedContentParser", ] ``` modelscope/agentscope/blob/main/src/agentscope/parsers/code_block_parser.py: ```py # -*- coding: utf-8 -*- """Model response parser class for Markdown code block.""" from typing import Optional from agentscope.models import ModelResponse from agentscope.parsers import ParserBase class MarkdownCodeBlockParser(ParserBase): """The base class for parsing the response text by fenced block.""" name: str = "{language_name} block" """The name of the parser.""" tag_begin: str = "```{language_name}" """The beginning tag.""" content_hint: str = "${{your_{language_name}_code}}" """The hint of the content.""" tag_end: str = "```" """The ending tag.""" format_instruction: str = ( "You should generate {language_name} code in a {language_name} fenced " "code block as follows: \n```{language_name}\n" "{content_hint}\n```" ) """The instruction for the format of the code block.""" def __init__( self, language_name: str, content_hint: Optional[str] = None, ) -> None: """Initialize the parser with the language name and the optional content hint. Args: language_name (`str`): The name of the language, which will be used in ```{language_name} content_hint (`Optional[str]`, defaults to `None`): The hint used to remind LLM what should be fill between the tags. If not provided, the default content hint "${{your_{language_name}_code}}" will be used. """ self.name = self.name.format(language_name=language_name) self.tag_begin = self.tag_begin.format(language_name=language_name) if content_hint is None: self.content_hint = f"${{your_{language_name}_code}}" else: self.content_hint = content_hint self.format_instruction = self.format_instruction.format( language_name=language_name, content_hint=self.content_hint, ).strip() def parse(self, response: ModelResponse) -> ModelResponse: """Extract the content between the tag_begin and tag_end in the response and store it in the parsed field of the response object. """ extract_text = self._extract_first_content_by_tag( response, self.tag_begin, self.tag_end, ) response.parsed = extract_text return response ``` modelscope/agentscope/blob/main/src/agentscope/parsers/json_object_parser.py: ```py # -*- coding: utf-8 -*- """The parser for JSON object in the model response.""" import inspect import json from copy import deepcopy from typing import Optional, Any, List, Sequence, Union from loguru import logger from pydantic import BaseModel from agentscope.exception import ( TagNotFoundError, JsonParsingError, JsonTypeError, RequiredFieldNotFoundError, ) from agentscope.models import ModelResponse from agentscope.parsers import ParserBase from agentscope.parsers.parser_base import DictFilterMixin from agentscope.utils.tools import _join_str_with_comma_and class MarkdownJsonObjectParser(ParserBase): """A parser to parse the response text to a json object.""" name: str = "json block" """The name of the parser.""" tag_begin: str = "```json" """Opening tag for a code block.""" content_hint: str = "{your_json_object}" """The hint of the content.""" tag_end: str = "```" """Closing end for a code block.""" _format_instruction = ( "You should respond a json object in a json fenced code block as " "follows:\n```json\n{content_hint}\n```" ) """The instruction for the format of the json object.""" def __init__(self, content_hint: Optional[Any] = None) -> None: """Initialize the parser with the content hint. Args: content_hint (`Optional[Any]`, defaults to `None`): The hint used to remind LLM what should be fill between the tags. If it is a string, it will be used as the content hint directly. If it is a dict, it will be converted to a json string and used as the content hint. """ if content_hint is not None: if isinstance(content_hint, str): self.content_hint = content_hint else: self.content_hint = json.dumps( content_hint, ensure_ascii=False, ) def parse(self, response: ModelResponse) -> ModelResponse: """Parse the response text to a json object, and fill it in the parsed field in the response object.""" # extract the content and try to fix the missing tags by hand try: extract_text = self._extract_first_content_by_tag( response, self.tag_begin, self.tag_end, ) except TagNotFoundError as e: # Try to fix the missing tag error by adding the tag try: response_copy = deepcopy(response) # Fix the missing tags if e.missing_begin_tag: response_copy.text = ( self.tag_begin + "\n" + response_copy.text ) if e.missing_end_tag: response_copy.text = response_copy.text + self.tag_end # Try again to extract the content extract_text = self._extract_first_content_by_tag( response_copy, self.tag_begin, self.tag_end, ) # replace the response with the fixed one response.text = response_copy.text logger.debug("Fix the missing tags by adding them manually.") except TagNotFoundError: # Raise the original error if the missing tags cannot be fixed raise e from None # Parse the content into JSON object try: parsed_json = json.loads(extract_text) response.parsed = parsed_json return response except json.decoder.JSONDecodeError as e: raw_response = f"{self.tag_begin}{extract_text}{self.tag_end}" raise JsonParsingError( f"The content between {self.tag_begin} and {self.tag_end} " f"MUST be a JSON object." f'When parsing "{raw_response}", an error occurred: {e}', raw_response=raw_response, ) from None @property def format_instruction(self) -> str: """Get the format instruction for the json object, if the format_example is provided, it will be used as the example. """ return self._format_instruction.format( content_hint=self.content_hint, ) class MarkdownJsonDictParser(MarkdownJsonObjectParser, DictFilterMixin): """A class used to parse a JSON dictionary object in a markdown fenced code""" name: str = "json block" """The name of the parser.""" tag_begin: str = "```json" """Opening tag for a code block.""" content_hint: str = "{your_json_dictionary}" """The hint of the content.""" tag_end: str = "```" """Closing end for a code block.""" _format_instruction = ( "Respond a JSON dictionary in a markdown's fenced code block as " "follows:\n```json\n{content_hint}\n```" ) """The instruction for the format of the json object.""" _format_instruction_with_schema = ( "Respond a JSON dictionary in a markdown's fenced code block as " "follows:\n" "```json\n" "{content_hint}\n" "```\n" "The generated JSON dictionary MUST follow this schema: \n" "{schema}" ) """The schema instruction for the format of the json object.""" required_keys: List[str] """A list of required keys in the JSON dictionary object. If the response misses any of the required keys, it will raise a RequiredFieldNotFoundError.""" def __init__( self, content_hint: Optional[Any] = None, required_keys: List[str] = None, keys_to_memory: Optional[Union[str, bool, Sequence[str]]] = True, keys_to_content: Optional[Union[str, bool, Sequence[str]]] = True, keys_to_metadata: Optional[Union[str, bool, Sequence[str]]] = False, ) -> None: """Initialize the parser with the content hint. Args: content_hint (`Optional[Any]`, defaults to `None`): The hint used to remind LLM what should be fill between the tags. If it is a string, it will be used as the content hint directly. If it is a dict, it will be converted to a json string and used as the content hint. If it's a Pydantic model, the schema will be displayed in the instruction. required_keys (`List[str]`, defaults to `[]`): A list of required keys in the JSON dictionary object. If the response misses any of the required keys, it will raise a RequiredFieldNotFoundError. keys_to_memory (`Optional[Union[str, bool, Sequence[str]]]`, defaults to `True`): The key or keys to be filtered in `to_memory` method. If it's - `False`, `None` will be returned in the `to_memory` method - `str`, the corresponding value will be returned - `List[str]`, a filtered dictionary will be returned - `True`, the whole dictionary will be returned keys_to_content (`Optional[Union[str, bool, Sequence[str]]]`, defaults to `True`): The key or keys to be filtered in `to_content` method. If it's - `False`, `None` will be returned in the `to_content` method - `str`, the corresponding value will be returned - `List[str]`, a filtered dictionary will be returned - `True`, the whole dictionary will be returned keys_to_metadata (`Optional[Union[str, bool, Sequence[str]]`, defaults to `False`): The key or keys to be filtered in `to_metadata` method. If it's - `False`, `None` will be returned in the `to_metadata` method - `str`, the corresponding value will be returned - `List[str]`, a filtered dictionary will be returned - `True`, the whole dictionary will be returned """ self.pydantic_class = None # Initialize the content_hint according to the type of content_hint if inspect.isclass(content_hint) and issubclass( content_hint, BaseModel, ): self.pydantic_class = content_hint self.content_hint = "{a_JSON_dictionary}" elif content_hint is not None: if isinstance(content_hint, str): self.content_hint = content_hint else: self.content_hint = json.dumps( content_hint, ensure_ascii=False, ) # Initialize the mixin class to allow filtering the parsed response DictFilterMixin.__init__( self, keys_to_memory=keys_to_memory, keys_to_content=keys_to_content, keys_to_metadata=keys_to_metadata, ) self.required_keys = required_keys or [] @property def format_instruction(self) -> str: """Get the format instruction for the json object, if the format_example is provided, it will be used as the example. """ if self.pydantic_class is None: return self._format_instruction.format( content_hint=self.content_hint, ) else: return self._format_instruction_with_schema.format( content_hint=self.content_hint, schema=self.pydantic_class.model_json_schema(), ) def parse(self, response: ModelResponse) -> ModelResponse: """Parse the text field of the response to a JSON dictionary object, store it in the parsed field of the response object, and check if the required keys exists. """ # Parse the JSON object response = super().parse(response) if not isinstance(response.parsed, dict): # If not a dictionary, raise an error raise JsonTypeError( "A JSON dictionary object is wanted, " f"but got {type(response.parsed)} instead.", response.text, ) # Requirement checking by Pydantic if self.pydantic_class is not None: try: response.parsed = dict(self.pydantic_class(**response.parsed)) except Exception as e: raise JsonParsingError( message=str(e), raw_response=response.text, ) from None # Check if the required keys exist keys_missing = [] for key in self.required_keys: if key not in response.parsed: keys_missing.append(key) if len(keys_missing) != 0: raise RequiredFieldNotFoundError( f"Missing required " f"field{'' if len(keys_missing)==1 else 's'} " f"{_join_str_with_comma_and(keys_missing)} in the JSON " f"dictionary object.", response.text, ) return response ``` modelscope/agentscope/blob/main/src/agentscope/parsers/parser_base.py: ```py # -*- coding: utf-8 -*- """The base class for model response parser.""" from abc import ABC, abstractmethod from typing import Union, Sequence from loguru import logger from agentscope.exception import TagNotFoundError from agentscope.models import ModelResponse # TODO: Support one-time warning in logger rather than setting global variable _FIRST_TIME_TO_REPORT_CONTENT = True _FIRST_TIME_TO_REPORT_MEMORY = True class ParserBase(ABC): """The base class for model response parser.""" @abstractmethod def parse(self, response: ModelResponse) -> ModelResponse: """Parse the response text to a specific object, and stored in the parsed field of the response object.""" def _extract_first_content_by_tag( self, response: ModelResponse, tag_start: str, tag_end: str, ) -> str: """Extract the first text content between the tag_start and tag_end in the response text. Note this function does not support nested. Args: response (`ModelResponse`): The response object. tag_start (`str`): The start tag. tag_end (`str`): The end tag. Returns: `str`: The extracted text content. """ text = response.text index_start = text.find(tag_start) # Avoid the case that tag_begin contains tag_end, e.g. ```json and ``` if index_start == -1: index_end = text.find(tag_end, 0) else: index_end = text.find(tag_end, index_start + len(tag_start)) if index_start == -1 or index_end == -1: missing_tags = [] if index_start == -1: missing_tags.append(tag_start) if index_end == -1: missing_tags.append(tag_end) raise TagNotFoundError( f"Missing " f"tag{'' if len(missing_tags)==1 else 's'} " f"{' and '.join(missing_tags)} in response: {text}", raw_response=text, missing_begin_tag=index_start == -1, missing_end_tag=index_end == -1, ) extract_text = text[ index_start + len(tag_start) : index_end # noqa: E203 ] return extract_text class DictFilterMixin: """A mixin class to filter the parsed response by keys. It allows users to set keys to be filtered during speaking, storing in memory, and returning in the agent reply function. """ def __init__( self, keys_to_memory: Union[str, bool, Sequence[str]], keys_to_content: Union[str, bool, Sequence[str]], keys_to_metadata: Union[str, bool, Sequence[str]], ) -> None: """Initialize the mixin class with the keys to be filtered during speaking, storing in memory, and returning in the agent reply function. Args: keys_to_memory (`Optional[Union[str, bool, Sequence[str]]]`): The key or keys to be filtered in `to_memory` method. If it's - `False`, `None` will be returned in the `to_memory` method - `str`, the corresponding value will be returned - `List[str]`, a filtered dictionary will be returned - `True`, the whole dictionary will be returned keys_to_content (`Optional[Union[str, bool, Sequence[str]]`): The key or keys to be filtered in `to_content` method. If it's - `False`, `None` will be returned in the `to_content` method - `str`, the corresponding value will be returned - `List[str]`, a filtered dictionary will be returned - `True`, the whole dictionary will be returned keys_to_metadata (`Optional[Union[str, bool, Sequence[str]]]`): The key or keys to be filtered in `to_metadata` method. If it's - `False`, `None` will be returned in the `to_metadata` method - `str`, the corresponding value will be returned - `List[str]`, a filtered dictionary will be returned - `True`, the whole dictionary will be returned """ self.keys_to_memory = keys_to_memory self.keys_to_content = keys_to_content self.keys_to_metadata = keys_to_metadata def to_memory( self, parsed_response: dict, allow_missing: bool = False, ) -> Union[str, dict, None]: """Filter the fields that will be stored in memory.""" return self._filter_content_by_names( parsed_response, self.keys_to_memory, allow_missing=allow_missing, ) def to_content( self, parsed_response: dict, allow_missing: bool = False, ) -> Union[str, dict, None]: """Filter the fields that will be fed into the content field in the returned message, which will be exposed to other agents. """ return self._filter_content_by_names( parsed_response, self.keys_to_content, allow_missing=allow_missing, ) def to_metadata( self, parsed_response: dict, allow_missing: bool = False, ) -> Union[str, dict, None]: """Filter the fields that will be fed into the returned message directly to control the application workflow.""" return self._filter_content_by_names( parsed_response, self.keys_to_metadata, allow_missing=allow_missing, ) def _filter_content_by_names( self, parsed_response: dict, keys: Union[str, bool, Sequence[str]], allow_missing: bool = False, ) -> Union[str, dict, None]: """Filter the parsed response by keys. If only one key is provided, the returned content will be a single corresponding value. Otherwise, the returned content will be a dictionary with the filtered keys and their corresponding values. Args: keys (`Union[str, bool, Sequence[str]]`): The key or keys to be filtered. If it's - `False`, `None` will be returned in the `to_content` method - `str`, the corresponding value will be returned - `List[str]`, a filtered dictionary will be returned - `True`, the whole dictionary will be returned allow_missing (`bool`, defaults to `False`): Whether to allow missing keys in the response. If set to `True`, the method will skip the missing keys in the response. Otherwise, it will raise a `ValueError` when a key is missing. Returns: `Union[str, dict]`: The filtered content. """ if isinstance(keys, bool): if keys: return parsed_response else: return None if isinstance(keys, str): return parsed_response[keys] # check if the required names are in the response for name in keys: if name not in parsed_response: if allow_missing: logger.warning( f"Content name {name} not found in the response. Skip " f"it.", ) else: raise ValueError(f"Name {name} not found in the response.") return { name: parsed_response[name] for name in keys if name in parsed_response } ``` modelscope/agentscope/blob/main/src/agentscope/parsers/regex_tagged_content_parser.py: ```py # -*- coding: utf-8 -*- """The parser for dynamic tagged content""" import json import re from typing import Union, Sequence, Optional, List from loguru import logger from ..exception import TagNotFoundError from ..models import ModelResponse from ..parsers import ParserBase from ..parsers.parser_base import DictFilterMixin class RegexTaggedContentParser(ParserBase, DictFilterMixin): """A regex tagged content parser, which extracts tagged content according to the provided regex pattern. Different from other parsers, this parser allows to extract multiple tagged content without knowing the keys in advance. The parsed result will be a dictionary within the parsed field of the model response. Compared with other parsers, this parser is more flexible and can be used in dynamic scenarios where - the keys are not known in advance - the number of the tagged content is not fixed Note: Without knowing the keys in advance, it's hard to prepare a format instruction template for different scenarios. Therefore, we ask the user to provide the format instruction in the constructor. Of course, the user can construct and manage the prompt by themselves optionally. Example: By default, the parser use a regex pattern to extract tagged content with the following format: ``` <{name1}>{content1} <{name2}>{content2} ``` The parser will extract the content as the following dictionary: ``` { "name1": content1, "name2": content2, } ``` """ def __init__( self, tagged_content_pattern: str = r"<(?P[^>]+)>" r"(?P.*?)" r"", format_instruction: Optional[str] = None, try_parse_json: bool = True, required_keys: Optional[List[str]] = None, keys_to_memory: Union[str, bool, Sequence[str]] = True, keys_to_content: Union[str, bool, Sequence[str]] = True, keys_to_metadata: Union[str, bool, Sequence[str]] = False, ) -> None: """Initialize the regex tagged content parser. Args: tagged_content_pattern (`Optional[str]`, defaults to `"<(?P[^>]+)>(?P.*?)"`): The regex pattern to extract tagged content. The pattern should contain two named groups: `name` and `content`. The `name` group is used as the key of the tagged content, and the `content` group is used as the value. format_instruction (`Optional[str]`, defaults to `None`): The instruction for the format of the tagged content, which will be attached to the end of the prompt messages to remind the LLM to follow the format. try_parse_json (`bool`, defaults to `True`): Whether to try to parse the tagged content as JSON. Note the parsing function won't raise exceptions. required_keys (`Optional[List[str]]`, defaults to `None`): The keys that are required in the tagged content. keys_to_memory (`Union[str, bool, Sequence[str]]`, defaults to `True`): The keys to save to memory. keys_to_content (`Union[str, bool, Sequence[str]]`, defaults to `True`): The keys to save to content. keys_to_metadata (`Union[str, bool, Sequence[str]]`, defaults to `False`): The key or keys to be filtered in `to_metadata` method. If it's - `False`, `None` will be returned in the `to_metadata` method - `str`, the corresponding value will be returned - `List[str]`, a filtered dictionary will be returned - `True`, the whole dictionary will be returned """ DictFilterMixin.__init__( self, keys_to_memory=keys_to_memory, keys_to_content=keys_to_content, keys_to_metadata=keys_to_metadata, ) assert ( "" in tagged_content_pattern ), "The tagged content pattern should contain a named group 'name'." assert ( "" in tagged_content_pattern ), "The tagged content pattern should contain a named group 'content'." self.tagged_content_pattern = tagged_content_pattern self._format_instruction = format_instruction self.try_parse_json = try_parse_json self.required_keys = required_keys or [] @property def format_instruction(self) -> str: """The format instruction for the tagged content.""" if self._format_instruction is None: raise ValueError( "The format instruction is not provided. Please provide it in " "the constructor of the parser.", ) return self._format_instruction def parse(self, response: ModelResponse) -> ModelResponse: """Parse the response text by the regex pattern, and return a dict of the content in the parsed field of the response. Args: response (`ModelResponse`): The response to be parsed. Returns: `ModelResponse`: The response with the parsed field as the parsed result. """ assert response.text is not None, "The response text is None." matches = re.finditer( self.tagged_content_pattern, response.text, flags=re.DOTALL, ) results = {} for match in matches: results[match.group("name")] = match.group("content") keys_missing = [ key for key in self.required_keys if key not in results ] if len(keys_missing) > 0: raise TagNotFoundError( f"Failed to find tags: {', '.join(keys_missing)}", response.text, ) if self.try_parse_json: keys_failed = [] for key in results: try: results[key] = json.loads(results[key]) except json.JSONDecodeError: keys_failed.append(key) logger.debug( f'Failed to parse JSON for keys: {", ".join(keys_failed)}', ) response.parsed = results return response ``` modelscope/agentscope/blob/main/src/agentscope/parsers/tagged_content_parser.py: ```py # -*- coding: utf-8 -*- """The parser for tagged content in the model response.""" import json from typing import Union, Sequence, Optional, List from agentscope.exception import JsonParsingError, TagNotFoundError from agentscope.models import ModelResponse from agentscope.parsers import ParserBase from agentscope.parsers.parser_base import DictFilterMixin class TaggedContent: """A tagged content object to store the tag name, tag begin, content hint and tag end.""" name: str """The name of the tagged content, which will be used as the key in extracted dictionary.""" tag_begin: str """The beginning tag.""" content_hint: str """The hint of the content.""" tag_end: str """The ending tag.""" parse_json: bool """Whether to parse the content as a json object.""" def __init__( self, name: str, tag_begin: str, content_hint: str, tag_end: str, parse_json: bool = False, ) -> None: """Initialize the tagged content object. Args: name (`str`): The name of the tagged content. tag_begin (`str`): The beginning tag. content_hint (`str`): The hint of the content. tag_end (`str`): The ending tag. parse_json (`bool`, defaults to `False`): Whether to parse the content as a json object. """ self.name = name self.tag_begin = tag_begin self.content_hint = content_hint self.tag_end = tag_end self.parse_json = parse_json def __str__(self) -> str: """Return the tagged content as a string.""" return f"{self.tag_begin}{self.content_hint}{self.tag_end}" class MultiTaggedContentParser(ParserBase, DictFilterMixin): """Parse response text by multiple tags, and return a dict of their content. Asking llm to generate JSON dictionary object directly maybe not a good idea due to involving escape characters and other issues. So we can ask llm to generate text with tags, and then parse the text to get the final JSON dictionary object. """ format_instruction = ( "Respond with specific tags as outlined below{json_required_hint}\n" "{tag_lines_format}" ) """The instruction for the format of the tagged content.""" json_required_hint = ", and the content between {} MUST be a JSON object:" """If a tagged content is required to be a JSON object by `parse_json` equals to `True`, this instruction will be used to remind the model to generate JSON object.""" def __init__( self, *tagged_contents: TaggedContent, keys_to_memory: Optional[Union[str, bool, Sequence[str]]] = True, keys_to_content: Optional[Union[str, bool, Sequence[str]]] = True, keys_to_metadata: Optional[Union[str, bool, Sequence[str]]] = False, keys_allow_missing: Optional[List[str]] = None, ) -> None: """Initialize the parser with tags. Args: *tagged_contents (`dict[str, Tuple[str, str]]`): Multiple TaggedContent objects, each object contains the tag name, tag begin, content hint and tag end. The name will be used as the key in the extracted dictionary. required_keys (`Optional[List[str]]`, defaults to `None`): A list of required keys_to_memory (`Optional[Union[str, bool, Sequence[str]]]`, defaults to `True`): The key or keys to be filtered in `to_memory` method. If it's - `False`, `None` will be returned in the `to_memory` method - `str`, the corresponding value will be returned - `List[str]`, a filtered dictionary will be returned - `True`, the whole dictionary will be returned keys_to_content (`Optional[Union[str, bool, Sequence[str]]`, defaults to `True`): The key or keys to be filtered in `to_content` method. If it's - `False`, `None` will be returned in the `to_content` method - `str`, the corresponding value will be returned - `List[str]`, a filtered dictionary will be returned - `True`, the whole dictionary will be returned keys_to_metadata (`Optional[Union[str, bool, Sequence[str]]]`, defaults to `False`): The key or keys to be filtered in `to_metadata` method. If it's - `False`, `None` will be returned in the `to_metadata` method - `str`, the corresponding value will be returned - `List[str]`, a filtered dictionary will be returned - `True`, the whole dictionary will be returned keys_allow_missing (`Optional[List[str]]`, defaults to `None`): A list of keys that are allowed to be missing in the response. """ # Initialize the mixin class DictFilterMixin.__init__( self, keys_to_memory=keys_to_memory, keys_to_content=keys_to_content, keys_to_metadata=keys_to_metadata, ) self.keys_allow_missing = keys_allow_missing self.tagged_contents = list(tagged_contents) # Prepare the format instruction according to the tagged contents tag_lines = "\n".join([str(_) for _ in tagged_contents]) # Prepare hint for the tagged contents that requires a JSON object. json_required_tags = ", ".join( [ f"{_.tag_begin} and {_.tag_end}" for _ in tagged_contents if _.parse_json ], ) if json_required_tags != "": json_required_hint = self.json_required_hint.format( json_required_tags, ) else: json_required_hint = ": " self.format_instruction = self.format_instruction.format( json_required_hint=json_required_hint, tag_lines_format=tag_lines, ) def parse(self, response: ModelResponse) -> ModelResponse: """Parse the response text by tags, and return a dict of their content in the parsed field of the model response object. If the tagged content requires to parse as a JSON object by `parse_json` equals to `True`, it will be parsed as a JSON object by `json.loads`.""" tag_to_content = {} for tagged_content in self.tagged_contents: tag_begin = tagged_content.tag_begin tag_end = tagged_content.tag_end try: extract_content = self._extract_first_content_by_tag( response, tag_begin, tag_end, ) if tagged_content.parse_json: try: extract_content = json.loads(extract_content) except json.decoder.JSONDecodeError as e: raw_response = f"{tag_begin}{extract_content}{tag_end}" raise JsonParsingError( f"The content between " f"{tagged_content.tag_begin} and " f"{tagged_content.tag_end} should be a JSON " f'object. An error "{e}" occurred when parsing: ' f"{raw_response}", raw_response=raw_response, ) from None tag_to_content[tagged_content.name] = extract_content except TagNotFoundError as e: # if the key is allowed to be missing, skip the error if ( self.keys_allow_missing is not None and tagged_content.name in self.keys_allow_missing ): continue raise e from None response.parsed = tag_to_content return response ``` modelscope/agentscope/blob/main/src/agentscope/pipelines/__init__.py: ```py # -*- coding: utf-8 -*- """ Import all pipeline related modules in the package. """ from .pipeline import ( PipelineBase, SequentialPipeline, IfElsePipeline, SwitchPipeline, ForLoopPipeline, WhileLoopPipeline, ) from .functional import ( sequentialpipeline, ifelsepipeline, switchpipeline, forlooppipeline, whilelooppipeline, ) __all__ = [ "PipelineBase", "SequentialPipeline", "IfElsePipeline", "SwitchPipeline", "ForLoopPipeline", "WhileLoopPipeline", "sequentialpipeline", "ifelsepipeline", "switchpipeline", "forlooppipeline", "whilelooppipeline", ] ``` modelscope/agentscope/blob/main/src/agentscope/pipelines/functional.py: ```py # -*- coding: utf-8 -*- """ Functional counterpart for Pipeline """ from typing import ( Callable, Sequence, Optional, Union, Any, Mapping, ) from ..agents.operator import Operator # A single Operator or a Sequence of Operators Operators = Union[Operator, Sequence[Operator]] def placeholder(x: dict = None) -> dict: r"""A placeholder that do nothing. Acts as a placeholder in branches that do not require any operations in flow control like if-else/switch """ return x def sequentialpipeline( operators: Sequence[Operator], x: Optional[dict] = None, ) -> dict: """Functional version of SequentialPipeline. Args: operators (`Sequence[Operator]`): Participating operators. x (`Optional[dict]`, defaults to `None`): The input dictionary. Returns: `dict`: the output dictionary. """ if len(operators) == 0: raise ValueError("No operators provided.") msg = operators[0](x) for operator in operators[1:]: msg = operator(msg) return msg def _operators(operators: Operators, x: Optional[dict] = None) -> dict: """Syntactic sugar for executing a single operator or a sequence of operators.""" if isinstance(operators, Sequence): return sequentialpipeline(operators, x) else: return operators(x) def ifelsepipeline( condition_func: Callable, if_body_operators: Operators, else_body_operators: Operators = placeholder, x: Optional[dict] = None, ) -> dict: """Functional version of IfElsePipeline. Args: condition_func (`Callable`): A function that determines whether to exeucte `if_body_operator` or `else_body_operator` based on x. if_body_operator (`Operators`): Operators executed when `condition_func` returns True. else_body_operator (`Operators`, defaults to `placeholder`): Operators executed when condition_func returns False, does nothing and just return the input by default. x (`Optional[dict]`, defaults to `None`): The input dictionary. Returns: `dict`: the output dictionary. """ if condition_func(x): return _operators(if_body_operators, x) else: return _operators(else_body_operators, x) def switchpipeline( condition_func: Callable[[Any], Any], case_operators: Mapping[Any, Operators], default_operators: Operators = placeholder, x: Optional[dict] = None, ) -> dict: """Functional version of SwitchPipeline. Args: condition_func (`Callable[[Any], Any]`): A function that determines which case_operator to execute based on the input x. case_operators (`Mapping[Any, Operator]`): A dictionary containing multiple operators and their corresponding trigger conditions. default_operators (`Operators`, defaults to `placeholder`): Operators that are executed when the actual condition do not meet any of the case_operators, does nothing and just return the input by default. x (`Optional[dict]`, defaults to `None`): The input dictionary. Returns: dict: the output dictionary. """ target_case = condition_func(x) if target_case in case_operators: return _operators(case_operators[target_case], x) else: return _operators(default_operators, x) def forlooppipeline( loop_body_operators: Operators, max_loop: int, break_func: Callable[[dict], bool] = lambda _: False, x: Optional[dict] = None, ) -> dict: """Functional version of ForLoopPipeline. Args: loop_body_operators (`Operators`): Operators executed as the body of the loop. max_loop (`int`): maximum number of loop executions. break_func (`Callable[[dict], bool]`): A function used to determine whether to break out of the loop based on the output of the loop_body_operator, defaults to `lambda _: False` x (`Optional[dict]`, defaults to `None`): The input dictionary. Returns: `dict`: The output dictionary. """ for _ in range(max_loop): # loop body x = _operators(loop_body_operators, x) # check condition if break_func(x): break return x # type: ignore[return-value] def whilelooppipeline( loop_body_operators: Operators, condition_func: Callable[[int, Any], bool] = lambda _, __: False, x: Optional[dict] = None, ) -> dict: """Functional version of WhileLoopPipeline. Args: loop_body_operators (`Operators`): Operators executed as the body of the loop. condition_func (`Callable[[int, Any], bool]`, optional): A function that determines whether to continue executing the loop body based on the current loop number and output of the loop_body_operator, defaults to `lambda _,__: False` x (`Optional[dict]`, defaults to `None`): The input dictionary. Returns: `dict`: the output dictionary. """ i = 0 while condition_func(i, x): # loop body x = _operators(loop_body_operators, x) # check condition i += 1 return x # type: ignore[return-value] ``` modelscope/agentscope/blob/main/src/agentscope/pipelines/pipeline.py: ```py # -*- coding: utf-8 -*- """ Base class for Pipeline """ from typing import Callable, Sequence from typing import Any from typing import List from typing import Mapping from typing import Optional from abc import abstractmethod from .functional import ( Operators, placeholder, sequentialpipeline, ifelsepipeline, switchpipeline, forlooppipeline, whilelooppipeline, ) from ..agents.operator import Operator class PipelineBase(Operator): r"""Base interface of all pipelines. The pipeline is a special kind of operator that includes multiple operators and the interaction logic among them. """ def __init__(self) -> None: self.participants: List[Any] = [] @abstractmethod def __call__(self, x: Optional[dict] = None) -> dict: """Define the actions taken by this pipeline. Args: x (Optional[`dict`], optional): Dialog history and some environment information Returns: `dict`: The pipeline's response to the input. """ class IfElsePipeline(PipelineBase): r"""A template pipeline for implementing control flow like if-else. IfElsePipeline(condition_func, if_body_operators, else_body_operators) represents the following workflow:: if condition_func(x): if_body_operators(x) else: else_body_operators(x) """ def __init__( self, condition_func: Callable[[dict], bool], if_body_operators: Operators, else_body_operators: Operators = placeholder, ) -> None: r"""Initialize an IfElsePipeline. Args: condition_func (`Callable[[dict], bool]`): A function that determines whether to execute if_body_operators or else_body_operators based on the input x. if_body_operators (`Operators`): Operators executed when condition_func returns True. else_body_operators (`Operators`): Operators executed when condition_func returns False, does nothing and just return the input by default. """ self.condition_func = condition_func self.if_body_operator = if_body_operators self.else_body_operator = else_body_operators self.participants = [self.if_body_operator] + [self.else_body_operator] def __call__(self, x: Optional[dict] = None) -> dict: return ifelsepipeline( condition_func=self.condition_func, if_body_operators=self.if_body_operator, else_body_operators=self.else_body_operator, x=x, ) class SwitchPipeline(PipelineBase): r"""A template pipeline for implementing control flow like switch-case. SwitchPipeline(condition_func, case_operators, default_operators) represents the following workflow:: switch condition_func(x): case k1: return case_operators[k1](x) case k2: return case_operators[k2](x) ... default: return default_operators(x) """ def __init__( self, condition_func: Callable[[dict], Any], case_operators: Mapping[Any, Operators], default_operators: Operators = placeholder, ) -> None: """Initialize a SwitchPipeline. Args: condition_func (`Callable[[dict], Any]`): A function that determines which case_operator to execute based on the input x. case_operators (`dict[Any, Operators]`): A dictionary containing multiple operators and their corresponding trigger conditions. default_operators (`Operators`, defaults to `placeholder`): Operators that are executed when the actual condition do not meet any of the case_operators, does nothing and just return the input by default. """ self.condition_func = condition_func self.case_operators = case_operators self.default_operators = default_operators self.participants = list(self.case_operators.values()) + [ self.default_operators, ] def __call__(self, x: Optional[dict] = None) -> dict: return switchpipeline( condition_func=self.condition_func, case_operators=self.case_operators, default_operators=self.default_operators, x=x, ) class ForLoopPipeline(PipelineBase): r"""A template pipeline for implementing control flow like for-loop ForLoopPipeline(loop_body_operators, max_loop) represents the following workflow:: for i in range(max_loop): x = loop_body_operators(x) ForLoopPipeline(loop_body_operators, max_loop, break_func) represents the following workflow:: for i in range(max_loop): x = loop_body_operators(x) if break_func(x): break """ def __init__( self, loop_body_operators: Operators, max_loop: int, break_func: Callable[[dict], bool] = lambda _: False, ): r"""Initialize a ForLoopPipeline. Args: loop_body_operators (`Operators`): Operators executed as the body of the loop. max_loop (`int`): Maximum number of loop executions. break_func (`Callable[[dict], bool]`, defaults to `lambda _: False`): A function used to determine whether to break out of the loop based on the output of the loop_body_operators. """ self.loop_body_operators = loop_body_operators self.max_loop = max_loop self.break_func = break_func self.participants = [self.loop_body_operators] def __call__(self, x: Optional[dict] = None) -> dict: return forlooppipeline( loop_body_operators=self.loop_body_operators, max_loop=self.max_loop, break_func=self.break_func, x=x, ) class WhileLoopPipeline(PipelineBase): r"""A template pipeline for implementing control flow like while-loop WhileLoopPipeline(loop_body_operators, condition_operator, condition_func) represents the following workflow:: i = 0 while (condition_func(i, x)) x = loop_body_operators(x) i += 1 """ def __init__( self, loop_body_operators: Operators, condition_func: Callable[[int, dict], bool] = lambda _, __: False, ): """Initialize a WhileLoopPipeline. Args: loop_body_operators (`Operators`): Operators executed as the body of the loop. condition_func (`Callable[[int, dict], bool]`, defaults to `lambda _, __: False`): A function that determines whether to continue executing the loop body based on the current loop number and output of the `loop_body_operator` """ self.condition_func = condition_func self.loop_body_operators = loop_body_operators self.participants = [self.loop_body_operators] def __call__(self, x: Optional[dict] = None) -> dict: return whilelooppipeline( loop_body_operators=self.loop_body_operators, condition_func=self.condition_func, x=x, ) class SequentialPipeline(PipelineBase): r"""A template pipeline for implementing sequential logic. Sequential(operators) represents the following workflow:: x = operators[0](x) x = operators[1](x) ... x = operators[n](x) """ def __init__(self, operators: Sequence[Operator]) -> None: r"""Initialize a Sequential pipeline. Args: operators (`Sequence[Operator]`): A Sequence of operators to be executed sequentially. """ self.operators = operators self.participants = list(self.operators) def __call__(self, x: Optional[dict] = None) -> dict: return sequentialpipeline(operators=self.operators, x=x) ``` modelscope/agentscope/blob/main/src/agentscope/prompt/__init__.py: ```py # -*- coding: utf-8 -*- """ Import all prompt related modules in the package. """ from ._prompt_generator_base import SystemPromptGeneratorBase from ._prompt_generator_zh import ChineseSystemPromptGenerator from ._prompt_generator_en import EnglishSystemPromptGenerator from ._prompt_comparer import SystemPromptComparer from ._prompt_optimizer import SystemPromptOptimizer from ._prompt_engine import PromptEngine __all__ = [ "PromptEngine", "SystemPromptGeneratorBase", "ChineseSystemPromptGenerator", "EnglishSystemPromptGenerator", "SystemPromptComparer", "SystemPromptOptimizer", ] ``` modelscope/agentscope/blob/main/src/agentscope/prompt/_prompt_comparer.py: ```py # -*- coding: utf-8 -*- """The Abtest module to show how different system prompt performs""" from typing import List, Optional, Union, Sequence from loguru import logger from agentscope.models import load_model_by_config_name from agentscope.message import Msg from agentscope.agents import UserAgent, AgentBase class _SystemPromptTestAgent(AgentBase): """An agent class used to test the given system prompt.""" def __init__( self, name: str, sys_prompt: str, model_config_name: str, ) -> None: """Init the agent with the given system prompt, model config name, and name. Args: name (`str`): The name of the agent. sys_prompt (`str`): The system prompt to be tested. model_config_name (`str`): The model config name to be used. """ super().__init__(name, sys_prompt, model_config_name) self.display = False self.memory.add(Msg("system", self.sys_prompt, "system")) def disable_display(self) -> None: """Disable the display of the output message.""" self.display = False def enable_display(self) -> None: """Enable the display of the output message.""" self.display = True def reply(self, x: Optional[Union[Msg, Sequence[Msg]]] = None) -> Msg: """Reply the message with the given system prompt.""" self.memory.add(x) prompt = self.model.format(self.memory.get_memory()) res = self.model(prompt) msg = Msg(self.name, res.text, "assistant") if self.display: self.speak(msg) self.memory.add(msg) return msg class SystemPromptComparer: """The Abtest module to compare how different system prompts perform with different queries or in a multi-turn dialog.""" def __init__( self, model_config_name: str, compared_system_prompts: List[str], ) -> None: """Init the Abtest module, the model config name, user prompt, and a list of prompt optimization methods or prompts are required. Args: model_config_name (`str`): The model config for the model to be used to generate and compare prompts. compared_system_prompts (`List[str]`): A list of system prompts to be compared in the abtest. """ self.model_config_name = model_config_name self.model = load_model_by_config_name(model_config_name) self.compared_system_prompts = compared_system_prompts # TODO: use distributed agent to accelerate the process self.agents = [ _SystemPromptTestAgent( f"assistant-{index}", sys_prompt=sys_prompt, model_config_name=model_config_name, ) for index, sys_prompt in enumerate(self.compared_system_prompts) ] def _compare_with_query(self, query: str) -> dict: """Infer the query with the given system prompt.""" msg_query = Msg("user", query, "user") msgs_result = [agent(msg_query) for agent in self.agents] results = [] for system_prompt, response_msg in zip( self.compared_system_prompts, msgs_result, ): results.append( { "system_prompt": system_prompt, "response": response_msg.content, }, ) return { "query": query, "results": results, } def _set_display_status(self, status: bool) -> None: """Set the display status of all agents.""" for agent in self.agents: if status: agent.enable_display() else: agent.disable_display() def compare_with_queries(self, queries: List[str]) -> List[dict]: """Compare different system prompts a list of input queries. Args: queries (`List[str]`): A list of input queries that will be used to compare different system prompts. Returns: `List[dict]`: A list of responses of the queries with different system prompts. """ self._set_display_status(False) query_results = [] for index, query in enumerate(queries): # Print the query logger.info(f"## Query {index}:\n{query}") res = self._compare_with_query(query) for index_prompt, _ in enumerate(res["results"]): logger.info( f"### System Prompt {index_prompt}\n" f"```\n" f"{_['system_prompt']}\n" f"```\n" f"\n" f"### Response\n" f"{_['response']}\n", ) query_results.append(res) self._clear_memories() return query_results def compare_in_dialog(self) -> List[dict]: """Compare how different system prompts perform in a multi-turn dialog. Users can press `exit` to exit the dialog. Returns: `List[dict]`: A list of dictionaries, which contains the tested system prompts and the dialog history. """ for agent in self.agents: Msg( agent.name, f"My system prompt: ```{agent.sys_prompt}```", "assistant", echo=True, ) print("\n", " Start the dialog, input `exit` to exit ".center(80, "#")) self._set_display_status(True) user_agent = UserAgent() x = None while x is None or x.content != "exit": for agent in self.agents: agent(x) x = user_agent() # Get the dialog history results = [ { "system_prompt": _.sys_prompt, "dialogue_history": _.memory.get_memory(), } for _ in self.agents ] # Clean the memory self._clear_memories() return results def _clear_memories(self) -> None: """Clear the memory of all agents.""" for agent in self.agents: agent.memory.clear() ``` modelscope/agentscope/blob/main/src/agentscope/prompt/_prompt_engine.py: ```py # -*- coding: utf-8 -*- """Prompt engineering module.""" from typing import Any, Optional, Union from enum import IntEnum from loguru import logger from agentscope.models import OpenAIWrapperBase, ModelWrapperBase from agentscope.constants import ShrinkPolicy from agentscope.utils.tools import to_openai_dict, to_dialog_str class PromptType(IntEnum): """Enum for prompt types.""" STRING = 0 LIST = 1 class PromptEngine: """Prompt engineering module for both list and string prompt""" def __init__( self, model: ModelWrapperBase, shrink_policy: ShrinkPolicy = ShrinkPolicy.TRUNCATE, max_length: Optional[int] = None, prompt_type: Optional[PromptType] = None, max_summary_length: int = 200, summarize_model: Optional[ModelWrapperBase] = None, ) -> None: """Init PromptEngine. Args: model (`ModelWrapperBase`): The target model for prompt engineering. shrink_policy (`ShrinkPolicy`, defaults to `ShrinkPolicy.TRUNCATE`): The shrink policy for prompt engineering, defaults to `ShrinkPolicy.TRUNCATE`. max_length (`Optional[int]`, defaults to `None`): The max length of context, if it is None, it will be set to the max length of the model. prompt_type (`Optional[MsgType]`, defaults to `None`): The type of prompt, if it is None, it will be set according to the model. max_summary_length (`int`, defaults to `200`): The max length of summary, if it is None, it will be set to the max length of the model. summarize_model (`Optional[ModelWrapperBase]`, defaults to `None`): The model used for summarization, if it is None, it will be set to `model`. Note: 1. TODO: Shrink function is still under development. 2. If the argument `max_length` and `prompt_type` are not given, they will be set according to the given model. 3. `shrink_policy` is used when the prompt is too long, it can be set to `ShrinkPolicy.TRUNCATE` or `ShrinkPolicy.SUMMARIZE`. a. `ShrinkPolicy.TRUNCATE` will truncate the prompt to the desired length. b. `ShrinkPolicy.SUMMARIZE` will summarize partial of the dialog history to save space. The summarization model defaults to `model` if not given. Example: With prompt engine, we encapsulate different operations for string- and list-style prompt, and block the prompt engineering process from the user. As a user, you can just combine you prompt as follows. .. code-block:: python # prepare the component system_prompt = "You're a helpful assistant ..." hint_prompt = "You should response in Json format." prefix = "assistant: " # initialize the prompt engine and join the prompt engine = PromptEngine(model) prompt = engine.join(system_prompt, memory.get_memory(), hint_prompt, prefix) """ self.model = model self.shrink_policy = shrink_policy self.max_length = max_length if prompt_type is None: if isinstance(model, OpenAIWrapperBase): self.prompt_type = PromptType.LIST else: self.prompt_type = PromptType.STRING else: self.prompt_type = prompt_type self.max_summary_length = max_summary_length if summarize_model is None: self.summarize_model = model logger.warning( "The prompt engine will be deprecated in the future. " "Please use the `format` function in model wrapper object " "instead. More details refer to ", "https://modelscope.github.io/agentscope/en/tutorial/206-prompt" ".html", ) def join( self, *args: Any, format_map: Optional[dict] = None, ) -> Union[str, list[dict]]: """Join prompt components according to its type. The join function can accept any number and type of arguments. If prompt type is `PromptType.STRING`, the arguments will be joined by `"\\\\n"`. If prompt type is `PromptType.LIST`, the string arguments will be converted to `Msg` from `system`. """ # TODO: achieve the summarize function # Filter `None` args = [_ for _ in args if _ is not None] if self.prompt_type == PromptType.STRING: return self.join_to_str(*args, format_map=format_map) elif self.prompt_type == PromptType.LIST: return self.join_to_list(*args, format_map=format_map) else: raise RuntimeError("Invalid prompt type.") def join_to_str(self, *args: Any, format_map: Union[dict, None]) -> str: """Join prompt components to a string.""" prompt = [] for item in args: if isinstance(item, list): items_str = self.join_to_str(*item, format_map=None) prompt += [items_str] elif isinstance(item, dict): prompt.append(to_dialog_str(item)) else: prompt.append(str(item)) prompt_str = "\n".join(prompt) if format_map is not None: prompt_str = prompt_str.format_map(format_map) return prompt_str def join_to_list(self, *args: Any, format_map: Union[dict, None]) -> list: """Join prompt components to a list of `Msg` objects.""" prompt = [] for item in args: if isinstance(item, list): # nested processing prompt.extend(self.join_to_list(*item, format_map=None)) elif isinstance(item, dict): prompt.append(to_openai_dict(item)) else: prompt.append(to_openai_dict({"content": str(item)})) if format_map is not None: format_prompt = [] for msg in prompt: format_prompt.append( { k.format_map(format_map): v.format_map(format_map) for k, v in msg.items() }, ) prompt = format_prompt return prompt ``` modelscope/agentscope/blob/main/src/agentscope/prompt/_prompt_generator_base.py: ```py # -*- coding: utf-8 -*- """Basic class for system prompt generator.""" from abc import ABC import random from typing import Any, List, Literal, Optional, Union from loguru import logger from tqdm import tqdm from agentscope.file_manager import file_manager from agentscope.message import Msg from agentscope.models import ( load_model_by_config_name, load_config_by_name, ModelResponse, ) from agentscope.prompt._prompt_utils import _find_top_k_embeddings class _SentencePieceEmbeddingModel: """The wrapper class for the sentence_transformers library. It is used to generate embeddings for the examples locally. Note: To download the model, you need to be accessible to the huggingface. """ def __init__(self, model_name_or_path: str) -> None: """The constructor of the SentencePieceEmbeddingModel. Args: model_name_or_path (`str`): The name or path of the model. Full model list refers to https://www.sbert.net/docs/sentence_transformer/pretrained_models.html#original-models """ self.model = None self.model_name_or_path = model_name_or_path def __call__(self, queries: Union[str, List[str]]) -> Any: # Lazy loading the model if self.model is None: try: from sentence_transformers import SentenceTransformer except ImportError as e: raise ImportError( "The sentence-transformers library is required. " "Install it with `pip install sentence-transformers`.", ) from e logger.info( f"Loading local embedding model: {self.model_name_or_path}", ) self.model = SentenceTransformer(self.model_name_or_path) logger.info("Finish loading the local embedding model.") embedding = self.model.encode(queries) return ModelResponse(embedding=[embedding]) class SystemPromptGeneratorBase(ABC): """Base class for system prompt generator, which receives the users' input and returns an optimized system prompt in the `optimize` method.""" def __init__( self, model_config_name: str, meta_prompt: str = None, response_prompt_template: str = None, example_num: int = 0, example_list: Optional[list] = None, example_selection_strategy: Literal["random", "similarity"] = "random", embed_model_config_name: Optional[str] = None, example_prompt_template: Optional[str] = None, local_embedding_model: str = None, ) -> None: """The constructor of the SystemPromptOptimizer, which uses the specified model and meta prompt to optimize the users' system prompt. Args: model_config_name (`str`): The name of the model config, which is used to load the model from the configuration. meta_prompt (`str`): The meta prompt used to optimize the users' system prompt. response_prompt_template (`Optional[str]`): The prompt template used to remind the LLM to generate the optimized system prompt. example_num (`int`): The number of examples that will be attached to the end of the meta prompt. If `0`, no examples will be attached. example_list (`List`): The candidate examples that will be chosen from. AgentScope provides a default list of examples. example_selection_strategy (`Literal["random", "similarity"]`): The strategy used to select examples. embed_model_config_name (`str`): If the example selection method is `"similarity"`, an embedding model config name is required. """ self.model = load_model_by_config_name(model_config_name) self.meta_prompt = meta_prompt self.response_prompt_template = response_prompt_template # example related self.example_num = example_num self.example_list = example_list or [] self.example_selection_strategy = example_selection_strategy self.example_prompt_template = example_prompt_template # assert example_num <= len(example_list) if self.example_num > len(self.example_list): raise ValueError( f"The number of examples to select ({self.example_num}) " f"is larger than the candidate examples provided " f"({len(self.example_list)}).", ) # Used to cache the embeddings of the examples. self.embed_model_name = None self.example_embeddings = None self.local_embedding_model = local_embedding_model # Load embed model if needed if ( self.example_num > 0 and self.example_selection_strategy == "similarity" ): if embed_model_config_name is None: logger.info( f"Embedding model config name is not provided, a default " f'local embedding model "{self.local_embedding_model}" ' f"will be used.", ) self.embed_model = _SentencePieceEmbeddingModel( self.local_embedding_model, ) self.embed_model_name = self.local_embedding_model else: self.embed_model = load_model_by_config_name( embed_model_config_name, ) self.embed_model_name = load_config_by_name( embed_model_config_name, ) self.example_embeddings = self._generate_embeddings() def _get_example_prompt(self, examples: List[dict]) -> str: """Get the prompt examples""" examples_prompt = [] for index, example in enumerate(examples): values = {"index": index + 1, **example} examples_prompt.append( self.example_prompt_template.format_map(values), ) return "\n".join(examples_prompt) def _select_example(self, user_prompt: str) -> List: """Select the examples that are most similar with the given user query Args: user_prompt (`str`): The user query used to select the examples. Returns: `List`: The selected examples. """ if self.example_selection_strategy == "random": return self._select_random_example() elif self.example_selection_strategy == "similarity": return self._select_similar_example(user_prompt) else: raise ValueError( f"Invalid example selection method " f"{self.example_selection_strategy}", ) def _select_similar_example(self, user_prompt: str) -> List: """Select the examples using embedding similarity Args: user_prompt (`str`): The user query used to select the examples. Returns: `List`: The selected examples. """ # Get the human query embd using the embedding model human_query_embd = self.embed_model(user_prompt).embedding[0] # TODO: use the retrieval service instead rather than achieving it # locally selected_indices = _find_top_k_embeddings( human_query_embd, self.example_embeddings, self.example_num, ) return [self.example_list[_] for _ in selected_indices] def _select_random_example(self) -> List: """Select the examples randomly.""" return random.sample( self.example_list, self.example_num, ) def _generate_embeddings(self) -> List: """Generate embeddings for the examples.""" example_embeddings = [] for example in tqdm(self.example_list, desc="Generating embeddings"): user_prompt = example["user_prompt"] # Load cached embedding instead of generating them again cached_embedding = file_manager.fetch_cached_text_embedding( text=user_prompt, embedding_model=self.embed_model_name, ) if cached_embedding is None: new_embedding = self.embed_model(user_prompt).embedding[0] example_embeddings.append(new_embedding) # Cache the embedding file_manager.cache_text_embedding( text=user_prompt, embedding=new_embedding, embedding_model=self.embed_model_name, ) else: example_embeddings.append(cached_embedding) return example_embeddings def generate(self, user_input: str) -> str: """Generate (optimized) system prompt according to the user input, which could be a user's system prompt or query. Args: user_input (`str`): The user input, could be user's system prompt or query, e.g. "Write a system prompt for a chatbot". Returns: `str`: The optimized system prompt. """ # Select examples examples = self._select_example(user_input) # Format the prompt formatted_prompt = "\n".join( [ self.meta_prompt, self._get_example_prompt(examples), self.response_prompt_template.format(user_prompt=user_input), ], ) prompt = self.model.format( Msg( "user", formatted_prompt, role="user", ), ) # Generate the optimized prompt response = self.model(prompt).text return response ``` modelscope/agentscope/blob/main/src/agentscope/prompt/_prompt_generator_en.py: ```py # -*- coding: utf-8 -*- """Prompt generator class.""" from typing import List, Literal, Optional from ._prompt_generator_base import SystemPromptGeneratorBase from ._prompt_utils import _DEFAULT_EXAMPLE_LIST_EN _DEFAULT_META_PROMPT_EN = """ You are an expert in writing and optimizing system prompts. Your task is to enhance the system prompt provided by the user, ensuring the enhanced prompt includes a description of the agent's role or personality, the agent's skill points, and some constraints. ## Note 1. The optimized system prompt must align with the user's original prompt intent. You may add callable tools, specific keywords, time frames, context, or any additional information to narrow the scope and guide the agent better in completing the task. Reconstruct the user's prompt as necessary. 2. The role and skill point descriptions should not narrow the scope defined by the user's original prompt. 3. Skill point descriptions should be as detailed and accurate as possible. If the user's original prompt includes examples, ensure skill points cover these cases but are not limited to them. For instance, if the original prompt mentions an "exam question generating robot" that can create fill-in-the-blank questions as an example, the skill points in the optimized prompt should include creating exam questions but not be limited to fill-in-the-blank questions. 4. Skill scope should not exceed the large model's capabilities. If it does, specify the tools or knowledge bases needed to endow the model with this skill. For example, since the large model lacks search function, invoke a search tool if searching is required. 5. Output the optimized prompt in markdown format. 6. The prompt must be concise, within 1000 words. 7. Retain the knowledge base or memory section in the optimized prompt if the user's provided prompt includes these. 8. If the prompt contains variables like ${{variable}}, ensure the variable appears only once in the optimized prompt. In subsequent references, use the variable name directly without enclosing it in ${}. 9. The language of the optimized prompt should match the user's original prompt: If the user provides the prompt in Chinese, optimize in Chinese; if in English, optimize in English. """ # noqa _DEFAULT_EXAMPLE_PROMPT_TEMPLATE_EN = """## Example {index} - User's Input: ``` {user_prompt} ``` - Optimized system prompt: ``` {opt_prompt} ``` """ _DEFAULT_RESPONSE_PROMPT_TEMPLATE_EN = """## User's Input ``` {user_prompt} ``` ## Optimized System Prompt """ # The name of the default local embedding model, which is used when # `embed_model_config_name` is not provided. _DEFAULT_LOCAL_EMBEDDING_MODEL = "sentence-transformers/all-mpnet-base-v2" class EnglishSystemPromptGenerator(SystemPromptGeneratorBase): """Optimize the users' system prompt with the given meta prompt and examples if provided.""" def __init__( self, model_config_name: str, meta_prompt: str = _DEFAULT_META_PROMPT_EN, response_prompt_template: str = _DEFAULT_RESPONSE_PROMPT_TEMPLATE_EN, example_num: int = 0, example_list: List = _DEFAULT_EXAMPLE_LIST_EN, example_selection_strategy: Literal["random", "similarity"] = "random", example_prompt_template: str = _DEFAULT_EXAMPLE_PROMPT_TEMPLATE_EN, embed_model_config_name: Optional[str] = None, local_embedding_model: str = _DEFAULT_LOCAL_EMBEDDING_MODEL, ): super().__init__( model_config_name=model_config_name, meta_prompt=meta_prompt, response_prompt_template=response_prompt_template, example_num=example_num, example_list=example_list, example_selection_strategy=example_selection_strategy, example_prompt_template=example_prompt_template, embed_model_config_name=embed_model_config_name, local_embedding_model=local_embedding_model, ) ``` modelscope/agentscope/blob/main/src/agentscope/prompt/_prompt_generator_zh.py: ```py # -*- coding: utf-8 -*- """Prompt generator class.""" from typing import List, Literal, Optional from ._prompt_generator_base import SystemPromptGeneratorBase from ._prompt_utils import _DEFAULT_EXAMPLE_LIST_ZH _DEFAULT_META_PROMPT_ZH = """ 你是一个擅长写和优化system prompt的专家。你的任务是优化用户提供的prompt, 使得优化后的system prompt包含对agent的角色或者性格描述，agent的技能点，和一些限制。 ## 注意 1. 优化后的system prompt必须与用户原始prompt意图一致，可适当加入可调用的工具、具体关键词、时间框架、上下文或任何可以缩小范围并指导agent能够更好地理解完成任务的附加信息，对用户的prompt进行重构。 2. 请注意角色描述和技能点的描述不能缩小用户原始prompt定义的范围。例如用户原始prompt里描述的是文案大师，优化后的prompt描述不能缩小范围变成小红书文案大师。 3. 对技能点的描述应该尽量详细准确。用户原始的prompt会提到一些示例，技能点应该能覆盖这些案例，但注意不能只局限于用户prompt里给的示例。例如用户原始prompt里提到出题机器人可以出填空题的考题的示例，优化后的prompt里技能点不能只包括出填空题。 4. 技能范围不能超过大模型的能力，如果超过，请必须注明需要调用哪些工具,或者需要哪些知识库来帮助大模型拥有这个技能。比如大模型并没有搜索功能，如果需要搜索，则需要调用搜索工具来实现。 5. 请以markdown的格式输出优化后的prompt。 6. 优化后的prompt必须语言简练，字数不超过1000字。 7. 如果用户提供的prompt包含知识库或者Memory部分，优化后的system prompt也必须保留这些部分。 8. 如果prompt中含有如下标识符的变量：${{variable}}, 请确保改变量在优化后的prompt里只出现一次,在其他要使用该变量的地方直接使用该变量名。例如${{document}}再次出现的时候，请直接使用"检索内容"。 9. 优化后的prompt语言与用户提供的prompt一致，即用户提供的prompt使用中文写的，优化后的prompt也必须是中文, 如果用户提供的prompt使用英文写的，优化后的prompt也必须是英文。 """ # noqa _DEFAULT_EXAMPLE_PROMPT_TEMPLATE_ZH = """## 样例{index} - 用户输入: ``` {user_prompt} ``` - 优化后的system prompt： ``` {opt_prompt} ``` """ _DEFAULT_RESPONSE_PROMPT_TEMPLATE_ZH = """## 用户输入 ``` {user_prompt} ``` ## 优化后的system prompt """ # The name of the default local embedding model, which is used when # `embed_model_config_name` is not provided. _DEFAULT_LOCAL_EMBEDDING_MODEL = "sentence-transformers/all-mpnet-base-v2" class ChineseSystemPromptGenerator(SystemPromptGeneratorBase): """Optimize the users' system prompt with the given meta prompt and examples if provided.""" def __init__( self, model_config_name: str, meta_prompt: str = _DEFAULT_META_PROMPT_ZH, response_prompt_template: str = _DEFAULT_RESPONSE_PROMPT_TEMPLATE_ZH, example_num: int = 0, example_list: List = _DEFAULT_EXAMPLE_LIST_ZH, example_selection_strategy: Literal["random", "similarity"] = "random", example_prompt_template: str = _DEFAULT_EXAMPLE_PROMPT_TEMPLATE_ZH, embed_model_config_name: Optional[str] = None, local_embedding_model: str = _DEFAULT_LOCAL_EMBEDDING_MODEL, ): super().__init__( model_config_name=model_config_name, meta_prompt=meta_prompt, response_prompt_template=response_prompt_template, example_num=example_num, example_list=example_list, example_selection_strategy=example_selection_strategy, example_prompt_template=example_prompt_template, embed_model_config_name=embed_model_config_name, local_embedding_model=local_embedding_model, ) ``` modelscope/agentscope/blob/main/src/agentscope/prompt/_prompt_optimizer.py: ```py # -*- coding: utf-8 -*- """A module that optimize agent system prompt given dialog history.""" from typing import Union, List from agentscope.message import Msg from agentscope.models import ModelWrapperBase, load_model_by_config_name _DEFAULT_META_PROMPT_TEMPLATE = """ You are an excellent Prompt Engineer. Your task is to optimize an Agent's system prompt by adding notes. The original system prompt provided by the user is: ``` {system_prompt} ``` The dialog history of user interaction with the agent is: ``` {dialog_history} ``` Now, you need to: 1. Determine if the user-agent interaction in the dialog history contains any explicit errors (such as function call errors, failure to adhere to input-output formats), misunderstandings of user intentions, etc. 2. Conduct a detailed analysis of the reasons for the errors and find solutions corresponding to the errors. 3. Based on the causes of the errors and user intentions, write one or several notes that can be added after the user’s system prompt in the form of attention notes or example notes to prevent the same mistakes from happening again in the future. If the notes to be added include examples, be extremely cautious. If unsure whether the example to add is correct, you may refrain from adding. The language of the notes you add should be consistent with the original system prompt provided by the user. For example, if the original system prompt provided by the user is written in Chinese, the notes you add should also be in Chinese; if the original system prompt provided by the user is written in English, the notes you add should also be in English. The notes you add should be included within the tag [prompt_note], for example: [prompt_note] Please note that the output should only include JSON format [/prompt_note]. If there are no obvious issues in the dialog history, then no notes need to be added. """ # noqa OPT_PROMPT_TEMPLATE_ZH = """你是一个优秀的Prompt Engineer，现在你要通过添加note的方式对一个Agent的system prompt进行优化。用户提供的原始system prompt是： ``` {system_prompt} ``` 用户与之交互的dialog history是： ``` {dialog_history} ``` 现在，你要 1. 判断用户与agent交互的dialog history中，是否包含显式的错误（如函数调用错误、没有遵循输入输出格式），对用户意图的误解等。 2. 对发生错误的原因进行详细分析，并且寻找对于对应错误的解决方案。 3. 根据错误原因和用户意图，写一条或几条可以添加在用户system prompt后面的注意事项note，或者exmaple形式的note，使之不要再犯同样的错误。如果要添加的note包含example，需要格外小心，如果不确定添加的example是否正确，可以先不添加。你添加的note语言与用户提供的原始system prompt一致，即用户提供的原始system prompt是使用中文写的，你添加的note也必须是中文; 如果用户提供的原始system prompt是使用english写的，你添加的note也必须是english。你添加的note应该包含在tag [prompt_note]中，例如 [prompt_note] 请注意输出仅包含json格式 [/prompt_note]。如果dialog history没有明显问题，则不需要添加任何note。 """ # noqa class SystemPromptOptimizer: """A system prompt optimizer class. For now (2024-06-13), the optimizer can optimize system prompt by extracting notes from the dialog history. It's more like reflection on the dialog history.""" def __init__( self, model_or_model_config_name: Union[ModelWrapperBase, str], meta_prompt_template: str = _DEFAULT_META_PROMPT_TEMPLATE, ) -> None: """Initialize the system prompt optimizer. Args: model_or_model_config_name (`Union[ModelWrapperBase, str]`): The model or model config name to be used for generating notes. meta_prompt_template (`str`, defaults to `_DEFAULT_META_PROMPT_TEMPLATE`): The meta prompt to guide the LLM to extract notes from the system prompt and dialog history. Must contain placeholders `{system_prompt}` and `{dialog_history}`. """ if isinstance(model_or_model_config_name, ModelWrapperBase): self.model = model_or_model_config_name elif isinstance(model_or_model_config_name, str): self.model = load_model_by_config_name(model_or_model_config_name) else: raise TypeError( "model_or_model_config_name must be ModelWrapperBase or str", ) self.meta_prompt = meta_prompt_template def _get_all_tagged_notes(self, response_text: str) -> List[str]: """Get all the notes in the response text.""" # TODO: Use a parser to extract the notes notes = [] start_tag = "[prompt_note]" end_tag = "[/prompt_note]" start_index = response_text.find(start_tag) while start_index != -1: end_index = response_text.find( end_tag, start_index + len(start_tag), ) if end_index != -1: note = response_text[start_index + len(start_tag) : end_index] notes.append(note) start_index = response_text.find( start_tag, end_index + len(end_tag), ) else: break return notes def generate_notes( self, system_prompt: str, dialog_history: List[Msg], ) -> List[str]: """Given the system prompt and dialogue history, generate notes to optimize the system prompt. Args: system_prompt (`str`): The system prompt provided by the user. dialog_history (`List[Msg]`): The dialogue history of user interaction with the agent. Returns: List[str]: The notes added to the system prompt. """ dialog_history_str = "\n".join( [f"{msg.name}: {msg.content}" for msg in dialog_history], ) prompt = self.model.format( Msg( "user", self.meta_prompt.format( system_prompt=system_prompt, dialog_history=dialog_history_str, ), role="user", ), ) response = self.model(prompt).text # Extract all the notes from the response text notes = self._get_all_tagged_notes(response) return notes ``` modelscope/agentscope/blob/main/src/agentscope/prompt/_prompt_utils.py: ```py # -*- coding: utf-8 -*- """Utility functions for prompt optimization.""" import json from typing import List from pathlib import Path import numpy as np from scipy.spatial.distance import cdist def _find_top_k_embeddings( query_embedding: List[float], list_embeddings: List[List[float]], k: int, ) -> List: """ Find the top k embeddings that are closed to the query embedding. Args: query_embedding (`List[float]`): the query to be searched. list_embeddings (`List[List[float]]`): the list of embeddings to be searched. k (`int`): the number of embeddings to be returned. Returns: `List`: the list of indices of the top k embeddings. """ # Compute cosine similarity between the query and the list of embeddings. # cdist returns the distance of 2-dimension arrays, # so we subtract from 1 to get similarity. # Cosine distance is defined as 1.0 minus the cosine similarity. similarities = ( 1 - cdist([query_embedding], list_embeddings, "cosine").flatten() ) # Get the top k indices sorted by similarity (in descending order). return list(np.argsort(similarities)[::-1][:k]) def _read_json_same_dir(file_name: str) -> dict: """Read the json file in the same dir""" current_file_path = Path(__file__) json_file_path = current_file_path.parent / file_name with open(json_file_path, "r", encoding="utf-8") as file: data = json.load(file) return data _examples = _read_json_same_dir("_prompt_examples.json") _DEFAULT_EXAMPLE_LIST_EN = _examples["en"] _DEFAULT_EXAMPLE_LIST_ZH = _examples["zh"] ``` modelscope/agentscope/blob/main/src/agentscope/rag/__init__.py: ```py # -*- coding: utf-8 -*- """ Import all pipeline related modules in the package. """ from .knowledge import Knowledge from .llama_index_knowledge import LlamaIndexKnowledge from .knowledge_bank import KnowledgeBank __all__ = [ "Knowledge", "LlamaIndexKnowledge", "KnowledgeBank", ] ``` modelscope/agentscope/blob/main/src/agentscope/rag/knowledge.py: ```py # -*- coding: utf-8 -*- """ Base class module for retrieval augmented generation (RAG). To accommodate the RAG process of different packages, we abstract the RAG process into four stages: - data loading: loading data into memory for following processing; - data indexing and storage: document chunking, embedding generation, and off-load the data into VDB; - data retrieval: taking a query and return a batch of documents or document chunks; - post-processing of the retrieved data: use the retrieved data to generate an answer. """ import importlib from abc import ABC, abstractmethod from typing import Any, Optional from loguru import logger from agentscope.models import ModelWrapperBase class Knowledge(ABC): """ Base class for RAG, CANNOT be instantiated directly """ def __init__( self, knowledge_id: str, emb_model: Any = None, knowledge_config: Optional[dict] = None, model: Optional[ModelWrapperBase] = None, **kwargs: Any, ) -> None: # pylint: disable=unused-argument """ initialize the knowledge component Args: knowledge_id (str): The id of the knowledge unit. emb_model (ModelWrapperBase): The embedding model used for generate embeddings knowledge_config (dict): The configuration to generate or load the index. """ self.knowledge_id = knowledge_id self.emb_model = emb_model self.knowledge_config = knowledge_config or {} self.postprocessing_model = model @abstractmethod def _init_rag( self, **kwargs: Any, ) -> Any: """ Initiate the RAG module. """ @abstractmethod def retrieve( self, query: Any, similarity_top_k: int = None, to_list_strs: bool = False, **kwargs: Any, ) -> list[Any]: """ retrieve list of content from database (vector stored index) to memory Args: query (Any): query for retrieval similarity_top_k (int): the number of most similar data returned by the retriever. to_list_strs (bool): whether return a list of str Returns: return a list with retrieved documents (in strings) """ def post_processing( self, retrieved_docs: list[str], prompt: str, **kwargs: Any, ) -> Any: """ A default solution for post-processing function, generates answer based on the retrieved documents. Args: retrieved_docs (list[str]): list of retrieved documents prompt (str): prompt for LLM generating answer with the retrieved documents Returns: Any: a synthesized answer from LLM with retrieved documents Example: self.postprocessing_model(prompt.format(retrieved_docs)) """ assert self.postprocessing_model prompt = prompt.format("\n".join(retrieved_docs)) return self.postprocessing_model(prompt, **kwargs).text def _prepare_args_from_config(self, config: dict) -> Any: """ Helper function to build objects in RAG classes. Args: config (dict): a dictionary containing configurations Returns: Any: an object that is parsed/built to be an element of input to the function of RAG module. """ if not isinstance(config, dict): return config if "create_object" in config: # if a term in args is an object, # recursively create object with args from config module_name = config.get("module", "") class_name = config.get("class", "") init_args = config.get("init_args", {}) try: cur_module = importlib.import_module(module_name) cur_class = getattr(cur_module, class_name) init_args = self._prepare_args_from_config(init_args) logger.info( f"load and build object: {class_name}", ) return cur_class(**init_args) except ImportError as exc_inner: logger.error( f"Fail to load class {class_name} " f"from module {module_name}", ) raise ImportError( f"Fail to load class {class_name} " f"from module {module_name}", ) from exc_inner else: prepared_args = {} for key, value in config.items(): if isinstance(value, list): prepared_args[key] = [] for c in value: prepared_args[key].append( self._prepare_args_from_config(c), ) elif isinstance(value, dict): prepared_args[key] = self._prepare_args_from_config(value) else: prepared_args[key] = value return prepared_args ``` modelscope/agentscope/blob/main/src/agentscope/rag/knowledge_bank.py: ```py # -*- coding: utf-8 -*- """ Knowledge bank for making Knowledge objects easier to use """ import copy import json from typing import Optional, Union from loguru import logger from agentscope.models import load_model_by_config_name from agentscope.agents import AgentBase from .llama_index_knowledge import LlamaIndexKnowledge DEFAULT_INDEX_CONFIG = { "knowledge_id": "", "data_processing": [], } DEFAULT_LOADER_CONFIG = { "load_data": { "loader": { "create_object": True, "module": "llama_index.core", "class": "SimpleDirectoryReader", "init_args": {}, }, }, } DEFAULT_INIT_CONFIG = { "input_dir": "", "recursive": True, "required_exts": [], } class KnowledgeBank: """ KnowledgeBank enables 1) provide an easy and fast way to initialize the Knowledge object; 2) make Knowledge object reusable and sharable for multiple agents. """ def __init__( self, configs: Union[dict, str], ) -> None: """initialize the knowledge bank""" if isinstance(configs, str): logger.info(f"Loading configs from {configs}") with open(configs, "r", encoding="utf-8") as fp: self.configs = json.loads(fp.read()) else: self.configs = configs self.stored_knowledge: dict[str, LlamaIndexKnowledge] = {} self._init_knowledge() def _init_knowledge(self) -> None: """initialize the knowledge bank""" for config in self.configs: print("bank", config) self.add_data_as_knowledge( knowledge_id=config["knowledge_id"], emb_model_name=config["emb_model_config_name"], knowledge_config=config, ) logger.info("knowledge bank initialization completed.\n ") def add_data_as_knowledge( self, knowledge_id: str, emb_model_name: str, data_dirs_and_types: dict[str, list[str]] = None, model_name: Optional[str] = None, knowledge_config: Optional[dict] = None, ) -> None: """ Transform data in a directory to be ready to work with RAG. Args: knowledge_id (str): user-defined unique id for the knowledge emb_model_name (str): name of the embedding model model_name (Optional[str]): name of the LLM for potential post-processing or query rewrite data_dirs_and_types (dict[str, list[str]]): dictionary of data paths (keys) to the data types (file extensions) for knowledgebase (e.g., [".md", ".py", ".html"]) knowledge_config (optional[dict]): complete indexing configuration, used for more advanced applications. Users can customize - loader, - transformations, - ... Examples can refer to../examples/conversation_with_RAG_agents/ a simple example of importing data to Knowledge object: '' knowledge_bank.add_data_as_knowledge( knowledge_id="agentscope_tutorial_rag", emb_model_name="qwen_emb_config", data_dirs_and_types={ "../../docs/sphinx_doc/en/source/tutorial": [".md"], }, persist_dir="./rag_storage/tutorial_assist", ) '' """ if knowledge_id in self.stored_knowledge: raise ValueError(f"knowledge_id {knowledge_id} already exists.") assert data_dirs_and_types is not None or knowledge_config is not None if knowledge_config is None: knowledge_config = copy.deepcopy(DEFAULT_INDEX_CONFIG) for data_dir, types in data_dirs_and_types.items(): loader_config = copy.deepcopy(DEFAULT_LOADER_CONFIG) loader_init = copy.deepcopy(DEFAULT_INIT_CONFIG) loader_init["input_dir"] = data_dir loader_init["required_exts"] = types loader_config["load_data"]["loader"]["init_args"] = loader_init knowledge_config["data_processing"].append(loader_config) self.stored_knowledge[knowledge_id] = LlamaIndexKnowledge( knowledge_id=knowledge_id, emb_model=load_model_by_config_name(emb_model_name), knowledge_config=knowledge_config, model=load_model_by_config_name(model_name) if model_name else None, ) logger.info(f"data loaded for knowledge_id = {knowledge_id}.") def get_knowledge( self, knowledge_id: str, duplicate: bool = False, ) -> LlamaIndexKnowledge: """ Get a Knowledge object from the knowledge bank. Args: knowledge_id (str): unique id for the Knowledge object duplicate (bool): whether return a copy of the Knowledge object. Returns: LlamaIndexKnowledge: the Knowledge object defined with Llama-index """ if knowledge_id not in self.stored_knowledge: raise ValueError( f"{knowledge_id} does not exist in the knowledge bank.", ) knowledge = self.stored_knowledge[knowledge_id] if duplicate: knowledge = copy.deepcopy(knowledge) logger.info(f"knowledge bank loaded: {knowledge_id}.") return knowledge def equip( self, agent: AgentBase, knowledge_id_list: list[str] = None, duplicate: bool = False, ) -> None: """ Equip the agent with the knowledge by knowledge ids. Args: agent (AgentBase): the agent to be equipped with knowledge knowledge_id_list: the list of knowledge ids to be equipped with the agent duplicate (bool): whether to deepcopy the knowledge object TODO: to accommodate with distributed setting """ logger.info(f"Equipping {agent.name} knowledge {knowledge_id_list}") knowledge_id_list = knowledge_id_list or [] if not hasattr(agent, "knowledge_list"): agent.knowledge_list = [] for kid in knowledge_id_list: knowledge = self.get_knowledge( knowledge_id=kid, duplicate=duplicate, ) agent.knowledge_list.append(knowledge) ``` modelscope/agentscope/blob/main/src/agentscope/rag/llama_index_knowledge.py: ```py # -*- coding: utf-8 -*- """ This module is an integration of the Llama index RAG into AgentScope package """ import os.path from typing import Any, Optional, List, Union from loguru import logger try: import llama_index from llama_index.core.base.base_retriever import BaseRetriever from llama_index.core.base.embeddings.base import ( BaseEmbedding, Embedding, ) from llama_index.core.ingestion import IngestionPipeline from llama_index.core.bridge.pydantic import PrivateAttr from llama_index.core.node_parser import SentenceSplitter from llama_index.core import ( VectorStoreIndex, StorageContext, load_index_from_storage, ) from llama_index.core.schema import ( Document, TransformComponent, ) except ImportError: llama_index = None BaseRetriever = None BaseEmbedding = None Embedding = None IngestionPipeline = None SentenceSplitter = None VectorStoreIndex = None StorageContext = None load_index_from_storage = None PrivateAttr = None Document = None TransformComponent = None from agentscope.file_manager import file_manager from agentscope.models import ModelWrapperBase from agentscope.constants import ( DEFAULT_TOP_K, DEFAULT_CHUNK_SIZE, DEFAULT_CHUNK_OVERLAP, ) from agentscope.rag.knowledge import Knowledge try: class _EmbeddingModel(BaseEmbedding): """ wrapper for ModelWrapperBase to an embedding model can be used in Llama Index pipeline. """ _emb_model_wrapper: ModelWrapperBase = PrivateAttr() def __init__( self, emb_model: ModelWrapperBase, embed_batch_size: int = 1, ) -> None: """ Dummy wrapper to convert a ModelWrapperBase to llama Index embedding model Args: emb_model (ModelWrapperBase): embedding model in ModelWrapperBase embed_batch_size (int): batch size, defaults to 1 """ super().__init__( model_name="Temporary_embedding_wrapper", embed_batch_size=embed_batch_size, ) self._emb_model_wrapper = emb_model def _get_query_embedding(self, query: str) -> List[float]: """ get embedding for query Args: query (str): query to be embedded """ # Note: AgentScope embedding model wrapper returns list # of embedding return list(self._emb_model_wrapper(query).embedding[0]) def _get_text_embeddings(self, texts: List[str]) -> List[Embedding]: """ get embedding for list of strings Args: texts ( List[str]): texts to be embedded """ results = [ list(self._emb_model_wrapper(t).embedding[0]) for t in texts ] return results def _get_text_embedding(self, text: str) -> Embedding: """ get embedding for a single string Args: text (str): texts to be embedded """ return list(self._emb_model_wrapper(text).embedding[0]) # TODO: use proper async methods, but depends on model wrapper async def _aget_query_embedding(self, query: str) -> List[float]: """The asynchronous version of _get_query_embedding.""" return self._get_query_embedding(query) async def _aget_text_embedding(self, text: str) -> List[float]: """Asynchronously get text embedding.""" return self._get_text_embedding(text) async def _aget_text_embeddings( self, texts: List[str], ) -> List[List[float]]: """Asynchronously get text embeddings.""" return self._get_text_embeddings(texts) except Exception: class _EmbeddingModel: # type: ignore[no-redef] """ A dummy embedding model for passing tests when llama-index is not install """ def __init__(self, emb_model: ModelWrapperBase): self._emb_model_wrapper = emb_model class LlamaIndexKnowledge(Knowledge): """ This class is a wrapper with the llama index RAG. """ def __init__( self, knowledge_id: str, emb_model: Union[ModelWrapperBase, BaseEmbedding, None] = None, knowledge_config: Optional[dict] = None, model: Optional[ModelWrapperBase] = None, persist_root: Optional[str] = None, overwrite_index: Optional[bool] = False, showprogress: Optional[bool] = True, **kwargs: Any, ) -> None: """ initialize the knowledge component based on the llama-index framework: https://github.com/run-llama/llama_index Notes: In LlamaIndex, one of the most important concepts is index, which is a data structure composed of Document objects, designed to enable querying by an LLM. The core workflow of initializing RAG is to convert data to index, and retrieve information from index. For example: 1) preprocessing documents with data loaders 2) generate embedding by configuring pipline with embedding models 3) store the embedding-content to vector database the default dir is "./rag_storage/knowledge_id" Args: knowledge_id (str): The id of the RAG knowledge unit. emb_model (ModelWrapperBase): The embedding model used for generate embeddings knowledge_config (dict): The configuration for llama-index to generate or load the index. model (ModelWrapperBase): The language model used for final synthesis persist_root (str): The root directory for index persisting overwrite_index (Optional[bool]): Whether to overwrite the index while refreshing showprogress (Optional[bool]): Whether to show the indexing progress """ super().__init__( knowledge_id=knowledge_id, emb_model=emb_model, knowledge_config=knowledge_config, model=model, **kwargs, ) if llama_index is None: raise ImportError( "LlamaIndexKnowledge require llama-index installed. " "Try a stable llama-index version, such as " "`pip install llama-index==0.10.30`", ) if persist_root is None: persist_root = file_manager.dir self.persist_dir = os.path.join(persist_root, knowledge_id) self.emb_model = emb_model self.overwrite_index = overwrite_index self.showprogress = showprogress self.index = None # ensure the emb_model is compatible with LlamaIndex if isinstance(emb_model, ModelWrapperBase): self.emb_model = _EmbeddingModel(emb_model) elif isinstance(self.emb_model, BaseEmbedding): pass else: raise TypeError( f"Embedding model does not support {type(self.emb_model)}.", ) # then we can initialize the RAG self._init_rag() def _init_rag(self, **kwargs: Any) -> None: """ Initialize the RAG. This includes: * if the persist_dir exists, load the persisted index * if not, convert the data to index * if needed, update the index * set the retriever to retrieve information from index Notes: * the index is persisted in the self.persist_dir * the refresh_index method is placed here for testing, it can be called externally. For example, updated the index periodically by calling rag.refresh_index() during the execution of the agent. """ if os.path.exists(self.persist_dir): self._load_index() # self.refresh_index() else: self._data_to_index() self._get_retriever() logger.info( f"RAG with knowledge ids: {self.knowledge_id} " f"initialization completed!\n", ) def _load_index(self) -> None: """ Load the persisted index from persist_dir. """ # load the storage_context storage_context = StorageContext.from_defaults( persist_dir=self.persist_dir, ) # construct index from self.index = load_index_from_storage( storage_context=storage_context, embed_model=self.emb_model, ) logger.info(f"index loaded from {self.persist_dir}") def _data_to_index(self) -> None: """ Convert the data to index by configs. This includes: * load the data to documents by using information from configs * set the transformations associated with documents * convert the documents to nodes * convert the nodes to index Notes: As each selected file type may need to use a different loader and transformations, knowledge_config is a list of configs. """ nodes = [] # load data to documents and set transformations # using information in knowledge_config for config in self.knowledge_config.get("data_processing"): documents = self._data_to_docs(config=config) transformations = self._set_transformations(config=config).get( "transformations", ) nodes_docs = self._docs_to_nodes( documents=documents, transformations=transformations, ) nodes = nodes + nodes_docs # convert nodes to index self.index = VectorStoreIndex( nodes=nodes, embed_model=self.emb_model, ) logger.info("index calculation completed.") # persist the calculated index self.index.storage_context.persist(persist_dir=self.persist_dir) logger.info("index persisted.") def _data_to_docs( self, query: Optional[str] = None, config: dict = None, ) -> Any: """ This method set the loader as needed, or just use the default setting. Then use the loader to load data from dir to documents. Notes: We can use simple directory loader (SimpleDirectoryReader) to load general documents, including Markdown, PDFs, Word documents, PowerPoint decks, images, audio and video. Or use SQL loader (DatabaseReader) to load database. Args: query (Optional[str]): optional, used when the data is in a database. config (dict): optional, used when the loader config is in a config file. Returns: Any: loaded documents """ loader = self._set_loader(config=config).get("loader") # let the doc_id be the filename for each document loader.filename_as_id = True if query is None: documents = loader.load_data() else: # this is for querying a database, # does not work for loading a document directory documents = loader.load_data(query) logger.info(f"loaded {len(documents)} documents") return documents def _docs_to_nodes( self, documents: List[Document], transformations: Optional[list[Optional[TransformComponent]]] = None, ) -> Any: """ Convert the loaded documents to nodes using transformations. Args: documents (List[Document]): documents to be processed, usually expected to be in llama index Documents. transformations (Optional[list[TransformComponent]]): optional, specifies the transformations (operators) to process documents (e.g., split the documents into smaller chunks) Return: Any: return the index of the processed document """ # nodes, or called chunks, is a presentation of the documents # we build nodes by using the IngestionPipeline # for each document with corresponding transformations pipeline = IngestionPipeline( transformations=transformations, ) # stack up the nodes from the pipline nodes = pipeline.run( documents=documents, show_progress=self.showprogress, ) logger.info("nodes generated.") return nodes def _set_loader(self, config: dict) -> Any: """ Set the loader as needed, or just use the default setting. Args: config (dict): a dictionary containing configurations """ if "load_data" in config: # we prepare the loader from the configs loader = self._prepare_args_from_config( config=config.get("load_data", {}), ) else: # we prepare the loader by default try: from llama_index.core import SimpleDirectoryReader except ImportError as exc_inner: raise ImportError( " LlamaIndexAgent requires llama-index to be install." "Please run `pip install llama-index`", ) from exc_inner loader = { "loader": SimpleDirectoryReader( input_dir="set_default_data_path", ), } logger.info("loaders are ready.") return loader def _set_transformations(self, config: dict) -> Any: """ Set the transformations as needed, or just use the default setting. Args: config (dict): a dictionary containing configurations. """ if "store_and_index" in config: temp = self._prepare_args_from_config( config=config.get("store_and_index", {}), ) transformations = temp.get("transformations") else: transformations = [ SentenceSplitter( chunk_size=self.knowledge_config.get( "chunk_size", DEFAULT_CHUNK_SIZE, ), chunk_overlap=self.knowledge_config.get( "chunk_overlap", DEFAULT_CHUNK_OVERLAP, ), ), ] # adding embedding model as the last step of transformation # https://docs.llamaindex.ai/en/stable/module_guides/loading/ingestion_pipeline/root.html transformations.append(self.emb_model) logger.info("transformations are ready.") # as the last step, we need to repackage the transformations in dict transformations = {"transformations": transformations} return transformations def _get_retriever( self, similarity_top_k: int = None, **kwargs: Any, ) -> BaseRetriever: """ Set the retriever as needed, or just use the default setting. Args: retriever (Optional[BaseRetriever]): passing a retriever in LlamaIndexKnowledge rag_config (dict): rag configuration, including similarity top k index. """ # set the retriever logger.info( f"similarity_top_k" f"={similarity_top_k or DEFAULT_TOP_K}", ) retriever = self.index.as_retriever( embed_model=self.emb_model, similarity_top_k=similarity_top_k or DEFAULT_TOP_K, **kwargs, ) logger.info("retriever is ready.") return retriever def retrieve( self, query: str, similarity_top_k: int = None, to_list_strs: bool = False, retriever: Optional[BaseRetriever] = None, **kwargs: Any, ) -> list[Any]: """ This is a basic retrieve function for knowledge. It will build a retriever on the fly and return the result of the query. Args: query (str): query is expected to be a question in string similarity_top_k (int): the number of most similar data returned by the retriever. to_list_strs (bool): whether returns the list of strings; if False, return NodeWithScore retriever (BaseRetriever): for advanced usage, user can pass their own retriever. Return: list[Any]: list of str or NodeWithScore More advanced query processing can refer to https://docs.llamaindex.ai/en/stable/examples/query_transformations/query_transform_cookbook.html """ if retriever is None: retriever = self._get_retriever(similarity_top_k) retrieved = retriever.retrieve(str(query)) if to_list_strs: results = [] for node in retrieved: results.append(node.get_text()) return results return retrieved def refresh_index(self) -> None: """ Refresh the index when needed. """ for config in self.knowledge_config.get("data_processing"): documents = self._data_to_docs(config=config) # store and indexing for each file type transformations = self._set_transformations(config=config).get( "transformations", ) self._insert_docs_to_index( documents=documents, transformations=transformations, ) def _insert_docs_to_index( self, documents: List[Document], transformations: TransformComponent, ) -> None: """ Add documents to the index. Given a list of documents, we first test if the doc_id is already in the index. If not, we add the doc to the list. If yes, and the over-write flag is enabled, we delete the old doc and add the new doc to the list. Lastly, we generate nodes for all documents on the list, and insert the nodes to the index. Args: documents (List[Document]): list of documents to be added. transformations (TransformComponent): transformations that convert the documents into nodes. """ # this is the pipline that generate the nodes pipeline = IngestionPipeline( transformations=transformations, ) # we need to generate nodes from this list of documents insert_docs_list = [] for doc in documents: if doc.doc_id not in self.index.ref_doc_info.keys(): # if the doc_id is not in the index, we add it to the list insert_docs_list.append(doc) logger.info( f"add new documents to index, " f"doc_id={doc.doc_id}", ) else: if self.overwrite_index: # if we enable overwrite index, we delete the old doc self.index.delete_ref_doc( ref_doc_id=doc.doc_id, delete_fr