Rename the `ChatMessage` and `AgentEvent` base classes to `BaseChatMessage` and `BaseAgentEvent`.
Bring back the `ChatMessage` and `AgentEvent` as union of built-in concrete types to avoid breaking existing applications that depends on Pydantic serialization.
Why?
Many existing code uses containers like this:
```python
class AppMessage(BaseModel):
name: str
message: ChatMessage
# Serialization is this:
m = AppMessage(...)
m.model_dump_json()
# Fields like HandoffMessage.target will be lost because it is now treated as a base class without content or target fields.
```
The assumption on `ChatMessage` or `AgentEvent` to be a union of concrete types could be in many existing code bases. So this PR brings back the union types, while keep method type hints such as those on `on_messages` to use the `BaseChatMessage` and `BaseAgentEvent` base classes for flexibility.
1. Add `on_pause` and `on_resume` API to `ChatAgent` to support pausing
behavior when running `on_message` concurrently.
2. Add `GroupChatPause` and `GroupChatResume` RPC events and handle them
in `ChatAgentContainer`.
3. Add `pause` and `resume` API to `BaseGroupChat` to allow for this
behavior accessible from the public API.
4. Improve `SequentialRoutedAgent` class to customize which message
types are sequentially handled, making it possible to have concurrent
handling for some messages (e.g., `GroupChatPause`).
5. Added unit tests.
See `test_group_chat_pause_resume.py` for how to use this feature.
What is the difference between pause/resume vs. termination and restart?
- Pause and resume issue direct RPC calls to the participanting agents
of a team while they are running, allowing putting the on-going
generation or actions on hold. This is useful when an agent's turn takes
a long time and multiple steps to complete, and user/application wants
to re-evaluate whether it is worth continue the step or cancel. This
also allows user/application to pause individual agents and resuming
them independently from the team API.
- Termination and restart requires the whole team to comes to a
full-stop, and termination conditions are checked in between agents'
turns. So termination can only happen when no agent is working on its
turn. It is possible that a termination condition has reached well
before the team is terminated, if the agent is taking a long time to
generate a response.
Resolves: #5881