diff --git a/.github/workflows/openai.yml b/.github/workflows/openai.yml
index b848ef9fb..41e6cd7b7 100644
--- a/.github/workflows/openai.yml
+++ b/.github/workflows/openai.yml
@@ -9,11 +9,11 @@ on:
paths:
- 'autogen/**'
- 'test/**'
- - 'notebook/autogen_agentchat_auto_feedback_from_code_execution.ipynb'
- - 'notebook/autogen_agentchat_function_call.ipynb'
- - 'notebook/autogen_agentchat_MathChat.ipynb'
- - 'notebook/autogen_openai_completion.ipynb'
- - 'notebook/autogen_chatgpt_gpt4.ipynb'
+ - 'notebook/agentchat_auto_feedback_from_code_execution.ipynb'
+ - 'notebook/agentchat_function_call.ipynb'
+ - 'notebook/agentchat_MathChat.ipynb'
+ - 'notebook/oai_completion.ipynb'
+ - 'notebook/oai_chatgpt_gpt4.ipynb'
- '.github/workflows/openai.yml'
jobs:
diff --git a/notebook/autogen_agentchat_MathChat.ipynb b/notebook/agentchat_MathChat.ipynb
similarity index 95%
rename from notebook/autogen_agentchat_MathChat.ipynb
rename to notebook/agentchat_MathChat.ipynb
index 29709716a..38844708d 100644
--- a/notebook/autogen_agentchat_MathChat.ipynb
+++ b/notebook/agentchat_MathChat.ipynb
@@ -5,7 +5,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
- "
"
+ "
"
]
},
{
@@ -17,7 +17,7 @@
"\n",
"AutoGen offers conversable agents powered by LLM, tool or human, which can be used to perform tasks collectively via automated chat. This framwork allows tool use and human participance through multi-agent conversation. Please find documentation about this feature [here](https://microsoft.github.io/autogen/docs/Use-Cases/agent_chat).\n",
"\n",
- "MathChat is an experimental convesational framework for math problem solving. In this notebook, we demonstrate how to use MathChat to solve math problems. MathChat uses the `AssistantAgent` and `MathUserProxyAgent`, which is similar to the usage of `AssistantAgent` and `UserProxyAgent` in other notebooks (e.g., [Automated Task Solving with Code Generation, Execution & Debugging](https://github.com/microsoft/autogen/blob/main/notebook/autogen_agentchat_auto_feedback_from_code_execution.ipynb)). Essentially, `MathUserProxyAgent` implements a different auto reply mechanism corresponding to the MathChat prompts. You can find more details in the paper [An Empirical Study on Challenging Math Problem Solving with GPT-4](https://arxiv.org/abs/2306.01337) or the [blogpost](https://microsoft.github.io/autogen/blog/2023/06/28/MathChat).\n",
+ "MathChat is an experimental convesational framework for math problem solving. In this notebook, we demonstrate how to use MathChat to solve math problems. MathChat uses the `AssistantAgent` and `MathUserProxyAgent`, which is similar to the usage of `AssistantAgent` and `UserProxyAgent` in other notebooks (e.g., [Automated Task Solving with Code Generation, Execution & Debugging](https://github.com/microsoft/autogen/blob/main/notebook/agentchat_auto_feedback_from_code_execution.ipynb)). Essentially, `MathUserProxyAgent` implements a different auto reply mechanism corresponding to the MathChat prompts. You can find more details in the paper [An Empirical Study on Challenging Math Problem Solving with GPT-4](https://arxiv.org/abs/2306.01337) or the [blogpost](https://microsoft.github.io/autogen/blog/2023/06/28/MathChat).\n",
"\n",
"## Requirements\n",
"\n",
diff --git a/notebook/autogen_agentchat_RetrieveChat.ipynb b/notebook/agentchat_RetrieveChat.ipynb
similarity index 99%
rename from notebook/autogen_agentchat_RetrieveChat.ipynb
rename to notebook/agentchat_RetrieveChat.ipynb
index eaa24931e..bdc994d0f 100644
--- a/notebook/autogen_agentchat_RetrieveChat.ipynb
+++ b/notebook/agentchat_RetrieveChat.ipynb
@@ -1,10 +1,11 @@
{
"cells": [
{
+ "attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
- "
"
+ "
"
]
},
{
@@ -18,7 +19,7 @@
"AutoGen offers conversable agents powered by LLM, tool or human, which can be used to perform tasks collectively via automated chat. This framwork allows tool use and human participance through multi-agent conversation.\n",
"Please find documentation about this feature [here](https://microsoft.github.io/autogen/docs/Use-Cases/agent_chat).\n",
"\n",
- "RetrieveChat is a convesational system for retrieve augmented code generation and question answering. In this notebook, we demonstrate how to utilize RetrieveChat to generate code and answer questions based on customized documentations that are not present in the LLM's training dataset. RetrieveChat uses the `RetrieveAssistantAgent` and `RetrieveUserProxyAgent`, which is similar to the usage of `AssistantAgent` and `UserProxyAgent` in other notebooks (e.g., [Automated Task Solving with Code Generation, Execution & Debugging](https://github.com/microsoft/autogen/blob/main/notebook/autogen_agentchat_auto_feedback_from_code_execution.ipynb)). Essentially, `RetrieveAssistantAgent` and `RetrieveUserProxyAgent` implement a different auto-reply mechanism corresponding to the RetrieveChat prompts.\n",
+ "RetrieveChat is a convesational system for retrieve augmented code generation and question answering. In this notebook, we demonstrate how to utilize RetrieveChat to generate code and answer questions based on customized documentations that are not present in the LLM's training dataset. RetrieveChat uses the `RetrieveAssistantAgent` and `RetrieveUserProxyAgent`, which is similar to the usage of `AssistantAgent` and `UserProxyAgent` in other notebooks (e.g., [Automated Task Solving with Code Generation, Execution & Debugging](https://github.com/microsoft/autogen/blob/main/notebook/agentchat_auto_feedback_from_code_execution.ipynb)). Essentially, `RetrieveAssistantAgent` and `RetrieveUserProxyAgent` implement a different auto-reply mechanism corresponding to the RetrieveChat prompts.\n",
"\n",
"## Table of Contents\n",
"We'll demonstrates five examples of using RetrieveChat for code generation and question answering:\n",
diff --git a/notebook/autogen_agentchat_auto_feedback_from_code_execution.ipynb b/notebook/agentchat_auto_feedback_from_code_execution.ipynb
similarity index 99%
rename from notebook/autogen_agentchat_auto_feedback_from_code_execution.ipynb
rename to notebook/agentchat_auto_feedback_from_code_execution.ipynb
index b64360bb4..de81cf4f3 100644
--- a/notebook/autogen_agentchat_auto_feedback_from_code_execution.ipynb
+++ b/notebook/agentchat_auto_feedback_from_code_execution.ipynb
@@ -5,7 +5,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
- "
"
+ "
"
]
},
{
diff --git a/notebook/autogen_agentchat_chess.ipynb b/notebook/agentchat_chess.ipynb
similarity index 99%
rename from notebook/autogen_agentchat_chess.ipynb
rename to notebook/agentchat_chess.ipynb
index 253708a51..d859f4701 100644
--- a/notebook/autogen_agentchat_chess.ipynb
+++ b/notebook/agentchat_chess.ipynb
@@ -5,7 +5,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
- "
"
+ "
"
]
},
{
diff --git a/notebook/autogen_agentchat_function_call.ipynb b/notebook/agentchat_function_call.ipynb
similarity index 99%
rename from notebook/autogen_agentchat_function_call.ipynb
rename to notebook/agentchat_function_call.ipynb
index 36249153c..35e051f38 100644
--- a/notebook/autogen_agentchat_function_call.ipynb
+++ b/notebook/agentchat_function_call.ipynb
@@ -6,7 +6,7 @@
"id": "ae1f50ec",
"metadata": {},
"source": [
- "
"
+ "
"
]
},
{
diff --git a/notebook/autogen_agentchat_groupchat.ipynb b/notebook/agentchat_groupchat.ipynb
similarity index 98%
rename from notebook/autogen_agentchat_groupchat.ipynb
rename to notebook/agentchat_groupchat.ipynb
index f6e0d5188..6964408eb 100644
--- a/notebook/autogen_agentchat_groupchat.ipynb
+++ b/notebook/agentchat_groupchat.ipynb
@@ -5,7 +5,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
- "
"
+ "
"
]
},
{
diff --git a/notebook/autogen_agentchat_groupchat_research.ipynb b/notebook/agentchat_groupchat_research.ipynb
similarity index 99%
rename from notebook/autogen_agentchat_groupchat_research.ipynb
rename to notebook/agentchat_groupchat_research.ipynb
index 6f01ddacd..599309f1c 100644
--- a/notebook/autogen_agentchat_groupchat_research.ipynb
+++ b/notebook/agentchat_groupchat_research.ipynb
@@ -5,7 +5,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
- "
"
+ "
"
]
},
{
diff --git a/notebook/autogen_agentchat_groupchat_vis.ipynb b/notebook/agentchat_groupchat_vis.ipynb
similarity index 99%
rename from notebook/autogen_agentchat_groupchat_vis.ipynb
rename to notebook/agentchat_groupchat_vis.ipynb
index c8be0ced0..4348d224d 100644
--- a/notebook/autogen_agentchat_groupchat_vis.ipynb
+++ b/notebook/agentchat_groupchat_vis.ipynb
@@ -5,7 +5,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
- "
"
+ "
"
]
},
{
diff --git a/notebook/autogen_agentchat_human_feedback.ipynb b/notebook/agentchat_human_feedback.ipynb
similarity index 99%
rename from notebook/autogen_agentchat_human_feedback.ipynb
rename to notebook/agentchat_human_feedback.ipynb
index 2119eb5cf..232462ad7 100644
--- a/notebook/autogen_agentchat_human_feedback.ipynb
+++ b/notebook/agentchat_human_feedback.ipynb
@@ -5,7 +5,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
- "
"
+ "
"
]
},
{
diff --git a/notebook/autogen_agentchat_planning.ipynb b/notebook/agentchat_planning.ipynb
similarity index 99%
rename from notebook/autogen_agentchat_planning.ipynb
rename to notebook/agentchat_planning.ipynb
index 8622f0cf7..93513fb64 100644
--- a/notebook/autogen_agentchat_planning.ipynb
+++ b/notebook/agentchat_planning.ipynb
@@ -5,7 +5,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
- "
"
+ "
"
]
},
{
diff --git a/notebook/autogen_agentchat_stream.ipynb b/notebook/agentchat_stream.ipynb
similarity index 99%
rename from notebook/autogen_agentchat_stream.ipynb
rename to notebook/agentchat_stream.ipynb
index 1c9c53b80..bc2ee88f1 100644
--- a/notebook/autogen_agentchat_stream.ipynb
+++ b/notebook/agentchat_stream.ipynb
@@ -5,7 +5,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
- "
"
+ "
"
]
},
{
diff --git a/notebook/agentchat_teaching.ipynb b/notebook/agentchat_teaching.ipynb
new file mode 100644
index 000000000..0d427b70f
--- /dev/null
+++ b/notebook/agentchat_teaching.ipynb
@@ -0,0 +1,970 @@
+{
+ "cells": [
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 1,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# %pip install --quiet flaml\"[autogen]\"~=2.1.0"
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Auto Generated Agent Chat: Teaching\n",
+ "\n",
+ "AutoGen offers conversable agents powered by LLM, tool or human, which can be used to perform tasks collectively via automated chat. This framwork makes it easy to build many advanced applications of LLMs.\n",
+ "Please find documentation about this feature [here](https://microsoft.github.io/autogen/docs/Use-Cases/agent_chat).\n",
+ "\n",
+ "This notebook demonstrates how AutoGen enables a user to teach AI new skills via natural agent interactions, without requiring knowledge of programming language. It is modified based on https://github.com/microsoft/FLAML/blob/evaluation/notebook/research_paper/teaching.ipynb and https://github.com/microsoft/FLAML/blob/evaluation/notebook/research_paper/teaching_recipe_reuse.ipynb.\n",
+ "\n",
+ "## Requirements\n",
+ "\n",
+ "AutoGen requires `Python>=3.8`. To run this notebook example, please install:\n",
+ "```bash\n",
+ "pip install pyautogen\n",
+ "```"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# %pip install --quiet pyautogen~=0.1.0"
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Set your API Endpoint\n",
+ "\n",
+ "The [`config_list_from_json`](https://microsoft.github.io/autogen/docs/reference/oai/openai_utils#config_list_from_json) function loads a list of configurations from an environment variable or a json file.\n",
+ "\n",
+ "It first looks for environment variable \"OAI_CONFIG_LIST\" which needs to be a valid json string. If that variable is not found, it then looks for a json file named \"OAI_CONFIG_LIST\". It filters the configs by models (you can filter by other keys as well).\n",
+ "\n",
+ "The json looks like the following:\n",
+ "```json\n",
+ "[\n",
+ " {\n",
+ " \"model\": \"gpt-4\",\n",
+ " \"api_key\": \"\"\n",
+ " },\n",
+ " {\n",
+ " \"model\": \"gpt-4\",\n",
+ " \"api_key\": \"\",\n",
+ " \"api_base\": \"\",\n",
+ " \"api_type\": \"azure\",\n",
+ " \"api_version\": \"2023-06-01-preview\"\n",
+ " },\n",
+ " {\n",
+ " \"model\": \"gpt-4-32k\",\n",
+ " \"api_key\": \"\",\n",
+ " \"api_base\": \"\",\n",
+ " \"api_type\": \"azure\",\n",
+ " \"api_version\": \"2023-06-01-preview\"\n",
+ " }\n",
+ "]\n",
+ "```\n",
+ "\n",
+ "If you open this notebook in colab, you can upload your files by clicking the file icon on the left panel and then choose \"upload file\" icon.\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 2,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from flaml import autogen\n",
+ "\n",
+ "llm_config={\n",
+ " \"request_timeout\": 600,\n",
+ " \"seed\": 44, # change the seed for different trials\n",
+ " \"config_list\": autogen.config_list_from_json(\n",
+ " \"OAI_CONFIG_LIST\",\n",
+ " filter_dict={\"model\": [\"gpt-4-32k\"]},\n",
+ " ),\n",
+ " \"temperature\": 0,\n",
+ "}"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Example Task: Literature Survey\n",
+ "\n",
+ "We consider a scenario where one needs to find research papers of a certain topic, categorize the application domains, and plot a bar chart of the number of papers in each domain."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Construct Agents\n",
+ "\n",
+ "We create an assistant agent to solve tasks with coding and language skills. We create a user proxy agent to describe tasks and execute the code suggested by the assistant agent."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 3,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# create an AssistantAgent instance named \"assistant\"\n",
+ "assistant = autogen.AssistantAgent(\n",
+ " name=\"assistant\",\n",
+ " llm_config=llm_config,\n",
+ " is_termination_msg=lambda x: True if \"TERMINATE\" in x.get(\"content\") else False,\n",
+ ")\n",
+ "# create a UserProxyAgent instance named \"user_proxy\"\n",
+ "user_proxy = autogen.UserProxyAgent(\n",
+ " name=\"user_proxy\",\n",
+ " human_input_mode=\"NEVER\",\n",
+ " is_termination_msg=lambda x: True if \"TERMINATE\" in x.get(\"content\") else False,\n",
+ " max_consecutive_auto_reply=10,\n",
+ " code_execution_config={\n",
+ " \"work_dir\": \"work_dir\",\n",
+ " \"use_docker\": False,\n",
+ " },\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Step-by-step Requests"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 4,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "\u001b[33muser_proxy\u001b[0m (to assistant):\n",
+ "\n",
+ "\n",
+ "Find arxiv papers that show how are people studying trust calibration in AI based systems \n",
+ "\n",
+ "\n",
+ "--------------------------------------------------------------------------------\n",
+ "\u001b[33massistant\u001b[0m (to user_proxy):\n",
+ "\n",
+ "To find arxiv papers related to trust calibration in AI-based systems, we can use the arxiv API to search for relevant papers. I will write a Python script that queries the arxiv API and prints the titles and URLs of the top 10 papers.\n",
+ "\n",
+ "Please execute the following Python code:\n",
+ "\n",
+ "```python\n",
+ "import requests\n",
+ "import feedparser\n",
+ "\n",
+ "def search_arxiv(query, max_results=10):\n",
+ " base_url = \"http://export.arxiv.org/api/query?\"\n",
+ " search_query = f\"search_query=all:{query}\"\n",
+ " start = 0\n",
+ " max_results = f\"max_results={max_results}\"\n",
+ " url = f\"{base_url}{search_query}&start={start}&{max_results}\"\n",
+ " response = requests.get(url)\n",
+ " feed = feedparser.parse(response.content)\n",
+ " return feed.entries\n",
+ "\n",
+ "query = \"trust calibration AI\"\n",
+ "papers = search_arxiv(query)\n",
+ "\n",
+ "for i, paper in enumerate(papers):\n",
+ " print(f\"{i+1}. {paper.title}\")\n",
+ " print(f\"URL: {paper.link}\\n\")\n",
+ "```\n",
+ "After executing the code, I will analyze the results and provide you with the relevant papers.\n",
+ "\n",
+ "--------------------------------------------------------------------------------\n",
+ "\u001b[31m\n",
+ ">>>>>>>> EXECUTING CODE BLOCK 0 (inferred language is python)...\u001b[0m\n",
+ "\u001b[33muser_proxy\u001b[0m (to assistant):\n",
+ "\n",
+ "exitcode: 0 (execution succeeded)\n",
+ "Code output: \n",
+ "1. Who Should I Trust: AI or Myself? Leveraging Human and AI Correctness\n",
+ " Likelihood to Promote Appropriate Trust in AI-Assisted Decision-Making\n",
+ "URL: http://arxiv.org/abs/2301.05809v1\n",
+ "\n",
+ "2. Effect of Confidence and Explanation on Accuracy and Trust Calibration\n",
+ " in AI-Assisted Decision Making\n",
+ "URL: http://arxiv.org/abs/2001.02114v1\n",
+ "\n",
+ "3. Trust Explanations to Do What They Say\n",
+ "URL: http://arxiv.org/abs/2303.13526v1\n",
+ "\n",
+ "4. A Systematic Literature Review of User Trust in AI-Enabled Systems: An\n",
+ " HCI Perspective\n",
+ "URL: http://arxiv.org/abs/2304.08795v1\n",
+ "\n",
+ "5. Trust Calibration and Trust Respect: A Method for Building Team Cohesion\n",
+ " in Human Robot Teams\n",
+ "URL: http://arxiv.org/abs/2110.06809v1\n",
+ "\n",
+ "6. Trust Considerations for Explainable Robots: A Human Factors Perspective\n",
+ "URL: http://arxiv.org/abs/2005.05940v1\n",
+ "\n",
+ "7. Experimental Investigation of Trust in Anthropomorphic Agents as Task\n",
+ " Partners\n",
+ "URL: http://arxiv.org/abs/2202.01077v2\n",
+ "\n",
+ "8. Uncalibrated Models Can Improve Human-AI Collaboration\n",
+ "URL: http://arxiv.org/abs/2202.05983v3\n",
+ "\n",
+ "9. Knowing About Knowing: An Illusion of Human Competence Can Hinder\n",
+ " Appropriate Reliance on AI Systems\n",
+ "URL: http://arxiv.org/abs/2301.11333v1\n",
+ "\n",
+ "10. A Turing Test for Transparency\n",
+ "URL: http://arxiv.org/abs/2106.11394v1\n",
+ "\n",
+ "\n",
+ "\n",
+ "--------------------------------------------------------------------------------\n",
+ "\u001b[33massistant\u001b[0m (to user_proxy):\n",
+ "\n",
+ "I have found the top 10 arxiv papers related to trust calibration in AI-based systems. Here are the titles and URLs of these papers:\n",
+ "\n",
+ "1. Who Should I Trust: AI or Myself? Leveraging Human and AI Correctness Likelihood to Promote Appropriate Trust in AI-Assisted Decision-Making\n",
+ "URL: http://arxiv.org/abs/2301.05809v1\n",
+ "\n",
+ "2. Effect of Confidence and Explanation on Accuracy and Trust Calibration in AI-Assisted Decision Making\n",
+ "URL: http://arxiv.org/abs/2001.02114v1\n",
+ "\n",
+ "3. Trust Explanations to Do What They Say\n",
+ "URL: http://arxiv.org/abs/2303.13526v1\n",
+ "\n",
+ "4. A Systematic Literature Review of User Trust in AI-Enabled Systems: An HCI Perspective\n",
+ "URL: http://arxiv.org/abs/2304.08795v1\n",
+ "\n",
+ "5. Trust Calibration and Trust Respect: A Method for Building Team Cohesion in Human Robot Teams\n",
+ "URL: http://arxiv.org/abs/2110.06809v1\n",
+ "\n",
+ "6. Trust Considerations for Explainable Robots: A Human Factors Perspective\n",
+ "URL: http://arxiv.org/abs/2005.05940v1\n",
+ "\n",
+ "7. Experimental Investigation of Trust in Anthropomorphic Agents as Task Partners\n",
+ "URL: http://arxiv.org/abs/2202.01077v2\n",
+ "\n",
+ "8. Uncalibrated Models Can Improve Human-AI Collaboration\n",
+ "URL: http://arxiv.org/abs/2202.05983v3\n",
+ "\n",
+ "9. Knowing About Knowing: An Illusion of Human Competence Can Hinder Appropriate Reliance on AI Systems\n",
+ "URL: http://arxiv.org/abs/2301.11333v1\n",
+ "\n",
+ "10. A Turing Test for Transparency\n",
+ "URL: http://arxiv.org/abs/2106.11394v1\n",
+ "\n",
+ "You can read these papers to learn more about how people are studying trust calibration in AI-based systems. If you need further assistance, please let me know.\n",
+ "\n",
+ "TERMINATE\n",
+ "\n",
+ "--------------------------------------------------------------------------------\n"
+ ]
+ }
+ ],
+ "source": [
+ "task1 = \"\"\"\n",
+ "Find arxiv papers that show how are people studying trust calibration in AI based systems \n",
+ "\"\"\"\n",
+ "\n",
+ "user_proxy.initiate_chat(assistant, message=task1)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 5,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "\u001b[33muser_proxy\u001b[0m (to assistant):\n",
+ "\n",
+ "analyze the above the results to list the application domains studied by these papers \n",
+ "\n",
+ "--------------------------------------------------------------------------------\n",
+ "\u001b[33massistant\u001b[0m (to user_proxy):\n",
+ "\n",
+ "I will analyze the abstracts of the top 10 arxiv papers related to trust calibration in AI-based systems to identify the application domains studied by these papers. Please execute the following Python code to print the abstracts:\n",
+ "\n",
+ "```python\n",
+ "for i, paper in enumerate(papers):\n",
+ " print(f\"{i+1}. {paper.title}\")\n",
+ " print(f\"Abstract: {paper.summary}\\n\")\n",
+ "```\n",
+ "\n",
+ "After executing the code, I will analyze the abstracts and provide you with the application domains studied by these papers.\n",
+ "\n",
+ "--------------------------------------------------------------------------------\n",
+ "\u001b[31m\n",
+ ">>>>>>>> EXECUTING CODE BLOCK 0 (inferred language is python)...\u001b[0m\n",
+ "\u001b[33muser_proxy\u001b[0m (to assistant):\n",
+ "\n",
+ "exitcode: 1 (execution failed)\n",
+ "Code output: \n",
+ "Traceback (most recent call last):\n",
+ " File \"\", line 1, in \n",
+ " for i, paper in enumerate(papers):\n",
+ "NameError: name 'papers' is not defined\n",
+ "\n",
+ "\n",
+ "--------------------------------------------------------------------------------\n",
+ "\u001b[33massistant\u001b[0m (to user_proxy):\n",
+ "\n",
+ "I apologize for the error. It seems the variable 'papers' is not available anymore. Let me provide you with the complete code to fetch the papers and print their abstracts. Please execute the following Python code:\n",
+ "\n",
+ "```python\n",
+ "import requests\n",
+ "import feedparser\n",
+ "\n",
+ "def search_arxiv(query, max_results=10):\n",
+ " base_url = \"http://export.arxiv.org/api/query?\"\n",
+ " search_query = f\"search_query=all:{query}\"\n",
+ " start = 0\n",
+ " max_results = f\"max_results={max_results}\"\n",
+ " url = f\"{base_url}{search_query}&start={start}&{max_results}\"\n",
+ " response = requests.get(url)\n",
+ " feed = feedparser.parse(response.content)\n",
+ " return feed.entries\n",
+ "\n",
+ "query = \"trust calibration AI\"\n",
+ "papers = search_arxiv(query)\n",
+ "\n",
+ "for i, paper in enumerate(papers):\n",
+ " print(f\"{i+1}. {paper.title}\")\n",
+ " print(f\"Abstract: {paper.summary}\\n\")\n",
+ "```\n",
+ "\n",
+ "After executing the code, I will analyze the abstracts and provide you with the application domains studied by these papers.\n",
+ "\n",
+ "--------------------------------------------------------------------------------\n",
+ "\u001b[31m\n",
+ ">>>>>>>> EXECUTING CODE BLOCK 0 (inferred language is python)...\u001b[0m\n",
+ "\u001b[33muser_proxy\u001b[0m (to assistant):\n",
+ "\n",
+ "exitcode: 0 (execution succeeded)\n",
+ "Code output: \n",
+ "1. Who Should I Trust: AI or Myself? Leveraging Human and AI Correctness\n",
+ " Likelihood to Promote Appropriate Trust in AI-Assisted Decision-Making\n",
+ "Abstract: In AI-assisted decision-making, it is critical for human decision-makers to\n",
+ "know when to trust AI and when to trust themselves. However, prior studies\n",
+ "calibrated human trust only based on AI confidence indicating AI's correctness\n",
+ "likelihood (CL) but ignored humans' CL, hindering optimal team decision-making.\n",
+ "To mitigate this gap, we proposed to promote humans' appropriate trust based on\n",
+ "the CL of both sides at a task-instance level. We first modeled humans' CL by\n",
+ "approximating their decision-making models and computing their potential\n",
+ "performance in similar instances. We demonstrated the feasibility and\n",
+ "effectiveness of our model via two preliminary studies. Then, we proposed three\n",
+ "CL exploitation strategies to calibrate users' trust explicitly/implicitly in\n",
+ "the AI-assisted decision-making process. Results from a between-subjects\n",
+ "experiment (N=293) showed that our CL exploitation strategies promoted more\n",
+ "appropriate human trust in AI, compared with only using AI confidence. We\n",
+ "further provided practical implications for more human-compatible AI-assisted\n",
+ "decision-making.\n",
+ "\n",
+ "2. Effect of Confidence and Explanation on Accuracy and Trust Calibration\n",
+ " in AI-Assisted Decision Making\n",
+ "Abstract: Today, AI is being increasingly used to help human experts make decisions in\n",
+ "high-stakes scenarios. In these scenarios, full automation is often\n",
+ "undesirable, not only due to the significance of the outcome, but also because\n",
+ "human experts can draw on their domain knowledge complementary to the model's\n",
+ "to ensure task success. We refer to these scenarios as AI-assisted decision\n",
+ "making, where the individual strengths of the human and the AI come together to\n",
+ "optimize the joint decision outcome. A key to their success is to appropriately\n",
+ "\\textit{calibrate} human trust in the AI on a case-by-case basis; knowing when\n",
+ "to trust or distrust the AI allows the human expert to appropriately apply\n",
+ "their knowledge, improving decision outcomes in cases where the model is likely\n",
+ "to perform poorly. This research conducts a case study of AI-assisted decision\n",
+ "making in which humans and AI have comparable performance alone, and explores\n",
+ "whether features that reveal case-specific model information can calibrate\n",
+ "trust and improve the joint performance of the human and AI. Specifically, we\n",
+ "study the effect of showing confidence score and local explanation for a\n",
+ "particular prediction. Through two human experiments, we show that confidence\n",
+ "score can help calibrate people's trust in an AI model, but trust calibration\n",
+ "alone is not sufficient to improve AI-assisted decision making, which may also\n",
+ "depend on whether the human can bring in enough unique knowledge to complement\n",
+ "the AI's errors. We also highlight the problems in using local explanation for\n",
+ "AI-assisted decision making scenarios and invite the research community to\n",
+ "explore new approaches to explainability for calibrating human trust in AI.\n",
+ "\n",
+ "3. Trust Explanations to Do What They Say\n",
+ "Abstract: How much are we to trust a decision made by an AI algorithm? Trusting an\n",
+ "algorithm without cause may lead to abuse, and mistrusting it may similarly\n",
+ "lead to disuse. Trust in an AI is only desirable if it is warranted; thus,\n",
+ "calibrating trust is critical to ensuring appropriate use. In the name of\n",
+ "calibrating trust appropriately, AI developers should provide contracts\n",
+ "specifying use cases in which an algorithm can and cannot be trusted. Automated\n",
+ "explanation of AI outputs is often touted as a method by which trust can be\n",
+ "built in the algorithm. However, automated explanations arise from algorithms\n",
+ "themselves, so trust in these explanations is similarly only desirable if it is\n",
+ "warranted. Developers of algorithms explaining AI outputs (xAI algorithms)\n",
+ "should provide similar contracts, which should specify use cases in which an\n",
+ "explanation can and cannot be trusted.\n",
+ "\n",
+ "4. A Systematic Literature Review of User Trust in AI-Enabled Systems: An\n",
+ " HCI Perspective\n",
+ "Abstract: User trust in Artificial Intelligence (AI) enabled systems has been\n",
+ "increasingly recognized and proven as a key element to fostering adoption. It\n",
+ "has been suggested that AI-enabled systems must go beyond technical-centric\n",
+ "approaches and towards embracing a more human centric approach, a core\n",
+ "principle of the human-computer interaction (HCI) field. This review aims to\n",
+ "provide an overview of the user trust definitions, influencing factors, and\n",
+ "measurement methods from 23 empirical studies to gather insight for future\n",
+ "technical and design strategies, research, and initiatives to calibrate the\n",
+ "user AI relationship. The findings confirm that there is more than one way to\n",
+ "define trust. Selecting the most appropriate trust definition to depict user\n",
+ "trust in a specific context should be the focus instead of comparing\n",
+ "definitions. User trust in AI-enabled systems is found to be influenced by\n",
+ "three main themes, namely socio-ethical considerations, technical and design\n",
+ "features, and user characteristics. User characteristics dominate the findings,\n",
+ "reinforcing the importance of user involvement from development through to\n",
+ "monitoring of AI enabled systems. In conclusion, user trust needs to be\n",
+ "addressed directly in every context where AI-enabled systems are being used or\n",
+ "discussed. In addition, calibrating the user-AI relationship requires finding\n",
+ "the optimal balance that works for not only the user but also the system.\n",
+ "\n",
+ "5. Trust Calibration and Trust Respect: A Method for Building Team Cohesion\n",
+ " in Human Robot Teams\n",
+ "Abstract: Recent advances in the areas of human-robot interaction (HRI) and robot\n",
+ "autonomy are changing the world. Today robots are used in a variety of\n",
+ "applications. People and robots work together in human autonomous teams (HATs)\n",
+ "to accomplish tasks that, separately, cannot be easily accomplished. Trust\n",
+ "between robots and humans in HATs is vital to task completion and effective\n",
+ "team cohesion. For optimal performance and safety of human operators in HRI,\n",
+ "human trust should be adjusted to the actual performance and reliability of the\n",
+ "robotic system. The cost of poor trust calibration in HRI, is at a minimum, low\n",
+ "performance, and at higher levels it causes human injury or critical task\n",
+ "failures. While the role of trust calibration is vital to team cohesion it is\n",
+ "also important for a robot to be able to assess whether or not a human is\n",
+ "exhibiting signs of mistrust due to some other factor such as anger,\n",
+ "distraction or frustration. In these situations the robot chooses not to\n",
+ "calibrate trust, instead the robot chooses to respect trust. The decision to\n",
+ "respect trust is determined by the robots knowledge of whether or not a human\n",
+ "should trust the robot based on its actions(successes and failures) and its\n",
+ "feedback to the human. We show that the feedback in the form of trust\n",
+ "calibration cues(TCCs) can effectively change the trust level in humans. This\n",
+ "information is potentially useful in aiding a robot it its decision to respect\n",
+ "trust.\n",
+ "\n",
+ "6. Trust Considerations for Explainable Robots: A Human Factors Perspective\n",
+ "Abstract: Recent advances in artificial intelligence (AI) and robotics have drawn\n",
+ "attention to the need for AI systems and robots to be understandable to human\n",
+ "users. The explainable AI (XAI) and explainable robots literature aims to\n",
+ "enhance human understanding and human-robot team performance by providing users\n",
+ "with necessary information about AI and robot behavior. Simultaneously, the\n",
+ "human factors literature has long addressed important considerations that\n",
+ "contribute to human performance, including human trust in autonomous systems.\n",
+ "In this paper, drawing from the human factors literature, we discuss three\n",
+ "important trust-related considerations for the design of explainable robot\n",
+ "systems: the bases of trust, trust calibration, and trust specificity. We\n",
+ "further detail existing and potential metrics for assessing trust in robotic\n",
+ "systems based on explanations provided by explainable robots.\n",
+ "\n",
+ "7. Experimental Investigation of Trust in Anthropomorphic Agents as Task\n",
+ " Partners\n",
+ "Abstract: This study investigated whether human trust in a social robot with\n",
+ "anthropomorphic physicality is similar to that in an AI agent or in a human in\n",
+ "order to clarify how anthropomorphic physicality influences human trust in an\n",
+ "agent. We conducted an online experiment using two types of cognitive tasks,\n",
+ "calculation and emotion recognition tasks, where participants answered after\n",
+ "referring to the answers of an AI agent, a human, or a social robot. During the\n",
+ "experiment, the participants rated their trust levels in their partners. As a\n",
+ "result, trust in the social robot was basically neither similar to that in the\n",
+ "AI agent nor in the human and instead settled between them. The results showed\n",
+ "a possibility that manipulating anthropomorphic features would help assist\n",
+ "human users in appropriately calibrating trust in an agent.\n",
+ "\n",
+ "8. Uncalibrated Models Can Improve Human-AI Collaboration\n",
+ "Abstract: In many practical applications of AI, an AI model is used as a decision aid\n",
+ "for human users. The AI provides advice that a human (sometimes) incorporates\n",
+ "into their decision-making process. The AI advice is often presented with some\n",
+ "measure of \"confidence\" that the human can use to calibrate how much they\n",
+ "depend on or trust the advice. In this paper, we present an initial exploration\n",
+ "that suggests showing AI models as more confident than they actually are, even\n",
+ "when the original AI is well-calibrated, can improve human-AI performance\n",
+ "(measured as the accuracy and confidence of the human's final prediction after\n",
+ "seeing the AI advice). We first train a model to predict human incorporation of\n",
+ "AI advice using data from thousands of human-AI interactions. This enables us\n",
+ "to explicitly estimate how to transform the AI's prediction confidence, making\n",
+ "the AI uncalibrated, in order to improve the final human prediction. We\n",
+ "empirically validate our results across four different tasks--dealing with\n",
+ "images, text and tabular data--involving hundreds of human participants. We\n",
+ "further support our findings with simulation analysis. Our findings suggest the\n",
+ "importance of jointly optimizing the human-AI system as opposed to the standard\n",
+ "paradigm of optimizing the AI model alone.\n",
+ "\n",
+ "9. Knowing About Knowing: An Illusion of Human Competence Can Hinder\n",
+ " Appropriate Reliance on AI Systems\n",
+ "Abstract: The dazzling promises of AI systems to augment humans in various tasks hinge\n",
+ "on whether humans can appropriately rely on them. Recent research has shown\n",
+ "that appropriate reliance is the key to achieving complementary team\n",
+ "performance in AI-assisted decision making. This paper addresses an\n",
+ "under-explored problem of whether the Dunning-Kruger Effect (DKE) among people\n",
+ "can hinder their appropriate reliance on AI systems. DKE is a metacognitive\n",
+ "bias due to which less-competent individuals overestimate their own skill and\n",
+ "performance. Through an empirical study (N = 249), we explored the impact of\n",
+ "DKE on human reliance on an AI system, and whether such effects can be\n",
+ "mitigated using a tutorial intervention that reveals the fallibility of AI\n",
+ "advice, and exploiting logic units-based explanations to improve user\n",
+ "understanding of AI advice. We found that participants who overestimate their\n",
+ "performance tend to exhibit under-reliance on AI systems, which hinders optimal\n",
+ "team performance. Logic units-based explanations did not help users in either\n",
+ "improving the calibration of their competence or facilitating appropriate\n",
+ "reliance. While the tutorial intervention was highly effective in helping users\n",
+ "calibrate their self-assessment and facilitating appropriate reliance among\n",
+ "participants with overestimated self-assessment, we found that it can\n",
+ "potentially hurt the appropriate reliance of participants with underestimated\n",
+ "self-assessment. Our work has broad implications on the design of methods to\n",
+ "tackle user cognitive biases while facilitating appropriate reliance on AI\n",
+ "systems. Our findings advance the current understanding of the role of\n",
+ "self-assessment in shaping trust and reliance in human-AI decision making. This\n",
+ "lays out promising future directions for relevant HCI research in this\n",
+ "community.\n",
+ "\n",
+ "10. A Turing Test for Transparency\n",
+ "Abstract: A central goal of explainable artificial intelligence (XAI) is to improve the\n",
+ "trust relationship in human-AI interaction. One assumption underlying research\n",
+ "in transparent AI systems is that explanations help to better assess\n",
+ "predictions of machine learning (ML) models, for instance by enabling humans to\n",
+ "identify wrong predictions more efficiently. Recent empirical evidence however\n",
+ "shows that explanations can have the opposite effect: When presenting\n",
+ "explanations of ML predictions humans often tend to trust ML predictions even\n",
+ "when these are wrong. Experimental evidence suggests that this effect can be\n",
+ "attributed to how intuitive, or human, an AI or explanation appears. This\n",
+ "effect challenges the very goal of XAI and implies that responsible usage of\n",
+ "transparent AI methods has to consider the ability of humans to distinguish\n",
+ "machine generated from human explanations. Here we propose a quantitative\n",
+ "metric for XAI methods based on Turing's imitation game, a Turing Test for\n",
+ "Transparency. A human interrogator is asked to judge whether an explanation was\n",
+ "generated by a human or by an XAI method. Explanations of XAI methods that can\n",
+ "not be detected by humans above chance performance in this binary\n",
+ "classification task are passing the test. Detecting such explanations is a\n",
+ "requirement for assessing and calibrating the trust relationship in human-AI\n",
+ "interaction. We present experimental results on a crowd-sourced text\n",
+ "classification task demonstrating that even for basic ML models and XAI\n",
+ "approaches most participants were not able to differentiate human from machine\n",
+ "generated explanations. We discuss ethical and practical implications of our\n",
+ "results for applications of transparent ML.\n",
+ "\n",
+ "\n",
+ "\n",
+ "--------------------------------------------------------------------------------\n",
+ "\u001b[33massistant\u001b[0m (to user_proxy):\n",
+ "\n",
+ "After analyzing the abstracts of the top 10 arxiv papers related to trust calibration in AI-based systems, I have identified the following application domains studied by these papers:\n",
+ "\n",
+ "1. AI-assisted decision-making: Papers 1, 2, and 9 focus on how humans can appropriately trust AI systems in decision-making processes, considering factors such as AI confidence, human competence, and the Dunning-Kruger Effect.\n",
+ "\n",
+ "2. Human-robot interaction (HRI) and human-autonomous teams (HATs): Papers 5 and 6 discuss trust calibration in the context of human-robot teams, focusing on team cohesion, trust calibration cues, and explainable robot systems.\n",
+ "\n",
+ "3. Explainable AI (XAI) and transparency: Papers 3, 8, and 10 explore the role of explanations and transparency in AI systems, addressing issues such as trust in explanations, uncalibrated models, and the Turing Test for Transparency.\n",
+ "\n",
+ "4. Human-computer interaction (HCI) perspective on user trust in AI-enabled systems: Paper 4 presents a systematic literature review of user trust in AI-enabled systems from an HCI perspective, discussing trust definitions, influencing factors, and measurement methods.\n",
+ "\n",
+ "5. Anthropomorphic agents and social robots: Paper 7 investigates the influence of anthropomorphic physicality on human trust in agents, comparing trust in AI agents, humans, and social robots.\n",
+ "\n",
+ "These application domains provide insights into various aspects of trust calibration in AI-based systems, including decision-making, human-robot interaction, explainable AI, and user trust from an HCI perspective.\n",
+ "\n",
+ "--------------------------------------------------------------------------------\n",
+ "\u001b[33muser_proxy\u001b[0m (to assistant):\n",
+ "\n",
+ "\n",
+ "\n",
+ "--------------------------------------------------------------------------------\n",
+ "\u001b[33massistant\u001b[0m (to user_proxy):\n",
+ "\n",
+ "TERMINATE\n",
+ "\n",
+ "--------------------------------------------------------------------------------\n"
+ ]
+ }
+ ],
+ "source": [
+ "task2 = \"analyze the above the results to list the application domains studied by these papers \"\n",
+ "user_proxy.initiate_chat(assistant, message=task2, clear_history=False)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 6,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "\u001b[33muser_proxy\u001b[0m (to assistant):\n",
+ "\n",
+ "Use this data to generate a bar chart of domains and number of papers in that domain and save to a file \n",
+ "\n",
+ "\n",
+ "--------------------------------------------------------------------------------\n",
+ "\u001b[33massistant\u001b[0m (to user_proxy):\n",
+ "\n",
+ "I will create a Python script that generates a bar chart of the application domains and the number of papers in each domain using the data we have analyzed. The chart will be saved as an image file. Please execute the following Python code:\n",
+ "\n",
+ "```python\n",
+ "import matplotlib.pyplot as plt\n",
+ "\n",
+ "domains = {\n",
+ " \"AI-assisted decision-making\": 3,\n",
+ " \"Human-robot interaction (HRI) and human-autonomous teams (HATs)\": 2,\n",
+ " \"Explainable AI (XAI) and transparency\": 3,\n",
+ " \"HCI perspective on user trust in AI-enabled systems\": 1,\n",
+ " \"Anthropomorphic agents and social robots\": 1,\n",
+ "}\n",
+ "\n",
+ "fig, ax = plt.subplots()\n",
+ "ax.bar(domains.keys(), domains.values())\n",
+ "plt.xticks(rotation=45, ha=\"right\")\n",
+ "plt.xlabel(\"Application Domains\")\n",
+ "plt.ylabel(\"Number of Papers\")\n",
+ "plt.title(\"Number of Papers per Application Domain\")\n",
+ "\n",
+ "plt.tight_layout()\n",
+ "plt.savefig(\"domains_bar_chart.png\")\n",
+ "plt.show()\n",
+ "```\n",
+ "\n",
+ "This code will generate a bar chart of the application domains and the number of papers in each domain, and save it as an image file named \"domains_bar_chart.png\" in the current working directory.\n",
+ "\n",
+ "--------------------------------------------------------------------------------\n",
+ "\u001b[31m\n",
+ ">>>>>>>> EXECUTING CODE BLOCK 0 (inferred language is python)...\u001b[0m\n",
+ "\u001b[33muser_proxy\u001b[0m (to assistant):\n",
+ "\n",
+ "exitcode: 0 (execution succeeded)\n",
+ "Code output: \n",
+ "Figure(640x480)\n",
+ "\n",
+ "\n",
+ "--------------------------------------------------------------------------------\n",
+ "\u001b[33massistant\u001b[0m (to user_proxy):\n",
+ "\n",
+ "I'm glad the code executed successfully. The bar chart of the application domains and the number of papers in each domain has been generated and saved as an image file named \"domains_bar_chart.png\" in your current working directory. You can now view the chart by opening the image file.\n",
+ "\n",
+ "If you need further assistance, please let me know.\n",
+ "\n",
+ "TERMINATE\n",
+ "\n",
+ "--------------------------------------------------------------------------------\n"
+ ]
+ }
+ ],
+ "source": [
+ "task3 = \"\"\"Use this data to generate a bar chart of domains and number of papers in that domain and save to a file \n",
+ "\"\"\"\n",
+ "user_proxy.initiate_chat(assistant, message=task3, clear_history=False)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 1,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "image/png": "",
+ "text/plain": [
+ ""
+ ]
+ },
+ "execution_count": 1,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "# from IPython.display import Image\n",
+ "\n",
+ "# Image(filename='work_dir/domains_bar_chart.png')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Create Recipes\n",
+ "\n",
+ "Now that the task has finished via a number of interactions. The user does not want to repeat these many steps in future. What can the user do?\n",
+ "\n",
+ "A followup request can be made to create a reusable recipe."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 7,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "\u001b[33muser_proxy\u001b[0m (to assistant):\n",
+ "\n",
+ "Reflect on the sequence and create a recipe containing all the steps \n",
+ "necessary and name for it. Suggest well-documented, generalized python function(s)\n",
+ " to perform similar tasks for coding steps in future. Make sure coding steps and \n",
+ " non-coding steps are never mixed in one function. In the docstr of the function(s),\n",
+ " clarify what non-coding steps are needed to use the language skill of the assistant.\n",
+ "\n",
+ "\n",
+ "--------------------------------------------------------------------------------\n",
+ "\u001b[33massistant\u001b[0m (to user_proxy):\n",
+ "\n",
+ "Based on the sequence of tasks performed, I propose the following recipe:\n",
+ "\n",
+ "**Recipe Name:** Analyzing and Visualizing Application Domains in Arxiv Papers\n",
+ "\n",
+ "**Steps:**\n",
+ "\n",
+ "1. Collect relevant papers from arxiv using a search query.\n",
+ "2. Analyze the abstracts of the collected papers to identify application domains.\n",
+ "3. Count the number of papers in each application domain.\n",
+ "4. Generate a bar chart of the application domains and the number of papers in each domain.\n",
+ "5. Save the bar chart as an image file.\n",
+ "\n",
+ "Here are the well-documented, generalized Python functions to perform the coding steps in the future:\n",
+ "\n",
+ "```python\n",
+ "import requests\n",
+ "import feedparser\n",
+ "import matplotlib.pyplot as plt\n",
+ "from typing import List, Dict\n",
+ "\n",
+ "def search_arxiv(query: str, max_results: int = 10) -> List[Dict[str, str]]:\n",
+ " \"\"\"\n",
+ " Search arxiv for papers related to a specific query.\n",
+ " \n",
+ " :param query: The search query for arxiv papers.\n",
+ " :param max_results: The maximum number of results to return. Default is 10.\n",
+ " :return: A list of dictionaries containing the title, link, and summary of each paper.\n",
+ " \"\"\"\n",
+ " base_url = \"http://export.arxiv.org/api/query?\"\n",
+ " search_query = f\"search_query=all:{query}\"\n",
+ " start = 0\n",
+ " max_results = f\"max_results={max_results}\"\n",
+ " url = f\"{base_url}{search_query}&start={start}&{max_results}\"\n",
+ " response = requests.get(url)\n",
+ " feed = feedparser.parse(response.content)\n",
+ " \n",
+ " papers = [{\"title\": entry.title, \"link\": entry.link, \"summary\": entry.summary} for entry in feed.entries]\n",
+ " return papers\n",
+ "\n",
+ "def generate_bar_chart(domains: Dict[str, int], output_file: str) -> None:\n",
+ " \"\"\"\n",
+ " Generate a bar chart of application domains and the number of papers in each domain, and save it as an image file.\n",
+ " \n",
+ " :param domains: A dictionary containing application domains as keys and the number of papers as values.\n",
+ " :param output_file: The name of the output image file.\n",
+ " \"\"\"\n",
+ " fig, ax = plt.subplots()\n",
+ " ax.bar(domains.keys(), domains.values())\n",
+ " plt.xticks(rotation=45, ha=\"right\")\n",
+ " plt.xlabel(\"Application Domains\")\n",
+ " plt.ylabel(\"Number of Papers\")\n",
+ " plt.title(\"Number of Papers per Application Domain\")\n",
+ "\n",
+ " plt.tight_layout()\n",
+ " plt.savefig(output_file)\n",
+ " plt.show()\n",
+ "```\n",
+ "\n",
+ "**Usage:**\n",
+ "\n",
+ "1. Use the `search_arxiv` function to collect relevant papers from arxiv using a search query.\n",
+ "2. Analyze the abstracts of the collected papers using your language skills to identify application domains and count the number of papers in each domain.\n",
+ "3. Use the `generate_bar_chart` function to generate a bar chart of the application domains and the number of papers in each domain, and save it as an image file.\n",
+ "\n",
+ "In the docstrings of the functions, I have clarified that non-coding steps, such as analyzing the abstracts of the collected papers, are needed to use the language skill of the assistant.\n",
+ "\n",
+ "--------------------------------------------------------------------------------\n",
+ "\u001b[31m\n",
+ ">>>>>>>> EXECUTING CODE BLOCK 0 (inferred language is python)...\u001b[0m\n",
+ "\u001b[33muser_proxy\u001b[0m (to assistant):\n",
+ "\n",
+ "exitcode: 0 (execution succeeded)\n",
+ "Code output: \n",
+ "\n",
+ "\n",
+ "--------------------------------------------------------------------------------\n",
+ "\u001b[33massistant\u001b[0m (to user_proxy):\n",
+ "\n",
+ "I'm glad you found the provided recipe and Python functions helpful. If you have any questions or need further assistance, please feel free to ask.\n",
+ "\n",
+ "TERMINATE\n",
+ "\n",
+ "--------------------------------------------------------------------------------\n"
+ ]
+ }
+ ],
+ "source": [
+ "task4 = \"\"\"Reflect on the sequence and create a recipe containing all the steps \n",
+ "necessary and name for it. Suggest well-documented, generalized python function(s)\n",
+ " to perform similar tasks for coding steps in future. Make sure coding steps and \n",
+ " non-coding steps are never mixed in one function. In the docstr of the function(s),\n",
+ " clarify what non-coding steps are needed to use the language skill of the assistant.\n",
+ "\"\"\"\n",
+ "user_proxy.initiate_chat(assistant, message=task4, clear_history=False)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Reuse Recipes\n",
+ "\n",
+ "The user can apply the same recipe to similar tasks in future.\n",
+ "\n",
+ "### Example Application"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# create an AssistantAgent instance named \"assistant\"\n",
+ "assistant = autogen.AssistantAgent(\n",
+ " name=\"assistant\",\n",
+ " llm_config=llm_config,\n",
+ " is_termination_msg=lambda x: True if \"TERMINATE\" in x.get(\"content\") else False,\n",
+ ")\n",
+ "# create a UserProxyAgent instance named \"user_proxy\"\n",
+ "user_proxy = autogen.UserProxyAgent(\n",
+ " name=\"user_proxy\",\n",
+ " human_input_mode=\"NEVER\",\n",
+ " is_termination_msg=lambda x: True if \"TERMINATE\" in x.get(\"content\") else False,\n",
+ " max_consecutive_auto_reply=10,\n",
+ " code_execution_config={\n",
+ " \"work_dir\": \"work_dir\",\n",
+ " \"use_docker\": False,\n",
+ " },\n",
+ ")\n",
+ "\n",
+ "task1 = '''\n",
+ "This recipe is available for you to reuse..\n",
+ "\n",
+ "\n",
+ "**Recipe Name:** Analyzing and Visualizing Application Domains in Arxiv Papers\n",
+ "\n",
+ "**Steps:**\n",
+ "\n",
+ "1. Collect relevant papers from arxiv using a search query.\n",
+ "2. Analyze the abstracts of the collected papers to identify application domains.\n",
+ "3. Count the number of papers in each application domain.\n",
+ "4. Generate a bar chart of the application domains and the number of papers in each domain.\n",
+ "5. Save the bar chart as an image file.\n",
+ "\n",
+ "Here are the well-documented, generalized Python functions to perform the coding steps in the future:\n",
+ "\n",
+ "```python\n",
+ "import requests\n",
+ "import feedparser\n",
+ "import matplotlib.pyplot as plt\n",
+ "from typing import List, Dict\n",
+ "\n",
+ "def search_arxiv(query: str, max_results: int = 10) -> List[Dict[str, str]]:\n",
+ " \"\"\"\n",
+ " Search arxiv for papers related to a specific query.\n",
+ " \n",
+ " :param query: The search query for arxiv papers.\n",
+ " :param max_results: The maximum number of results to return. Default is 10.\n",
+ " :return: A list of dictionaries containing the title, link, and summary of each paper.\n",
+ " \"\"\"\n",
+ " base_url = \"http://export.arxiv.org/api/query?\"\n",
+ " search_query = f\"search_query=all:{query}\"\n",
+ " start = 0\n",
+ " max_results = f\"max_results={max_results}\"\n",
+ " url = f\"{base_url}{search_query}&start={start}&{max_results}\"\n",
+ " response = requests.get(url)\n",
+ " feed = feedparser.parse(response.content)\n",
+ " \n",
+ " papers = [{\"title\": entry.title, \"link\": entry.link, \"summary\": entry.summary} for entry in feed.entries]\n",
+ " return papers\n",
+ "\n",
+ "def generate_bar_chart(domains: Dict[str, int], output_file: str) -> None:\n",
+ " \"\"\"\n",
+ " Generate a bar chart of application domains and the number of papers in each domain, and save it as an image file.\n",
+ " \n",
+ " :param domains: A dictionary containing application domains as keys and the number of papers as values.\n",
+ " :param output_file: The name of the output image file.\n",
+ " \"\"\"\n",
+ " fig, ax = plt.subplots()\n",
+ " ax.bar(domains.keys(), domains.values())\n",
+ " plt.xticks(rotation=45, ha=\"right\")\n",
+ " plt.xlabel(\"Application Domains\")\n",
+ " plt.ylabel(\"Number of Papers\")\n",
+ " plt.title(\"Number of Papers per Application Domain\")\n",
+ "\n",
+ " plt.tight_layout()\n",
+ " plt.savefig(output_file)\n",
+ " plt.show()\n",
+ "```\n",
+ "\n",
+ "**Usage:**\n",
+ "\n",
+ "1. Use the `search_arxiv` function to collect relevant papers from arxiv using a search query.\n",
+ "2. Analyze the abstracts of the collected papers using your language skills to identify application domains and count the number of papers in each domain.\n",
+ "3. Use the `generate_bar_chart` function to generate a bar chart of the application domains and the number of papers in each domain, and save it as an image file.\n",
+ "\n",
+ "\n",
+ "\n",
+ "\n",
+ "Here is a new task:\n",
+ "Plot a chart for application domains of GPT models\n",
+ "'''\n",
+ "\n",
+ "user_proxy.initiate_chat(assistant, message=task1)"
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "flaml-eval",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.9.17"
+ },
+ "orig_nbformat": 4
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
diff --git a/notebook/autogen_agentchat_two_users.ipynb b/notebook/agentchat_two_users.ipynb
similarity index 99%
rename from notebook/autogen_agentchat_two_users.ipynb
rename to notebook/agentchat_two_users.ipynb
index dba644ca9..1c20a58b1 100644
--- a/notebook/autogen_agentchat_two_users.ipynb
+++ b/notebook/agentchat_two_users.ipynb
@@ -5,7 +5,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
- "
"
+ "
"
]
},
{
diff --git a/notebook/autogen_agentchat_web_info.ipynb b/notebook/agentchat_web_info.ipynb
similarity index 99%
rename from notebook/autogen_agentchat_web_info.ipynb
rename to notebook/agentchat_web_info.ipynb
index d8cd9c21f..6ec0b0671 100644
--- a/notebook/autogen_agentchat_web_info.ipynb
+++ b/notebook/agentchat_web_info.ipynb
@@ -5,7 +5,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
- "
"
+ "
"
]
},
{
diff --git a/notebook/autogen_chatgpt_gpt4.ipynb b/notebook/oai_chatgpt_gpt4.ipynb
similarity index 99%
rename from notebook/autogen_chatgpt_gpt4.ipynb
rename to notebook/oai_chatgpt_gpt4.ipynb
index db07b23af..c3f09943e 100644
--- a/notebook/autogen_chatgpt_gpt4.ipynb
+++ b/notebook/oai_chatgpt_gpt4.ipynb
@@ -5,7 +5,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
- "
"
+ "
"
]
},
{
diff --git a/notebook/autogen_openai_completion.ipynb b/notebook/oai_completion.ipynb
similarity index 99%
rename from notebook/autogen_openai_completion.ipynb
rename to notebook/oai_completion.ipynb
index 5ffde3cf0..01cd77a49 100644
--- a/notebook/autogen_openai_completion.ipynb
+++ b/notebook/oai_completion.ipynb
@@ -5,7 +5,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
- "
"
+ "
"
]
},
{
diff --git a/test/test_notebook.py b/test/test_notebook.py
index 880249cdf..9d50f7325 100644
--- a/test/test_notebook.py
+++ b/test/test_notebook.py
@@ -48,45 +48,45 @@ def run_notebook(input_nb, output_nb="executed_openai_notebook.ipynb", save=Fals
skip or not sys.version.startswith("3.10"),
reason="do not run if openai is not installed or py!=3.10",
)
-def test_autogen_agentchat_auto_feedback_from_code(save=False):
- run_notebook("autogen_agentchat_auto_feedback_from_code_execution.ipynb", save=save)
+def test_agentchat_auto_feedback_from_code(save=False):
+ run_notebook("agentchat_auto_feedback_from_code_execution.ipynb", save=save)
@pytest.mark.skipif(
skip or not sys.version.startswith("3.10"),
reason="do not run if openai is not installed or py!=3.10",
)
-def test_autogen_openai_completion(save=False):
- run_notebook("autogen_openai_completion.ipynb", save=save)
+def test_openai_completion(save=False):
+ run_notebook("openai_completion.ipynb", save=save)
@pytest.mark.skipif(
skip or not sys.version.startswith("3.10"),
reason="do not run if openai is not installed or py!=3.10",
)
-def test_autogen_agentchat_function_call(save=False):
- run_notebook("autogen_agentchat_function_call.ipynb", save=save)
+def test_agentchat_function_call(save=False):
+ run_notebook("agentchat_function_call.ipynb", save=save)
@pytest.mark.skipif(
skip or not sys.version.startswith("3.10"),
reason="do not run if openai is not installed or py!=3.10",
)
-def test_autogen_agentchat_MathChat(save=False):
- run_notebook("autogen_agentchat_MathChat.ipynb", save=save)
+def test_agentchat_MathChat(save=False):
+ run_notebook("agentchat_MathChat.ipynb", save=save)
@pytest.mark.skipif(
skip or not sys.version.startswith("3.11"),
reason="do not run if openai is not installed or py!=3.11",
)
-def test_autogen_chatgpt_gpt4(save=False):
- run_notebook("autogen_chatgpt_gpt4.ipynb", save=save)
+def test_chatgpt_gpt4(save=False):
+ run_notebook("chatgpt_gpt4.ipynb", save=save)
if __name__ == "__main__":
- test_autogen_agentchat_auto_feedback_from_code(save=True)
- # test_autogen_chatgpt_gpt4(save=True)
- # test_autogen_openai_completion(save=True)
- # test_autogen_agentchat_MathChat(save=True)
- # test_autogen_agentchat_function_call(save=True)
+ test_agentchat_auto_feedback_from_code(save=True)
+ # test_chatgpt_gpt4(save=True)
+ # test_openai_completion(save=True)
+ # test_agentchat_MathChat(save=True)
+ # test_agentchat_function_call(save=True)
diff --git a/website/blog/2023-06-28-MathChat/index.mdx b/website/blog/2023-06-28-MathChat/index.mdx
index 986beed4e..cb2bcf296 100644
--- a/website/blog/2023-06-28-MathChat/index.mdx
+++ b/website/blog/2023-06-28-MathChat/index.mdx
@@ -75,7 +75,7 @@ We found that compared to basic prompting, which demonstrates the innate capabil
For categories like Algebra and Prealgebra, PoT and PS showed little improvement, and in some instances, even led to a decrease in accuracy. However, MathChat was able to enhance total accuracy by around 6% compared to PoT and PS, showing competitive performance across all categories. Remarkably, MathChat improved accuracy in the Algebra category by about 15% over other methods. Note that categories like Intermediate Algebra and Precalculus remained challenging for all methods, with only about 20% of problems solved accurately.
The code for experiments can be found at this [repository](https://github.com/kevin666aa/FLAML/tree/gpt_math_solver/flaml/autogen/math).
-We now provide an implementation of MathChat using the interactive agents in AutoGen. See this [notebook](https://github.com/microsoft/autogen/blob/main/notebook/autogen_agentchat_MathChat.ipynb) for example usage.
+We now provide an implementation of MathChat using the interactive agents in AutoGen. See this [notebook](https://github.com/microsoft/autogen/blob/main/notebook/agentchat_MathChat.ipynb) for example usage.
## Future Directions
diff --git a/website/docs/Examples/AutoGen-AgentChat.md b/website/docs/Examples/AutoGen-AgentChat.md
index e34d3cf70..a41cd2f25 100644
--- a/website/docs/Examples/AutoGen-AgentChat.md
+++ b/website/docs/Examples/AutoGen-AgentChat.md
@@ -4,14 +4,15 @@ AutoGen offers conversable agents powered by LLM, tool or human, which can be us
Please find documentation about this feature [here](/docs/Use-Cases/agent_chat).
Links to notebook examples:
-* [Automated Task Solving with Code Generation, Execution & Debugging](https://github.com/microsoft/autogen/blob/main/notebook/autogen_agentchat_auto_feedback_from_code_execution.ipynb)
-* [Auto Code Generation, Execution, Debugging and Human Feedback](https://github.com/microsoft/autogen/blob/main/notebook/autogen_agentchat_human_feedback.ipynb)
-* [Solve Tasks Requiring Web Info](https://github.com/microsoft/autogen/blob/main/notebook/autogen_agentchat_web_info.ipynb)
-* [Use Provided Tools as Functions](https://github.com/microsoft/autogen/blob/main/notebook/autogen_agentchat_function_call.ipynb)
-* [Automated Task Solving with Coding & Planning Agents](https://github.com/microsoft/autogen/blob/main/notebook/autogen_agentchat_planning.ipynb)
-* [Automated Task Solving with GPT-4 + Multiple Human Users](https://github.com/microsoft/autogen/blob/main/notebook/autogen_agentchat_two_users.ipynb)
-* [Automated Chess Game Playing & Chitchatting by GPT-4 Agents](https://github.com/microsoft/autogen/blob/main/notebook/autogen_agentchat_chess.ipynb)
-* [Automated Task Solving by Group Chat (with 3 group member agents and 1 manager agent)](https://github.com/microsoft/autogen/blob/main/notebook/autogen_agentchat_groupchat.ipynb)
-* [Automated Data Visualization by Group Chat (with 3 group member agents and 1 manager agent)](https://github.com/microsoft/autogen/blob/main/notebook/autogen_agentchat_groupchat_vis.ipynb)
-* [Automated Complex Task Solving by Group Chat (with 6 group member agents and 1 manager agent)](https://github.com/microsoft/autogen/blob/main/notebook/autogen_agentchat_groupchat_research.ipynb)
-* [Automated Continual Learning from New Data](https://github.com/microsoft/autogen/blob/main/notebook/autogen_agentchat_stream.ipynb)
+* [Automated Task Solving with Code Generation, Execution & Debugging](https://github.com/microsoft/autogen/blob/main/notebook/agentchat_auto_feedback_from_code_execution.ipynb)
+* [Auto Code Generation, Execution, Debugging and Human Feedback](https://github.com/microsoft/autogen/blob/main/notebook/agentchat_human_feedback.ipynb)
+* [Solve Tasks Requiring Web Info](https://github.com/microsoft/autogen/blob/main/notebook/agentchat_web_info.ipynb)
+* [Use Provided Tools as Functions](https://github.com/microsoft/autogen/blob/main/notebook/agentchat_function_call.ipynb)
+* [Automated Task Solving with Coding & Planning Agents](https://github.com/microsoft/autogen/blob/main/notebook/agentchat_planning.ipynb)
+* [Automated Task Solving with GPT-4 + Multiple Human Users](https://github.com/microsoft/autogen/blob/main/notebook/agentchat_two_users.ipynb)
+* [Automated Chess Game Playing & Chitchatting by GPT-4 Agents](https://github.com/microsoft/autogen/blob/main/notebook/agentchat_chess.ipynb)
+* [Automated Task Solving by Group Chat (with 3 group member agents and 1 manager agent)](https://github.com/microsoft/autogen/blob/main/notebook/agentchat_groupchat.ipynb)
+* [Automated Data Visualization by Group Chat (with 3 group member agents and 1 manager agent)](https://github.com/microsoft/autogen/blob/main/notebook/agentchat_groupchat_vis.ipynb)
+* [Automated Complex Task Solving by Group Chat (with 6 group member agents and 1 manager agent)](https://github.com/microsoft/autogen/blob/main/notebook/agentchat_groupchat_research.ipynb)
+* [Automated Continual Learning from New Data](https://github.com/microsoft/autogen/blob/main/notebook/agentchat_stream.ipynb)
+* [Teach Agents New Skills & Reuse via Automated Chat](https://github.com/microsoft/autogen/blob/main/notebook/agentchat_teaching.ipynb)
diff --git a/website/docs/Examples/AutoGen-OpenAI.md b/website/docs/Examples/AutoGen-OpenAI.md
index c44b77018..d68504a1c 100644
--- a/website/docs/Examples/AutoGen-OpenAI.md
+++ b/website/docs/Examples/AutoGen-OpenAI.md
@@ -4,5 +4,5 @@ AutoGen also offers a cost-effective hyperparameter optimization technique [EcoO
Please find documentation about this feature [here](/docs/Use-Cases/enhanced_inference).
Links to notebook examples:
-* [Optimize for Code Generation](https://github.com/microsoft/autogen/blob/main/notebook/autogen_openai_completion.ipynb) | [Open in colab](https://colab.research.google.com/github/microsoft/autogen/blob/main/notebook/autogen_openai_completion.ipynb)
-* [Optimize for Math](https://github.com/microsoft/autogen/blob/main/notebook/autogen_chatgpt_gpt4.ipynb) | [Open in colab](https://colab.research.google.com/github/microsoft/autogen/blob/main/notebook/autogen_chatgpt_gpt4.ipynb)
+* [Optimize for Code Generation](https://github.com/microsoft/autogen/blob/main/notebook/oai_completion.ipynb) | [Open in colab](https://colab.research.google.com/github/microsoft/autogen/blob/main/notebook/oai_completion.ipynb)
+* [Optimize for Math](https://github.com/microsoft/autogen/blob/main/notebook/oai_chatgpt_gpt4.ipynb) | [Open in colab](https://colab.research.google.com/github/microsoft/autogen/blob/main/notebook/oai_chatgpt_gpt4.ipynb)
diff --git a/website/docs/Use-Cases/agent_chat.md b/website/docs/Use-Cases/agent_chat.md
index 231d96f9f..9da5c8ec1 100644
--- a/website/docs/Use-Cases/agent_chat.md
+++ b/website/docs/Use-Cases/agent_chat.md
@@ -69,29 +69,29 @@ On the one hand, one can achieve fully autonomous conversations after an initial
By adopting the conversation-driven control with both programming language and natural language, AutoGen inherently allows dynamic conversation. Dynamic conversation allows the agent topology to change depending on the actual flow of conversation under different input problem instances, while the flow of a static conversation always follows a pre-defined topology. The dynamic conversation pattern is useful in complex applications where the patterns of interaction cannot be predetermined in advance. AutoGen provides two general approaches to achieving dynamic conversation:
-- Registered auto-reply. With the pluggable auto-reply function, one can choose to invoke conversations with other agents depending on the content of the current message and context. A working system demonstrating this type of dynamic conversation can be found in this code example, demonstrating a [dynamic group chat](https://github.com/microsoft/autogen/blob/main/notebook/autogen_agentchat_groupchat.ipynb). In the system, we register an auto-reply function in the group chat manager, which lets LLM decide who will the next speaker be in a group chat setting.
+- Registered auto-reply. With the pluggable auto-reply function, one can choose to invoke conversations with other agents depending on the content of the current message and context. A working system demonstrating this type of dynamic conversation can be found in this code example, demonstrating a [dynamic group chat](https://github.com/microsoft/autogen/blob/main/notebook/agentchat_groupchat.ipynb). In the system, we register an auto-reply function in the group chat manager, which lets LLM decide who will the next speaker be in a group chat setting.
-- LLM-based function call. In this approach, LLM decides whether or not to call a particular function depending on the conversation status in each inference call.
-By messaging additional agents in the called functions, the LLM can drive dynamic multi-agent conversation. A working system showcasing this type of dynamic conversation can be found in the [multi-user math problem solving scenario](https://github.com/microsoft/autogen/blob/main/notebook/autogen_agentchat_two_users.ipynb), where a student assistant would automatically resort to an expert using function calls.
+- LLM-based function call. In this approach, LLM decides whether or not to call a particular function depending on the conversation status in each inference call.
+By messaging additional agents in the called functions, the LLM can drive dynamic multi-agent conversation. A working system showcasing this type of dynamic conversation can be found in the [multi-user math problem solving scenario](https://github.com/microsoft/autogen/blob/main/notebook/agentchat_two_users.ipynb), where a student assistant would automatically resort to an expert using function calls.
-### Diverse Applications Implemented with AutoGen
+### Diverse Applications Implemented with AutoGen
The figure below shows six examples of applications built using AutoGen.

-* [Automated Task Solving with Code Generation, Execution & Debugging](https://github.com/microsoft/autogen/blob/main/notebook/autogen_agentchat_auto_feedback_from_code_execution.ipynb)
-* [Auto Code Generation, Execution, Debugging and Human Feedback](https://github.com/microsoft/autogen/blob/main/notebook/autogen_agentchat_human_feedback.ipynb)
-* [Solve Tasks Requiring Web Info](https://github.com/microsoft/autogen/blob/main/notebook/autogen_agentchat_web_info.ipynb)
-* [Use Provided Tools as Functions](https://github.com/microsoft/autogen/blob/main/notebook/autogen_agentchat_function_call.ipynb)
-* [Automated Task Solving with Coding & Planning Agents](https://github.com/microsoft/autogen/blob/main/notebook/autogen_agentchat_planning.ipynb)
-* [Automated Task Solving with GPT-4 + Multiple Human Users](https://github.com/microsoft/autogen/blob/main/notebook/autogen_agentchat_two_users.ipynb)
-* [Automated Chess Game Playing & Chitchatting by GPT-4 Agents](https://github.com/microsoft/autogen/blob/main/notebook/autogen_agentchat_chess.ipynb)
-* [Automated Task Solving by Group Chat (with 3 group member agents and 1 manager agent)](https://github.com/microsoft/autogen/blob/main/notebook/autogen_agentchat_groupchat.ipynb)
-* [Automated Data Visualization by Group Chat (with 3 group member agents and 1 manager agent)](https://github.com/microsoft/autogen/blob/main/notebook/autogen_agentchat_groupchat_vis.ipynb)
-* [Automated Complex Task Solving by Group Chat (with 6 group member agents and 1 manager agent)](https://github.com/microsoft/autogen/blob/main/notebook/autogen_agentchat_groupchat_research.ipynb)
-* [Automated Continual Learning from New Data](https://github.com/microsoft/autogen/blob/main/notebook/autogen_agentchat_stream.ipynb)
-
+* [Automated Task Solving with Code Generation, Execution & Debugging](https://github.com/microsoft/autogen/blob/main/notebook/agentchat_auto_feedback_from_code_execution.ipynb)
+* [Auto Code Generation, Execution, Debugging and Human Feedback](https://github.com/microsoft/autogen/blob/main/notebook/agentchat_human_feedback.ipynb)
+* [Solve Tasks Requiring Web Info](https://github.com/microsoft/autogen/blob/main/notebook/agentchat_web_info.ipynb)
+* [Use Provided Tools as Functions](https://github.com/microsoft/autogen/blob/main/notebook/agentchat_function_call.ipynb)
+* [Automated Task Solving with Coding & Planning Agents](https://github.com/microsoft/autogen/blob/main/notebook/agentchat_planning.ipynb)
+* [Automated Task Solving with GPT-4 + Multiple Human Users](https://github.com/microsoft/autogen/blob/main/notebook/agentchat_two_users.ipynb)
+* [Automated Chess Game Playing & Chitchatting by GPT-4 Agents](https://github.com/microsoft/autogen/blob/main/notebook/agentchat_chess.ipynb)
+* [Automated Task Solving by Group Chat (with 3 group member agents and 1 manager agent)](https://github.com/microsoft/autogen/blob/main/notebook/agentchat_groupchat.ipynb)
+* [Automated Data Visualization by Group Chat (with 3 group member agents and 1 manager agent)](https://github.com/microsoft/autogen/blob/main/notebook/agentchat_groupchat_vis.ipynb)
+* [Automated Complex Task Solving by Group Chat (with 6 group member agents and 1 manager agent)](https://github.com/microsoft/autogen/blob/main/notebook/agentchat_groupchat_research.ipynb)
+* [Automated Continual Learning from New Data](https://github.com/microsoft/autogen/blob/main/notebook/agentchat_stream.ipynb)
+* [Teach Agents New Skills & Reuse via Automated Chat](https://github.com/microsoft/autogen/blob/main/notebook/agentchat_teaching.ipynb)
@@ -101,4 +101,4 @@ The figure below shows six examples of applications built using AutoGen.
* [AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation Framework](https://arxiv.org/abs/2308.08155). Qingyun Wu, Gagan Bansal, Jieyu Zhang, Yiran Wu, Shaokun Zhang, Erkang Zhu, Beibin Li, Li Jiang, Xiaoyun Zhang and Chi Wang. ArXiv 2023.
-* [An Empirical Study on Challenging Math Problem Solving with GPT-4](https://arxiv.org/abs/2306.01337). Yiran Wu, Feiran Jia, Shaokun Zhang, Hangyu Li, Erkang Zhu, Yue Wang, Yin Tat Lee, Richard Peng, Qingyun Wu, Chi Wang. ArXiv preprint arXiv:2306.01337 (2023).
\ No newline at end of file
+* [An Empirical Study on Challenging Math Problem Solving with GPT-4](https://arxiv.org/abs/2306.01337). Yiran Wu, Feiran Jia, Shaokun Zhang, Hangyu Li, Erkang Zhu, Yue Wang, Yin Tat Lee, Richard Peng, Qingyun Wu, Chi Wang. ArXiv preprint arXiv:2306.01337 (2023).
diff --git a/website/docs/Use-Cases/enhanced_inference.md b/website/docs/Use-Cases/enhanced_inference.md
index 5de73003c..8bf27661d 100644
--- a/website/docs/Use-Cases/enhanced_inference.md
+++ b/website/docs/Use-Cases/enhanced_inference.md
@@ -6,8 +6,8 @@ There are a number of benefits of using `autogen` to perform inference: performa
## Tune Inference Parameters
*Links to notebook examples:*
-* [Optimize for Code Generation](https://github.com/microsoft/autogen/blob/main/notebook/autogen_openai_completion.ipynb)
-* [Optimize for Math](https://github.com/microsoft/autogen/blob/main/notebook/autogen_chatgpt_gpt4.ipynb)
+* [Optimize for Code Generation](https://github.com/microsoft/autogen/blob/main/notebook/oai_completion.ipynb)
+* [Optimize for Math](https://github.com/microsoft/autogen/blob/main/notebook/oai_chatgpt_gpt4.ipynb)
### Choices to optimize
@@ -158,7 +158,7 @@ response = autogen.Completion.create(
It will try querying Azure OpenAI gpt-4, OpenAI gpt-3.5-turbo, and a locally hosted llama-7B one by one, ignoring AuthenticationError, RateLimitError and Timeout,
until a valid result is returned. This can speed up the development process where the rate limit is a bottleneck. An error will be raised if the last choice fails. So make sure the last choice in the list has the best availability.
-For convenience, we provide a number of utility functions to load config lists, such as `config_list_from_json`.
+For convenience, we provide a number of utility functions to load config lists, such as [`config_list_from_json`](/docs/reference/oai/openai_utils#config_list_from_json).
### Logic error
@@ -364,4 +364,3 @@ Set `compact=False` in `start_logging()` to switch.
```
It can be seen that the individual API call history contains redundant information of the conversation. For a long conversation the degree of redundancy is high.
The compact history is more efficient and the individual API call history contains more details.
-