autogen/notebook/oai_openai_utils.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# In-depth Guide to OpenAI Utility Functions\n",
    "\n",
    "Managing API configurations can be tricky, especially when dealing with multiple models and API versions. The provided utility functions assist users in managing these configurations effectively. Ensure your API keys and other sensitive data are stored securely. You might store keys in `.txt` or `.env` files or environment variables for local development. Never expose your API keys publicly. If you insist on storing your key files locally on your repo (you shouldn't), ensure the key file path is added to the `.gitignore` file.\n",
    "\n",
    "#### Steps:\n",
    "1. Obtain API keys from OpenAI and optionally from Azure OpenAI (or other provider).\n",
    "2. Store them securely using either:\n",
    "    - Environment Variables: `export OPENAI_API_KEY='your-key'` in your shell.\n",
    "    - Text File: Save the key in a `key_openai.txt` file.\n",
    "    - Env File: Save the key to a `.env` file eg: `OPENAI_API_KEY=sk-********************`\n",
    "\n",
    "---\n",
    "\n",
    "**TL;DR:** <br>\n",
    "There are many ways to generate a `config_list` depending on your use case:\n",
    "\n",
    "- `get_config_list`: Generates configurations for API calls, primarily from provided API keys.\n",
    "- `config_list_openai_aoai`: Constructs a list of configurations using both Azure OpenAI and OpenAI endpoints, sourcing API keys from environment variables or local files.\n",
    "- `config_list_from_json`: Loads configurations from a JSON structure, either from an environment variable or a local JSON file, with the flexibility of filtering configurations based on given criteria.\n",
    "- `config_list_from_models`: Creates configurations based on a provided list of models, useful when targeting specific models without manually specifying each configuration.\n",
    "- `config_list_from_dotenv`: Constructs a configuration list from a `.env` file, offering a consolidated way to manage multiple API configurations and keys from a single file."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### What is a `config_list`?\n",
    "When instantiating an assistant, such as the example below, it is passed a `config_list`. This is used to tell the `AssistantAgent` which models or configurations it has access to:\n",
    "```python\n",
    "\n",
    "assistant = AssistantAgent(\n",
    "    name=\"assistant\",\n",
    "    llm_config={\n",
    "        \"request_timeout\": 600,\n",
    "        \"seed\": 42,\n",
    "        \"config_list\": config_list,\n",
    "        \"temperature\": 0,\n",
    "    },\n",
    ")\n",
    "```\n",
    "\n",
    "Consider an intelligent assistant that utilizes OpenAI's GPT models. Depending on user requests, it might need to:\n",
    "\n",
    "- Generate creative content (using gpt-4).\n",
    "- Answer general queries (using gpt-3.5-turbo).\n",
    "\n",
    "Different tasks may require different models, and the `config_list` aids in dynamically selecting the appropriate model configuration, managing API keys, endpoints, and versions for efficient operation of the intelligent assistant. In summary, the `config_list` helps the agents work efficiently, reliably, and optimally by managing various configurations and interactions with the OpenAI API - enhancing the adaptability and functionality of the agents."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# ! pip install pyautogen"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import autogen "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## get_config_list\n",
    "\n",
    "Used to generate configurations for API calls."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "api_keys = [\"YOUR_OPENAI_API_KEY\"]\n",
    "api_bases = None  # You can specify API base URLs if needed. eg: localhost:8000\n",
    "api_type = \"openai\"  # Type of API, e.g., \"openai\" or \"aoai\".\n",
    "api_version = None  # Specify API version if needed.\n",
    "\n",
    "config_list = autogen.get_config_list(\n",
    "    api_keys,\n",
    "    api_bases=api_bases,\n",
    "    api_type=api_type,\n",
    "    api_version=api_version\n",
    ")\n",
    "\n",
    "print(config_list)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## config_list_openai_aoai\n",
    "\n",
    "This method creates a list of configurations using Azure OpenAI endpoints and OpenAI endpoints. It tries to extract API keys and bases from environment variables or local text files.\n",
    "\n",
    "Steps:\n",
    "- Store OpenAI API key in:\n",
    "    - Environment variable: `OPENAI_API_KEY`\n",
    "    - or Local file: `key_openai.txt`\n",
    "- Store Azure OpenAI API key in:\n",
    "    - Environment variable: `AZURE_OPENAI_API_KEY`\n",
    "    - or Local file: `key_aoai.txt` (Supports multiple keys, one per line)\n",
    "- Store Azure OpenAI API base in:\n",
    "    - Environment variable: `AZURE_OPENAI_API_BASE`\n",
    "    - or Local file: `base_aoai.txt` (Supports multiple bases, one per line)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "config_list = autogen.config_list_openai_aoai(\n",
    "    key_file_path=\".\",\n",
    "    openai_api_key_file=\"key_openai.txt\",\n",
    "    aoai_api_key_file=\"key_aoai.txt\",\n",
    "    aoai_api_base_file=\"base_aoai.txt\",\n",
    "    exclude=None # The API type to exclude, eg: \"openai\" or \"aoai\".\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## config_list_from_json\n",
    "\n",
    "This method loads configurations from an environment variable or a JSON file. It provides flexibility by allowing users to filter configurations based on certain criteria.\n",
    "\n",
    "Steps:\n",
    "- Setup the JSON Configuration:\n",
    "    1. Store configurations in an environment variable named `OAI_CONFIG_LIST` as a valid JSON string.\n",
    "    2. Alternatively, save configurations in a local JSON file named `OAI_CONFIG_LIST.json`\n",
    "    3. Add `OAI_CONFIG_LIST` to your `.gitignore` file on your local repository.\n",
    "\n",
    "Your JSON structure should look something like this:\n",
    "\n",
    "```json\n",
    "# OAI_CONFIG_LIST file example\n",
    "[\n",
    "    {\n",
    "        \"model\": \"gpt-4\",\n",
    "        \"api_key\": \"YOUR_OPENAI_API_KEY\"\n",
    "    },\n",
    "    {\n",
    "        \"model\": \"gpt-3.5-turbo\",\n",
    "        \"api_key\": \"YOUR_OPENAI_API_KEY\",\n",
    "        \"api_version\": \"2023-03-01-preview\"\n",
    "    }\n",
    "]\n",
    "\n",
    "```\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "config_list = autogen.config_list_from_json(\n",
    "    env_or_file=\"OAI_CONFIG_LIST\",  # or OAI_CONFIG_LIST.json if file extension is added\n",
    "    filter_dict={\n",
    "        \"model\": {\n",
    "            \"gpt-4\",\n",
    "            \"gpt-3.5-turbo\",\n",
    "        }\n",
    "    }\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### What is `filter_dict`?\n",
    "\n",
    "The z parameter in `autogen.config_list_from_json` function is used to selectively filter the configurations loaded from the environment variable or JSON file based on specified criteria. It allows you to define criteria to select only those configurations that match the defined conditions.\n",
    "\n",
    "let's say you want to configure an assistant agent to only LLM type. Take the below example: even though we have \"gpt-3.5-turbo\" and \"gpt-4\" in our `OAI_CONFIG_LIST`, this agent would only be configured to use"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "cheap_config_list = autogen.config_list_from_json(\n",
    "    env_or_file=\"OAI_CONFIG_LIST\",  \n",
    "    filter_dict={\n",
    "        \"model\": {\n",
    "            \"gpt-3.5-turbo\",\n",
    "        }\n",
    "    }\n",
    ")\n",
    "\n",
    "costly_config_list = autogen.config_list_from_json(\n",
    "    env_or_file=\"OAI_CONFIG_LIST\", \n",
    "    filter_dict={\n",
    "        \"model\": {\n",
    "            \"gpt-4\",\n",
    "        }\n",
    "    }\n",
    ")\n",
    "\n",
    "# Assistant using GPT 3.5 Turbo\n",
    "assistant_one = autogen.AssistantAgent(\n",
    "    name=\"3.5-assistant\",\n",
    "    llm_config={\n",
    "        \"request_timeout\": 600,\n",
    "        \"seed\": 42,\n",
    "        \"config_list\": cheap_config_list,\n",
    "        \"temperature\": 0,\n",
    "    },\n",
    ")\n",
    "\n",
    "# Assistant using GPT 4\n",
    "assistant_two = autogen.AssistantAgent(\n",
    "    name=\"4-assistant\",\n",
    "    llm_config={\n",
    "        \"request_timeout\": 600,\n",
    "        \"seed\": 42,\n",
    "        \"config_list\": costly_config_list,\n",
    "        \"temperature\": 0,\n",
    "    },\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "With the `OAI_CONFIG_LIST` we set earlier, there isn't much to filter on. But when the complexity of a project grows and you're managing multiple models for various purposes, you can see how `filter_dict` can be useful. \n",
    "\n",
    "A more complex filtering criteria could be the following: Assuming we have an `OAI_CONFIG_LIST` several models used to create various agents - Let's say we want to load configurations for `gpt-4` using API version `\"2023-03-01-preview\"` and we want the `api_type` to be `aoai`, we can set up `filter_dict` as follows:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "config_list = autogen.config_list_from_json(\n",
    "    env_or_file=\"OAI_CONFIG_LIST\",\n",
    "    filter_dict = {\n",
    "        \"model\": {\n",
    "            \"gpt-4\"\n",
    "        },\n",
    "        \"api_version\": {\n",
    "            \"2023-03-01-preview\"\n",
    "        },\n",
    "        \"api_type\": \n",
    "        [\"aoai\"]\n",
    "    },\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## config_list_from_models\n",
    "\n",
    "This method creates configurations based on a provided list of models. It's useful when you have specific models in mind and don't want to manually specify each configuration. The [`config_list_from_models`](/docs/reference/oai/openai_utils#config_list_from_models) function tries to create a list of configurations using Azure OpenAI endpoints and OpenAI endpoints for the provided list of models. It assumes the api keys and api bases are stored in the corresponding environment variables or local txt files. It's okay to only have the OpenAI API key, OR only the Azure OpenAI API key + base. For Azure the model name refers to the OpenAI Studio deployment name.\n",
    "\n",
    "Steps:\n",
    "- Similar to method 1, store API keys and bases either in environment variables or `.txt` files."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "config_list = autogen.config_list_from_models(\n",
    "    key_file_path = \".\",\n",
    "    openai_api_key_file = \"key_openai.txt\",\n",
    "    aoai_api_key_file = \"key_aoai.txt\",\n",
    "    aoai_api_base_file = \"base_aoai.txt\",\n",
    "    exclude=\"aoai\",\n",
    "    model_list = None,\n",
    "    model_list=[\"gpt-4\", \"gpt-3.5-turbo\", \"gpt-3.5-turbo-16k\"],\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## config_list_from_dotenv\n",
    "\n",
    "If you are interested in keeping all of your keys in a single location like a `.env` file rather than using a configuration specifically for OpenAI, you can use `config_list_from_dotenv`. This allows you to conveniently create a config list without creating a complex `OAI_CONFIG_LIST` file.\n",
    "\n",
    "The `model_api_key_map` parameter is a dictionary that maps model names to the environment variable names in the `.env` file where their respective API keys are stored. It lets the code know which API key to use for each model. \n",
    "\n",
    "If not provided, it defaults to using `OPENAI_API_KEY` for `gpt-4` and `OPENAI_API_KEY` for `gpt-3.5-turbo`.\n",
    "\n",
    "```python\n",
    "    # default key map\n",
    "    model_api_key_map = {\n",
    "        \"gpt-4\": \"OPENAI_API_KEY\",\n",
    "        \"gpt-3.5-turbo\": \"OPENAI_API_KEY\",\n",
    "    }\n",
    "```\n",
    "\n",
    "Here is an example `.env` file:\n",
    "\n",
    "```bash\n",
    "OPENAI_API_KEY=sk-*********************\n",
    "HUGGING_FACE_API_KEY=**************************\n",
    "ANOTHER_API_KEY=1234567890234567890\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[{'api_key': 'sk-*********************', 'model': 'gpt-4'},\n",
       " {'api_key': 'sk-*********************', 'model': 'gpt-3.5-turbo'}]"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import autogen\n",
    "\n",
    "config_list = autogen.config_list_from_dotenv(\n",
    "    dotenv_file_path='.env', # If None the function will try to find in the working directory\n",
    "    filter_dict={\n",
    "        \"model\": {\n",
    "            \"gpt-4\",\n",
    "            \"gpt-3.5-turbo\",\n",
    "        }\n",
    "    }\n",
    ")\n",
    "\n",
    "config_list"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[{'api_key': '1234567890234567890', 'model': 'gpt-4'},\n",
       " {'api_key': 'sk-*********************', 'model': 'gpt-3.5-turbo'}]"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# gpt-3.5-turbo will default to OPENAI_API_KEY\n",
    "config_list = autogen.config_list_from_dotenv(\n",
    "    dotenv_file_path='.env', # If None the function will try to find in the working directory\n",
    "    model_api_key_map={\n",
    "        \"gpt-4\": \"ANOTHER_API_KEY\",  # String or dict accepted\n",
    "    },\n",
    "    filter_dict={\n",
    "        \"model\": {\n",
    "            \"gpt-4\",\n",
    "            \"gpt-3.5-turbo\",\n",
    "        }\n",
    "    }\n",
    ")\n",
    "\n",
    "config_list"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[{'api_key': 'sk-*********************', 'model': 'gpt-4'},\n",
       " {'api_key': '**************************', 'model': 'vicuna'}]"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# example using different environment variable names\n",
    "config_list = autogen.config_list_from_dotenv(\n",
    "    dotenv_file_path='.env',\n",
    "    model_api_key_map={\n",
    "        \"gpt-4\": \"OPENAI_API_KEY\",\n",
    "        \"vicuna\": \"HUGGING_FACE_API_KEY\",\n",
    "    },\n",
    "    filter_dict={\n",
    "        \"model\": {\n",
    "            \"gpt-4\",\n",
    "            \"vicuna\",\n",
    "        }\n",
    "    }\n",
    ")\n",
    "\n",
    "config_list"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can also provide additional configurations for APIs, simply by replacing the string value with a dictionary expanding on the configurations. See the example below showing the example of using `gpt-4` on `openai` by default, and using `gpt-3.5-turbo` with additional configurations for `aoai`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[{'api_key': 'sk-*********************', 'model': 'gpt-4'},\n",
       " {'api_key': '1234567890234567890',\n",
       "  'api_base': 'https://api.someotherapi.com',\n",
       "  'api_type': 'aoai',\n",
       "  'api_version': 'v2',\n",
       "  'model': 'gpt-3.5-turbo'}]"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "config_list = autogen.config_list_from_dotenv(\n",
    "    dotenv_file_path='.env',\n",
    "    model_api_key_map={\n",
    "        \"gpt-4\": \"OPENAI_API_KEY\",\n",
    "        \"gpt-3.5-turbo\": {\n",
    "            \"api_key_env_var\": \"ANOTHER_API_KEY\",\n",
    "            \"api_type\": \"aoai\",\n",
    "            \"api_version\": \"v2\",\n",
    "            \"api_base\": \"https://api.someotherapi.com\"\n",
    "        }\n",
    "    },\n",
    "    filter_dict={\n",
    "        \"model\": {\n",
    "            \"gpt-4\",\n",
    "            \"gpt-3.5-turbo\",\n",
    "        }\n",
    "    }\n",
    ")\n",
    "\n",
    "config_list"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "masterclass",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.12"
  },
  "orig_nbformat": 4
 },
 "nbformat": 4,
 "nbformat_minor": 2
}