feat(examples): output qfs evaluation results as json and markdown (#240)

* fix vectorize_model configuration key typo * fix permissions of data files * fix examples README.md inconsistency * output qfs evaluation results as json and markdown * format summarization_metrics.py with black
2025-01-08 16:52:31 +08:00 · 2025-01-08 16:52:31 +08:00 · fb15dcec26
parent 75b3447097
commit fb15dcec26
28 changed files with 145 additions and 29 deletions
--- a/kag/examples/2wiki/README.md
+++ b/kag/examples/2wiki/README.md
@ -21,7 +21,7 @@ cd kag/examples/2wiki

 ### Step 2: Configure models

-Update the generative model configurations ``openie_llm`` and ``chat_llm`` and the representive model configuration ``vectorizer_model`` in [kag_config.yaml](./kag_config.yaml).
+Update the generative model configurations ``openie_llm`` and ``chat_llm`` and the representational model configuration ``vectorize_model`` in [kag_config.yaml](./kag_config.yaml).

 You need to fill in correct ``api_key``s. If your model providers and model names are different from the default values, you also need to update ``base_url`` and ``model``.

--- a/kag/examples/2wiki/README_cn.md
+++ b/kag/examples/2wiki/README_cn.md
@ -21,7 +21,7 @@ cd kag/examples/2wiki

 ### Step 2：配置模型

-更新 [kag_config.yaml](./kag_config.yaml) 中的生成模型配置 ``openie_llm`` 和 ``chat_llm`` 和表示模型配置 ``vectorizer_model``。
+更新 [kag_config.yaml](./kag_config.yaml) 中的生成模型配置 ``openie_llm`` 和 ``chat_llm`` 和表示模型配置 ``vectorize_model``。

 您需要设置正确的 ``api_key``。如果使用的模型供应商和模型名与默认值不同，您还需要更新 ``base_url`` 和 ``model``。

--- a/kag/examples/README.md
+++ b/kag/examples/README.md
@ -152,7 +152,7 @@ kag_solver_pipeline:
 #------------kag-solver configuration end----------------#
 ```

-Update the generative model configurations ``openie_llm`` and ``chat_llm`` and the representive model configuration ``vectorizer_model`` in the configuration file.
+Update the generative model configurations ``openie_llm`` and ``chat_llm`` and the representational model configuration ``vectorize_model`` in the configuration file.

 You need to fill in correct ``api_key``s. If your model providers and model names are different from the default values, you also need to update ``base_url`` and ``model``.

@ -203,7 +203,7 @@ cd kag/examples/TwoWikiTest

 #### Step 2: Edit project configuration

-**Note**: The embedding vectors generated by different representation models can vary significantly. It is recommended not to update the ``vectorizer_model`` configuration after the project is created. If you need to update the ``vectorizer_model`` configuration, please create a new project.
+**Note**: The embedding vectors generated by different representation models can vary significantly. It is recommended not to update the ``vectorize_model`` configuration after the project is created. If you need to update the ``vectorize_model`` configuration, please create a new project.

 ```bash
 vim ./kag_config.yaml
--- a/kag/examples/README_cn.md
+++ b/kag/examples/README_cn.md
@ -152,7 +152,7 @@ kag_solver_pipeline:
 #------------kag-solver configuration end----------------#
 ```

-您需要更新其中的生成模型配置 ``openie_llm`` 和 ``chat_llm`` 和表示模型配置 ``vectorizer_model``。
+您需要更新其中的生成模型配置 ``openie_llm`` 和 ``chat_llm`` 和表示模型配置 ``vectorize_model``。

 您需要设置正确的 ``api_key``。如果使用的模型供应商和模型名与默认值不同，您还需要更新 ``base_url`` 和 ``model``。

@ -203,7 +203,7 @@ cd kag/examples/TwoWikiTest

 #### Step 2：编辑项目配置

-**注意**：由不同表示模型生成的 embedding 向量差异较大，``vectorizer_model`` 配置在项目创建后建议不再更新；如有更新 ``vectorizer_model`` 配置的需求，请创建一个新项目。
+**注意**：由不同表示模型生成的 embedding 向量差异较大，``vectorize_model`` 配置在项目创建后建议不再更新；如有更新 ``vectorize_model`` 配置的需求，请创建一个新项目。

 ```bash
 vim ./kag_config.yaml
--- a/kag/examples/baike/README.md
+++ b/kag/examples/baike/README.md
@ -17,7 +17,7 @@ cd kag/examples/baike

 ### Step 2: Configure models

-Update the generative model configurations ``openie_llm`` and ``chat_llm`` and the representive model configuration ``vectorizer_model`` in [kag_config.yaml](./kag_config.yaml).
+Update the generative model configurations ``openie_llm`` and ``chat_llm`` and the representational model configuration ``vectorize_model`` in [kag_config.yaml](./kag_config.yaml).

 You need to fill in correct ``api_key``s. If your model providers and model names are different from the default values, you also need to update ``base_url`` and ``model``.

--- a/kag/examples/baike/README_cn.md
+++ b/kag/examples/baike/README_cn.md
@ -17,7 +17,7 @@ cd kag/examples/baike

 ### Step 2：配置模型

-更新 [kag_config.yaml](./kag_config.yaml) 中的生成模型配置 ``openie_llm`` 和 ``chat_llm`` 和表示模型配置 ``vectorizer_model``。
+更新 [kag_config.yaml](./kag_config.yaml) 中的生成模型配置 ``openie_llm`` 和 ``chat_llm`` 和表示模型配置 ``vectorize_model``。

 您需要设置正确的 ``api_key``。如果使用的模型供应商和模型名与默认值不同，您还需要更新 ``base_url`` 和 ``model``。

--- a/kag/examples/csqa/.gitignore
+++ b/kag/examples/csqa/.gitignore
@ -1,3 +1,5 @@
 ckpt/
 /cs.jsonl
 /solver/data/csqa_kag_answers.json
+/solver/csqa_qfs_res_*.json
+/solver/csqa_qfs_res_*.md
--- a/kag/examples/csqa/README.md
+++ b/kag/examples/csqa/README.md
@ -29,7 +29,7 @@ python generate_data.py

 ### Step 3: Configure models

-Update the generative model configurations ``openie_llm`` and ``chat_llm`` and the representive model configuration ``vectorizer_model`` in [kag_config.yaml](./kag_config.yaml).
+Update the generative model configurations ``openie_llm`` and ``chat_llm`` and the representational model configuration ``vectorize_model`` in [kag_config.yaml](./kag_config.yaml).

 You need to fill in correct ``api_key``s. If your model providers and model names are different from the default values, you also need to update ``base_url`` and ``model``.

--- a/kag/examples/csqa/README_cn.md
+++ b/kag/examples/csqa/README_cn.md
@ -29,7 +29,7 @@ python generate_data.py

 ### Step 3：配置模型

-更新 [kag_config.yaml](./kag_config.yaml) 中的生成模型配置 ``openie_llm`` 和 ``chat_llm`` 和表示模型配置 ``vectorizer_model``。
+更新 [kag_config.yaml](./kag_config.yaml) 中的生成模型配置 ``openie_llm`` 和 ``chat_llm`` 和表示模型配置 ``vectorize_model``。

 您需要设置正确的 ``api_key``。如果使用的模型供应商和模型名与默认值不同，您还需要更新 ``base_url`` 和 ``model``。

--- a/kag/examples/csqa/solver/summarization_metrics.py
+++ b/kag/examples/csqa/solver/summarization_metrics.py
@ -48,6 +48,10 @@ class SummarizationMetricsEvaluator(object):
                "groundtruth_answer": " ".join(x["answers"]),
                "kag_answer": x["kag_answer"],
                "lightrag_answer": y["lightrag_answer"],
+                "context": x.get("context", ""),
+                "meta": x.get("meta"),
+                "kag_trace_log": x.get("kag_trace_log", ""),
+                "lightrag_context": y.get("lightrag_context", ""),
            }
            result.append(item)
        return result
@ -118,12 +122,116 @@ class SummarizationMetricsEvaluator(object):
                ) / 2
        return average_metrics

-    def run(self):
-        metrics = self._compute_summarization_metrics()
-        reverse_metrics = self._compute_reverse_summarization_metrics()
-        average_metrics = self._compute_average_summarization_metrics(
-            metrics, reverse_metrics
+    def _save_evaluation_responses(self, metrics, reverse_metrics):
+        responses = metrics["responses"]
+        reverse_responses = reverse_metrics["responses"]
+        for item, response, reverse_response in zip(
+            self._questions_and_answers, responses, reverse_responses
+        ):
+            item["response"] = response
+            item["reverse_response"] = reverse_response
+
+    def _format_winner(self, score1, score2, *, is_reversed):
+        if score1 > score2:
+            return "KAG" if not is_reversed else "LightRAG"
+        if score1 < score2:
+            return "LightRAG" if not is_reversed else "KAG"
+        return "None"
+
+    def _format_description(self, description, *, is_reversed):
+        if not is_reversed:
+            description = description.replace("Answer 1", "KAG")
+            description = description.replace("Answer 2", "LightRAG")
+        else:
+            description = description.replace("Answer 1", "LightRAG")
+            description = description.replace("Answer 2", "KAG")
+        return description
+
+    def _format_evaluation_response(self, r, *, is_reversed):
+        if r is None:
+            return "None"
+        all_keys = "Comprehensiveness", "Diversity", "Empowerment", "Overall"
+        string = ""
+        for index, key in enumerate(all_keys):
+            if index > 0:
+                string += "\n\n"
+            string += "**%s**" % key
+            string += "\n\n%s Score: %d" % (
+                "KAG" if not is_reversed else "LightRAG",
+                r[key]["Score 1"],
+            )
+            string += "\n\n%s Score: %d" % (
+                "LightRAG" if not is_reversed else "KAG",
+                r[key]["Score 2"],
+            )
+            string += "\n\nWinner: %s" % self._format_winner(
+                r[key]["Score 1"], r[key]["Score 2"], is_reversed=is_reversed
+            )
+            string += "\n\nExplanation: %s" % self._format_description(
+                r[key]["Explanation"], is_reversed=is_reversed
+            )
+        return string
+
+    def _format_question(self, item):
+        string = item["question"]
+        string += "\n" + "=" * 80
+        string += "\n\nground truth"
+        string += "\n" + "-" * 80
+        string += "\n" + item["groundtruth_answer"]
+        string += "\n\nKAG"
+        string += "\n" + "-" * 80
+        string += "\n" + item["kag_answer"]
+        string += "\n\nLightRAG"
+        string += "\n" + "-" * 80
+        string += "\n" + item["lightrag_answer"]
+        string += "\n\n%s evaluation" % self._evaluator_kwargs["model"]
+        string += ": KAG vs LightRAG"
+        string += "\n" + "-" * 80
+        string += "\n" + self._format_evaluation_response(
+            item["response"], is_reversed=False
        )
+        string += "\n\n%s evaluation" % self._evaluator_kwargs["model"]
+        string += ": LightRAG vs KAG"
+        string += "\n" + "-" * 80
+        string += "\n" + self._format_evaluation_response(
+            item["reverse_response"], is_reversed=True
+        )
+        return string
+
+    def _format_questions(self):
+        string = ""
+        for index, item in enumerate(self._questions_and_answers):
+            if index > 0:
+                string += "\n\n"
+            string += self._format_question(item)
+        return string
+
+    def _save_evaluation_results(self, metrics, reverse_metrics, average_metrics):
+        import io
+        import os
+        import json
+        import time
+
+        data = {
+            "metricses": {
+                "Metrics: KAG vs LightRAG": metrics["average_metrics"],
+                "Metrics: LightRAG vs KAG": reverse_metrics["average_metrics"],
+                "Average: KAG vs LightRAG": average_metrics["average_metrics"],
+            },
+            "questions": self._questions_and_answers,
+        }
+        start_time = time.time()
+        dir_path = os.path.dirname(os.path.abspath(__file__))
+        file_path = os.path.join(dir_path, f"csqa_qfs_res_{start_time}.json")
+        with io.open(file_path, "w", encoding="utf-8", newline="\n") as fout:
+            json.dump(data, fout, separators=(",", ": "), indent=4, ensure_ascii=False)
+            print(file=fout)
+        file_path = os.path.join(dir_path, f"csqa_qfs_res_{start_time}.md")
+        with io.open(file_path, "w", encoding="utf-8", newline="\n") as fout:
+            string = self._format_questions()
+            print(string, file=fout)
+
+    def _print_evaluation_results(self, metrics, reverse_metrics, average_metrics):
        all_keys = "Comprehensiveness", "Diversity", "Empowerment", "Overall"
        all_items = "Score 1", "Score 2"
        titles = (
@ -150,6 +258,16 @@ class SummarizationMetricsEvaluator(object):
                    string += " %.2f" % metrics["average_metrics"][key][item]
        print(string)

+    def run(self):
+        metrics = self._compute_summarization_metrics()
+        reverse_metrics = self._compute_reverse_summarization_metrics()
+        self._save_evaluation_responses(metrics, reverse_metrics)
+        average_metrics = self._compute_average_summarization_metrics(
+            metrics, reverse_metrics
+        )
+        self._save_evaluation_results(metrics, reverse_metrics, average_metrics)
+        self._print_evaluation_results(metrics, reverse_metrics, average_metrics)
+

 def main():
    evaluator = SummarizationMetricsEvaluator()
--- a/kag/examples/hotpotqa/README.md
+++ b/kag/examples/hotpotqa/README.md
@ -21,7 +21,7 @@ cd kag/examples/hotpotqa

 ### Step 2: Configure models

-Update the generative model configurations ``openie_llm`` and ``chat_llm`` and the representive model configuration ``vectorizer_model`` in [kag_config.yaml](./kag_config.yaml).
+Update the generative model configurations ``openie_llm`` and ``chat_llm`` and the representational model configuration ``vectorize_model`` in [kag_config.yaml](./kag_config.yaml).

 You need to fill in correct ``api_key``s. If your model providers and model names are different from the default values, you also need to update ``base_url`` and ``model``.

--- a/kag/examples/hotpotqa/README_cn.md
+++ b/kag/examples/hotpotqa/README_cn.md
@ -21,7 +21,7 @@ cd kag/examples/hotpotqa

 ### Step 2：配置模型

-更新 [kag_config.yaml](./kag_config.yaml) 中的生成模型配置 ``openie_llm`` 和 ``chat_llm`` 和表示模型配置 ``vectorizer_model``。
+更新 [kag_config.yaml](./kag_config.yaml) 中的生成模型配置 ``openie_llm`` 和 ``chat_llm`` 和表示模型配置 ``vectorize_model``。

 您需要设置正确的 ``api_key``。如果使用的模型供应商和模型名与默认值不同，您还需要更新 ``base_url`` 和 ``model``。

--- a/kag/examples/hotpotqa/builder/data/hotpotqa_corpus.json
+++ b/kag/examples/hotpotqa/builder/data/hotpotqa_corpus.json
--- a/kag/examples/hotpotqa/builder/data/hotpotqa_sub_corpus.json
+++ b/kag/examples/hotpotqa/builder/data/hotpotqa_sub_corpus.json
--- a/kag/examples/hotpotqa/builder/data/hotpotqa_train_corpus.json
+++ b/kag/examples/hotpotqa/builder/data/hotpotqa_train_corpus.json
--- a/kag/examples/hotpotqa/solver/data/hotpotqa_qa.json
+++ b/kag/examples/hotpotqa/solver/data/hotpotqa_qa.json
--- a/kag/examples/hotpotqa/solver/data/hotpotqa_qa_sub.json
+++ b/kag/examples/hotpotqa/solver/data/hotpotqa_qa_sub.json
--- a/kag/examples/hotpotqa/solver/data/hotpotqa_qa_train.json
+++ b/kag/examples/hotpotqa/solver/data/hotpotqa_qa_train.json
--- a/kag/examples/medicine/README.md
+++ b/kag/examples/medicine/README.md
@ -21,7 +21,7 @@ cd kag/examples/medicine

 ### Step 2: Configure models

-Update the generative model configurations ``openie_llm`` and ``chat_llm`` and the representive model configuration ``vectorizer_model`` in [kag_config.yaml](./kag_config.yaml).
+Update the generative model configurations ``openie_llm`` and ``chat_llm`` and the representational model configuration ``vectorize_model`` in [kag_config.yaml](./kag_config.yaml).

 You need to fill in correct ``api_key``s. If your model providers and model names are different from the default values, you also need to update ``base_url`` and ``model``.

--- a/kag/examples/medicine/README_cn.md
+++ b/kag/examples/medicine/README_cn.md
@ -21,7 +21,7 @@ cd kag/examples/medicine

 ### Step 2：配置模型

-更新 [kag_config.yaml](./kag_config.yaml) 中的生成模型配置 ``openie_llm`` 和 ``chat_llm`` 和表示模型配置 ``vectorizer_model``。
+更新 [kag_config.yaml](./kag_config.yaml) 中的生成模型配置 ``openie_llm`` 和 ``chat_llm`` 和表示模型配置 ``vectorize_model``。

 您需要设置正确的 ``api_key``。如果使用的模型供应商和模型名与默认值不同，您还需要更新 ``base_url`` 和 ``model``。

--- a/kag/examples/musique/README.md
+++ b/kag/examples/musique/README.md
@ -21,7 +21,7 @@ cd kag/examples/musique

 ### Step 2: Configure models

-Update the generative model configurations ``openie_llm`` and ``chat_llm`` and the representive model configuration ``vectorizer_model`` in [kag_config.yaml](./kag_config.yaml).
+Update the generative model configurations ``openie_llm`` and ``chat_llm`` and the representational model configuration ``vectorize_model`` in [kag_config.yaml](./kag_config.yaml).

 You need to fill in correct ``api_key``s. If your model providers and model names are different from the default values, you also need to update ``base_url`` and ``model``.

--- a/kag/examples/musique/README_cn.md
+++ b/kag/examples/musique/README_cn.md
@ -21,7 +21,7 @@ cd kag/examples/musique

 ### Step 2：配置模型

-更新 [kag_config.yaml](./kag_config.yaml) 中的生成模型配置 ``openie_llm`` 和 ``chat_llm`` 和表示模型配置 ``vectorizer_model``。
+更新 [kag_config.yaml](./kag_config.yaml) 中的生成模型配置 ``openie_llm`` 和 ``chat_llm`` 和表示模型配置 ``vectorize_model``。

 您需要设置正确的 ``api_key``。如果使用的模型供应商和模型名与默认值不同，您还需要更新 ``base_url`` 和 ``model``。

--- a/kag/examples/musique/builder/data/musique_corpus.json
+++ b/kag/examples/musique/builder/data/musique_corpus.json
--- a/kag/examples/musique/solver/data/musique_qa.json
+++ b/kag/examples/musique/solver/data/musique_qa.json
--- a/kag/examples/riskmining/README.md
+++ b/kag/examples/riskmining/README.md
@ -23,12 +23,10 @@ cd kag/examples/riskmining

 ### Step 2: Configure models

-Update the generative model configurations ``openie_llm`` and ``chat_llm`` in [kag_config.yaml](./kag_config.yaml).
+Update the generative model configurations ``openie_llm`` and ``chat_llm`` and the representational model configuration ``vectorize_model`` in [kag_config.yaml](./kag_config.yaml).

 You need to fill in correct ``api_key``s. If your model providers and model names are different from the default values, you also need to update ``base_url`` and ``model``.

-Since the representive model is not used in this example, you can retain the default configuration for the representative model ``vectorizer_model``.
-
 ### Step 3: Project initialization

 Initiate the project with the following command.
--- a/kag/examples/riskmining/README_cn.md
+++ b/kag/examples/riskmining/README_cn.md
@ -21,12 +21,10 @@ cd kag/examples/riskmining

 ### Step 2：配置模型

-更新 [kag_config.yaml](./kag_config.yaml) 中的生成模型配置 ``openie_llm`` 和 ``chat_llm``。
+更新 [kag_config.yaml](./kag_config.yaml) 中的生成模型配置 ``openie_llm`` 和 ``chat_llm`` 和表示模型配置 ``vectorize_model``。

 您需要设置正确的 ``api_key``。如果使用的模型供应商和模型名与默认值不同，您还需要更新 ``base_url`` 和 ``model``。

-在本示例中未使用表示模型，可保持表示模型配置 ``vectorizer_model`` 的默认配置。
-
 ### Step 3：初始化项目

 先对项目进行初始化。
--- a/kag/examples/supplychain/README.md
+++ b/kag/examples/supplychain/README.md
@ -41,7 +41,7 @@ Update the generative model configurations ``openie_llm`` and ``chat_llm`` in [k

 You need to fill in correct ``api_key``s. If your model providers and model names are different from the default values, you also need to update ``base_url`` and ``model``.

-Since the representive model is not used in this example, you can retain the default configuration for the representative model ``vectorizer_model``.
+Since the representational model is not used in this example, you can retain the default configuration for the representative model ``vectorize_model``.

 #### Step 3: Project initialization

--- a/kag/examples/supplychain/README_cn.md
+++ b/kag/examples/supplychain/README_cn.md
@ -43,7 +43,7 @@ cd kag/examples/supplychain

 您需要设置正确的 ``api_key``。如果使用的模型供应商和模型名与默认值不同，您还需要更新 ``base_url`` 和 ``model``。

-在本示例中未使用表示模型，可保持表示模型配置 ``vectorizer_model`` 的默认配置。
+在本示例中未使用表示模型，可保持表示模型配置 ``vectorize_model`` 的默认配置。

 #### Step 3：初始化项目