KAG/kag/examples/2wiki/README.md

2.7 KiB

KAG Example: TwoWiki

English | 简体中文

2WikiMultiHopQA is a multi-hop QA dataset for comprehensive evaluation of reasoning steps. It's used by KAG and HippoRAG for multi-hop question answering performance evaluation.

Here we demonstrate how to build a knowledge graph for the 2WikiMultiHopQA dataset, generate answers to those evaluation questions with KAG and calculate EM and F1 metrics of the KAG generated answers compared to the ground-truth answers.

1. Precondition

Please refer to Quick Start to install KAG and its dependency OpenSPG server, and learn about using KAG in developer mode.

2. Steps to reproduce

Step 1: Enter the example directory

cd kag/examples/2wiki

Step 2: Configure models

Update the generative model configurations openie_llm and chat_llm and the representational model configuration vectorize_model in kag_config.yaml.

You need to fill in correct api_keys. If your model providers and model names are different from the default values, you also need to update base_url and model.

Step 3: Project initialization

Initiate the project with the following command.

knext project restore --host_addr http://127.0.0.1:8887 --proj_path .

Step 4: Commit the schema

Execute the following command to commit the schema TwoWiki.schema.

knext schema commit

Step 5: Build the knowledge graph

Execute indexer.py in the builder directory to build the knowledge graph.

cd builder && python indexer.py && cd ..

Step 6: Execute the QA tasks

Execute evaFor2wiki.py in the solver directory to generate the answers and calculate the EM and F1 metrics.

cd solver && python evaFor2wiki.py && cd ..

The generated answers are saved to ./solver/2wiki_res_*.json.

The calculated EM and F1 metrics are saved to ./solver/2wiki_metrics_*.json.

Step 7: (Optional) Cleanup

To delete the checkpoints, execute the following command.

rm -rf ./builder/ckpt
rm -rf ./solver/ckpt

To delete the KAG project and related knowledge graph, execute the following similar command. Replace the OpenSPG server address and KAG project id with actual values.

curl http://127.0.0.1:8887/project/api/delete?projectId=1

Step 8: (Optional) Try the larger datasets

Restart from Step 1 and modify indexer.py and evaFor2wiki.py to try the larger datasets.