Commit Graph

149 Commits

Author SHA1 Message Date
田常@蚂蚁 4fbdd1515b
feat(kag): update readme (#462)
* 更新readme

* 更新readme
2025-04-18 15:40:49 +08:00
田常@蚂蚁 7c7910ab67
更新readme (#460) 2025-04-18 10:05:27 +08:00
zhuzhongshu123 13cea5f6fe
feat(kag): update to v0.7 (#456)
* add think cost

* update csv scanner

* add final rerank

* add reasoner

* add iterative planner

* fix dpr search

* fix dpr search

* add reference data

* move odps import

* update requirement.txt

* update 2wiki

* add missing file

* fix markdown reader

* add iterative planning

* update version

* update runner

* update 2wiki example

* update bridge

* merge solver and solver_new

* add cur day

* writer delete

* update multi process

* add missing files

* fix report

* add chunk retrieved executor

* update try in stream runner result

* add path

* add math executor

* update hotpotqa example

* remove log

* fix python coder solver

* update hotpotqa example

* fix python coder solver

* update config

* fix bad

* add log

* remove unused code

* commit with task thought

* move kag model to common

* add default chat llm

* fix

* use static planner

* support chunk graph node

* add args

* support naive rag

* llm client support tool calls

* add default async

* add openai

* fix result

* fix markdown reader

* fix thinker

* update asyncio interface

* feat(solver): add mcp support (#444)

* 上传mcp client相关代码

* 1、完成一套mcp client的调用,从pipeline到planner、executor
2、允许json中传入多个mcp_server,通过大模型进行调用并选择
3、调通baidu_map_mcp的使用

* 1、schema

* bugfix:删减冗余代码

---------

Co-authored-by: wanxingyu.wxy <wanxingyu.wxy@antgroup.com>

* fix affairqa after solver refactor

* fix affairqa after solver refactor

* fix readme

* add params

* update version

* update mcp executor

* update mcp executor

* solver add mcp executor

* add missing file

* add mpc executor

* add executor

* x

* update

* fix requirement

* fix main llm config

* fix solver

* bugfix:修复invoke函数调用逻辑

* chg eva

* update example

* add kag layer

* add step task

* support dot refresh

* support dot refresh

* support dot refresh

* support dot refresh

* add retrieved num

* add retrieved num

* add pipelineconf

* update ppr

* update musique prompts

* update

* add to_dict for BuilderComponentData

* async build

* add deduce prompt

* add deduce prompt

* add deduce prompt

* fix reader

* add deduce prompt

* add page thinker report

* modify prmpt

* add step status

* add self cognition

* add self cognition

* add memory graph storage

* add now time

* update memory config

* add now time

* chg graph loader

* 添加prqa数据集和代码

* bugfix:prqa调用逻辑修复

* optimize:优化代码逻辑,生成答案规范化

* add retry py code

* update memory graph

* update memory graph

* fix

* fix ner

* add with_out_refer generator prompt

* fix

* close ckpt

* fix query

* fix query

* update version

* add llm checker

* add llm checker

* 1、上传evalutor.py以及修改gold_answer.json格式
2、优化代码逻辑
3、修改README.md文件

* update exp

* update exp

* rerank support

* add static rewrite query

* recall more chunks

* fix graph load

* add static rewrite query

* fix bugs

* add finish check

* add finish check

* add finish check

* add finish check

* 1、上传evalutor.py的结果
2、优化代码逻辑,优化readme文件

* add lf retry

* add memory graph api

* fix reader api

* add ner

* add metrics

* fix bug

* remove ner

* add reraise fo retry

* add edge prop to memory graph

* add memory graph

* 1、评测数据集结果修正
2、优化evaluator.py代码
3、删除结果不存在而gold_answer中有答案的问题

* 删除评测结果文件

* fix knext host addr

* async eva

* add lf prompt

* add lf prompt

* add config

* add retry

* add unknown check

* add rc result

* add rc result

* add rc result

* add rc result

* 依据kag pipeline格式修改代码逻辑并通过测试

* bugfix:删除冗余代码

* fix report prompt

* bugfix:触发重试机制

* bugfix:中文符号错误

* fix rethinker prompt

* update version to 0.6.2b78

* update version

* 1、修改evaluator.py,通过大模型计算准确率,符合最新调用逻辑
2、修改prompt,让没有回答的结果重复测试

* update affairqa for evaluate

* update affairqa for evaluate

* bugfix:修正数据集

* bugfix:修正数据集

* bugfix:修正数据集

* fix name conflict

* bugfix:删除错误问题

* bugfix:文件名命名错误导致evaluator失败

* update for affairqa eval

* bugfix:修改代码保持evaluate逻辑一致

* x

* update for affairqa readme

* remove temp eval scripts

* bugfix for math deduce

* merge 0.6.2_dev

* merge 0.6.2_dev

* fix

* update client addr

* updated version

* update for affairqa eval

* evaUtils 支持中文

* fix affairqa eval:

* remove unused example

* update kag config

* fix default value

* update readme

* fix init

* 注释信息修改,并添加部分class说明

* update example config

* Tc 0.7.0 (#459)

* 提交affairQA 代码

* fix affairqa eval

---------

Co-authored-by: zhengke.gzk <zhengke.gzk@antgroup.com>

* fix all examples

* reformat

---------

Co-authored-by: peilong <peilong.zpl@antgroup.com>
Co-authored-by: 锦呈 <zhangxinhong.zxh@antgroup.com>
Co-authored-by: wanxingyu.wxy <wanxingyu.wxy@antgroup.com>
Co-authored-by: zhengke.gzk <zhengke.gzk@antgroup.com>
2025-04-17 17:23:52 +08:00
bingchu c6b107ce56
fix(knext): set token in request of write_graph (#409)
* fix(knext)set token in request of write_graph

* refine code
2025-03-12 10:13:03 +08:00
Xinhong Zhang 31b895a3fa
fix(knext): project update addr (#408)
* fix buidler init

* add pro commit

* rename graphalgoclient to graphclient

* use config default
2025-03-11 17:29:15 +08:00
Andy bf46bbbd4d
core_team #andy (#389) 2025-03-03 19:13:44 +08:00
royzhao 5d12979694
fix(solver): bugfix SPO Retrieval LLM response parse (#378)
* bugfix official_name node has same prop object

* reformat by black

* adapter spo retrieval llm response
2025-02-27 11:28:01 +08:00
royzhao 6a16df3565
fix(builder): bugfix official_name node has same prop object (#372)
* bugfix official_name node has same prop object

* reformat by black
2025-02-25 18:16:08 +08:00
Xinhong Zhang 8d51e66d6a
fix(builder): fix pdf reader for normalizing text in outline (#344)
* fix buidler init

* add pro commit

* rename graphalgoclient to graphclient

* fix pdf reader

* fix pdf reader

* fix pdf reader
2025-02-17 14:02:39 +08:00
xueguanwen daa536fb3f
Update baike kag_config.yaml (#339)
The reference to the class corresponding to the default_chunk_retriever is incorrect. fix it
2025-02-11 11:01:47 +08:00
hy89 b02553243d
fix the error when the stream parameter is True (#336) 2025-02-11 11:01:26 +08:00
luzizhuo 4a40479e6b
Add Discord link and wechat qr code. (#338)
* Add qr code

* Update README.md

Add discord and how to join the wechat group.
2025-02-08 21:15:56 +08:00
royzhao bd0c3ec92e
fix empty data generate (#319) 2025-01-22 14:16:45 +08:00
zhuzhongshu123 3348dfeaa4
use json repair for llm client (#312) 2025-01-21 11:20:26 +08:00
royzhao cdf0ea3933
add retry (#306) 2025-01-20 14:14:10 +08:00
zhuzhongshu123 1e57016373
disable entity linking in postprocess by default (#304) 2025-01-20 11:19:44 +08:00
zhuzhongshu123 4ad5bded26
delete checkpoint of postprocess (#302) 2025-01-18 12:05:31 +08:00
zhuzhongshu123 7666ca40dd
feat(kag): catch unexpected exceptions (#298)
* x (#280)

* feat(bridge): spg server bridge (#283)

* x

* bridge add solver

* x

* feat(bridge): Spg server bridge check (#285)

* x

* bridge add solver

* x

* add invoke

* feat(common): llm client catch exception (#294)

* x

* bridge add solver

* x

* add invoke

* llm client catch error

* feat(solver): catch chunk retriever exception (#297)

* x

* bridge add solver

* x

* add invoke

* llm client catch error

* catch exception

* feat(common):llm except (#299)

* x

* bridge add solver

* x

* add invoke

* llm client catch error

* catch exception

* print llm invoke error info

* with except

* feat(common): force raise except (#300)

* x

* bridge add solver

* x

* add invoke

* llm client catch error

* catch exception

* print llm invoke error info

* with except

* force raise except
2025-01-17 17:11:51 +08:00
zhuzhongshu123 deae277510
feat(bridge): spg server bridge supports config check and run solver (#287)
* x

* x (#280)

* bridge add solver

* x

* feat(bridge): spg server bridge (#283)

* x

* bridge add solver

* x

* add invoke

* llm client catch error
2025-01-17 13:52:00 +08:00
zhuzhongshu123 ca31351971
support custom kag config file (#279) 2025-01-15 18:10:55 +08:00
zhuzhongshu123 a40980a294
fix(examples): fix qa file name (#251) 2025-01-14 20:18:38 +08:00
Xinhong Zhang 248b22520f
fix(builder): fix markdown reader for id (#273)
* fix buidler init

* add pro commit

* rename graphalgoclient to graphclient

* first fix
2025-01-14 14:36:41 +08:00
joseosvaldo16 6494fd20c0
feat(builder): add Azure Open AI Compatibility (#269)
* feat(llm): add Azure OpenAI client and vectorization support

* chore: add .DS_Store to .gitignore

* refactor(llm):add description for api_version and default value

* refactor(vectorize_model): added description for ap_version and default values for some params

* refactor(openai_model): enhance docstring for Azure AD token and deployment parameters
2025-01-14 12:57:43 +08:00
zhuzhongshu123 671a9a016c
fix mix reader (#270) 2025-01-14 10:18:38 +08:00
Andy c2056ef2f6
update(kag) Update README (#264)
* update README #andy

* update README #andy

* update README #andy

* update README #andy

* update README #andy

* update README #andy
2025-01-13 10:57:10 +08:00
Andy 724a026b15
update(kag) Update README (#258)
* update README #andy

* update README #andy

* update README #andy

* update README #andy
2025-01-10 17:38:23 +08:00
zhuzhongshu123 e1fccef44c
chore(examples): domain KG inject example (#249)
* add timeout param for llm and embedding model

* add example

* fix title
2025-01-09 17:14:51 +08:00
xionghuaidong fb15dcec26
feat(examples): output qfs evaluation results as json and markdown (#240)
* fix vectorize_model configuration key typo

* fix permissions of data files

* fix examples README.md inconsistency

* output qfs evaluation results as json and markdown

* format summarization_metrics.py with black
2025-01-08 16:52:31 +08:00
田常@蚂蚁 75b3447097
Update README.md (#237) 2025-01-08 16:10:10 +08:00
田常@蚂蚁 9cf076de79
Update README_cn.md (#238) 2025-01-08 16:04:31 +08:00
zhuzhongshu123 1bf129c1f6
add timeout param for llm and embedding model (#236) 2025-01-08 15:58:38 +08:00
zhuzhongshu123 1131dbbb28
fix ollma regsiter name (#234) 2025-01-08 11:35:22 +08:00
zhuzhongshu123 57c5aad42a
fix(knext): fix knext client (#230)
* fix knext client

* x
2025-01-07 19:34:26 +08:00
zhuzhongshu123 35b1135dc4
fix knext client (#229) 2025-01-07 19:23:51 +08:00
royzhao 38a85ec334
fix(KAG): change level log (#227)
* change log level to debug

* fix(example): fix vectorize model config in example (#220)

* fix vectorize model config

* remove ak

* remove ak

* x

* fix knext env (#223)

* fix 'ModuleNotFoundError: No module named xxx' error in Windows (#222)

* reduce warn (#225)

* change log level to debug

---------

Co-authored-by: zhuzhongshu123 <152354526+zhuzhongshu123@users.noreply.github.com>
Co-authored-by: Xinhong Zhang <zhangxinhong.zxh@antgroup.com>
2025-01-07 17:19:49 +08:00
田常@蚂蚁 416fa8a6c4
update(kag) update log level (#226)
* udpate default yaml and corpus

* update log level to debug
2025-01-07 16:45:03 +08:00
zhuzhongshu123 16767f7e84
reduce warn (#225) 2025-01-07 16:39:29 +08:00
yangman a56d8a88a4
fix 'ModuleNotFoundError: No module named xxx' error in Windows (#222) 2025-01-07 16:06:23 +08:00
Xinhong Zhang 838c161e61
fix knext env (#223) 2025-01-07 16:01:06 +08:00
royzhao 564fd964f6
change log level to debug (#221) 2025-01-07 15:43:04 +08:00
zhuzhongshu123 f141596981
fix(example): fix vectorize model config in example (#220)
* fix vectorize model config

* remove ak

* remove ak

* x
2025-01-07 15:35:00 +08:00
xionghuaidong ca6a35ebbb
fix examples REAME.md to match quick start doc (#218) 2025-01-07 14:34:33 +08:00
Xinhong Zhang b415c4d933
fix(knext): fix knext project env (#211)
* fix create project

* fix create project

* fix create project

* fix create project
2025-01-07 14:28:31 +08:00
田常@蚂蚁 9b11539a42
udpate default yaml and corpus (#217) 2025-01-07 13:51:58 +08:00
xionghuaidong f1671e8419
move more images for kag examples to _static/images (#216) 2025-01-07 11:23:15 +08:00
xionghuaidong 3264744b93
move images for kag examples to _static/images (#214) 2025-01-07 10:55:25 +08:00
xionghuaidong cde3495ca6
docs(examples): finish README.md for the examples directory (#210)
* add data introduction for example supplychain

* add schema modeling for example supplychain

* add kg construction for example supplychain

* add kg query for example supplychain

* finish README_cn.md for the supplychain example

* finish README.md for the supplychain example

* add README.md for the riskmining example

* add README.md for the medicine example

* update README.md of hotpotqa, 2wiki and musique

* add README_cn.md for hotpotqa, 2wiki and musique

* update README.md of csqa

* add README_cn.md for csqa

* update link targets to 0.6 version of the docs

* add README.md for baike

* reformat Python code in examples

* add README_cn.md for examples

* finish README.md for examples

* fix typo in README.md for examples
2025-01-06 20:03:29 +08:00
Xinhong Zhang 87c29a98e2
fix create project (#208) 2025-01-06 15:30:31 +08:00
xionghuaidong 93b92c7629
docs(examples): finish README.md for builtin kag examples (#207)
* add data introduction for example supplychain

* add schema modeling for example supplychain

* add kg construction for example supplychain

* add kg query for example supplychain

* finish README_cn.md for the supplychain example

* finish README.md for the supplychain example

* add README.md for the riskmining example

* add README.md for the medicine example

* update README.md of hotpotqa, 2wiki and musique

* add README_cn.md for hotpotqa, 2wiki and musique

* update README.md of csqa

* add README_cn.md for csqa

* update link targets to 0.6 version of the docs

* add README.md for baike

* reformat Python code in examples
2025-01-06 11:14:02 +08:00
zhuzhongshu123 f5ad5f1101
refactor(all): kag v0.6 (#174)
* add path find

* fix find path

* spg guided relation extraction

* fix dict parse with same key

* rename graphalgoclient to graphclient

* rename graphalgoclient to graphclient

* file reader supports http url

* add checkpointer class

* parser supports checkpoint

* add build

* remove incorrect logs

* remove logs

* update examples

* update chain checkpointer

* vectorizer batch size set to 32

* add a zodb backended checkpointer

* add a zodb backended checkpointer

* fix zodb based checkpointer

* add thread for zodb IO

* fix(common): resolve mutlithread conflict in zodb IO

* fix(common): load existing zodb checkpoints

* update examples

* update examples

* fix zodb writer

* add docstring

* fix jieba version mismatch

* commit kag_config-tc.yaml

1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file

Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.

* commit kag_config-tc.yaml

1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file

Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.

* 1、fix bug in base_table_splitter

* 1、fix bug in base_table_splitter

* 1、fix bug in default_chain

* 增加solver

* add kag

* update outline splitter

* add main test

* add op

* code refactor

* add tools

* fix outline splitter

* fix outline prompt

* graph api pass

* commit with page rank

* add search api and graph api

* add markdown report

* fix vectorizer num batch compute

* add retry for vectorize model call

* update markdown reader

* update markdown reader

* update pdf reader

* raise extractor failure

* add default expr

* add log

* merge jc reader features

* rm import

* add build

* fix zodb based checkpointer

* add thread for zodb IO

* fix(common): resolve mutlithread conflict in zodb IO

* fix(common): load existing zodb checkpoints

* update examples

* update examples

* fix zodb writer

* add docstring

* fix jieba version mismatch

* commit kag_config-tc.yaml

1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file

Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.

* commit kag_config-tc.yaml

1、rename type to register_name
2、put a uniqe & specific name to register_name
3、rename reader to scanner
4、rename parser to reader
5、rename num_parallel to num_parallel_file, rename chain_level_num_paralle to num_parallel_chain_of_file
6、rename kag_extractor to schema_free_extractor, schema_base_extractor to schema_constraint_extractor
7、pre-define llm & vectorize_model and refer them in the yaml file

Issues to be resolved:
1、examples of event extract & spg extract
2、statistic of indexer, such as nums of nodes & edges extracted, ratio of llm invoke.
3、Exceptions such as Debt, account does not exist should be thrown in llm invoke.
4、conf of solver need to be re-examined.

* 1、fix bug in base_table_splitter

* 1、fix bug in base_table_splitter

* 1、fix bug in default_chain

* update outline splitter

* add main test

* add markdown report

* code refactor

* fix outline splitter

* fix outline prompt

* update markdown reader

* fix vectorizer num batch compute

* add retry for vectorize model call

* update markdown reader

* raise extractor failure

* rm parser

* run pipeline

* add config option of whether to perform llm config check, default to false

* fix

* recover pdf reader

* several components can be null for default chain

* 支持完整qa运行

* add if

* remove unused code

* 使用chunk兜底

* excluded source relation to choose

* add generate

* default recall 10

* add local memory

* 排除相似边

* 增加保护

* 修复并发问题

* add debug logger

* 支持topk参数化

* 支持chunk截断和调整spo select 的prompt

* 增加查询请求保护

* 增加force_chunk配置

* fix entity linker algorithm

* 增加sub query改写

* fix md reader dup in test

* fix

* merge knext to kag parallel

* fix package

* 修复指标下跌问题

* scanner update

* scanner update

* add doc and update example scripts

* fix

* add bridge to spg server

* add format

* fix bridge

* update conf for baike

* disable ckpt for spg server runner

* llm invoke error default raise exceptions

* chore(version): bump version to X.Y.Z

* update default response generation prompt

* add method getSummarizationMetrics

* fix(common): fix project conf empty error

* fix typo

* 增加上报信息

* 修改main solver

* postprocessor support spg server

* 修改solver支持名

* fix language

* 修改chunker接口,增加openapi

* rename vectorizer to vectorize_model in spg server config

* generate_random_string start with gen

* add knext llm vector checker

* add knext llm vector checker

* add knext llm vector checker

* solver移除默认值

* udpate yaml and register_name for baike

* udpate yaml and register_name for baike

* remove config key check

* 修复llmmodule

* fix knext project

* udpate yaml and register_name for examples

* udpate yaml and register_name for examples

* Revert "udpate yaml and register_name for examples"

This reverts commit b3fa5ca9ba.

* update register name

* fix

* fix

* support multiple resigter names

* update component

* update reader register names (#183)

* fix markdown reader

* fix llm client for retry

* feat(common): add processed chunk id checkpoint (#185)

* update reader register names

* add processed chunk id checkpoint

* feat(example): add example config (#186)

* update reader register names

* add processed chunk id checkpoint

* add example config file

* add max_workers parameter for getSummarizationMetrics to make it faster

* add csqa data generation script generate_data.py

* commit generated csqa builder and solver data

* add csqa basic project files

* adjust split_length and num_threads_per_chain to match lightrag settings

* ignore ckpt dirs

* add csqa evaluation script eval.py

* save evaluation scripts summarization_metrics.py and factual_correctness.py

* save LightRAG output csqa_lightrag_answers.json

* ignore KAG output csqa_kag_answers.json

* add README.md for CSQA

* fix(solver): fix solver pipeline conf (#191)

* update reader register names

* add processed chunk id checkpoint

* add example config file

* update solver pipeline config

* fix project create

* update links and file paths

* reformat csqa kag_config.yaml

* reformat csqa python files

* reformat getSummarizationMetrics and compare_summarization_answers

* fix(solver): fix solver config (#192)

* update reader register names

* add processed chunk id checkpoint

* add example config file

* update solver pipeline config

* fix project create

* fix main solver conf

* add except

* fix typo in csqa README.md

* feat(conf): support reinitialize config for call from java side (#199)

* update reader register names

* add processed chunk id checkpoint

* add example config file

* update solver pipeline config

* fix project create

* fix main solver conf

* support reinitialize config for java call

* revert default response generation prompt

* update project list

* add README.md for the hotpotqa, 2wiki and musique examples

* 增加spo检索

* turn off kag config dump by default

* turn off knext schema dump by default

* add .gitignore and fix kag_config.yaml

* add README.md for the medicine example

* add README.md for the supplychain example

* bugfix for risk mining

* use exact out

* refactor(solver): format solver code (#205)

* update reader register names

* add processed chunk id checkpoint

* add example config file

* update solver pipeline config

* fix project create

* fix main solver conf

* support reinitialize config for java call

* black format

---------

Co-authored-by: peilong <peilong.zpl@antgroup.com>
Co-authored-by: 锦呈 <zhangxinhong.zxh@antgroup.com>
Co-authored-by: zhengke.gzk <zhengke.gzk@antgroup.com>
Co-authored-by: huaidong.xhd <huaidong.xhd@antgroup.com>
2025-01-03 17:10:51 +08:00