Implement full-text search of posts with Lucene default (#2675)

#### What type of PR is this?

/kind feature
/area core
/milestone 2.0

#### What this PR does / why we need it:

This PR mainly implement full-text search of posts and provide extension point for other search engine.

Meanwhile, I implement ExtensionGetter to get implemention(s) of extension point from system ConfigMap.

But there still are something to do here:

- [x] Udpate documents when posts are published or posts are becoming unpublic.
- [x] Delete documents when posts are unpublished or deleted.

Because I'm waiting for https://github.com/halo-dev/halo/pull/2659 got merged.

I create two endpoints:

1. For full-text search of post

    ```bash
    curl -X 'GET' \
      'http://localhost:8090/apis/api.halo.run/v1alpha1/indices/post?keyword=halo&limit=10000&highlightPreTag=%3CB%3E&highlightPostTag=%3C%2FB%3E' \
      -H 'accept: */*'
    ```

1. For refreshing indices

    ```bash
    curl -X 'POST' \
      'http://localhost:8090/apis/api.console.halo.run/v1alpha1/indices/post' \
      -H 'accept: */*' \
      -d ''
    ```

#### Which issue(s) this PR fixes:

Fixes #https://github.com/halo-dev/halo/issues/2637

#### Special notes for your reviewer:

#### Does this PR introduce a user-facing change?

```release-note
提供文章全文搜索功能并支持搜索引擎扩展
```
This commit is contained in:
John Niang 2022-11-12 00:12:13 +08:00 committed by GitHub
parent 8b9ea1d301
commit dac4eecea6
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
37 changed files with 1468 additions and 31 deletions

View File

@ -62,6 +62,12 @@ dependencies {
implementation 'org.openapi4j:openapi-schema-validator:1.0.7'
implementation "net.bytebuddy:byte-buddy"
// Apache Lucene
implementation 'org.apache.lucene:lucene-core:9.4.1'
implementation 'org.apache.lucene:lucene-queryparser:9.4.1'
implementation 'org.apache.lucene:lucene-highlighter:9.4.1'
implementation 'cn.shenyanchao.ik-analyzer:ik-analyzer:9.0.0'
implementation "org.apache.commons:commons-lang3:$commonsLang3"
implementation "io.seruco.encoding:base62:$base62"
implementation "org.pf4j:pf4j:$pf4j"

View File

@ -0,0 +1,356 @@
# 在 Halo 中实践全文搜索
主题端需全文搜索接口用于模糊搜索文章,且对效率要求极高。已经有对应的 Issue
提出,可参考:<https://github.com/halo-dev/halo/issues/2637>
实现全文搜索的本地方案最好的就是 Apache 旗下开源的 [Lucene](https://lucene.apache.org/)
,不过 [Hibernate Search](https://hibernate.org/search/) 也基于 Lucene 实现了全文搜索。Halo 2.0 的自定义模型并不是直接在
Hibernate 上构建的,也就是说 Hibernate 在 Halo 2.0 只是一个可选项,故我们最终可能并不会采用 Hibernate Search即使它有很多优势。
Halo 也可以学习 Hibernate 适配多种搜索引擎,如 Lucene、ElasticSearch、MeiliSearch 等。默认实现为 Lucene对于用户来说这种实现方式部署成本最低。
## 搜索接口设计
### 搜索参数
字段如下所示:
- keyword: string. 关键字
- sort: string[]. 搜索字段和排序方式
- offset: number. 本次查询结果偏移数
- limit: number. 本次查询的结果最大条数
例如:
```bash
http://localhost:8090/apis/api.halo.run/v1alpha1/posts?keyword=halo&sort=title.asc&sort=publishTimestamp,desc&offset=20&limit=10
```
### 搜索结果
```yaml
hits:
- name: halo01
title: Halo 01
permalink: /posts/halo01
categories:
- a
- b
tags:
- c
- d
- name: halo02
title: Halo 02
permalink: /posts/halo02
categories:
- a
- b
tags:
- c
- d
query: "halo"
total: 100
limit: 20
offset: 10
processingTimeMills: 2
```
#### 搜索结果分页问题
目前,大多数搜索引擎为了性能问题,并没有直接提供分页功能,或者不推荐分页。
请参考:
- <https://solr.apache.org/guide/solr/latest/query-guide/pagination-of-results.html>
- <https://docs.meilisearch.com/learn/advanced/pagination.html>
- <https://www.elastic.co/guide/en/elasticsearch/reference/current/paginate-search-results.html>
- <https://discourse.algolia.com/t/pagination-limit/10585>
综合以上讨论我们暂定不支持分页。不过允许设置单次查询的记录数limit <= max_limit
#### 中文搜索优化
Lucene 默认的分析器,对中文的分词不够友好,我们需要借助外部依赖或者外部整理好的词库帮助我们更好的对中文句子分词,以便优化中文搜索结果。
以下是关于中文分析器的 Java 库:
- <https://gitee.com/lionsoul/jcseg>
- <https://code.google.com/archive/p/ik-analyzer>
- <https://github.com/huaban/jieba-analysis>
- <https://github.com/medcl/elasticsearch-analysis-ik>
- <https://github.com/blueshen/ik-analyzer>
### 搜索引擎样例
#### MeiliSearch
```bash
curl 'http://localhost:7700/indexes/movies/search' \
-H 'Accept: */*' \
-H 'Accept-Language: zh-CN,zh;q=0.9,en;q=0.8,en-GB;q=0.7,en-US;q=0.6,zh-TW;q=0.5' \
-H 'Authorization: Bearer MASTER_KEY' \
-H 'Connection: keep-alive' \
-H 'Content-Type: application/json' \
-H 'Cookie: logged_in=yes; adminer_permanent=; XSRF-TOKEN=75995791-980a-4f3e-81fb-2e199d8f3934' \
-H 'Origin: http://localhost:7700' \
-H 'Referer: http://localhost:7700/' \
-H 'Sec-Fetch-Dest: empty' \
-H 'Sec-Fetch-Mode: cors' \
-H 'Sec-Fetch-Site: same-origin' \
-H 'User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36 Edg/107.0.1418.26' \
-H 'X-Meilisearch-Client: Meilisearch mini-dashboard (v0.2.2) ; Meilisearch instant-meilisearch (v0.8.2) ; Meilisearch JavaScript (v0.27.0)' \
-H 'sec-ch-ua: "Microsoft Edge";v="107", "Chromium";v="107", "Not=A?Brand";v="24"' \
-H 'sec-ch-ua-mobile: ?0' \
-H 'sec-ch-ua-platform: "Windows"' \
--data-raw '{"q":"halo","attributesToHighlight":["*"],"highlightPreTag":"<ais-highlight-0000000000>","highlightPostTag":"</ais-highlight-0000000000>","limit":21}' \
--compressed
```
```json
{
"hits": [
{
"id": 108761,
"title": "I Am... Yours: An Intimate Performance at Wynn Las Vegas",
"overview": "Filmed at the Encore Theater at Wynn Las Vegas, this extraordinary concert features performances of over 30 songs from Beyoncés three multi-platinum solo releases, Destinys Child catalog and a few surprises. This amazing concert includes the #1 hits, “Single Ladies (Put A Ring On It),” “If I Were A Boy,” “Halo,” “Sweet Dreams” and showcases a gut-wrenching performance of “Thats Why Youre Beautiful.” Included on \"I AM... YOURS An Intimate Performance At Wynn Las Vegas,\" is a biographical storytelling woven between many songs and exclusive behind-the-scenes footage.",
"genres": ["Music", "Documentary"],
"poster": "https://image.tmdb.org/t/p/w500/j8n1XQNfw874Ka7SS3HQLCVNBxb.jpg",
"release_date": 1258934400,
"_formatted": {
"id": "108761",
"title": "I Am... Yours: An Intimate Performance at Wynn Las Vegas",
"overview": "Filmed at the Encore Theater at Wynn Las Vegas, this extraordinary concert features performances of over 30 songs from Beyoncés three multi-platinum solo releases, Destinys Child catalog and a few surprises. This amazing concert includes the #1 hits, “Single Ladies (Put A Ring On It),” “If I Were A Boy,” “<ais-highlight-0000000000>Halo</ais-highlight-0000000000>,” “Sweet Dreams” and showcases a gut-wrenching performance of “Thats Why Youre Beautiful.” Included on \"I AM... YOURS An Intimate Performance At Wynn Las Vegas,\" is a biographical storytelling woven between many songs and exclusive behind-the-scenes footage.",
"genres": ["Music", "Documentary"],
"poster": "https://image.tmdb.org/t/p/w500/j8n1XQNfw874Ka7SS3HQLCVNBxb.jpg",
"release_date": "1258934400"
}
}
],
"estimatedTotalHits": 10,
"query": "halo",
"limit": 21,
"offset": 0,
"processingTimeMs": 2
}
```
![MeiliSearch UI](./meilisearch.jpg)
#### Algolia
```bash
curl 'https://og53ly1oqh-dsn.algolia.net/1/indexes/*/queries?x-algolia-agent=Algolia%20for%20JavaScript%20(4.14.2)%3B%20Browser%20(lite)%3B%20docsearch%20(3.2.1)%3B%20docsearch-react%20(3.2.1)%3B%20docusaurus%20(2.1.0)&x-algolia-api-key=739f2a55c6d13d93af146c22a4885669&x-algolia-application-id=OG53LY1OQH' \
-H 'Accept: */*' \
-H 'Accept-Language: zh-CN,zh;q=0.9,en;q=0.8,en-GB;q=0.7,en-US;q=0.6,zh-TW;q=0.5' \
-H 'Connection: keep-alive' \
-H 'Origin: https://docs.halo.run' \
-H 'Referer: https://docs.halo.run/' \
-H 'Sec-Fetch-Dest: empty' \
-H 'Sec-Fetch-Mode: cors' \
-H 'Sec-Fetch-Site: cross-site' \
-H 'User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36 Edg/107.0.1418.26' \
-H 'content-type: application/x-www-form-urlencoded' \
-H 'sec-ch-ua: "Microsoft Edge";v="107", "Chromium";v="107", "Not=A?Brand";v="24"' \
-H 'sec-ch-ua-mobile: ?0' \
-H 'sec-ch-ua-platform: "Windows"' \
--data-raw '{"requests":[{"query":"halo","indexName":"docs","params":"attributesToRetrieve=%5B%22hierarchy.lvl0%22%2C%22hierarchy.lvl1%22%2C%22hierarchy.lvl2%22%2C%22hierarchy.lvl3%22%2C%22hierarchy.lvl4%22%2C%22hierarchy.lvl5%22%2C%22hierarchy.lvl6%22%2C%22content%22%2C%22type%22%2C%22url%22%5D&attributesToSnippet=%5B%22hierarchy.lvl1%3A5%22%2C%22hierarchy.lvl2%3A5%22%2C%22hierarchy.lvl3%3A5%22%2C%22hierarchy.lvl4%3A5%22%2C%22hierarchy.lvl5%3A5%22%2C%22hierarchy.lvl6%3A5%22%2C%22content%3A5%22%5D&snippetEllipsisText=%E2%80%A6&highlightPreTag=%3Cmark%3E&highlightPostTag=%3C%2Fmark%3E&hitsPerPage=20&facetFilters=%5B%22language%3Azh-Hans%22%2C%5B%22docusaurus_tag%3Adefault%22%2C%22docusaurus_tag%3Adocs-default-1.6%22%5D%5D"}]}' \
--compressed
```
```json
{
"results": [
{
"hits": [
{
"content": null,
"hierarchy": {
"lvl0": "Documentation",
"lvl1": "使用 Docker Compose 部署 Halo",
"lvl2": "更新容器组 ",
"lvl3": null,
"lvl4": null,
"lvl5": null,
"lvl6": null
},
"type": "lvl2",
"url": "https://docs.halo.run/getting-started/install/other/docker-compose/#更新容器组",
"objectID": "4ccfa93009143feb6e423274a4944496267beea8",
"_snippetResult": {
"hierarchy": {
"lvl1": {
"value": "… Docker Compose 部署 <mark>Halo</mark>",
"matchLevel": "full"
},
"lvl2": {
"value": "更新容器组 ",
"matchLevel": "none"
}
}
},
"_highlightResult": {
"hierarchy": {
"lvl0": {
"value": "Documentation",
"matchLevel": "none",
"matchedWords": []
},
"lvl1": {
"value": "使用 Docker Compose 部署 <mark>Halo</mark>",
"matchLevel": "full",
"fullyHighlighted": false,
"matchedWords": ["halo"]
},
"lvl2": {
"value": "更新容器组 ",
"matchLevel": "none",
"matchedWords": []
}
},
"hierarchy_camel": [
{
"lvl0": {
"value": "Documentation",
"matchLevel": "none",
"matchedWords": []
},
"lvl1": {
"value": "使用 Docker Compose 部署 <mark>Halo</mark>",
"matchLevel": "full",
"fullyHighlighted": false,
"matchedWords": ["halo"]
},
"lvl2": {
"value": "更新容器组 ",
"matchLevel": "none",
"matchedWords": []
}
}
]
}
}
],
"nbHits": 113,
"page": 0,
"nbPages": 6,
"hitsPerPage": 20,
"exhaustiveNbHits": true,
"exhaustiveTypo": true,
"exhaustive": {
"nbHits": true,
"typo": true
},
"query": "halo",
"params": "query=halo&attributesToRetrieve=%5B%22hierarchy.lvl0%22%2C%22hierarchy.lvl1%22%2C%22hierarchy.lvl2%22%2C%22hierarchy.lvl3%22%2C%22hierarchy.lvl4%22%2C%22hierarchy.lvl5%22%2C%22hierarchy.lvl6%22%2C%22content%22%2C%22type%22%2C%22url%22%5D&attributesToSnippet=%5B%22hierarchy.lvl1%3A5%22%2C%22hierarchy.lvl2%3A5%22%2C%22hierarchy.lvl3%3A5%22%2C%22hierarchy.lvl4%3A5%22%2C%22hierarchy.lvl5%3A5%22%2C%22hierarchy.lvl6%3A5%22%2C%22content%3A5%22%5D&snippetEllipsisText=%E2%80%A6&highlightPreTag=%3Cmark%3E&highlightPostTag=%3C%2Fmark%3E&hitsPerPage=20&facetFilters=%5B%22language%3Azh-Hans%22%2C%5B%22docusaurus_tag%3Adefault%22%2C%22docusaurus_tag%3Adocs-default-1.6%22%5D%5D",
"index": "docs",
"renderingContent": {},
"processingTimeMS": 1,
"processingTimingsMS": {
"total": 1
}
}
]
}
```
![Algolia UI](./algolia.png)
#### Wiki
```bash
curl 'https://wiki.fit2cloud.com/rest/api/search?cql=siteSearch%20~%20%22halo%22%20AND%20type%20in%20(%22space%22%2C%22user%22%2C%22com.atlassian.confluence.extra.team-calendars%3Acalendar-content-type%22%2C%22attachment%22%2C%22page%22%2C%22com.atlassian.confluence.extra.team-calendars%3Aspace-calendars-view-content-type%22%2C%22blogpost%22)&start=20&limit=20&excerpt=highlight&expand=space.icon&includeArchivedSpaces=false&src=next.ui.search' \
-H 'authority: wiki.fit2cloud.com' \
-H 'accept: */*' \
-H 'accept-language: zh-CN,zh;q=0.9,en;q=0.8,en-GB;q=0.7,en-US;q=0.6,zh-TW;q=0.5' \
-H 'cache-control: no-cache, no-store, must-revalidate' \
-H 'cookie: _ga=GA1.2.1720479041.1657188862; seraph.confluence=89915546%3A6fc1394f8d537ffa08fb679e6e4dd64993448051; mywork.tab.tasks=false; JSESSIONID=5347D8618AC5883DE9B702E77152170D' \
-H 'expires: 0' \
-H 'pragma: no-cache' \
-H 'referer: https://wiki.fit2cloud.com/' \
-H 'sec-ch-ua: "Microsoft Edge";v="107", "Chromium";v="107", "Not=A?Brand";v="24"' \
-H 'sec-ch-ua-mobile: ?0' \
-H 'sec-ch-ua-platform: "Windows"' \
-H 'sec-fetch-dest: empty' \
-H 'sec-fetch-mode: cors' \
-H 'sec-fetch-site: same-origin' \
-H 'user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36 Edg/107.0.1418.26' \
--compressed
```
```json
{
"results": [
{
"content": {
"id": "76722",
"type": "page",
"status": "current",
"title": "2.3 测试 - 接口",
"restrictions": {},
"_links": {
"webui": "/pages/viewpage.action?pageId=721",
"tinyui": "/x/8K_SB",
"self": "https://wiki.halo.run/rest/api/content/76720"
},
"_expandable": {
"container": "",
"metadata": "",
"extensions": "",
"operations": "",
"children": "",
"history": "/rest/api/content/7670/history",
"ancestors": "",
"body": "",
"version": "",
"descendants": "",
"space": "/rest/api/space/IT"
}
},
"title": "2.3 接口 - 接口",
"excerpt": "另存为新用例",
"url": "/pages/viewpage.action?pageId=7672",
"resultGlobalContainer": {
"title": "IT 客户",
"displayUrl": "/display/IT"
},
"entityType": "content",
"iconCssClass": "aui-icon content-type-page",
"lastModified": "2022-05-11T22:40:53.000+08:00",
"friendlyLastModified": "五月 11, 2022",
"timestamp": 1652280053000
}
],
"start": 20,
"limit": 20,
"size": 20,
"totalSize": 70,
"cqlQuery": "siteSearch ~ \"halo\" AND type in (\"space\",\"user\",\"com.atlassian.confluence.extra.team-calendars:calendar-content-type\",\"attachment\",\"page\",\"com.atlassian.confluence.extra.team-calendars:space-calendars-view-content-type\",\"blogpost\")",
"searchDuration": 36,
"_links": {
"base": "https://wiki.halo.run",
"context": ""
}
}
```
### FAQ
#### 是否需要统一参数和响应体结构?
以下是关于统一参数和响应体结构的优缺点分析:
优点:
- 主题端搜索结果 UI 更加一致,不会因为使用不同搜索引擎导致 UI 上的变动
缺点:
- 无法完全发挥出对应的搜索引擎的实力。比如某个搜索引擎有很实用的功能,而某些搜索引擎没有。
- Halo Core 需要适配不同的搜索引擎,比较繁琐
#### 是否需要提供扩展点集成其他搜索引擎?
既然 Lucene 非常强大,且暂时已经能够满足我们的要求,我们为什么还需要集成其他搜索引擎呢?
- Lucene 目前是作为 Halo 的依赖使用的,也就意味着只支持 Halo 单实例部署,阻碍未来 Halo 无状态化的趋势。
- 相反,其他搜索引擎(例如 Solr、MeiliSearch、ElasticSearch 等都可以独立部署Halo 只需要利用对应的 SDK 和搜索引擎沟通即可,无论 Halo 是否是多实例部署。

Binary file not shown.

After

Width:  |  Height:  |  Size: 35 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 89 KiB

View File

@ -4,8 +4,10 @@ import com.fasterxml.jackson.annotation.JsonInclude;
import org.springframework.boot.autoconfigure.jackson.Jackson2ObjectMapperBuilderCustomizer;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.scheduling.annotation.EnableAsync;
@Configuration(proxyBeanMethods = false)
@EnableAsync
public class HaloConfiguration {
@Bean

View File

@ -1,5 +1,7 @@
package run.halo.app.core.extension;
import static java.lang.Boolean.parseBoolean;
import com.fasterxml.jackson.annotation.JsonIgnore;
import io.swagger.v3.oas.annotations.media.Schema;
import java.time.Instant;
@ -13,6 +15,7 @@ import lombok.ToString;
import run.halo.app.extension.AbstractExtension;
import run.halo.app.extension.ExtensionUtil;
import run.halo.app.extension.GVK;
import run.halo.app.extension.MetadataOperator;
import run.halo.app.infra.Condition;
/**
@ -62,8 +65,12 @@ public class Post extends AbstractExtension {
@JsonIgnore
public boolean isPublished() {
Map<String, String> labels = getMetadata().getLabels();
return labels != null && labels.getOrDefault(PUBLISHED_LABEL, "false").equals("true");
return isPublished(this.getMetadata());
}
public static boolean isPublished(MetadataOperator metadata) {
var labels = metadata.getLabels();
return labels != null && parseBoolean(labels.getOrDefault(PUBLISHED_LABEL, "false"));
}
@Data

View File

@ -6,20 +6,27 @@ import static org.springdoc.core.fn.builders.parameter.Builder.parameterBuilder;
import static org.springdoc.core.fn.builders.requestbody.Builder.requestBodyBuilder;
import io.swagger.v3.oas.annotations.enums.ParameterIn;
import java.time.Duration;
import lombok.AllArgsConstructor;
import org.springdoc.core.fn.builders.schema.Builder;
import org.springdoc.webflux.core.fn.SpringdocRouteBuilder;
import org.springframework.context.ApplicationEventPublisher;
import org.springframework.dao.OptimisticLockingFailureException;
import org.springframework.http.MediaType;
import org.springframework.stereotype.Component;
import org.springframework.web.reactive.function.server.RouterFunction;
import org.springframework.web.reactive.function.server.ServerRequest;
import org.springframework.web.reactive.function.server.ServerResponse;
import reactor.core.publisher.Mono;
import reactor.util.retry.Retry;
import run.halo.app.content.ListedPost;
import run.halo.app.content.PostQuery;
import run.halo.app.content.PostRequest;
import run.halo.app.content.PostService;
import run.halo.app.core.extension.Post;
import run.halo.app.event.post.PostPublishedEvent;
import run.halo.app.event.post.PostRecycledEvent;
import run.halo.app.event.post.PostUnpublishedEvent;
import run.halo.app.extension.ListResult;
import run.halo.app.extension.ReactiveExtensionClient;
import run.halo.app.extension.router.QueryParamBuildUtil;
@ -37,6 +44,8 @@ public class PostEndpoint implements CustomEndpoint {
private final PostService postService;
private final ReactiveExtensionClient client;
private final ApplicationEventPublisher eventPublisher;
@Override
public RouterFunction<ServerResponse> endpoint() {
final var tag = "api.console.halo.run/v1alpha1/Post";
@ -91,9 +100,29 @@ public class PostEndpoint implements CustomEndpoint {
.in(ParameterIn.PATH)
.required(true)
.implementation(String.class))
.parameter(parameterBuilder().name("headSnapshot")
.description("Head snapshot name of content.")
.in(ParameterIn.QUERY)
.required(false))
.response(responseBuilder()
.implementation(Post.class))
)
.PUT("posts/{name}/unpublish", this::unpublishPost,
builder -> builder.operationId("UnpublishPost")
.description("Publish a post.")
.tag(tag)
.parameter(parameterBuilder().name("name")
.in(ParameterIn.PATH)
.required(true))
.response(responseBuilder()
.implementation(Post.class)))
.PUT("posts/{name}/recycle", this::recyclePost,
builder -> builder.operationId("RecyclePost")
.description("Recycle a post.")
.tag(tag)
.parameter(parameterBuilder().name("name")
.in(ParameterIn.PATH)
.required(true)))
.build();
}
@ -110,15 +139,54 @@ public class PostEndpoint implements CustomEndpoint {
}
Mono<ServerResponse> publishPost(ServerRequest request) {
String name = request.pathVariable("name");
return client.fetch(Post.class, name)
.flatMap(post -> {
Post.PostSpec spec = post.getSpec();
var name = request.pathVariable("name");
return client.get(Post.class, name)
.doOnNext(post -> {
var spec = post.getSpec();
request.queryParam("headSnapshot").ifPresent(spec::setHeadSnapshot);
spec.setPublish(true);
// TODO Provide release snapshot query param to control
spec.setReleaseSnapshot(spec.getHeadSnapshot());
return client.update(post);
})
.flatMap(client::update)
.retryWhen(Retry.backoff(3, Duration.ofMillis(100))
.filter(t -> t instanceof OptimisticLockingFailureException))
.flatMap(post -> postService.publishPost(post.getMetadata().getName()))
// TODO Fire published event in reconciler in the future
.doOnNext(post -> eventPublisher.publishEvent(
new PostPublishedEvent(this, post.getMetadata().getName())))
.flatMap(post -> ServerResponse.ok().bodyValue(post));
}
private Mono<ServerResponse> unpublishPost(ServerRequest request) {
var name = request.pathVariable("name");
return client.get(Post.class, name)
.doOnNext(post -> {
var spec = post.getSpec();
spec.setPublish(false);
})
.flatMap(client::update)
.retryWhen(Retry.backoff(3, Duration.ofMillis(100))
.filter(t -> t instanceof OptimisticLockingFailureException))
// TODO Fire unpublished event in reconciler in the future
.doOnNext(post -> eventPublisher.publishEvent(
new PostUnpublishedEvent(this, post.getMetadata().getName())))
.flatMap(post -> ServerResponse.ok().bodyValue(post));
}
private Mono<ServerResponse> recyclePost(ServerRequest request) {
var name = request.pathVariable("name");
return client.get(Post.class, name)
.doOnNext(post -> {
var spec = post.getSpec();
spec.setDeleted(true);
})
.flatMap(client::update)
.retryWhen(Retry.backoff(3, Duration.ofMillis(100))
.filter(t -> t instanceof OptimisticLockingFailureException))
// TODO Fire recycled event in reconciler in the future
.doOnNext(post -> eventPublisher.publishEvent(
new PostRecycledEvent(this, post.getMetadata().getName())))
.flatMap(post -> ServerResponse.ok().bodyValue(post));
}

View File

@ -0,0 +1,17 @@
package run.halo.app.event.post;
import org.springframework.context.ApplicationEvent;
public class PostDeletedEvent extends ApplicationEvent {
private final String postName;
public PostDeletedEvent(Object source, String postName) {
super(source);
this.postName = postName;
}
public String getPostName() {
return postName;
}
}

View File

@ -0,0 +1,18 @@
package run.halo.app.event.post;
import org.springframework.context.ApplicationEvent;
public class PostPublishedEvent extends ApplicationEvent {
private final String postName;
public PostPublishedEvent(Object source, String postName) {
super(source);
this.postName = postName;
}
public String getPostName() {
return postName;
}
}

View File

@ -0,0 +1,17 @@
package run.halo.app.event.post;
import org.springframework.context.ApplicationEvent;
public class PostRecycledEvent extends ApplicationEvent {
private final String postName;
public PostRecycledEvent(Object source, String postName) {
super(source);
this.postName = postName;
}
public String getPostName() {
return postName;
}
}

View File

@ -0,0 +1,18 @@
package run.halo.app.event.post;
import org.springframework.context.ApplicationEvent;
public class PostUnpublishedEvent extends ApplicationEvent {
private final String postName;
public PostUnpublishedEvent(Object source, String postName) {
super(source);
this.postName = postName;
}
public String getPostName() {
return postName;
}
}

View File

@ -1,5 +1,7 @@
package run.halo.app.extension;
import static run.halo.app.infra.utils.GenericClassUtils.generateConcreteClass;
import com.fasterxml.jackson.annotation.JsonIgnore;
import com.fasterxml.jackson.annotation.JsonProperty;
import io.swagger.v3.oas.annotations.media.Schema;
@ -120,15 +122,8 @@ public class ListResult<T> implements Streamable<T> {
* @return generic ListResult class.
*/
public static <T> Class<?> generateGenericClass(Class<T> type) {
var generic =
TypeDescription.Generic.Builder.parameterizedType(ListResult.class, type)
.build();
return new ByteBuddy()
.subclass(generic)
.name(type.getSimpleName() + "List")
.make()
.load(ListResult.class.getClassLoader())
.getLoaded();
return generateConcreteClass(ListResult.class, type,
() -> type.getSimpleName() + "List");
}
public static <T> ListResult<T> emptyResult() {

View File

@ -14,6 +14,7 @@ import java.io.IOException;
import java.time.Instant;
import java.util.Arrays;
import java.util.Collection;
import java.util.Collections;
import java.util.HashMap;
import java.util.LinkedHashSet;
import java.util.List;
@ -45,6 +46,10 @@ public class Unstructured implements Extension {
this.data = data;
}
public Map getData() {
return Collections.unmodifiableMap(data);
}
@Override
public String getApiVersion() {
return (String) data.get("apiVersion");
@ -161,7 +166,7 @@ public class Unstructured implements Extension {
data.put("metadata", metadataMap);
}
static Optional<Object> getNestedValue(Map map, String... fields) {
public static Optional<Object> getNestedValue(Map map, String... fields) {
if (fields == null || fields.length == 0) {
return Optional.of(map);
}
@ -177,11 +182,11 @@ public class Unstructured implements Extension {
}
@SuppressWarnings("unchecked")
static Optional<List<String>> getNestedStringList(Map map, String... fields) {
public static Optional<List<String>> getNestedStringList(Map map, String... fields) {
return getNestedValue(map, fields).map(value -> (List<String>) value);
}
static Optional<Set<String>> getNestedStringSet(Map map, String... fields) {
public static Optional<Set<String>> getNestedStringSet(Map map, String... fields) {
return getNestedValue(map, fields).map(value -> {
if (value instanceof Collection collection) {
return new LinkedHashSet<>(collection);
@ -192,7 +197,7 @@ public class Unstructured implements Extension {
}
@SuppressWarnings("unchecked")
static void setNestedValue(Map map, Object value, String... fields) {
public static void setNestedValue(Map map, Object value, String... fields) {
if (fields == null || fields.length == 0) {
// do nothing when no fields provided
return;
@ -205,12 +210,13 @@ public class Unstructured implements Extension {
});
}
static Optional<Map> getNestedMap(Map map, String... fields) {
public static Optional<Map> getNestedMap(Map map, String... fields) {
return getNestedValue(map, fields).map(value -> (Map) value);
}
@SuppressWarnings("unchecked")
static Optional<Map<String, String>> getNestedStringStringMap(Map map, String... fields) {
public static Optional<Map<String, String>> getNestedStringStringMap(Map map,
String... fields) {
return getNestedValue(map, fields)
.map(labelsObj -> {
var labels = (Map) labelsObj;
@ -220,7 +226,7 @@ public class Unstructured implements Extension {
});
}
static Optional<Instant> getNestedInstant(Map map, String... fields) {
public static Optional<Instant> getNestedInstant(Map map, String... fields) {
return getNestedValue(map, fields)
.map(instantValue -> {
if (instantValue instanceof Instant instant) {
@ -231,7 +237,7 @@ public class Unstructured implements Extension {
}
static Optional<Long> getNestedLong(Map map, String... fields) {
public static Optional<Long> getNestedLong(Map map, String... fields) {
return getNestedValue(map, fields)
.map(longObj -> {
if (longObj instanceof Long l) {

View File

@ -0,0 +1,11 @@
package run.halo.app.infra;
import org.springframework.context.ApplicationEvent;
public class SchemeInitializedEvent extends ApplicationEvent {
public SchemeInitializedEvent(Object source) {
super(source);
}
}

View File

@ -1,6 +1,7 @@
package run.halo.app.infra;
import org.springframework.boot.context.event.ApplicationStartedEvent;
import org.springframework.context.ApplicationEventPublisher;
import org.springframework.context.ApplicationListener;
import org.springframework.lang.NonNull;
import org.springframework.stereotype.Component;
@ -27,6 +28,7 @@ import run.halo.app.core.extension.attachment.Policy;
import run.halo.app.core.extension.attachment.PolicyTemplate;
import run.halo.app.extension.ConfigMap;
import run.halo.app.extension.SchemeManager;
import run.halo.app.search.extension.SearchEngine;
import run.halo.app.security.authentication.pat.PersonalAccessToken;
@Component
@ -34,15 +36,23 @@ public class SchemeInitializer implements ApplicationListener<ApplicationStarted
private final SchemeManager schemeManager;
public SchemeInitializer(SchemeManager schemeManager) {
private final ApplicationEventPublisher eventPublisher;
public SchemeInitializer(SchemeManager schemeManager,
ApplicationEventPublisher eventPublisher) {
this.schemeManager = schemeManager;
this.eventPublisher = eventPublisher;
}
@Override
public void onApplicationEvent(@NonNull ApplicationStartedEvent event) {
schemeManager.register(Role.class);
schemeManager.register(PersonalAccessToken.class);
// plugin.halo.run
schemeManager.register(Plugin.class);
schemeManager.register(SearchEngine.class);
schemeManager.register(RoleBinding.class);
schemeManager.register(User.class);
schemeManager.register(ReverseProxy.class);
@ -65,5 +75,7 @@ public class SchemeInitializer implements ApplicationListener<ApplicationStarted
schemeManager.register(PolicyTemplate.class);
// metrics.halo.run
schemeManager.register(Counter.class);
eventPublisher.publishEvent(new SchemeInitializedEvent(this));
}
}

View File

@ -1,5 +1,7 @@
package run.halo.app.infra;
import java.util.LinkedHashMap;
import java.util.Set;
import lombok.Data;
/**
@ -87,4 +89,17 @@ public class SystemSetting {
public static final String GROUP = "menu";
public String primary;
}
/**
* ExtensionPointEnabled key is full qualified name of extension point and value is a list of
* full qualified name of implementation.
*/
public static class ExtensionPointEnabled extends LinkedHashMap<String, Set<String>> {
public static final ExtensionPointEnabled EMPTY = new ExtensionPointEnabled();
public static final String GROUP = "extensionPointEnabled";
}
}

View File

@ -0,0 +1,48 @@
package run.halo.app.infra.utils;
import static net.bytebuddy.description.type.TypeDescription.Generic.Builder.parameterizedType;
import java.io.IOException;
import java.util.function.Supplier;
import net.bytebuddy.ByteBuddy;
import reactor.core.Exceptions;
public enum GenericClassUtils {
;
/**
* Generate concrete class of generic class. e.g.: {@code List<String>}
*
* @param rawClass is generic class, like {@code List.class}
* @param parameterType is parameter type of generic class
* @param <T> parameter type
* @return generated class
*/
public static <T> Class<?> generateConcreteClass(Class<?> rawClass, Class<T> parameterType) {
return generateConcreteClass(rawClass, parameterType, () ->
parameterType.getSimpleName() + rawClass.getSimpleName());
}
/**
* Generate concrete class of generic class. e.g.: {@code List<String>}
*
* @param rawClass is generic class, like {@code List.class}
* @param parameterType is parameter type of generic class
* @param nameGenerator is generated class name
* @param <T> parameter type
* @return generated class
*/
public static <T> Class<?> generateConcreteClass(Class<?> rawClass, Class<T> parameterType,
Supplier<String> nameGenerator) {
var concreteType = parameterizedType(rawClass, parameterType).build();
try (var unloaded = new ByteBuddy()
.subclass(concreteType)
.name(nameGenerator.get())
.make()) {
return unloaded.load(rawClass.getClassLoader()).getLoaded();
} catch (IOException e) {
// Should never happen
throw Exceptions.propagate(e);
}
}
}

View File

@ -0,0 +1,67 @@
package run.halo.app.plugin.extensionpoint;
import java.util.Set;
import org.pf4j.ExtensionPoint;
import org.springframework.context.ApplicationContext;
import org.springframework.stereotype.Component;
import reactor.core.publisher.Flux;
import reactor.core.publisher.Mono;
import run.halo.app.infra.SystemConfigurableEnvironmentFetcher;
import run.halo.app.infra.SystemSetting.ExtensionPointEnabled;
import run.halo.app.plugin.HaloPluginManager;
@Component
public class DefaultExtensionGetter implements ExtensionGetter {
private final SystemConfigurableEnvironmentFetcher systemConfigFetcher;
private final HaloPluginManager pluginManager;
private final ApplicationContext applicationContext;
public DefaultExtensionGetter(SystemConfigurableEnvironmentFetcher systemConfigFetcher,
HaloPluginManager pluginManager, ApplicationContext applicationContext) {
this.systemConfigFetcher = systemConfigFetcher;
this.pluginManager = pluginManager;
this.applicationContext = applicationContext;
}
@Override
public <T extends ExtensionPoint> Mono<T> getEnabledExtension(Class<T> extensionPoint) {
return systemConfigFetcher.fetch(ExtensionPointEnabled.GROUP, ExtensionPointEnabled.class)
.switchIfEmpty(Mono.just(ExtensionPointEnabled.EMPTY))
.mapNotNull(enabled -> {
var implClassNames = enabled.getOrDefault(extensionPoint.getName(), Set.of());
return pluginManager.getExtensions(extensionPoint)
.stream()
.filter(impl -> implClassNames.contains(impl.getClass().getName()))
.findFirst()
// Fallback to local implementation of the extension point.
// This will happen when no proper configuration is found.
.orElseGet(() ->
applicationContext.getBeanProvider(extensionPoint).getIfAvailable());
});
}
@Override
public <T extends ExtensionPoint> Flux<T> getEnabledExtensions(Class<T> extensionPoint) {
return systemConfigFetcher.fetch(ExtensionPointEnabled.GROUP, ExtensionPointEnabled.class)
.switchIfEmpty(Mono.just(ExtensionPointEnabled.EMPTY))
.flatMapMany(enabled -> {
var implClassNames = enabled.getOrDefault(extensionPoint.getName(), Set.of());
var extensions = pluginManager.getExtensions(extensionPoint)
.stream()
.filter(impl -> implClassNames.contains(impl.getClass().getName()))
.toList();
if (extensions.isEmpty()) {
extensions = applicationContext.getBeanProvider(extensionPoint)
.orderedStream()
// we only fetch one implementation here
.limit(1)
.toList();
}
return Flux.fromIterable(extensions);
});
}
}

View File

@ -0,0 +1,27 @@
package run.halo.app.plugin.extensionpoint;
import org.pf4j.ExtensionPoint;
import reactor.core.publisher.Flux;
import reactor.core.publisher.Mono;
public interface ExtensionGetter {
/**
* Get only one enabled extension from system configuration.
*
* @param extensionPoint is extension point class.
* @return implementation of the corresponding extension point. If no configuration is found,
* we will use the default implementation from application context instead.
*/
<T extends ExtensionPoint> Mono<T> getEnabledExtension(Class<T> extensionPoint);
/**
* Get enabled extension list from system configuration.
*
* @param extensionPoint is extension point class.
* @return implementations of the corresponding extension point. If no configuration is found,
* we will use the default implementation from application context instead.
*/
<T extends ExtensionPoint> Flux<T> getEnabledExtensions(Class<T> extensionPoint);
}

View File

@ -0,0 +1,46 @@
package run.halo.app.search;
import lombok.extern.slf4j.Slf4j;
import org.springdoc.webflux.core.fn.SpringdocRouteBuilder;
import org.springframework.stereotype.Component;
import org.springframework.web.reactive.function.server.RouterFunction;
import org.springframework.web.reactive.function.server.ServerRequest;
import org.springframework.web.reactive.function.server.ServerResponse;
import reactor.core.publisher.Mono;
import run.halo.app.core.extension.endpoint.CustomEndpoint;
import run.halo.app.extension.GroupVersion;
@Component
@Slf4j
public class IndicesEndpoint implements CustomEndpoint {
private final IndicesService indicesService;
private static final String API_VERSION = "api.console.halo.run/v1alpha1";
public IndicesEndpoint(IndicesService indicesService) {
this.indicesService = indicesService;
}
@Override
public RouterFunction<ServerResponse> endpoint() {
final var tag = API_VERSION + "/Indices";
return SpringdocRouteBuilder.route()
.POST("indices/post", this::rebuildPostIndices,
builder -> builder.operationId("BuildPostIndices")
.tag(tag)
.description("Build or rebuild post indices for full text search"))
.build();
}
private Mono<ServerResponse> rebuildPostIndices(ServerRequest request) {
return indicesService.rebuildPostIndices()
.then(Mono.defer(() -> ServerResponse.ok().bodyValue("Rebuild post indices")));
}
@Override
public GroupVersion groupVersion() {
return GroupVersion.parseAPIVersion(API_VERSION);
}
}

View File

@ -0,0 +1,36 @@
package run.halo.app.search;
import java.util.concurrent.CountDownLatch;
import lombok.extern.slf4j.Slf4j;
import org.springframework.context.event.EventListener;
import org.springframework.scheduling.annotation.Async;
import org.springframework.stereotype.Component;
import org.springframework.util.StopWatch;
import run.halo.app.infra.SchemeInitializedEvent;
@Slf4j
@Component
public class IndicesInitializer {
private final IndicesService indicesService;
public IndicesInitializer(IndicesService indicesService) {
this.indicesService = indicesService;
}
@Async
@EventListener(SchemeInitializedEvent.class)
public void whenSchemeInitialized(SchemeInitializedEvent event) throws InterruptedException {
var latch = new CountDownLatch(1);
log.info("Initialize post indices...");
var watch = new StopWatch("PostIndicesWatch");
watch.start("rebuild");
indicesService.rebuildPostIndices()
.doFinally(signalType -> latch.countDown())
.subscribe();
latch.await();
watch.stop();
log.info("Initialized post indices. Usage: {}", watch);
}
}

View File

@ -0,0 +1,9 @@
package run.halo.app.search;
import reactor.core.publisher.Mono;
public interface IndicesService {
Mono<Void> rebuildPostIndices();
}

View File

@ -0,0 +1,47 @@
package run.halo.app.search;
import org.springframework.stereotype.Service;
import reactor.core.Exceptions;
import reactor.core.publisher.Flux;
import reactor.core.publisher.Mono;
import reactor.core.scheduler.Schedulers;
import run.halo.app.core.extension.Post;
import run.halo.app.plugin.extensionpoint.ExtensionGetter;
import run.halo.app.search.post.PostDoc;
import run.halo.app.search.post.PostSearchService;
import run.halo.app.theme.finders.PostFinder;
@Service
public class IndicesServiceImpl implements IndicesService {
private final ExtensionGetter extensionGetter;
private final PostFinder postFinder;
public IndicesServiceImpl(ExtensionGetter extensionGetter, PostFinder postFinder) {
this.extensionGetter = extensionGetter;
this.postFinder = postFinder;
}
@Override
public Mono<Void> rebuildPostIndices() {
return extensionGetter.getEnabledExtension(PostSearchService.class)
// TODO Optimize listing posts with non-blocking.
.flatMap(searchService -> Flux.fromStream(() -> postFinder.list(0, 0)
.stream()
.filter(post -> Post.isPublished(post.getMetadata()))
.peek(post -> postFinder.content(post.getMetadata().getName()))
.map(PostDoc::from))
.subscribeOn(Schedulers.boundedElastic())
.limitRate(100)
.buffer(100)
.doOnNext(postDocs -> {
try {
searchService.addDocuments(postDocs);
} catch (Exception e) {
throw Exceptions.propagate(e);
}
})
.then()
);
}
}

View File

@ -0,0 +1,63 @@
package run.halo.app.search;
import io.swagger.v3.oas.annotations.media.Schema;
import org.springframework.util.MultiValueMap;
import org.springframework.util.StringUtils;
import org.springframework.web.server.ServerWebInputException;
public class SearchParam {
private static final int DEFAULT_LIMIT = 10;
private static final String DEFAULT_HIGHLIGHT_PRE_TAG = "<B>";
private static final String DEFAULT_HIGHLIGHT_POST_TAG = "</B>";
private final MultiValueMap<String, String> query;
public SearchParam(MultiValueMap<String, String> query) {
this.query = query;
}
@Schema(name = "keyword", required = true)
public String getKeyword() {
var keyword = query.getFirst("keyword");
if (!StringUtils.hasText(keyword)) {
throw new ServerWebInputException("keyword is required");
}
return keyword;
}
@Schema(name = "limit", defaultValue = "100", maximum = "1000")
public int getLimit() {
var limitString = query.getFirst("limit");
int limit = 0;
if (StringUtils.hasText(limitString)) {
try {
limit = Integer.parseInt(limitString);
} catch (NumberFormatException nfe) {
throw new ServerWebInputException("Failed to get ");
}
}
if (limit <= 0) {
limit = DEFAULT_LIMIT;
}
return limit;
}
@Schema(name = "highlightPreTag", defaultValue = DEFAULT_HIGHLIGHT_PRE_TAG)
public String getHighlightPreTag() {
var highlightPreTag = query.getFirst("highlightPreTag");
if (!StringUtils.hasText(highlightPreTag)) {
highlightPreTag = DEFAULT_HIGHLIGHT_PRE_TAG;
}
return highlightPreTag;
}
@Schema(name = "highlightPostTag", defaultValue = DEFAULT_HIGHLIGHT_POST_TAG)
public String getHighlightPostTag() {
var highlightPostTag = query.getFirst("highlightPostTag");
if (!StringUtils.hasText(highlightPostTag)) {
highlightPostTag = DEFAULT_HIGHLIGHT_POST_TAG;
}
return highlightPostTag;
}
}

View File

@ -0,0 +1,13 @@
package run.halo.app.search;
import java.util.List;
import lombok.Data;
@Data
public class SearchResult<T> {
private List<T> hits;
private String keyword;
private Long total;
private int limit;
private long processingTimeMillis;
}

View File

@ -0,0 +1,39 @@
package run.halo.app.search.extension;
import io.swagger.v3.oas.annotations.media.Schema;
import lombok.Data;
import lombok.EqualsAndHashCode;
import lombok.ToString;
import run.halo.app.extension.AbstractExtension;
import run.halo.app.extension.GVK;
import run.halo.app.extension.Ref;
@Data
@ToString(callSuper = true)
@EqualsAndHashCode(callSuper = true)
@GVK(group = "plugin.halo.run", version = "v1alpha1", kind = "SearchEngine",
plural = "searchengines", singular = "searchengine")
public class SearchEngine extends AbstractExtension {
@Schema(required = true)
private SearchEngineSpec spec;
@Data
public static class SearchEngineSpec {
private String logo;
private String website;
@Schema(required = true)
private String displayName;
private String description;
private Ref settingRef;
private String postSearchImpl;
}
}

View File

@ -0,0 +1,196 @@
package run.halo.app.search.post;
import static org.apache.commons.lang3.StringUtils.stripToEmpty;
import static org.apache.lucene.document.Field.Store.NO;
import static org.apache.lucene.document.Field.Store.YES;
import static org.apache.lucene.index.IndexWriterConfig.OpenMode.APPEND;
import static org.apache.lucene.index.IndexWriterConfig.OpenMode.CREATE_OR_APPEND;
import java.io.IOException;
import java.time.Instant;
import java.util.ArrayList;
import java.util.List;
import java.util.Set;
import lombok.extern.slf4j.Slf4j;
import org.apache.commons.lang3.StringUtils;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.LongPoint;
import org.apache.lucene.document.StoredField;
import org.apache.lucene.document.StringField;
import org.apache.lucene.document.TextField;
import org.apache.lucene.index.DirectoryReader;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.index.Term;
import org.apache.lucene.search.FuzzyQuery;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.Sort;
import org.apache.lucene.search.highlight.Highlighter;
import org.apache.lucene.search.highlight.InvalidTokenOffsetsException;
import org.apache.lucene.search.highlight.QueryScorer;
import org.apache.lucene.search.highlight.SimpleFragmenter;
import org.apache.lucene.search.highlight.SimpleHTMLFormatter;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;
import org.jsoup.Jsoup;
import org.jsoup.safety.Safelist;
import org.springframework.beans.factory.DisposableBean;
import org.springframework.stereotype.Service;
import org.springframework.util.StopWatch;
import org.wltea.analyzer.lucene.IKAnalyzer;
import reactor.core.Exceptions;
import run.halo.app.infra.properties.HaloProperties;
import run.halo.app.search.SearchParam;
import run.halo.app.search.SearchResult;
@Service
@Slf4j
public class LucenePostSearchService implements PostSearchService, DisposableBean {
public static final int MAX_FRAGMENT_SIZE = 100;
private final Analyzer analyzer;
private final Directory postIndexDir;
public LucenePostSearchService(HaloProperties haloProperties)
throws IOException {
analyzer = new IKAnalyzer();
var postIdxPath = haloProperties.getWorkDir().resolve("indices/posts");
postIndexDir = FSDirectory.open(postIdxPath);
}
@Override
public SearchResult<PostHit> search(SearchParam param) throws Exception {
var dirReader = DirectoryReader.open(postIndexDir);
var searcher = new IndexSearcher(dirReader);
var keyword = param.getKeyword();
var watch = new StopWatch("SearchWatch");
watch.start("search for " + keyword);
var query = buildQuery(keyword);
var topDocs = searcher.search(query, param.getLimit(), Sort.RELEVANCE);
watch.stop();
var highlighter = new Highlighter(
new SimpleHTMLFormatter(param.getHighlightPreTag(), param.getHighlightPostTag()),
new QueryScorer(query));
highlighter.setTextFragmenter(new SimpleFragmenter(MAX_FRAGMENT_SIZE));
var hits = new ArrayList<PostHit>(topDocs.scoreDocs.length);
for (var scoreDoc : topDocs.scoreDocs) {
hits.add(convert(searcher.doc(scoreDoc.doc), highlighter));
}
var result = new SearchResult<PostHit>();
result.setHits(hits);
result.setTotal(topDocs.totalHits.value);
result.setKeyword(param.getKeyword());
result.setLimit(param.getLimit());
result.setProcessingTimeMillis(watch.getTotalTimeMillis());
return result;
}
@Override
public void addDocuments(List<PostDoc> posts) throws IOException {
var writeConfig = new IndexWriterConfig(analyzer);
writeConfig.setOpenMode(CREATE_OR_APPEND);
try (var writer = new IndexWriter(postIndexDir, writeConfig)) {
posts.forEach(post -> {
var doc = this.convert(post);
try {
var seqNum =
writer.updateDocument(new Term(PostDoc.ID_FIELD, post.getName()), doc);
if (log.isDebugEnabled()) {
log.debug("Updated document({}) with sequence number {} returned",
post.getName(), seqNum);
}
} catch (IOException e) {
throw Exceptions.propagate(e);
}
});
}
}
@Override
public void removeDocuments(Set<String> postNames) throws IOException {
var writeConfig = new IndexWriterConfig(analyzer);
writeConfig.setOpenMode(APPEND);
try (var writer = new IndexWriter(postIndexDir, writeConfig)) {
var terms = postNames.stream()
.map(postName -> new Term(PostDoc.ID_FIELD, postName))
.toArray(Term[]::new);
long seqNum = writer.deleteDocuments(terms);
log.debug("Deleted documents({}) with sequence number {}", terms.length, seqNum);
}
}
@Override
public void destroy() throws Exception {
analyzer.close();
postIndexDir.close();
}
private Query buildQuery(String keyword) {
keyword = stripToEmpty(keyword).toLowerCase();
if (log.isDebugEnabled()) {
log.debug("Trying to search for keyword: {}", keyword);
}
return new FuzzyQuery(new Term("searchable", keyword));
}
private Document convert(PostDoc post) {
var doc = new Document();
doc.add(new StringField("name", post.getName(), YES));
doc.add(new StoredField("title", post.getTitle()));
var content = Jsoup.clean(stripToEmpty(post.getExcerpt()) + stripToEmpty(post.getContent()),
Safelist.none());
doc.add(new StoredField("content", content));
doc.add(new TextField("searchable", post.getTitle() + content, NO));
long publishTimestamp = post.getPublishTimestamp().toEpochMilli();
doc.add(new LongPoint("publishTimestamp", publishTimestamp));
doc.add(new StoredField("publishTimestamp", publishTimestamp));
doc.add(new StoredField("permalink", post.getPermalink()));
return doc;
}
private PostHit convert(Document doc, Highlighter highlighter)
throws IOException, InvalidTokenOffsetsException {
var post = new PostHit();
post.setName(doc.get("name"));
var title = getHighlightedText(doc, "title", highlighter, MAX_FRAGMENT_SIZE);
post.setTitle(title);
var content = getHighlightedText(doc, "content", highlighter, MAX_FRAGMENT_SIZE);
post.setContent(content);
var publishTimestamp = doc.getField("publishTimestamp").numericValue().longValue();
post.setPublishTimestamp(Instant.ofEpochMilli(publishTimestamp));
post.setPermalink(doc.get("permalink"));
return post;
}
private String getHighlightedText(Document doc, String field, Highlighter highlighter,
int maxLength)
throws InvalidTokenOffsetsException, IOException {
try {
var highlightedText = highlighter.getBestFragment(analyzer, field, doc.get(field));
if (highlightedText != null) {
return highlightedText;
}
} catch (IllegalArgumentException iae) {
// TODO we have to ignore the error currently due to no solution about the error.
if (!"boost must be a positive float, got -1.0".equals(iae.getMessage())) {
throw iae;
}
}
// handle if there is not highlighted text
var fieldValue = doc.get(field);
return StringUtils.substring(fieldValue, 0, maxLength);
}
}

View File

@ -0,0 +1,36 @@
package run.halo.app.search.post;
import java.time.Instant;
import lombok.Data;
import run.halo.app.theme.finders.vo.PostVo;
@Data
public class PostDoc {
public static final String ID_FIELD = "name";
private String name;
private String title;
private String excerpt;
private String content;
private Instant publishTimestamp;
private String permalink;
// TODO Move this static method to other place.
public static PostDoc from(PostVo postVo) {
var post = new PostDoc();
post.setName(postVo.getMetadata().getName());
post.setTitle(postVo.getSpec().getTitle());
post.setExcerpt(postVo.getStatus().getExcerpt());
post.setPublishTimestamp(postVo.getSpec().getPublishTime());
post.setContent(postVo.getContent().getContent());
post.setPermalink(postVo.getStatus().getPermalink());
return post;
}
}

View File

@ -0,0 +1,76 @@
package run.halo.app.search.post;
import java.util.List;
import java.util.Set;
import java.util.concurrent.CountDownLatch;
import org.springframework.context.event.EventListener;
import org.springframework.scheduling.annotation.Async;
import org.springframework.stereotype.Component;
import reactor.core.Exceptions;
import run.halo.app.event.post.PostPublishedEvent;
import run.halo.app.event.post.PostRecycledEvent;
import run.halo.app.event.post.PostUnpublishedEvent;
import run.halo.app.plugin.extensionpoint.ExtensionGetter;
import run.halo.app.theme.finders.PostFinder;
@Component
public class PostEventListener {
private final ExtensionGetter extensionGetter;
private final PostFinder postFinder;
public PostEventListener(ExtensionGetter extensionGetter,
PostFinder postFinder) {
this.extensionGetter = extensionGetter;
this.postFinder = postFinder;
}
@Async
@EventListener(PostPublishedEvent.class)
public void handlePostPublished(PostPublishedEvent publishedEvent) throws InterruptedException {
var postVo = postFinder.getByName(publishedEvent.getPostName());
var postDoc = PostDoc.from(postVo);
var latch = new CountDownLatch(1);
extensionGetter.getEnabledExtension(PostSearchService.class)
.doOnNext(searchService -> {
try {
searchService.addDocuments(List.of(postDoc));
} catch (Exception e) {
throw Exceptions.propagate(e);
}
})
.doFinally(signalType -> latch.countDown())
.subscribe();
latch.await();
}
@Async
@EventListener(PostUnpublishedEvent.class)
public void handlePostUnpublished(PostUnpublishedEvent unpublishedEvent)
throws InterruptedException {
deletePostDoc(unpublishedEvent.getPostName());
}
@Async
@EventListener(PostRecycledEvent.class)
public void handlePostRecycled(PostRecycledEvent recycledEvent) throws InterruptedException {
deletePostDoc(recycledEvent.getPostName());
}
void deletePostDoc(String postName) throws InterruptedException {
var latch = new CountDownLatch(1);
extensionGetter.getEnabledExtension(PostSearchService.class)
.doOnNext(searchService -> {
try {
searchService.removeDocuments(Set.of(postName));
} catch (Exception e) {
throw Exceptions.propagate(e);
}
})
.doFinally(signalType -> latch.countDown())
.subscribe();
latch.await();
}
}

View File

@ -0,0 +1,19 @@
package run.halo.app.search.post;
import java.time.Instant;
import lombok.Data;
@Data
public class PostHit {
private String name;
private String title;
private String content;
private Instant publishTimestamp;
private String permalink;
}

View File

@ -0,0 +1,69 @@
package run.halo.app.search.post;
import static run.halo.app.extension.router.QueryParamBuildUtil.buildParametersFromType;
import static run.halo.app.infra.utils.GenericClassUtils.generateConcreteClass;
import org.springdoc.core.fn.builders.apiresponse.Builder;
import org.springdoc.webflux.core.fn.SpringdocRouteBuilder;
import org.springframework.stereotype.Component;
import org.springframework.web.reactive.function.server.RouterFunction;
import org.springframework.web.reactive.function.server.ServerRequest;
import org.springframework.web.reactive.function.server.ServerResponse;
import reactor.core.Exceptions;
import reactor.core.publisher.Mono;
import run.halo.app.core.extension.endpoint.CustomEndpoint;
import run.halo.app.extension.GroupVersion;
import run.halo.app.plugin.extensionpoint.ExtensionGetter;
import run.halo.app.search.SearchParam;
import run.halo.app.search.SearchResult;
@Component
public class PostSearchEndpoint implements CustomEndpoint {
private static final String API_VERSION = "api.halo.run/v1alpha1";
private final ExtensionGetter extensionGetter;
public PostSearchEndpoint(ExtensionGetter extensionGetter) {
this.extensionGetter = extensionGetter;
}
@Override
public RouterFunction<ServerResponse> endpoint() {
final var tag = API_VERSION + "/Post";
return SpringdocRouteBuilder.route()
.GET("indices/post", this::search,
builder -> {
builder.operationId("SearchPost")
.tag(tag)
.description("Search posts with fuzzy query")
.response(Builder.responseBuilder().implementation(
generateConcreteClass(SearchResult.class, PostHit.class,
() -> "PostHits")));
buildParametersFromType(builder, SearchParam.class);
}
)
.build();
}
private Mono<ServerResponse> search(ServerRequest request) {
return Mono.fromSupplier(
() -> new SearchParam(request.queryParams()))
.flatMap(param -> extensionGetter.getEnabledExtension(PostSearchService.class)
.switchIfEmpty(Mono.error(() ->
new RuntimeException("Please enable any post search service before searching")))
.map(searchService -> {
try {
return searchService.search(param);
} catch (Exception e) {
throw Exceptions.propagate(e);
}
}))
.flatMap(result -> ServerResponse.ok().bodyValue(result));
}
@Override
public GroupVersion groupVersion() {
return GroupVersion.parseAPIVersion(API_VERSION);
}
}

View File

@ -0,0 +1,17 @@
package run.halo.app.search.post;
import java.util.List;
import java.util.Set;
import org.pf4j.ExtensionPoint;
import run.halo.app.search.SearchParam;
import run.halo.app.search.SearchResult;
public interface PostSearchService extends ExtensionPoint {
SearchResult<PostHit> search(SearchParam searchParam) throws Exception;
void addDocuments(List<PostDoc> posts) throws Exception;
void removeDocuments(Set<String> postNames) throws Exception;
}

View File

@ -0,0 +1,10 @@
apiVersion: plugin.halo.run/v1alpha1
kind: SearchEngine
metadata:
name: lucene
spec:
logo: https://lucene.apache.org/theme/images/lucene/lucene_logo_green_300.png
website: https://lucene.apache.org/
displayName: Lucene
description: Apache Lucene is a high-performance, full-featured search engine library written entirely in Java. It is a technology suitable for nearly any application that requires structured search, full-text search, faceting, nearest-neighbor search across high-dimensionality vectors, spell correction or query suggestions.
postSearchImpl: run.halo.app.search.post.LucenePostSearchService

View File

@ -0,0 +1,31 @@
package run.halo.app.core.extension;
import static org.junit.jupiter.api.Assertions.assertEquals;
import static org.mockito.Mockito.when;
import java.util.Map;
import java.util.function.Function;
import org.junit.jupiter.api.Test;
import org.mockito.Mockito;
import run.halo.app.extension.MetadataOperator;
class PostTest {
@Test
void staticIsPublishedTest() {
var test = (Function<Map<String, String>, Boolean>) (labels) -> {
var metadata = Mockito.mock(MetadataOperator.class);
when(metadata.getLabels()).thenReturn(labels);
return Post.isPublished(metadata);
};
assertEquals(false, test.apply(Map.of()));
assertEquals(false, test.apply(Map.of("content.halo.run/published", "false")));
assertEquals(false, test.apply(Map.of("content.halo.run/published", "False")));
assertEquals(false, test.apply(Map.of("content.halo.run/published", "0")));
assertEquals(false, test.apply(Map.of("content.halo.run/published", "1")));
assertEquals(false, test.apply(Map.of("content.halo.run/published", "T")));
assertEquals(false, test.apply(Map.of("content.halo.run/published", "")));
assertEquals(true, test.apply(Map.of("content.halo.run/published", "true")));
assertEquals(true, test.apply(Map.of("content.halo.run/published", "True")));
}
}

View File

@ -3,6 +3,8 @@ package run.halo.app.core.extension.endpoint;
import static org.assertj.core.api.Assertions.assertThat;
import static org.mockito.ArgumentMatchers.any;
import static org.mockito.ArgumentMatchers.eq;
import static org.mockito.ArgumentMatchers.isA;
import static org.mockito.Mockito.doNothing;
import static org.mockito.Mockito.when;
import org.junit.jupiter.api.BeforeEach;
@ -11,12 +13,14 @@ import org.junit.jupiter.api.extension.ExtendWith;
import org.mockito.InjectMocks;
import org.mockito.Mock;
import org.mockito.junit.jupiter.MockitoExtension;
import org.springframework.context.ApplicationEventPublisher;
import org.springframework.test.web.reactive.server.WebTestClient;
import reactor.core.publisher.Mono;
import run.halo.app.content.PostRequest;
import run.halo.app.content.PostService;
import run.halo.app.content.TestPost;
import run.halo.app.core.extension.Post;
import run.halo.app.event.post.PostPublishedEvent;
import run.halo.app.extension.ReactiveExtensionClient;
/**
@ -27,15 +31,19 @@ import run.halo.app.extension.ReactiveExtensionClient;
*/
@ExtendWith(MockitoExtension.class)
class PostEndpointTest {
@Mock
private PostService postService;
PostService postService;
@Mock
private ReactiveExtensionClient client;
ReactiveExtensionClient client;
@Mock
ApplicationEventPublisher eventPublisher;
@InjectMocks
private PostEndpoint postEndpoint;
PostEndpoint postEndpoint;
private WebTestClient webTestClient;
WebTestClient webTestClient;
@BeforeEach
void setUp() {
@ -75,9 +83,10 @@ class PostEndpointTest {
void publishPost() {
Post post = TestPost.postV1();
when(postService.publishPost(any())).thenReturn(Mono.just(post));
when(client.fetch(eq(Post.class), eq(post.getMetadata().getName())))
when(client.get(eq(Post.class), eq(post.getMetadata().getName())))
.thenReturn(Mono.just(post));
when(client.update(any())).thenReturn(Mono.just(post));
doNothing().when(eventPublisher).publishEvent(isA(PostPublishedEvent.class));
webTestClient.put()
.uri("/posts/post-A/publish")

View File

@ -0,0 +1,30 @@
package run.halo.app.infra;
import static org.junit.jupiter.api.Assertions.assertTrue;
import org.junit.jupiter.api.Nested;
import org.junit.jupiter.api.Test;
import run.halo.app.infra.SystemSetting.ExtensionPointEnabled;
import run.halo.app.infra.utils.JsonUtils;
class SystemSettingTest {
@Nested
class ExtensionPointEnabledTest {
@Test
void deserializeTest() {
var json = """
{
"run.halo.app.search.post.PostSearchService": [
"run.halo.app.search.post.LucenePostSearchService"
]
}
""";
var enabled = JsonUtils.jsonToObject(json, ExtensionPointEnabled.class);
assertTrue(enabled.containsKey("run.halo.app.search.post.PostSearchService"));
}
}
}

View File

@ -19,7 +19,8 @@ import run.halo.app.extension.ReactiveExtensionClient;
@SpringBootTest(properties = {"halo.security.initializer.disabled=false",
"halo.security.initializer.super-admin-username=fake-admin",
"halo.security.initializer.super-admin-password=fake-password"})
"halo.security.initializer.super-admin-password=fake-password",
"halo.required-extension-disabled=true"})
@AutoConfigureWebTestClient
@AutoConfigureTestDatabase
class SuperAdminInitializerTest {