龙空技术网

elastic search(四)基本查询restful与java实现

JAVA面试官 73

前言:

如今我们对“restfuljava”大约比较关注,小伙伴们都想要了解一些“restfuljava”的相关资讯。那么小编同时在网摘上搜集了一些关于“restfuljava””的相关内容,希望大家能喜欢,咱们快快来了解一下吧!

查询才是ES的重中之重

4.1 准备测试数据

本文测试数据的格式如下,是一个类似文章的数据结构

本文提前录入了一批数据,读者朋友在跑示例时也可以提前泡入几十条数据

测试数据的mapping

{  "mappings": {    "_doc": {      "properties": {        "authors": {          "type": "keyword"        },        "code": {          "type": "keyword"        },        "content": {          "type": "text",          "analyzer": "ik_smart"        },        "id": {          "type": "keyword"        },        "orgName": {          "type": "keyword"        },        "pubTime": {          "type": "integer"        },        "title": {          "type": "text",          "analyzer": "ik_smart"        },        "type": {          "type": "keyword"        }      }    }  }}
4.2 term查询

term查询代表完全匹配,搜索之前不会对搜索关键词进行分词。

term查询text内容,会对text的分词结果进行匹配

restful请求

### termPOST /points/_search{  "from": 0,  "size": 2,   "query": {    "term": {      "authors": {        "value": "李某某"      }    }  }}

Java代码

@Testpublic void term() throws IOException {    // 构建SearchRequest    SearchRequest searchRequest = new SearchRequest("points");    // 指定查询条件    SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();    searchSourceBuilder.from(0);    searchSourceBuilder.size(2);    searchSourceBuilder.query(QueryBuilders.termsQuery("author", "李某某"));    searchRequest.source(searchSourceBuilder);    // 执行查询    SearchResponse searchResponse = ClientHelper.client().search(searchRequest, RequestOptions.DEFAULT);    // 得到source的数据    for (SearchHit hit : searchResponse.getHits().getHits()) {        Map<String, Object> item = hit.getSourceAsMap();        System.out.println(item);    }}
4.3 terms查询

terms和term查询是一样的,不进行分词,完全匹配

terms针对一个字段包含多个值匹配时使用,等同于mysql的IN

restful

### termPOST /points/_search{  "from": 0,  "size": 2,   "query": {    "term": {      "authors": {        "value": "李某某"      }    }  }}

Java代码

@Testpublic void terms() throws IOException {    // 构建SearchRequest    SearchRequest searchRequest = new SearchRequest("points");    // 指定查询条件    SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();    searchSourceBuilder.from(0);    searchSourceBuilder.size(2);    searchSourceBuilder.query(QueryBuilders.termsQuery("author", "李某某", "刘某某", "王某某"));    searchRequest.source(searchSourceBuilder);    // 执行查询    SearchResponse searchResponse = ClientHelper.client().search(searchRequest, RequestOptions.DEFAULT);    // 得到source的数据    for (SearchHit hit : searchResponse.getHits().getHits()) {        Map<String, Object> item = hit.getSourceAsMap();        System.out.println(item);    }}
4.4 match查询 match_all

match查询属于高级查询,会根据查询字段的类型,选择不同的查询方式

查询的是日期或数值,会自动转换

查询的是keyword,match不会对查询词进行分词

查询的是可以分词的内容,text,match会根据分词规则去指定处理

match查询,其实底层是多个term查询

match_all查询restful。匹配所有的数据。注意:ES查询数据默认值返回10条。

POST /points/_search{  "query": {    "match_all": {}     }}

match_all的java实现。

@Testpublic void matchAll() throws IOException {    // 构建SearchRequest    SearchRequest searchRequest = new SearchRequest("points");    // 指定查询条件    SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();    searchSourceBuilder.query(QueryBuilders.matchAllQuery());    searchRequest.source(searchSourceBuilder);    // 执行查询    SearchResponse searchResponse = ClientHelper.client().search(searchRequest, RequestOptions.DEFAULT);    // 得到source的数据    for (SearchHit hit : searchResponse.getHits().getHits()) {        Map<String, Object> item = hit.getSourceAsMap();        System.out.println(item);    }}

本小节将抽出公共方法 query,后续java示例专注于构造query条件

// 公共方法public void query(QueryBuilder query) throws IOException {    // 构建SearchRequest    SearchRequest searchRequest = new SearchRequest("points");    // 指定查询条件    SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();    searchSourceBuilder.query(query);    searchRequest.source(searchSourceBuilder);    // 执行查询    SearchResponse searchResponse = ClientHelper.client().search(searchRequest, RequestOptions.DEFAULT);    // 得到source的数据    for (SearchHit hit : searchResponse.getHits().getHits()) {        Map<String, Object> item = hit.getSourceAsMap();        System.out.println(item);    }}
4.5 match查询和bool match 查询

【1】match restful

POST /points/_search{  "query": {    "match": {      "title":"家居建材"    }     }}

注意返回有一个_score的字段,表示匹配程度

match的java实现

@Testpublic void match() throws IOException {    QueryBuilder queryBuilder = QueryBuilders.matchQuery("title","家居建材");    this.query(queryBuilder);}

【2】bool match 基于一个field匹配的内容,采用and或者or的方式

POST /points/_search{  "query": {    "match": {      "title":{        "query": "家具 建材",        "operator":"and" #既包含家具,也包含建材      }    }     }}POST /points/_search{  "query": {    "match": {      "title":{        "query": "家具 建材",        "operator":"or" #包括家具、或者包括建材      }    }     }}

java

@Testpublic void boolMatchAnd() throws IOException {    QueryBuilder queryBuilder = QueryBuilders.matchQuery("title","家居 建材").operator(Operator.AND);    this.query(queryBuilder);}@Testpublic void boolMatchOr() throws IOException {    QueryBuilder queryBuilder = QueryBuilders.matchQuery("title","家居 建材").operator(Operator.OR);    this.query(queryBuilder);}
4.6 multiMatch查询

match是针对一个field做检索,multi_match是针对多个field对应一个text的文本

restful

POST /points/_search{  "query": {    "multi_match": {        "query": "公司",        "fields": ["title","orgName"]    }     }}

Java

@Testpublic void multiMatch() throws IOException {    QueryBuilder queryBuilder = QueryBuilders.multiMatchQuery("公司", "title", "orgName");    this.query(queryBuilder);}
4.7 id和ids查询

id查询等同于SQL的where id = ?

id 的restful

GET /points/_doc/6060

id 的java

@Testpublic void getById() throws IOException {    // 创建request    GetRequest getRequest = new GetRequest("points" , "6060");    // 执行    GetResponse getResponse = ClientHelper.client().get(getRequest , RequestOptions.DEFAULT);    System.out.println(getResponse.getSourceAsMap());}

ids 等同于 where id in

ids 的restful

POST /points/_search{  "query": {    "ids": {        "values": ["6060","6061"]            }     }}

ids的java

@Testpublic void getByIds() throws IOException {    QueryBuilder queryBuilder = QueryBuilders.idsQuery().addIds("6060","6061");    this.query(queryBuilder);}
4.8 prefix查询

通过一个关键字去指定一个字段的前缀

restful

POST /points/_search{  "query": {    "prefix": {      "orgName": {        "value": "字节"      }    }  }}

java

@Testpublic void prefix() throws IOException {    QueryBuilder queryBuilder = QueryBuilders.prefixQuery("orgName" , "字节");    this.query(queryBuilder);}
4.9 fuzzy查询

输入字符的大概,ES就会根据内容的大概去匹配一个结果,哪怕有错别字

可以指定至少匹配多少个字符

restful

POST /points/_search{  "query": {    "fuzzy": {      "orgName": {        "value": "今日后条",        "prefix_length": 2      }    }  }}

java

@Testpublic void fuzzy() throws IOException {    QueryBuilder queryBuilder = QueryBuilders            .fuzzyQuery("orgName", "今日后条")            .prefixLength(2);    this.query(queryBuilder);}
4.10 wildcard查询

通配符查询,等同于SQL的like。可以指定通配符*和占位符?

restful

POST /points/_search{  "query": {    "wildcard": {      "orgName": {        "value": "中国*"  #指定通配符和占位符      }    }  }}

java

@Testpublic void wildcard() throws IOException {    QueryBuilder queryBuilder = QueryBuilders            .wildcardQuery("orgName", "中国???");    this.query(queryBuilder);}
4.11 range查询

范围查询,只针对数值查询,对一个字段进行大于小于查询。

restful

POST /points/_search{  "query": {    "range": {      "pubTime": {        "gte": 20230810,        "lte": 20230830      }    }  }}

java

@Testpublic void range() throws IOException {    QueryBuilder queryBuilder = QueryBuilders            .rangeQuery("pubTime")            .gte(20230810)            .lte(20230830);    this.query(queryBuilder);}
4.12 regexp查询

正则。通过正则表达式去匹配内容

prefix、wildcard、fuzzy、regexp的查询效率比较低

restful

POST /points/_search{  "query": {    "regexp": {      "code":"758[0-9]{10}"    }  }}

java

@Testpublic void regexp() throws IOException {    QueryBuilder queryBuilder = QueryBuilders.regexpQuery("code" , "758[0-9]{10}");    this.query(queryBuilder);}
4.13 深分页scroll查询

ES 对from+size是有限制的,from和size的结果不能超过1W

from+size ES 查询数据的方式:

【1】先对关键字分词,去分词库检索,得到多个ID

【2】去各个分片中去拉取指定的数据(耗时)

【3】将数据根据score进行排序(耗时)

【4】根据from、size将查询到的数据舍弃一部分,返回结果

scroll ES查询数据

【1】先对关键字分词,去分词库检索,得到多个ID

【2】将文档的ID存到ES的一个上下文

【3】根据size检索指定数据,拿完的数据ID,会从上下文中移除

【4】再次查找,直接去上下文中查找ID,再去检索

优缺点

scroll查询不适合做实时查询,适合做后台管理

restful 1m表示scroll上下文的缓存时间

#第一步GET points/_search?scroll=1m{  "query": {    "match_all": {}  },  "size": 20,  "sort": [    {      "pubTime": {        "order": "desc"      }    }  ]}# 将返回一个scrollId# 第2步POST _search/scroll{  "scroll_id":"#scrollID#",  "scroll":"1m"}# 删除scroll上下文,中断scrollDELETE _search/scroll/#scrollID#  

java

@Testpublic void scroll() throws IOException {    //【1】开始scroll ***********    // request    SearchRequest searchRequest = new SearchRequest("points");    // scroll    searchRequest.scroll(TimeValue.MINUS_ONE);    // query    SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();    searchSourceBuilder.size(5);    searchSourceBuilder.sort("pubTime", SortOrder.DESC);    searchSourceBuilder.query(QueryBuilders.matchAllQuery());    searchRequest.source(searchSourceBuilder);    // execute    SearchResponse searchResponse = ClientHelper.client().search(searchRequest, RequestOptions.DEFAULT);    // scrollId    String scrollId = searchResponse.getScrollId();    for (SearchHit hit : searchResponse.getHits().getHits()) {        System.out.println(hit.getSourceAsMap());    }    //【2】循环scroll ***********    //    while (true) {        // scrollRequest        SearchScrollRequest searchScrollRequest = new SearchScrollRequest(scrollId);        searchScrollRequest.scroll(TimeValue.MINUS_ONE);        SearchResponse searchResp = ClientHelper.client().search(searchRequest, RequestOptions.DEFAULT);        SearchHit[] searchHits = searchResp.getHits().getHits();        // result        if (searchHits == null) {            break;        }        for (SearchHit searchHit : searchHits) {            System.out.println(searchHit.getSourceAsMap());        }    }    // 【3】删除scrollId ***********    ClearScrollRequest clearScrollRequest = new ClearScrollRequest();    clearScrollRequest.setScrollIds(Lists.newArrayList(scrollId));    ClearScrollResponse clearScrollResponse = ClientHelper.client().clearScroll(clearScrollRequest, RequestOptions.DEFAULT);    System.out.println(clearScrollResponse);}
4.14 delete-by-query

根据term match等查询条件去删除大量文档

如果需要删除大部分数据,推荐新增一个全新索引,然后将需要保留的内容存到新的做引

restful

POST /points/_delete_by_query{  "query": {    "range": {      "pubTime": {        "gte": 20210810,        "lte": 20210830      }    }  }}

java

@Testpublic void deleteByQuery() throws IOException {    // request    DeleteByQueryRequest deleteByQueryRequest = new DeleteByQueryRequest("points");    QueryBuilder queryBuilder = QueryBuilders            .rangeQuery("pubTime")            .gte(20230810)            .lte(20230830);    deleteByQueryRequest.setQuery(queryBuilder);    // execute    BulkByScrollResponse resp = ClientHelper.client().deleteByQuery(deleteByQueryRequest, RequestOptions.DEFAULT);    System.out.println(resp);}

标签: #restfuljava