ElasticSearch

简介

Elasticsearch 是一个分布式、可扩展、实时的搜索与数据分析引擎，建立在一个全文搜索引擎库 Apache Lucene™基础之上，是整个 ElasticStack 技术栈的核心。ES是采用java语言编写，提供了简单易用的RestFul API，开发者可以使用其简单的RestFul API，开发相关的搜索功能，从而避免lucene的复杂性。

优点：

一个分布式的实时文档存储，每个字段可以被索引与搜索
一个分布式实时分析搜索引擎
能胜任上百个服务节点的扩展，并支持 PB 级别的结构化或者非结构化数据

ElasticStack ：包括 Elasticsearch、 Kibana、 Beats 和 Logstash（也称为 ELK Stack）

安装

Docker方式安装

获取镜像：docker pull elasticsearch:7.14.0

创建数据目录：

mkdir -p /opt/docker_app/elasticsearch/config
mkdir -p /opt/docker_app/elasticsearch/logs
mkdir -p /opt/docker_app/elasticsearch/data
mkdir -p /opt/docker_app/elasticsearch/plugins

创建配置文件：

1	echo "http.host: 0.0.0.0">>/opt/docker_app/elasticsearch/config/elasticsearch.yml

系统配置修改：

1
2
3

vim /etc/sysctl.conf
vm.max_map_count=655360
sysctl -p /etc/sysctl.conf/

创建容器：(注意不能用root用户启动)

docker run --name elasticsearch -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" -e ES_JAVA_OPTS="-Xms1g -Xmx2g" --restart=always -v /opt/docker_app/elasticsearch/config/elasticsearch.yml:/config/elasticsearch.yml -v /opt/docker_app/elasticsearch/data:/elasticsearch/data -v /opt/docker_app/elasticsearch/plugins:/elasticsearch/plugins -v /opt/docker_app/elasticsearch/logs:/elasticsearch/logs -d elasticsearch:7.14.0

检查是否连接成功：浏览器输入http://ip:9200
安装成功

Kibana

Kibana是一个针对Elasticsearch的开源分析及可视化平台，使用Kibana可以查询、查看并与存储在ES索引的数据进行交互操作，使用Kibana能执行高级的数据分析，并能以图表、表格和地图的形式查看数据。（当然如果对esAPI很熟悉的话可以使用postman）

Docker安装

拉取镜像：docker pull kibana:7.14.0

创建数据目录以及配置文件：

mkdir -p /opt/docker_app/kibana/config 
vim kibana.yml

#配置内容：
server.host: "0.0.0.0"
server.shutdownTimeout: "5s"
#集群
elasticsearch.hosts: [ "http://localhost:9200" ]
#elasticsearch中用户名和密码
#elasticsearch.username: "kibana_system"
#elasticsearch.password: "JNeepMbbA0inbAI8voK3"
#设置kibana中文显示
i18n.locale: zh-CN

启动容器：

1	docker run -d --name=kibana -p 5601:5601 --restart=always -v /opt/docker_app/kibana/config/kibana.yml:/usr/share/kibana/config/kibana.yml kibana:7.14.0

进入kibana：http://ip:5601

设置密码

设置es密码

在配置文件中进行配置：

http.cors.enabled: true
http.cors.allow-origin: "*"
http.cors.allow-headers: Authorization
xpack.security.enabled: true
xpack.security.transport.ssl.enabled: true

进入容器执行./bin/elasticsearch-setup-passwords interactive
期间设置各种账号密码
账号：
- elastic：超级管理员账号
- kibana：Kibana访问专用账号
- logstash_system：Logstash访问专用账号
- beats_system：FileBeat访问专用账号
- apm_system：APM系统专用账号
- remote_monitoring_user：远程监控账号
此时进入es就需要输入账号密码了（可以选择上面注册的各种账号密码）

核心概念

索引

一个索引就是一个拥有几分相似特征的文档的集合，一个索引类似于传统关系数据库中的一个数据库，是一个存储关系型文档的地方。比如说，你可以有一个商品数据的索引，一个订单数据的索引，还有一个用户数据的索引。一个索引由一个名字来标识（必须全部是小写字母的)，并且当我们要对这个索引中的文档进行索引、搜索、更新和删除的时候，都要使用到这个名字。

映射

映射是定义一个文档和它所包含的字段如何被存储和索引的过程。在默认配置下，ES可以根据插入的数据自动地创建mapping，也可以手动创建mapping。mapping中主要包括字段名、字段类型等

文档

文档是索引中存储的一条条数据。一条文档是一个可被索引的最小单元。ES中的文档采用了轻量级的JSON格式数据来表示。

基本操作

索引操作

创建索引：PUT /索引名（默认ES在创建索引时会为索引创建1个备份索引和一个primary索引）
- 例子：PUT /products
  
  由于主索引和备份索引在同一台机器上（这样并不安全），所以当前索引状态为yellow
  1
  2
  3
  4
  5
  6
  7
  8
  #可以不创建备份索引：
  PUT /goods
  {
  "settings":{
  "number_of_shards": 1,
  "number_of_replicas": 0
  }
  }
  ES中索引健康转态red(索引不可用)、yellwo(索引可用,存在风险)、green(健康)
查看es中所有索引：GET /_cat/indices或者GET /_cat/indices?v(带上标题)
删除索引：DELETE /索引名

映射操作

创建映射（创建索引时一起创建）

例子：

#创建索引以及映射
PUT /products
{
	"settings":{
		"number_of_shards": 1,
		"number_of_replicas": 0
	},
	"mappings": {
	  "properties": {
	    "id":{
	      "type": "integer"
	    },
	    "title":{
	      "type": "keyword"
	    },
	    "price":{
	      "type": "double"
	    },
	    "created_at":{
	      "type": "date"
	    },
	    "description":{
	      "type": "text"
	    }
	  }
	}
}

映射类型：

字符串类型：keyword关键词、text文本

数字类型：intrger、long

小数类型：float、double

布尔类型：boolean

日期类型：date

IPv4及IPv6地址类型：ip

自动判断字段类型：Auto

Nested：嵌套对象类型

….更多查看官网

查看索引的映射信息：GET /索引名/_mapping

文档操作

添加文档：POST /索引名/_doc/[id]

例子：

#自动生成文档_id方式
POST /products/_doc/
{
	"title" : "可口可乐" ,
	"price": 3.5,
	"created_at" : "2022-09-15"，
	"description" : "肥宅快乐水"
}

#指定义文档_id
POST /products/_doc/1
{
	"id": 1
	"title" : "卫龙小辣棒" ,
	"price": 6.2,
	"created_at" : "2022-09-15",
	"description" : "辣条"
}

查询文档：GET /索引名/_doc/_id值
删除文档：DELETE /索引名/_doc/_id值

更新文档：

方式一：PUT /索引名/_doc/_id值{文档内容}(这种方式会删除原来的文档，重新添加)

例子：

# 更新文档
PUT /products/_doc/1
{
	"price":6.5
}

方式二：

例子：

# 更新文档2
POST /products/_doc/1/_update
{   
  "doc":{
    "price":4.8
  }
}

批量操作：POST /索引名/_doc/_bulk{操作数据}

例子1批量插入(注意数据不能换行)

POST /products/_doc/_bulk
{"index":{"_id":2}}
  {"id": 2,"title" : "百事可乐" ,"price": 3.0,"created_at" : "2022-09-15","description" : "肥宅快乐水02"}
{"index":{"_id":3}}
  {"id": 3,"title" : "乐事薯片" ,"price": 6.9,"created_at" : "2022-09-15","description" : "看电视时吃"}

例子2批量各种操作

#批量添加更新删除
POST /products/_doc/_bulk
{"index":{"_id":4}}
  {"id": 4,"title" : "口香糖" ,"price": 9.9,"created_at" : "2022-09-15","description" : "清新口气"}
{"update":{"_id":3}}
  {"doc":{"price":7.9}}
{"delete":{"_id":2}}

说明:批量时不会因为一个失败而全部失败,而是继续执行后续操作,在返回时按照执行的状态返回!

高级查询

DSL查询

ES中提供了一种强大的检索数据方式,这种检索方式称之为Query DSL，查询表达式(Query DSL)是一种非常灵活又富有表现力的查询语言，Query DSL是利用Rest API传递JSON格式的请求体(RequestBody)数据与ES进行交互，这种方式的丰富查询语法让ES检索变得更强大，更简洁。

基础语法：

1	GET /索引名/_search {json格式请求体数据}

查询所有：使用match_all来表示

例子：

#查询所有
GET /products/_search
{
  "query":{
    "match_all":{}
  }
}

关键词查询：使用term进行查询
- 例子：
  1
  2
  3
  4
  5
  6
  7
  8
  9
  10
  11
  # 条件查询02
  GET /products/_search
  {
  "query": {
  "term": {
  "title": {
  "value":"可口可乐"
  }
  }
  }
  }
  keyword、integer、double等除text类型需要精确匹配，text类型由于采用标准分词器，按照单字单词分词所以只能按照单字单词查询。
条件查询：使用match表示匹配条件
- 例子：
  1
  2
  3
  4
  5
  6
  7
  8
  9
  # 条件查询
  GET /products/_search
  {
  "query": {
  "match": {
  "price": 3.5
  }
  }
  }
  对于数值类型match操作使用的是精确匹配，对于text文本类型使用的是模糊匹配。

范围查询：使用range来指定范围

例子：

# 范围查询 gt大于/gte大于等于/lt小于/lte小于等于
GET /products/_search
{
  "query": {
    "range": {
      "price": {
        "gte": 1,
        "lte": 10
      }
    }
  }
}

前缀查询：使用prefix表示前缀(依然根据关键词查询)

例子：

#前缀查询
GET /products/_search
{
  "query": {
    "prefix": {
      "title": {
        "value": "乐事"
      }
    }
  }
}

通配符查询：使用wildcard表示使用通配符查询（*代表多个任意字符?代表一个字符）

例子：

#通配符查询
GET /products/_search
{
  "query": {
    "wildcard": {
      "description":{
        "value": "看*"
      }
    }
  }
}

根据_id查询：使用ids根据多个id查询

例子：

# 根据_id查询
GET /products/_search
{
  "query": {
    "ids": {
      "values": [1,2,3]
    }
  }
}

模糊查询：使用fuzzy表示模糊查询
- 例子：
  1
  2
  3
  4
  5
  6
  7
  8
  9
  10
  11
  #模糊查询
  GET /products/_search
  {
  "query": {
  "fuzzy": {
  "title": {
  "value": "乐事"
  }
  }
  }
  }
  注意: fuzzy模糊查询最大模糊错误必须在0-2之间
  - 搜索关键词长度为2不允许存在模糊
  - 搜索关键词长度为3-5允许一次模糊
  - 搜索关键词长度大于5允许最大2模糊

布尔查询：使用bool来组合多个条件实现复杂查询

关键字：
- must：相当于与，表示同时满足
- should：相当于或，表示满足其中任意一个
- must_not：相当于非，表示同时不满足

例子：

#布尔查询
GET /products/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "ids": {"values": [1,3,5]}
        },
        {
          "prefix": {
            "title": {
              "value": "乐事"
            }
          }
        }
      ]
    }
  }
}

多字段查询：使用multi_match表示根据多字段查询（multi_match同match一样text进行模糊查询，其他进行精确查询）

例子：

#多字段查询
GET /products/_search
{
  "query": {
    "multi_match": {
      "title": "乐事",
      "price": 3.5
      }
    }
  }
}

默认字段分词查询：使用query_string开启默认字段分词查询（如果之前分词就按照分词查找，不分词就按照不分词来查找）

例子：

#默认字段分词查询
GET /products/_search
{
  "query": {
    "query_string": {
      "default_field": "title",
      "query": "乐事薯片"
    }
  }
}

高亮查询：使用highlight可以让符合条件的文档中的关键词高亮

例子：

#高亮查询
GET /products/_search
{
  "query": {
    "query_string": {
      "default_field": "description",
      "query": "电视快乐睡"
    }
  },
  "highlight": {
  	# 设置前置标签默认为<em>
    "pre_tags": ["<span style='color:red;'>"], 
    # 设置后置标签默认为</em>
    "post_tags": ["</span>"],
    # 设置字段匹配的校验默认只能是text字段，但是必须能够分词才行
    "require_field_match": "false", 
    # 规定哪些字段高亮
    "fields": {
      "*":{}
    }
  }
}

分页查询：使用size和from来控制分页大小

关键字：
- size：指定查询结果中的返回条数
- from：用来指定起始返回位置

例子：

#分页
GET /products/_search
{
  "query": {
    "query_string": {
      "default_field": "description",
      "query": "电视快乐睡"
    }
  },
  "highlight": {
    "pre_tags": ["<span style='color:red;'>"], 
    "post_tags": ["</span>"],
    "require_field_match": "false", 
    "fields": {
      "*":{}
    }
  }
  ,
  "size": 1,
  "from": 0
}

指定排序：使用sort表示进行排序，并用order指定降序还是升序

例子：

#指定排序
GET /products/_search
{
  "query":{
    "match_all":{}
  },
  "sort": [
    {
      "price": {
        "order": "desc"
      }
    }
  ]
}

指定返回字段：使用_source指明要返回的字段

例子：

# 指定返回字段
GET /products/_search
{
  "query":{
    "match_all":{}
  },
  "_source": [
    "description", "title" 
  ]
}

过滤查询

过滤查询，其实准确来说，ES中的查询操作分为2种:查询(query)和过滤(filter)。查询即是之前提到的Query DSL 查询，它(查询)默认会计算每个返回文档的得分，然后根据得分排序。而过滤(filter)只会筛选出符合的文档，并不计算得分，而且它可以缓存文档。

所以，单从性能考虑，过滤比查询更快。换句话说过滤适合在大范围筛选数据，而查询则适合精确匹配数据。一般应用时，应先使用过滤操作过滤数据，然后使用查询匹配数据。

基础语法：

GET /索引/_search
{
  "query": {
    "bool": {
      "must": [
        {"match_all": {}} //查询条件
      ],
      #过滤条件（会在query之前执行，并且ES会缓存经常使用的过滤器）
      "filter": {...}
      }
    }
  }
}

term、terms过滤器：根据关键词过滤

# 使用term过滤
GET /products/_search   
{
  "query": {
    "bool": {
      "must": [
        {"term": {
          "name": {
            "value": "可口可乐"
          }
        }}
      ],
      "filter": {
        "term": {
          "description":"好吃"
        }
      }
    }
  }
}
# 使用terms过滤
GET /products/_search  
{
  "query": {
    "bool": {
      "must": [
        {"term": {
          "name": {
            "value": "可口可乐"
          }
        }}
      ],
      "filter": {
        "terms": {
          "description":["饮料","好吃"]
        }
      }
    }
  }
}

ranage过滤器：根据范围过滤

# ranage过滤器
GET /products/_search  
{
  "query": {
    "bool": {
      "must": [
        {"term": {
          "name": {
            "value": "可口可乐"
          }
        }}
      ],
      "filter": {
        "ranage": {
          "price": {
            "gte": 0,
            "lte": 5.2
          }
        }
      }
    }
  }
}

exists过滤器：过滤存在指定字段,获取字段不为空的索引记录使用

# exists过滤器
GET /products/_search  
{
  "query": {
    "bool": {
      "must": [
        {"term": {
          "name": {
            "value": "可口可乐"
          }
        }}
      ],
      "filter": {
        "exists": {
          "field":"price"
        }
      }
    }
  }
}

ids过滤器：过滤含有指定_id的索引记录

# ids过滤器
GET /products/_search  
{
  "query": {
    "bool": {
      "must": [
        {"term": {
          "name": {
            "value": "可口可乐"
          }
        }}
      ],
      "filter": {
        "ids": {
          "values": ["1","2","3"]
        }
      }
    }
  }
}

聚合查询

聚合︰英文为Aggregation Aggs，是es除搜索功能外提供的针对es数据做统计分析的功能。聚合有助于根据搜索查询提供聚合数据。聚合查询是数据库中重要的功能特性，ES作为搜索引擎兼数据库，同样提供了强大的聚合分析能力。它基于查询条件来对数据进行分桶、计算的方法。有点类似于SQL中的group by再加一些函数方法的操作。

注意：text类型不支持聚合查询

使用

根据某个字段进行分组

GET /products/_search
{
  "query": {
    "match_all": {}
  }
  , "aggs": {
  	# 分组名称（自定义）
    "price_group": {
      # 分组字段
      "terms": {
        "field": "price"
      }
    }
  }
}

求最大/小值：

GET /products/_search
{
  "query": {
    "match_all": {}
  }
  , "aggs": {
    "max_price": {
      "max/min": {
        "field": "price"
      }
    }
  }
}

求平均值：

GET /products/_search
{
  "query": {
    "match_all": {}
  }
  , "aggs": {
    "avg_price": {
      "avg": {
        "field": "price"
      }
    }
  }
}

分词器

Analysis和Analyzer

Analysis∶文本分析是把全文本转换一系列单词(term/token)的过程，也叫分词。

Analysis是通过Analyzer来实现的。分词就是将文档通过Analyzer分成一个一个的Term,每一个Term都指向包含这个Term的文档。

Analyzer组成

在ES中默认使用标准分词器: StandardAnalyzer，其会将中文分为一个字一个字，将英文分为一个单词一个单词。

Analyzer主要由三个部分组成：character filters 、tokenizers、token filters

character filters：

字符过滤器，在一段文本进行分词之前，先进行预处理。
tokenizers：

分词器，英文分词可以根据空格将单词分开,中文分词比较复杂,可以采用机器学习算法来分词。
token filters：

Token过滤器，将切分的单词进行加工。大小写转换（例将“Quick”转为小写)，去掉停用词（例如停用词像“a”、“and”、“the”等等)，加入同义词（例如同义词像”jump”和“leap”)。

内置分词器

Standard Analyzer：默认分词器，按照单词分词英文统一转为小写过滤标点符号中文单字分词，并小写处理
Simple Analyzer：英文按照单词分词英文统一转为小写，去掉符号，中文按照空格分词，小写处理
Stop Analyzer ：小写处理，停用词过滤(the,a,is)
Whitespace Analyzer：中文英文按照空格分词，英文不会转为小写不去掉标点符号，不转小写
Keyword Analyzer：不分词，直接将输入当作输出

如果需要对分词器进行测试：

POST /_analyze
{
	"analyzer":"分词器简名如：standard",
	"text": "分词文本"
}

创建索引时设置分词器

PUT /products
{
	"settings":{
		"number_of_shards": 1,
		"number_of_replicas": 0
	},
	"mappings": {
	  "properties": {
	    "title":{
	      "type": "keyword",
	      # 指定分词器
	      "analyzer":"standard"
	    }
	  }
	}
}

IK分词器

IK分词器是ES的一个插件，主要用于把一段中文或者英文的划分成一个个的关键字，我们在搜索时候会把自己的信息进行分词，会把数据库中或者索引库中的数据进行分词，然后进行一个匹配操作，默认的中文分词器是将每个字看成一个词。

安装

下载IK分词器(https://github.com/medcl/elasticsearch-analysis-ik/releases)，注意版本要与ES版本一致
下载后解压上传到docker容器的插件数据卷中：/opt/docker_app/elasticsearch/plugins
重启容器

测试使用

IK有两种颗粒度的拆分：

ik_smart：会做最粗粒度的拆分
ik_max_word：会将文本做最细粒度的拆分

ik_smart测试：

POST /_analyze
{
	"analyzer":"ik_smart",
	"text": "分词文本真的不错"
}

ik_max_word测试：

POST /_analyze
{
	"analyzer":"ik_max_word",
	"text": "分词文本真的不错"
}

扩展词、停用词配置

IK分词器支持自定义扩展词典和停用词典

扩展词典：就是有些词并不是关键词,但是也希望被ES用来作为检索的关键词,可以将这些词加入扩展词典。

停用词典：就是有些词是关键词,但是出于业务场景不想使用这些关键词被检索到，可以将这些词放入停用词典。

定义扩展词典和停用词典可以修改IK分词器中config目录中IKAnalyzer.cfg.xml这个文件。

配置

先创建扩展词和停用词的.dic文件（注意文件中一行只能放一个词）：

# 创建并输入扩展词
vim ext_dict.dic
# 创建并输入停用词
vim ext_stopword.dic

在IKAnalyzer.cfg.xml中配置：

vim IKAnalyzer.cfg.xml

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
<properties>
        <comment>IK Analyzer 扩展配置</comment>
        <!--用户可以在这里配置自己的扩展字典 -->
        <entry key="ext_dict">ext_dict.dic</entry>
         <!--用户可以在这里配置自己的扩展停止词字典-->
        <entry key="ext_stopwords">ext_stopword.dic</entry>
        <!--用户可以在这里配置远程扩展字典 -->
        <!-- <entry key="remote_ext_dict">words_location</entry> -->
        <!--用户可以在这里配置远程扩展停止词字典-->
        <!-- <entry key="remote_ext_stopwords">words_location</entry> -->
</properties>

3 . 重启es

当然一般可以使用他所提供的字典：

扩展词典：extra_main.dic

停用词典：extra_stopword.dic

整合SpringBoot

导入依赖

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-data-elasticsearch</artifactId>
</dependency>

配置客户端

@Configuration
public class RedisClientConfig extends AbstractElasticsearchConfiguration {

    @Value("${spring.elasticsearch.host}")
    private String host;

    @Override
    @Bean
    public RestHighLevelClient elasticsearchClient() {
        final ClientConfiguration clientConfiguration = ClientConfiguration.builder()
                .connectedTo(host)
                .build();
        return RestClients.create(clientConfiguration).rest();
    }
}

配置之后，会在spring工厂中创建RestHighLevelClient对象（之前的restful交互方式）和ElasticSearchOperations对象（面向对象的方式）

客户端对象

ElasticSearchOperations

特点：以面向对象的方式操作ElasticSearch

相关注解：

@Document(indexName=”索引名”,createIndex = true)：用在类上，代表一个对象为一个文档
- indexName：指定索引名
- createIndex：是否创建索引（不存在则创建）
@Id：用在属性上，指定字段与es文档中_id对应
@Field(type = FieldType.keyword,analyzer=”ik_max_word”)：用在属性上，描述属性在es中存储类型以及分词情况
- type：用于指定字段类型
- analyzer：指定分词器

案例

文档实体：

@Data
@AllArgsConstructor
@NoArgsConstructor
@Document(indexName = "products")
public class Product {
    @Id
    private Integer id;
    @Field(type = FieldType.Keyword)
    private String title;
    @Field(type = FieldType.Float)
    private Double price;
    @Field(type = FieldType.Text,analyzer = "ik_max_word")
    private String description;
}

测试：

@SpringBootTest
class EsDemoApplicationTests {

    @Autowired
    ElasticsearchOperations operations;

    /**
     * save方法如果id不存在则插入，如果id存在则更新
     */
    @Test
    void testSave() {
        Product product = new Product(1,"可口可乐",3.5,"可乐很好喝，肥宅快乐水");
        operations.save(product);
    }

    @Test
    void testSearch(){
        //根据id查询
        Product product = operations.get("1", Product.class);
        System.out.println(product);
        //查询所有
        SearchHits<Product> searchHits = operations.search(Query.findAll(), Product.class);
        System.out.println(searchHits);
    }

    @Test
    void testDel(){
        //根据id删除
        operations.delete("1", Product.class);
        //方式二
        Product product = new Product();
        product.setId(1);
        operations.delete(product);
        //删除所有
        operations.delete(Query.findAll(),Product.class);
    }
}

RestHighLevelClient

测试索引操作：

创建索引：

@Test
public void testCreateIndexAndMapping() throws IOException {
    //参数1：创建索引请求对象，参数2：请求配置对象
    CreateIndexRequest createIndexRequest = new CreateIndexRequest("goods");
    //指定映射,以json形式
    createIndexRequest.mapping("{\n" +
                               "\t  \"properties\": {\n" +
                               "\t    \"id\":{\n" +
                               "\t      \"type\": \"integer\"\n" +
                               "\t    },\n" +
                               "\t    \"title\":{\n" +
                               "\t      \"type\": \"keyword\"\n" +
                               "\t    },\n" +
                               "\t    \"price\":{\n" +
                               "\t      \"type\": \"double\"\n" +
                               "\t    },\n" +
                               "\t    \"created_at\":{\n" +
                               "\t      \"type\": \"date\"\n" +
                               "\t    },\n" +
                               "\t    \"description\":{\n" +
                               "\t      \"type\": \"text\"\n" +
                               "\t    }\n" +
                               "\t  }\n" +
                               "\t}", XContentType.JSON);
    CreateIndexResponse createIndexResponse = restHighLevelClient.indices().create(createIndexRequest, RequestOptions.DEFAULT);
    //查看是否创建成功
    System.out.println(createIndexResponse.isAcknowledged());
    restHighLevelClient.close();
}

删除索引：

@Test
public void testDelIndex() throws IOException {
    //参数1：删除索引请求对象，参数2：请求配置对象
    AcknowledgedResponse acknowledgedResponse = restHighLevelClient.indices().delete(new DeleteIndexRequest("goods"), RequestOptions.DEFAULT);
    System.out.println(acknowledgedResponse.isAcknowledged());
}

测试文档操作：

插入文档：

@Test
public void testCreateDoc() throws IOException {
    //指定索引
    IndexRequest indexRequest = new IndexRequest("goods");
    //指定文档id
    indexRequest.id("1");
    //指定文档内容
    indexRequest.source("{\n" +
                        "\t\"title\" : \"卫龙小辣棒\",\n" +
                        "\t\"price\":6.2,\n" +
                        "\t\"created_at\" : \"2022-09-15\",\n" +
                        "\t\"description\" : \"辣条\"\n" +
                        "}", XContentType.JSON);
    IndexResponse response = restHighLevelClient.index(indexRequest, RequestOptions.DEFAULT);
    //获取插入状态
    System.out.println(response.status());

}

更新文档：

@Test
public void testUpdateDoc() throws IOException {
    //指定索引以及id
    UpdateRequest updateRequest = new UpdateRequest("goods","1");
    //指定修改内容
    updateRequest.doc("{\n" +
                      "    \"price\":4.8\n" +
                      "  }",XContentType.JSON);
    restHighLevelClient.update(updateRequest,RequestOptions.DEFAULT);
}

删除文档：

@Test
public void testDelDoc() throws IOException {
    //指定索引以及id
    DeleteRequest deleteRequest = new DeleteRequest("goods","1");
    restHighLevelClient.delete(deleteRequest,RequestOptions.DEFAULT);
}

基于id查询文档：

@Test
public void testQueryById() throws IOException {
    GetResponse documentFields = restHighLevelClient.get(new GetRequest("goods","1"), RequestOptions.DEFAULT);
    //获取id
    System.out.println(documentFields.getId());
    System.out.println(documentFields.getSourceAsString());
}

各种查询：

@Test
public void testQuery() throws IOException {
//term关键词查询
    query(QueryBuilders.termQuery("description", "辣"));
    // range查询
    query(QueryBuilders.rangeQuery("price").from(0).to(6.5));
    // match查询
    query(QueryBuilders.matchQuery("description","辣条"));
    // 前缀查询
    query(QueryBuilders.prefixQuery("title","卫龙"));
    // 通配符查询
    query(QueryBuilders.wildcardQuery("title","卫龙*"));
    // ids查询
    query(QueryBuilders.idsQuery().addIds("1","2"));
    // 多字段查询
    query(QueryBuilders.multiMatchQuery("辣","title","description"));
}

public void query(QueryBuilder queryBuilder) throws IOException {
    //指定查询索引
    SearchRequest searchRequest = new SearchRequest("goods");
    //指定查询条件
    SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
    sourceBuilder.query(queryBuilder);
    searchRequest.source(sourceBuilder);
    SearchResponse response = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
    //获取总条数和最大得分
    System.out.println("总条数：" + response.getHits().getTotalHits().value);
    System.out.println("最大得分：" + response.getHits().getMaxScore());
    //获取并输出查询数据
    SearchHit[] hits = response.getHits().getHits();
    for (SearchHit hit : hits) {
        System.out.println(hit.getId());
        System.out.println(hit.getSourceAsString());
    }
}

分页查询以及排序、指定返回字段、高亮：

public void query(QueryBuilder queryBuilder) throws IOException {
    //指定查询索引
    SearchRequest searchRequest = new SearchRequest("goods");
    //指定查询条件
    SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
    //指定高亮字段
    HighlightBuilder highlightBuilder = new HighlightBuilder();
    highlightBuilder.requireFieldMatch(false).field("description").field("title").preTags("<span>").postTags("</span>");
    sourceBuilder.query(queryBuilder)
        .from(0)
        .size(1)
        .sort("price", SortOrder.DESC)
        .fetchSource(new String[]{"title"},new String[]{})
        .highlighter(highlightBuilder);
    searchRequest.source(sourceBuilder);
    SearchResponse response = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
    //获取总条数和最大得分
    System.out.println("总条数：" + response.getHits().getTotalHits().value);
    System.out.println("最大得分：" + response.getHits().getMaxScore());
    //获取并输出查询数据
    SearchHit[] hits = response.getHits().getHits();
    for (SearchHit hit : hits) {
        System.out.println(hit.getId());
        System.out.println(hit.getSourceAsString());
    }
}

过滤查询：

@Test
public void testFilterQuery() throws IOException {
    SearchRequest searchRequest = new SearchRequest("goods");
    SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
    sourceBuilder.query(QueryBuilders.matchAllQuery());
    //设置过滤条件
    sourceBuilder.postFilter(QueryBuilders.termQuery("title","卫龙小辣棒"));
    searchRequest.source(sourceBuilder);
    SearchResponse response = restHighLevelClient.search(searchRequest,RequestOptions.DEFAULT);
    //获取总条数和最大得分
    System.out.println("总条数：" + response.getHits().getTotalHits().value);
    System.out.println("最大得分：" + response.getHits().getMaxScore());
    //获取并输出查询数据
    SearchHit[] hits = response.getHits().getHits();
    for (SearchHit hit : hits) {
        System.out.println(hit.getId());
        System.out.println(hit.getSourceAsString());
    }
}

聚合查询：

@Test
public void testAggs() throws IOException {
    SearchRequest searchRequest = new SearchRequest("products");
    SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
    sourceBuilder.query(QueryBuilders.matchAllQuery())
        .aggregation(AggregationBuilders.terms("price_group").field("price"))
        .size(0);
    searchRequest.source(sourceBuilder);
    SearchResponse response = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
    Aggregations aggregations = response.getAggregations();
    ParsedDoubleTerms aggregation = aggregations.get("price_group");
    List<? extends Terms.Bucket> buckets = aggregation.getBuckets();
    for (Terms.Bucket bucket : buckets) {
        System.out.println(bucket.getKey()+" "+bucket.getDocCount());
    }
}

聚合函数为max、min、sum桶中只有一个返回值，则max转化为ParsedMax、min转化为ParsedMin、sum转化为ParsedSum、 avg转化为ParsedAvg

集群

一个集群就是由一个或多个节点组织在一起，它们共同持有整个的数据，并一起提供索引和搜索功能。一个集群由一个唯一的名字标识，这个名字默认就是elasticsearch。这个名字是重要的，因为一个节点只能通过指定某个集群的名字，来加入这个集群。

节点：

一个节点是你集群中的一个服务器，作为集群的一部分，它存储你的数据，参与集群的索引和搜索功能。

分片：

Elasticsearch提供了将索引划分成多份的能力，这些份就叫做分片。当创建一个索引的时候，你可以指定你想要的分片的数量。每个分片本身也是一个功能完善并且独立的“索引”,这个“索引”可以被放置到集群中的任何节点上。

搭建

集群规划：

# 三个节点
- node1：web端口9201，tcp端口9301
- node2：web端口9202，tcp端口9302
- node3：web端口9203，tcp端口9303

依次创建备份数据卷：

# 节点1
mkdir -p /opt/docker_app/elasticsearch/config_node1
mkdir -p /opt/docker_app/elasticsearch/data_node1
mkdir -p /opt/docker_app/elasticsearch/plugins_node1
echo "http.host: 0.0.0.0">>/opt/docker_app/elasticsearch/config_node1/elasticsearch.yml
# 节点2
mkdir -p /opt/docker_app/elasticsearch/config_node2
mkdir -p /opt/docker_app/elasticsearch/data_node2
mkdir -p /opt/docker_app/elasticsearch/plugins_node2
echo "http.host: 0.0.0.0">>/opt/docker_app/elasticsearch/config_node2/elasticsearch.yml
# 节点3
mkdir -p /opt/docker_app/elasticsearch/config_node3
mkdir -p /opt/docker_app/elasticsearch/data_node3
mkdir -p /opt/docker_app/elasticsearch/plugins_node3
echo "http.host: 0.0.0.0">>/opt/docker_app/elasticsearch/config_node3/elasticsearch.yml

对配置文件进行配置：

node-1

# 指定集群名称 一个集群必须一致
cluster.name: es-cluster
# 指定节点名称 每个节点唯一
node.name: node-1
# 开放远程连接 
network.host: 0.0.0.0
# 指定使用发布地址进行集群间通信
network.publish_host: 192.168.227.130
# 指定web端口
http.port: 9201
# 指定tcp端口
transport.tcp.port: 9301
# 指定所有节点的tcp通信
discovery.seed_hosts: ["192.168.227.130:9301","192.168.227.130:9302","192.168.227.130:9303"]
# 指定可以初始化集群节点名称
cluster.initial_master_nodes: ["node-1","node-2","node-3"]
# 集群最少几个点可用
gateway.recover_after_nodes: 1
# 解决跨域问题
http.cors.enabled: true
http.cors.allow-origin: "*"

node-2

# 指定集群名称 一个集群必须一致
cluster.name: es-cluster
# 指定节点名称 每个节点唯一
node.name: node-2
# 开放远程连接 
http.host: 0.0.0.0
# 指定使用发布地址进行集群间通信
network.publish_host: 192.168.227.130
# 指定web端口
http.port: 9202
# 指定tcp端口
transport.tcp.port: 9302
# 指定所有节点的tcp通信
discovery.seed_hosts: ["192.168.227.130:9301","192.168.227.130:9302","192.168.227.130:9303"]
# 指定可以初始化集群节点名称
cluster.initial_master_nodes: ["node-1","node-2","node-3"]
# 集群最少几个点可用
gateway.recover_after_nodes: 1
# 解决跨域问题
http.cors.enabled: true
http.cors.allow-origin: "*"

node-3

# 指定集群名称 一个集群必须一致
cluster.name: es-cluster
# 指定节点名称 每个节点唯一
node.name: node-3
# 开放远程连接 
http.host: 0.0.0.0
# 指定使用发布地址进行集群间通信
network.publish_host: 192.168.227.130
# 指定web端口
http.port: 9203
# 指定tcp端口
transport.tcp.port: 9303
# 指定所有节点的tcp通信
discovery.seed_hosts: ["192.168.227.130:9301","192.168.227.130:9302","192.168.227.130:9303"]
# 指定可以初始化集群节点名称
cluster.initial_master_nodes: ["node-1","node-2","node-3"]
# 集群最少几个点可用
gateway.recover_after_nodes: 2
# 解决跨域问题
http.cors.enabled: true
http.cors.allow-origin: "*"

启动各个节点容器：

# node-1
docker run --name elasticsearch01 -p 9201:9201 -p 9301:9301 -e ES_JAVA_OPTS="-Xms256m -Xmx256m"  -v /opt/docker_app/elasticsearch/config_node1/elasticsearch.yml:/usr/share/config/elasticsearch.yml -v /opt/docker_app/elasticsearch/data_node1:/usr/share/elasticsearch/data -v /opt/docker_app/elasticsearch/plugins_node1:/usr/share/elasticsearch/plugins  -d elasticsearch:7.14.0
# node-2
docker run --name elasticsearch02 -p 9202:9202 -p 9302:9302 -e ES_JAVA_OPTS="-Xms256m -Xmx256m"  -v /opt/docker_app/elasticsearch/config_node2/elasticsearch.yml:/usr/share/config/elasticsearch.yml -v /opt/docker_app/elasticsearch/data_node2:/usr/share/elasticsearch/data -v /opt/docker_app/elasticsearch/plugins_node2:/usr/share/elasticsearch/plugins  -d elasticsearch:7.14.0
# node-3
docker run --name elasticsearch03 -p 9203:9203 -p 9303:9303 -e ES_JAVA_OPTS="-Xms256m -Xmx256m"  -v /opt/docker_app/elasticsearch/config_node3/elasticsearch.yml:/usr/share/config/elasticsearch.yml -v /opt/docker_app/elasticsearch/data_node3:/usr/share/elasticsearch/data -v /opt/docker_app/elasticsearch/plugins_node3:/usr/share/elasticsearch/plugins  -d elasticsearch:7.14.0