ubuntu16.04+elasticsearch6.5为例,参考官网文档https://www.elastic.co/guide/en/elasticsearch/reference/current/getting-started.html
安装java#
参考文章:https://www.digitalocean.com/community/tutorials/how-to-install-java-with-apt-get-on-ubuntu-16-04
Copy$ sudo apt-get update$ sudo apt-get install -y default-jre$ sudo add-apt-repository ppa:webupd8team/java && sudo apt-get update$ sudo apt-get install oracle-java8-installer$ export JAVA_HOME="/usr/lib/jvm/java-8-oracle"$ java -version #测试java$ echo $JAVA_HOME #测试java_home
Elasticsearch#
安装(6.5.4)#
Copy$ wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.5.4.zip $ unzip elasticsearch-6.5.4.zip
启动
Copy$ cd elasticsearch-6.5.4/bin$ ./elasticsearch
启动时,如果报错vm.maxmapcount [65530] is too low执行下面
Copy$ sudo sysctl -w vm.max_map_count=262144
curl测试,出现以下信息表示启动成功,安装正常
Copy$ curl 127.0.0.1:9200 { "name" : "c5skAub", "cluster_name" : "elasticsearch", "cluster_uuid" : "bdkUuVtQSvWOiY_vXEFnvw", "version" : { "number" : "6.5.4", "build_flavor" : "default", "build_type" : "tar", "build_hash" : "d2ef93d", "build_date" : "2018-12-17T21:17:40.758843Z", "build_snapshot" : false, "lucene_version" : "7.5.0", "minimum_wire_compatibility_version" : "5.6.0", "minimum_index_compatibility_version" : "5.0.0" }, "tagline" : "You Know, for Search"}
基础概念#
Elastic是目前全文搜索引擎的首选,本质上是非关系型数据库,与mysql一些概念对比如下。
Mysql | Elastic |
---|---|
database(数据库) | index(索引) |
table(表) | type(类型,7.x将废弃) |
row(记录) | document(文档) |
column(字段) | fileds(字段) |
基本操作#
Elastic的操作通过rest api来完成,以下操作都将省去
curl -XMETHOD "http://localhost:9200" -H 'Content-Type: application/json' [-d 'request body']
,如果想远程访问,修改/path-to-elastic/config/elasticsearch.yml
中的network.host: 0.0.0.0
后重启即可
操作索引
新建一个名为customer的index,?pretty返回友好的json
Copy$ PUT /customer?pretty
列出所有索引
Copy$ GET /_cat/indices?v
删除索引
Copy$ DELETE /customer
操作文档
新建id为1的document,由于type将被废除,所以规定每个index只包含一个type,统一为_doc
Copy$ PUT /customer/_doc/1?pretty { "name": "luke"}
如果使用post并且id留空将会生成一个随机的id
Copy$ POST /customer/_doc?pretty {"name": "php"} { "_index": "customer", "_type": "_doc", "_id": "hIkkLGgBFVhvdLuiNNGD", ##返回的id "_version": 1, "result": "created", "_shards": { "total": 2, "successful": 1, "failed": 0 }, "_seq_no": 0, "_primary_term": 3}
更新文档与新建相同,改变数据即可,或者
Copy$ POST /customer/_doc/1/_update?pretty { "doc": { "name": "luke44", "age": 24 } }
使用简单的脚本更新,这里的ctx._source指向将被修改的文档
Copy$ POST /customer/_doc/1/_update?pretty { "script" : "ctx._source.age += 5"}
查询id为1的文档
Copy$ GET /customer/_doc/1?pretty { "_index": "customer", "_type": "_doc", "_id": "1", "_version": 1, "found": true, "_source": { "name": "luke" } }
删除文档
Copy$ DELETE /customer/_doc/2?pretty
批量操作,批量更新id为1和2的文档,注意在postman中body最后必须空一行
Copy$ POST /customer/_doc/_bulk?pretty {"index":{"_id":"1"}} {"name": "luke" } {"index":{"_id":"2"}} {"name": "php", "age": "20" }
先更新id为1的文档,然后删除id为2的文档
Copy$ POST /customer/_doc/_bulk?pretty {"update":{"_id":"1"}} {"doc":{"name":"php best"}} {"delete":{"_id":"2"}}
批量操作时其中一个操作失败时,其他操作任然会继续执行,结束时根据执行顺序返回状态。
浏览数据
先准备一个虚拟的银行客户帐户信息数据集,类似这种格式,请右键下载数据集另存为accounts.json
Copy{ "account_number": 0, "balance": 16623, "firstname": "Bradshaw", "lastname": "Mckenzie", "age": 29, "gender": "F", "address": "244 Columbus Place", "employer": "Euron", "email": "bradshawmckenzie@euron.com", "city": "Hobucken", "state": "CO"}
导入数据集
Copy$ POST /bank/_doc/_bulk?pretty&refresh --data-binary "@accounts.json"$ GET /_cat/indices?v health status index uuid pri rep docs.count docs.deleted store.size pri.store.size yellow open bank 3inMmuQzRqaTpMkzfh07_A 5 1 1000 0 95.9kb 95.9kb yellow open customer gSRgPG9cScKHcuycJE2drw 5 1 2 0 7.7kb 7.7kb
match_all查询
使用URI搜索,q=*
匹配所有,sort=account_number:asc
表示按account_number
升序排列
Copy$ GET /bank/_search?q=*&sort=account_number:asc&pretty { "took" : 63, //耗时,毫秒 "timed_out" : false, //是否超时 "_shards" : { //碎片 "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { //命中 "total" : 1000, "max_score" : null, "hits" : [ { "_index" : "bank", "_type" : "_doc", "_id" : "0", "sort": [0], "_score" : null, "_source" : {"account_number":0,"balance":16623,"firstname":"Bradshaw","lastname":"Mckenzie","age":29,"gender":"F","address":"244 Columbus Place","employer":"Euron","email":"bradshawmckenzie@euron.com","city":"Hobucken","state":"CO"} }, { "_index" : "bank", "_type" : "_doc", "_id" : "1", "sort": [1], "_score" : null, "_source" : {"account_number":1,"balance":39225,"firstname":"Amber","lastname":"Duke","age":32,"gender":"M","address":"880 Holmes Lane","employer":"Pyrami","email":"amberduke@pyrami.com","city":"Brogan","state":"IL"} }, ... ] } }
使用json请求体搜索,获取跟上面相同的效果
Copy$ GET /bank/_search { "query": { "match_all": {} }, "sort": [ { "account_number": "asc" } ] }
使用size和from限制结果条数,类似mysql的limit和from;使用_source查询指定字段
Copy$ GET /bank/_search { "query": { "match_all": {} }, "sort": { "balance": { "order": "desc" } }, "from": 10, "size": 15, //默认10 "_source": ["account_number", "balance"] }
match查询
查询account_number为20的所有账户
Copy$ GET /bank/_search { "query": { "match": { "account_number": 20 } } }
查询address中包含mill
单词的所有账户
Copy$ GET /bank/_search { "query": { "match": { "address": "mill" } } }
查询address中包含mill
或者lane
单词的所有账户
Copy$ GET /bank/_search { "query": { "match": { "address": "mill lane" } } }
match_phrase查询,match的变种,查询address中包含mill lane
的所有账户
Copy$ GET /bank/_search { "query": { "match_phrase": { "address": "mill lane" } } }
bool查询
查询address中包含mill
和lane
单词的所有账户,bool must
子句指定所有必须为true的查询才能将文档视为匹配项
Copy$ GET /bank/_search { "query": { "bool": { "must": [ { "match": { "address": "mill" } }, { "match": { "address": "lane" } } ] //"should": [...] 或查询 //"must_not": [...] 都不是 } } }
组合查询,查询年龄为40并且不住在ID
省的客户账户
Copy$ GET /bank/_search { "query": { "bool": { "must": [ { "match": { "age": "40" } } ], "must_not": [ { "match": { "state": "ID" } } ] } } }
bool过滤器
查询余额在20000到30000(包含)的客户账户
Copy$ GET /bank/_search { "query": { "bool": { "must": { "match_all": {} }, "filter": { "range": { "balance": { "gte": 20000, "lte": 30000 } } } } } }
Elasticsearch是一个既简单又复杂的工具。以上,已经了解了它的基础知识,以及如何使用一些REST API来处理它,以后再慢慢了解一些更高级的知识点。
作者:Luke_44
出处:https://www.cnblogs.com/luke44/p/elasticsearch-doc.html