Elasticsearch入门教程之安装与基本使用-原创手记-慕课网

ubuntu16.04+elasticsearch6.5为例，参考官网文档https://www.elastic.co/guide/en/elasticsearch/reference/current/getting-started.html

安装java#

参考文章：https://www.digitalocean.com/community/tutorials/how-to-install-java-with-apt-get-on-ubuntu-16-04

Copy$ sudo apt-get update$ sudo apt-get install -y default-jre$ sudo add-apt-repository ppa:webupd8team/java && sudo apt-get update$ sudo apt-get install oracle-java8-installer$ export JAVA_HOME="/usr/lib/jvm/java-8-oracle"$ java -version     #测试java$ echo $JAVA_HOME   #测试java_home

Elasticsearch#

安装(6.5.4)#

Copy$ wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.5.4.zip
$ unzip elasticsearch-6.5.4.zip

启动

Copy$ cd elasticsearch-6.5.4/bin$ ./elasticsearch

启动时，如果报错vm.maxmapcount [65530] is too low执行下面

Copy$ sudo sysctl -w vm.max_map_count=262144

curl测试，出现以下信息表示启动成功，安装正常

Copy$ curl 127.0.0.1:9200   {  "name" : "c5skAub",  "cluster_name" : "elasticsearch",  "cluster_uuid" : "bdkUuVtQSvWOiY_vXEFnvw",  "version" : {    "number" : "6.5.4",    "build_flavor" : "default",    "build_type" : "tar",    "build_hash" : "d2ef93d",    "build_date" : "2018-12-17T21:17:40.758843Z",    "build_snapshot" : false,    "lucene_version" : "7.5.0",    "minimum_wire_compatibility_version" : "5.6.0",    "minimum_index_compatibility_version" : "5.0.0"
  },  "tagline" : "You Know, for Search"}

基础概念#

Elastic是目前全文搜索引擎的首选，本质上是非关系型数据库，与mysql一些概念对比如下。

Mysql	Elastic
database(数据库)	index(索引)
table(表)	type(类型，7.x将废弃)
row(记录)	document(文档)
column(字段)	fileds(字段)

基本操作#

Elastic的操作通过rest api来完成，以下操作都将省去curl -XMETHOD "http://localhost:9200" -H 'Content-Type: application/json' [-d 'request body']，如果想远程访问，修改/path-to-elastic/config/elasticsearch.yml中的network.host: 0.0.0.0后重启即可

操作索引

新建一个名为customer的index，?pretty返回友好的json

Copy$ PUT /customer?pretty

列出所有索引

Copy$ GET /_cat/indices?v

删除索引

Copy$ DELETE /customer

操作文档

新建id为1的document，由于type将被废除，所以规定每个index只包含一个type，统一为_doc

Copy$ PUT /customer/_doc/1?pretty
{
    "name": "luke"}

如果使用post并且id留空将会生成一个随机的id

Copy$ POST /customer/_doc?pretty {"name": "php"}
{    "_index": "customer",    "_type": "_doc",    "_id": "hIkkLGgBFVhvdLuiNNGD",  ##返回的id
    "_version": 1,    "result": "created",    "_shards": {        "total": 2,        "successful": 1,        "failed": 0
    },    "_seq_no": 0,    "_primary_term": 3}

更新文档与新建相同，改变数据即可，或者

Copy$ POST /customer/_doc/1/_update?pretty
{
    "doc": { "name": "luke44", "age": 24 }
}

使用简单的脚本更新，这里的ctx._source指向将被修改的文档

Copy$ POST /customer/_doc/1/_update?pretty
{
  "script" : "ctx._source.age += 5"}

查询id为1的文档

Copy$ GET /customer/_doc/1?pretty
{
    "_index": "customer",    "_type": "_doc",    "_id": "1",    "_version": 1,    "found": true,    "_source": {        "name": "luke"
    }
}

删除文档

Copy$ DELETE /customer/_doc/2?pretty

批量操作，批量更新id为1和2的文档，注意在postman中body最后必须空一行

Copy$ POST /customer/_doc/_bulk?pretty
{"index":{"_id":"1"}}
{"name": "luke" }
{"index":{"_id":"2"}}
{"name": "php", "age": "20" }

先更新id为1的文档，然后删除id为2的文档

Copy$ POST /customer/_doc/_bulk?pretty
{"update":{"_id":"1"}}
{"doc":{"name":"php best"}}
{"delete":{"_id":"2"}}

批量操作时其中一个操作失败时，其他操作任然会继续执行，结束时根据执行顺序返回状态。

浏览数据

先准备一个虚拟的银行客户帐户信息数据集，类似这种格式，请右键下载数据集另存为accounts.json

Copy{    "account_number": 0,    "balance": 16623,    "firstname": "Bradshaw",    "lastname": "Mckenzie",    "age": 29,    "gender": "F",    "address": "244 Columbus Place",    "employer": "Euron",    "email": "bradshawmckenzie@euron.com",    "city": "Hobucken",    "state": "CO"}

导入数据集

Copy$ POST /bank/_doc/_bulk?pretty&refresh --data-binary "@accounts.json"$ GET /_cat/indices?v
health status index    uuid                   pri rep docs.count docs.deleted store.size pri.store.size
yellow open   bank     3inMmuQzRqaTpMkzfh07_A   5   1       1000            0     95.9kb         95.9kb
yellow open   customer gSRgPG9cScKHcuycJE2drw   5   1          2            0      7.7kb          7.7kb

match_all查询

使用URI搜索，q=*匹配所有，sort=account_number:asc表示按account_number升序排列

Copy$ GET /bank/_search?q=*&sort=account_number:asc&pretty
{  "took" : 63,  //耗时，毫秒
  "timed_out" : false,  //是否超时
  "_shards" : {     //碎片
    "total" : 5,    "successful" : 5,    "skipped" : 0,    "failed" : 0
  },  "hits" : {    //命中
    "total" : 1000,    "max_score" : null,    "hits" : [ {      "_index" : "bank",      "_type" : "_doc",      "_id" : "0",      "sort": [0],      "_score" : null,      "_source" : {"account_number":0,"balance":16623,"firstname":"Bradshaw","lastname":"Mckenzie","age":29,"gender":"F","address":"244 Columbus Place","employer":"Euron","email":"bradshawmckenzie@euron.com","city":"Hobucken","state":"CO"}
    }, {      "_index" : "bank",      "_type" : "_doc",      "_id" : "1",      "sort": [1],      "_score" : null,      "_source" : {"account_number":1,"balance":39225,"firstname":"Amber","lastname":"Duke","age":32,"gender":"M","address":"880 Holmes Lane","employer":"Pyrami","email":"amberduke@pyrami.com","city":"Brogan","state":"IL"}
    }, ...
    ]
  }
}

使用json请求体搜索，获取跟上面相同的效果

Copy$ GET /bank/_search
{  "query": { "match_all": {} },  "sort": [
    { "account_number": "asc" }
  ]
}

使用size和from限制结果条数，类似mysql的limit和from；使用_source查询指定字段

Copy$ GET /bank/_search
{  "query": { "match_all": {} },  "sort": { "balance": { "order": "desc" } },  "from": 10,  "size": 15,    //默认10
  "_source": ["account_number", "balance"]
}

match查询

查询account_number为20的所有账户

Copy$ GET /bank/_search
{  "query": { "match": { "account_number": 20 } }
}

查询address中包含mill单词的所有账户

Copy$ GET /bank/_search
{  "query": { "match": { "address": "mill" } }
}

查询address中包含mill或者lane单词的所有账户

Copy$ GET /bank/_search
{  "query": { "match": { "address": "mill lane" } }
}

match_phrase查询，match的变种，查询address中包含mill lane的所有账户

Copy$ GET /bank/_search
{  "query": { "match_phrase": { "address": "mill lane" } }
}

bool查询

查询address中包含mill和lane单词的所有账户，bool must子句指定所有必须为true的查询才能将文档视为匹配项

Copy$ GET /bank/_search
{  "query": {    "bool": {      "must": [
        { "match": { "address": "mill" } },
        { "match": { "address": "lane" } }
      ]      //"should": [...] 或查询
      //"must_not": [...] 都不是
    }
  }
}

组合查询，查询年龄为40并且不住在ID省的客户账户

Copy$ GET /bank/_search
{  "query": {    "bool": {      "must": [
        { "match": { "age": "40" } }
      ],      "must_not": [
        { "match": { "state": "ID" } }
      ]
    }
  }
}

bool过滤器

查询余额在20000到30000(包含)的客户账户

Copy$ GET /bank/_search
{  "query": {    "bool": {      "must": { "match_all": {} },      "filter": {        "range": {          "balance": {            "gte": 20000,            "lte": 30000
          }
        }
      }
    }
  }
}

Elasticsearch是一个既简单又复杂的工具。以上，已经了解了它的基础知识，以及如何使用一些REST API来处理它，以后再慢慢了解一些更高级的知识点。

作者：Luke_44

出处：https://www.cnblogs.com/luke44/p/elasticsearch-doc.html