elastic – 程序员的日常

2021年12月6日

kibana discover 查询日志

Kibana查询语言基于Lucene查询语法。下面是一些提示：

为了执行一个文本搜索，可以简单的输入一个文本字符串。例如，如果你想搜索web服务器的日志，你可以输入关键字”chrome”，这样你就可以搜索到所有有关”chrome”的字段
为了搜索一个特定字段的特定值，可以用字段的名称作为前缀。例如，你输入”status:200“，将会找到所有status字段的值是200的文档
为了搜索一个范围值，你可以用括号范围语法，[START_VALUE TO END_VALUE]。例如，为了找到状态码是4xx的文档，你可以输入status:[400 TO 499]
为了指定更改复杂的查询条件，你可以用布尔操作符 AND , OR , 和 NOT。例如，为了找到状态码是4xx并且extension字段是php或者html的文档，你可以输入status:[400 TO 499] AND (extension:php OR extension:html)

response:200 将匹配response字段的值是200的文档

用引号引起来的一段字符串叫短语搜索。例如，message:”Quick brown fox” 将在message字段中搜索”quick brown fox”这个短语。如果没有引号，将会匹配到包含这些词的所有文档，而不管它们的顺序如何。这就意味着，会匹配到”Quick brown fox”，而不会匹配”quick fox brown”。（画外音：引号引起来作为一个整体）

查询解析器将不再基于空格进行分割。多个搜索项必须由明确的布尔运算符分隔。注意，布尔运算符不区分大小写。

在Lucene中，response:200 extension:php 等价于 response:200 and extension:php。这将匹配response字段值匹配200并且extenion字段值匹配php的文档。

如果我们把中间换成or，那么response:200 or extension:php将匹配response字段匹配200 或者 extension字段匹配php的文档。

默认情况下，and 比 or 具有更高优先级。

response:200 and extension:php or extension:css 将匹配response是200并且extension是php，或者匹配extension是css而response任意

括号可以改变这种优先级

还可以用not

not response:200 将匹配response不是200的文档

response:200 and not (extension:php or extension:css) 将匹配response是200并且extension不是php也不是css的文档

2020年3月23日

elastic 内存

1、JVM heap

ES 建议分配不超过 50% 的内存给 JVM，同时总量不要大于 32 GB。

# 设置 max heap size
export ES_HEAP_SIZE=10g

# 查看当前 max heap size
GET /_cat/nodes?h=heap.max

JVM 使用的内存越小，留给 Lucene 的内存就越多，就可以有更多的内存作为缓存，提高性能。但是 JVM 的内存也不能太小，否则会导致 OOM 或 FullGC，而这会导致节点超时被移出，shards reallocate 又会极大的影响性能。

ES 有两个组件需要使用内存：

JVM
Lucene

JVM 主要存放各种 in-memory 的数据结构，Lucene 负责底层的存储。因为 Lucene 依赖操作系统的缓存机制，而操作系统会把所有的空闲内存分配为 page cache，所以如果 JVM heap 过大，会导致 segment 没有足够的 cache。

64 位的指针会带来额外的性能开销，为了提高性能，Java 会采用 compressed oops 的技术，依然使用 32 位的指针，不过指针用来表示 object offsets，此时一个 32 位指针可以表示 4 G 的 object，而不是 4 G 的 bytes，这种情况下，可以让 heap 的空间达到 32 GB。但如果给 heap 分配的空间超过 32 GB，Java 就会采用 64 bit 的 ordinary object pointers（OOP），运行性能会出现显著下降，据估测，heap 大小达到 40-50 GB 时，性能才与 32 GB 时相同。

如果机器内存远大于 32 GB，也建议不要给 heap 设置超过 32 GB 的内存，可以考虑把机器划分为多个 64 GB 内存的节点，然后每个节点分配 32 GB heap。

2、GC

JVM 会申请一大块内存，称为 heap。heap 中的所有对象会被分为两个集合：Young（Eden）和 Old，一般来说 Old 的空间会显著的大于 Young。

每当 Young 的空间耗尽时，就会启动 young gc，所有幸存的对象会被标记为 survivor，如果一个对象连续两次被标记为 survivor，就会被放入 Old。Old 类似，空间耗尽时会触发 old gc。这两种 GC 都会导致 stop-the-world。不过因为 young 空间较小，gc 耗时一般也很短。带来严重暂停的一般都是 old gc。

ES 默认当 JVM heap 超过 75% 的时候启动 GC，所以应该监控 heap 使用率超过 80% 的情形，此时 gc 已无法有效的清理内存。默认的 heap 是 1GB，这个值太小，实际上线时要根据内存大小进行调整（一般小于 50% 内存）。但是 JVM heap 太大时，会导致 GC 时间过长。如果超过 30s 无响应，就会被 master 移出集群。

CMS & G1

Java Bugs in various JVMs affecting Lucene / Solr

ES 默认使用 CMS，在 Heap 较大时性能会比较差(更长的 Stop-the-world)，相较之下 G1 在大 heap 时性能更好，但是 ES 官方不推荐使用 G1，因为 Lucene 在 G1 下会出现数据丢失的 bug。

Do not, under any circumstances, run Lucene with the G1 garbage collector. Lucene’s test suite fails with the G1 garbage collector on a regular basis, including bugs that cause index corruption. There is no person on this planet that seems to understand such bugs (see https://bugs.openjdk.java.net/browse/JDK-8038348, open for over a year), so don’t count on the situation changing soon.

3、swap

要尽一切可能避免系统 swap 内存。

不建议直接关闭 swap，应该调整 swappiness 来让系统降低 swap 的频率。swappiness 的取值范围为 0-100，0 表示仅在 OOM 时 swap，100 表示系统会尽一切可能 swap，默认值为 60，ES 建议设置为 1。

# 查询 swappiness
cat /proc/sys/vm/swappiness

# 设置
sudo vi /etc/sysctl.conf
vm.swappiness = 1

如果无法修改系统参数，可以修改 ES 的参数 mlockall，该参数会让 JVM 锁定进程的地址空间始终在内存内，避免被 swap：

bootstrap.memory_lock: true

# 查看是否启用 mlockall
GET _nodes?filter_path=**.mlockall

2020年3月23日2020年3月23日

elastic 创建文档

创建新文档（indexing）的操作分为两步：refresh 和 flush。

Refresh：

新插入的文档存放在 in-memory buffer 中
默认每秒一次的 refresh，会将 in-memory buffer 写入 in-memory segment（到这一步后，新文档就可被检索了）

每一个 shard 都有多个连续的 segment 组成，segment 是不可变的，所以每一次的 update 都意味着两步：

写入新数据
将老数据标记为已删除（在 merged 的时候才会真正删除）

Flush：

文档在被写入 in-memory buffer 的同时，也会写入 shard 的 translog
每 30 分钟（或 translog 大小达到阈值），会促发 flush
flush 会清空 in-memory buffer，并且将 in-memory segment 提交到磁盘，然后清空 translog

translog 每 5s 就被提交到磁盘一次，可以通过 index.translog.flush_threshold_size 调整 translog 刷新的大小阈值。

涉及的指标：

indices.indexing.index_total
indices.indexing.index_time_in_millis
indices.indexing.index_current
indices.refresh.total
indices.refresh.total_time_in_millis
indices.flush.total
indices.flush.total_time_in_millis

通过监控 indexing latency，可以了解 ES 是否逼近写入极限。

1、提高写入性能

适当的制定 shards 数，让 index 分散于各个 node 之上，提高并发性能。

关闭 merge throttling，如果 ES 发觉有 merging 操作失败，就会自动会 index 进行限流，可以通过将 indices.store.throttle.type 设置为 none 来关闭这一功能。

提高 index buffer，indices.memory.index_buffer_size 默认为 10%。

在初始化 index 并需要执行大量 index 的时候，可以先不要设置 replicas，等大批量写入完成后，再设置 replica，可以显著提高性能。

默认每秒执行一次 refresh，降低 refresh 频率可以提高写入性能。

ES 2.0 以后，会在每一次 request 后 flush translog，设置 index.translog.durability=async 可以将 flush 设置为异步，并通过 sync_interval 设置刷新间隔。

需要注意的是，primary shards 数仅在创建 index 时可以指定，之后无法修改。只能通过创建新 index 再 reindex 的方式来实现扩展。

2、Store

File system storage types

GET /_settings

创建 index 时，可以指定文件系统，可用选项有：

fs：系统默认，系统兼容性好；
simplefs：并发性能差；
niofs：并发效率高，不支持 windows；
mmapfs：使用 mmap

推荐使用 niofs。

2020年2月21日

elasticsearch使用bulk实现批量操作

bulk的格式：
{action:{metadata}}\n
{requstbody}\n (请求体)

action：(行为)，包含create（文档不存在时创建）、update（更新文档）、index（创建新文档或替换已用文档）、delete（删除一个文档）。
create和index的区别：如果数据存在，使用create操作失败，会提示文档已存在，使用index则可以成功执行。
metadata：(行为操作的具体索引信息)，需要指明数据的_index、_type、_id。

批量添加

POST /lib2/books/_bulk

{"index":{"_id":1}}  \\行为：索引信息
{"title":"Java","price","55"} \\请求体
{"index":{"_id":2}}
{"title":"Html5","price","45"}
{"index":{"_id":3}}
{"title":"Php","price","35"}
{"index":{"_id":4}}
{"title":"Python","price","50"}

批量删除
删除的批量操作不需要请求体

POST /lib/books/_bulk
{"delete":{"_index":"lib","_type":"books","_id":"4"}} //删除的批量操作不需要请求体
{"create":{"_index":"tt","_type":"ttt","_id":"100"}}
{"name":"lisi"} //请求体
{"index":{"_index":"tt","_type":"ttt"}} //没有指定_id，elasticsearch将会自动生成_id
{"name":"zhaosi"} //请求体
{"update":{"_index":"lib","_type":"books","_id":"4"}} //更新动作不能缺失_id，文档不存在更新将会失败
{"doc":{"price":58}} //请求体

bluk一次最大处理多少数据量
bulk会将要处理的数据载入内存中，所以数据量是有限的，最佳的数据两不是一个确定的数据，它取决于你的硬件，你的文档大小以及复杂性，你的索引以及搜索的负载。

一般建议是1000-5000个文档，大小建议是5-15MB，默认不能超过100M，可以在es的配置文件（即$ES_HOME下的config下的elasticsearch.yml）中，bulk的线程池配置是内核数+1。

2020年2月15日

Elasticsearch update api

Update API可以根据提供的脚本更新文档。该操作从索引获取文档，运行脚本（脚本语言和参数是可选的），并返回操作的结果（也允许删除或忽略该操作）。使用版本控制来确保在“get”(查询文档)和“reindex”(重新索引文档)期间没有发生更新。

值得注意的是，该操作会重新索引文档（也就是说更新操作会先查文档，对文档合并，删除之前的文档，重新添加合并的文档。），它只是减少了网络往返以及减少了get（获取文档）和index（索引文档）之间版本冲突的可能性。需要启用_source字段才能使此特性生效。

比如，索引一个简单的文档：

PUT test/_doc/1
{
    "counter" : 1,
    "tags" : ["red"]
}

Scripted updates

以下示例演示了如何执行一个增加counter的脚本：

POST test/_doc/1/_update
{
    "script" : {
        "source": "ctx._source.counter += params.count",
        "lang": "painless",
        "params" : {
            "count" : 4
        }
    }
}

现在我们就可以往tags列表里添加一个tag（注意，如果tag存在，仍会添加，因为它是一个list）

POST test/_doc/1/_update
{
    "script" : {
        "source": "ctx._source.tags.add(params.tag)",
        "lang": "painless",
        "params" : {
            "tag" : "blue"
        }
    }
}

不止_source，以下变量也可以通过ctx来取得： _index, _type, _id, _version, _routing and _now(当前的时间戳)

以下示例演示了如何获取_id，比如：

POST test/_doc/1/_update
{
    "script" : "ctx._source.tags.add(ctx._id)"
}

也可以向文档添加新字段：

POST test/_doc/1/_update
{
    "script" : "ctx._source.new_field = 'value_of_new_field'"
}

从文档移除某个字段：

POST test/_doc/1/_update
{
    "script" : "ctx._source.remove('new_field')"
}

甚至可以改变已执行的操作。以下示例：如果标签字段包含green，将删除doc，否则它不执行任何操作（即该操作会被忽略，返回noop）：

POST test/_doc/1/_update
{
    "script" : {
        "source": "if (ctx._source.tags.contains(params.tag)) { ctx.op = 'delete' } else { ctx.op = 'none' }",
        "lang": "painless",
        "params" : {
            "tag" : "green"
        }
    }
}

更新部分文档

update API还支持传递部分文档，该部分文档将合并到现有文档中（简单的递归合并，对象的内部合并，替换核心”keys/values”和数组）。要完全替换现有文档，应使用index API。以下示例演示了如何使用部分更新向现有文档添加新字段：

POST test/_doc/1/_update
{
    "doc" : {
        "name" : "new_name"
    }
}

如果同时指定了doc和script，会报错。最好是将部分文档的字段对放在脚本本身中（目前我还不知道该怎么操作）。

POST test/_doc/1/_update
{
  "doc" : {
        "age" : "18"
    },
    "script" : {
        "source": "ctx._source.counter += params.count",
        "lang": "painless",
        "params" : {
            "count" : 4
        }
    }
}

返回结果如下：

{
  "error": {
    "root_cause": [
      {
        "type": "action_request_validation_exception",
        "reason": "Validation Failed: 1: can't provide both script and doc;"
      }
    ],
    "type": "action_request_validation_exception",
    "reason": "Validation Failed: 1: can't provide both script and doc;"
  },
  "status": 400
}

检测noop更新
如果指定了doc，则其值将与现有_source合并。默认情况下，不更改任何内容的更新，会检测到并会返回“result”：“noop”，如下所示：

POST test/_doc/1/_update
{
    "doc" : {
        "name" : "new_name"
    }
}

如果在发送请求之前name是new_name，则忽略整个更新请求。如果请求被忽略，响应中的result元素将返回noop。

{
  "_index": "test",
  "_type": "_doc",
  "_id": "1",
  "_version": 2,
  "result": "noop",
  "_shards": {
    "total": 0,
    "successful": 0,
    "failed": 0
  }
}

设置”detect_noop”: false可以禁用这种默认行为：

POST test/_doc/1/_update
{
    "doc" : {
        "name" : "new_name"
    },
    "detect_noop": false
}

Upserts

如果文档尚不存在，则upsert元素的内容将作为新文档插入。如果文档确实存在，则执行脚本：

POST test/_doc/1/_update
{
    "script" : {
        "source": "ctx._source.counter += params.count",
        "lang": "painless",
        "params" : {
            "count" : 4
        }
    },
    "upsert" : {
        "counter" : 1
    }
}

scripted_upsert
如果希望无论文档是否存在，都运行脚本（即使用脚本处理初始化文档而不是upsert元素）可以将scripted_upsert设置为true：

POST sessions/session/dh3sgudg8gsrgl/_update
{
    "scripted_upsert":true,
    "script" : {
        "id": "my_web_session_summariser",
        "params" : {
            "pageViewEvent" : {
                "url":"foo.com/bar",
                "response":404,
                "time":"2014-01-01 12:32"
            }
        }
    },
    "upsert" : {}
}

下面来看看和直接写脚本不用upsert的区别，当文档不存在时，直接下面这样写会报错。

POST test/_doc/1/_update
{
    "scripted_upsert":true,
    "script" : {
        "source": "ctx._source.counter += params.count",
        "lang": "painless",
        "params" : {
            "count" : 4
        }
    }
}

返回错误消息如下：

{
  "error": {
    "root_cause": [
      {
        "type": "document_missing_exception",
        "reason": "[_doc][1]: document missing",
        "index_uuid": "YgmlkeEERGm20yUBDJHKtQ",
        "shard": "3",
        "index": "test"
      }
    ],
    "type": "document_missing_exception",
    "reason": "[_doc][1]: document missing",
    "index_uuid": "YgmlkeEERGm20yUBDJHKtQ",
    "shard": "3",
    "index": "test"
  },
  "status": 404
}

设置scripted_upsert：true，当文档不存在时，执行下面的代码：

POST test/_doc/1/_update
{
    "scripted_upsert":true,
    "script" : {
        "source": "ctx._source.counter += params.count",
        "lang": "painless",
        "params" : {
            "count" : 4
        }
    },
    "upsert" : {
        "counter" : 10
    }
}

返回的结果如下：

{
  "_index": "test",
  "_type": "_doc",
  "_id": "1",
  "_version": 1,
  "result": "created",
  "_shards": {
    "total": 2,
    "successful": 1,
    "failed": 0
  },
  "_seq_no": 6,
  "_primary_term": 1
}

可见，执行成功了，下面来看看文档：

{
  "_index": "test",
  "_type": "_doc",
  "_id": "1",
  "_version": 1,
  "found": true,
  "_source": {
    "counter": 14
  }
}

counter的值为14，可见是先执行了upsert的内容，然后执行了脚本。

doc_as_upsert
将doc_as_upsert设置为true将使用doc的内容作为upsert值，而不是发送部分doc加上upsert文档：

POST test/_doc/1/_update
{
    "doc" : {
        "name" : "new_name"
    },
    "doc_as_upsert" : true
}

下面来看看和直接写doc的区别：

POST test/_doc/1/_update
{
    "doc" : {
        "name" : "new_name"
    }
}

当文档不存在时，设置doc_as_upsert为true，可以成功执行。而上面这种情况会报错，提示文档不存在。如果向下面这样写会出现什么情况呢？

POST test/_doc/1/_update
{
    "doc" : {
        "name" : "new_name"
    },
    "upsert" : {
        "counter" : 10
    },
    "doc_as_upsert" : true
}

结果是upsert永远不会被执行，不管文档存在不存在，始终执行的是doc的内容。

转载： https://www.cnblogs.com/ginb/p/9413382.html