提交者
GitHub
Merge pull request #1 from zzy5066/patch-1
Update readme-cn.md
正在显示
1 个修改的文件
包含
189 行增加
和
40 行删除
| @@ -6,12 +6,19 @@ go-stash是一个高效的从Kafka获取,根据配置的规则进行处理, | @@ -6,12 +6,19 @@ go-stash是一个高效的从Kafka获取,根据配置的规则进行处理, | ||
| 6 | 6 | ||
| 7 | go-stash有大概logstash 5倍的吞吐性能,并且部署简单,一个可执行文件即可。 | 7 | go-stash有大概logstash 5倍的吞吐性能,并且部署简单,一个可执行文件即可。 |
| 8 | 8 | ||
| 9 | - | 9 | + |
| 10 | 10 | ||
| 11 | -## Quick Start | 11 | + |
| 12 | +### 安装 | ||
| 13 | + | ||
| 14 | +```shell | ||
| 15 | +cd stash && go build stash.go | ||
| 16 | +``` | ||
| 17 | + | ||
| 18 | +### Quick Start | ||
| 12 | 19 | ||
| 13 | ```shell | 20 | ```shell |
| 14 | -gostash -f etc/config.yaml | 21 | +./stash -f etc/config.yaml |
| 15 | ``` | 22 | ``` |
| 16 | 23 | ||
| 17 | config.yaml示例如下: | 24 | config.yaml示例如下: |
| @@ -20,49 +27,189 @@ config.yaml示例如下: | @@ -20,49 +27,189 @@ config.yaml示例如下: | ||
| 20 | Clusters: | 27 | Clusters: |
| 21 | - Input: | 28 | - Input: |
| 22 | Kafka: | 29 | Kafka: |
| 23 | - Name: gostash | 30 | + Name: go-stash |
| 31 | + Log: | ||
| 32 | + Mode: file | ||
| 24 | Brokers: | 33 | Brokers: |
| 25 | - - "172.16.186.16:19092" | ||
| 26 | - - "172.16.186.17:19092" | ||
| 27 | - Topics: | ||
| 28 | - - k8slog | ||
| 29 | - Group: pro | ||
| 30 | - Consumers: 16 | 34 | + - "172.16.48.41:9092" |
| 35 | + - "172.16.48.42:9092" | ||
| 36 | + - "172.16.48.43:9092" | ||
| 37 | + Topic: ngapplog | ||
| 38 | + Group: stash | ||
| 39 | + Conns: 3 | ||
| 40 | + Consumers: 10 | ||
| 41 | + Processors: 60 | ||
| 42 | + MinBytes: 1048576 | ||
| 43 | + MaxBytes: 10485760 | ||
| 44 | + Offset: first | ||
| 31 | Filters: | 45 | Filters: |
| 32 | - - Action: drop | ||
| 33 | - Conditions: | ||
| 34 | - - Key: k8s_container_name | ||
| 35 | - Value: "-rpc" | ||
| 36 | - Type: contains | ||
| 37 | - - Key: level | ||
| 38 | - Value: info | ||
| 39 | - Type: match | ||
| 40 | - Op: and | ||
| 41 | - - Action: remove_field | ||
| 42 | - Fields: | ||
| 43 | - - message | ||
| 44 | - - _source | ||
| 45 | - - _type | ||
| 46 | - - _score | ||
| 47 | - - _id | ||
| 48 | - - "@version" | ||
| 49 | - - topic | ||
| 50 | - - index | ||
| 51 | - - beat | ||
| 52 | - - docker_container | ||
| 53 | - - Action: transfer | ||
| 54 | - Field: message | ||
| 55 | - Target: data | 46 | + - Action: drop |
| 47 | + Conditions: | ||
| 48 | + - Key: status | ||
| 49 | + Value: 503 | ||
| 50 | + Type: contains | ||
| 51 | + - Key: type | ||
| 52 | + Value: "app" | ||
| 53 | + Type: match | ||
| 54 | + Op: and | ||
| 55 | + - Action: remove_field | ||
| 56 | + Fields: | ||
| 57 | + - message | ||
| 58 | + - source | ||
| 59 | + - beat | ||
| 60 | + - fields | ||
| 61 | + - input_type | ||
| 62 | + - offset | ||
| 63 | + - "@version" | ||
| 64 | + - _score | ||
| 65 | + - _type | ||
| 66 | + - clientip | ||
| 67 | + - http_host | ||
| 68 | + - request_time | ||
| 56 | Output: | 69 | Output: |
| 57 | ElasticSearch: | 70 | ElasticSearch: |
| 58 | Hosts: | 71 | Hosts: |
| 59 | - - "172.16.141.4:9200" | ||
| 60 | - - "172.16.141.5:9200" | ||
| 61 | - # {.event}是json输入的event属性值 | ||
| 62 | - # {{yyyy-MM-dd}}表示日期,比如2020-09-09 | ||
| 63 | - Index: {.event}-{{yyyy-MM-dd}} | 72 | + - "http://172.16.188.73:9200" |
| 73 | + - "http://172.16.188.74:9200" | ||
| 74 | + - "http://172.16.188.75:9200" | ||
| 75 | + Index: "go-stash-{{yyyy.MM.dd}}" | ||
| 76 | + MaxChunkBytes: 5242880 | ||
| 77 | + GracePeriod: 10s | ||
| 78 | + Compress: false | ||
| 79 | + TimeZone: UTC | ||
| 80 | +``` | ||
| 81 | + | ||
| 82 | +## 详细说明 | ||
| 83 | + | ||
| 84 | +### input | ||
| 85 | + | ||
| 86 | +```shell | ||
| 87 | +Conns: 3 | ||
| 88 | +Consumers: 10 | ||
| 89 | +Processors: 60 | ||
| 90 | +MinBytes: 1048576 | ||
| 91 | +MaxBytes: 10485760 | ||
| 92 | +Offset: first | ||
| 93 | +``` | ||
| 94 | +#### Conns | ||
| 95 | + 链接kafka的链接数,链接数依据cpu的核数,一般<= CPU的核数; | ||
| 96 | + | ||
| 97 | +#### Consumers | ||
| 98 | + 每个连接数打开的线程数,计算规则为Conns * Consumers,不建议超过分片总数,比如topic分片为30,Conns *Consumers <= 30 | ||
| 99 | + | ||
| 100 | +#### Processors | ||
| 101 | + 处理数据的线程数量,依据CPU的核数,可以适当增加,建议配置:Conns * Consumers * 2 或 Conns * Consumers * 3,例如:60 或 90 | ||
| 102 | + | ||
| 103 | +#### MinBytes MaxBytes | ||
| 104 | + 每次从kafka获取数据块的区间大小,默认为1M~10M,网络和IO较好的情况下,可以适当调高 | ||
| 105 | + | ||
| 106 | +#### Offset | ||
| 107 | + 可选last和false,默认为last,表示从头从kafka开始读取数据 | ||
| 108 | + | ||
| 109 | + | ||
| 110 | +### Filters | ||
| 111 | + | ||
| 112 | +```shell | ||
| 113 | +- Action: drop | ||
| 114 | + Conditions: | ||
| 115 | + - Key: k8s_container_name | ||
| 116 | + Value: "-rpc" | ||
| 117 | + Type: contains | ||
| 118 | + - Key: level | ||
| 119 | + Value: info | ||
| 120 | + Type: match | ||
| 121 | + Op: and | ||
| 122 | +- Action: remove_field | ||
| 123 | + Fields: | ||
| 124 | + - message | ||
| 125 | + - _source | ||
| 126 | + - _type | ||
| 127 | + - _score | ||
| 128 | + - _id | ||
| 129 | + - "@version" | ||
| 130 | + - topic | ||
| 131 | + - index | ||
| 132 | + - beat | ||
| 133 | + - docker_container | ||
| 134 | + - offset | ||
| 135 | + - prospector | ||
| 136 | + - source | ||
| 137 | + - stream | ||
| 138 | +- Action: transfer | ||
| 139 | + Field: message | ||
| 140 | + Target: data | ||
| 141 | + | ||
| 142 | +``` | ||
| 143 | +#### - Action: drop | ||
| 144 | + - 删除标识:满足此条件的数据,在处理时将被移除,不进入es | ||
| 145 | + - 按照删除条件,指定key字段及Value的值,Type字段可选contains(包含)或match(匹配) | ||
| 146 | + - 拼接条件Op: and,也可写or | ||
| 147 | + | ||
| 148 | +#### - Action: remove_field | ||
| 149 | + 移除字段标识:需要移除的字段,在下面列出即可 | ||
| 150 | + | ||
| 151 | +#### - Action: transfer | ||
| 152 | + 转移字段标识:例如可以将message字段,重新定义为data字段 | ||
| 153 | + | ||
| 154 | + | ||
| 155 | +### Output | ||
| 156 | + | ||
| 157 | +#### Index | ||
| 158 | + 索引名称,indexname-{{yyyy.MM.dd}}表示年.月.日,也可以用{{yyyy-MM-dd}},格式自己定义 | ||
| 159 | + | ||
| 160 | +#### MaxChunkBytes | ||
| 161 | + 每次往ES提交的bulk大小,默认是5M,可依据ES的io情况,适当的调整 | ||
| 162 | + | ||
| 163 | +#### GracePeriod | ||
| 164 | + 默认为10s,在程序关闭后,在10s内用于处理余下的消费和数据,优雅退出 | ||
| 165 | + | ||
| 166 | +#### Compress | ||
| 167 | + 数据压缩,压缩会减少传输的数据量,但会增加一定的处理性能,可选值true/false,默认为false | ||
| 168 | + | ||
| 169 | +#### TimeZone | ||
| 170 | + 默认值为UTC,世界标准时间 | ||
| 171 | + | ||
| 172 | + | ||
| 173 | + | ||
| 174 | + | ||
| 175 | + | ||
| 176 | +## ES性能写入测试 | ||
| 177 | + | ||
| 178 | + | ||
| 179 | +### 测试环境 | ||
| 180 | +- stash服务器:3台 4核 8G | ||
| 181 | +- es服务器: 15台 16核 64G | ||
| 182 | + | ||
| 183 | +### 关键配置 | ||
| 184 | + | ||
| 185 | +```shell | ||
| 186 | +- Input: | ||
| 187 | + Conns: 3 | ||
| 188 | + Consumers: 10 | ||
| 189 | + Processors: 60 | ||
| 190 | + MinBytes: 1048576 | ||
| 191 | + MaxBytes: 10485760 | ||
| 192 | + Filters: | ||
| 193 | + - Action: remove_field | ||
| 194 | + Fields: | ||
| 195 | + - message | ||
| 196 | + - source | ||
| 197 | + - beat | ||
| 198 | + - fields | ||
| 199 | + - input_type | ||
| 200 | + - offset | ||
| 201 | + - request_time | ||
| 202 | + Output: | ||
| 203 | + Index: "nginx_pro-{{yyyy.MM.d}}" | ||
| 204 | + Compress: false | ||
| 205 | + MaxChunkBytes: 5242880 | ||
| 206 | + TimeZone: UTC | ||
| 64 | ``` | 207 | ``` |
| 65 | 208 | ||
| 209 | +### 写入速度平均在15W/S以上 | ||
| 210 | + | ||
| 211 | + | ||
| 212 | + | ||
| 66 | ### 微信交流群 | 213 | ### 微信交流群 |
| 67 | 214 | ||
| 68 | 加群之前有劳给一个star,一个小小的star是作者们回答问题的动力。 | 215 | 加群之前有劳给一个star,一个小小的star是作者们回答问题的动力。 |
| @@ -73,4 +220,6 @@ Clusters: | @@ -73,4 +220,6 @@ Clusters: | ||
| 73 | 220 | ||
| 74 | 如果您发现bug请及时提issue,我们会尽快确认并修改。 | 221 | 如果您发现bug请及时提issue,我们会尽快确认并修改。 |
| 75 | 222 | ||
| 76 | -添加我的微信:kevwan,请注明go-stash,我拉进go-stash社区群🤝 | ||
| 223 | +添加我的微信:kevwan,请注明go-stash,我拉进go-stash社区群🤝 | ||
| 224 | + | ||
| 225 | +### --END |
-
请 注册 或 登录 后发表评论