正在显示
1 个修改的文件
包含
168 行增加
和
19 行删除
@@ -6,12 +6,19 @@ go-stash是一个高效的从Kafka获取,根据配置的规则进行处理, | @@ -6,12 +6,19 @@ go-stash是一个高效的从Kafka获取,根据配置的规则进行处理, | ||
6 | 6 | ||
7 | go-stash有大概logstash 5倍的吞吐性能,并且部署简单,一个可执行文件即可。 | 7 | go-stash有大概logstash 5倍的吞吐性能,并且部署简单,一个可执行文件即可。 |
8 | 8 | ||
9 | -![go-stash](doc/flow.png) | 9 | +![go-stash](https://pro-public.xiaoheiban.cn/icon/84cc2f235035d7f1da6df512d4ba97b7.png) |
10 | 10 | ||
11 | -## Quick Start | 11 | + |
12 | +### 安装 | ||
13 | + | ||
14 | +```shell | ||
15 | +cd stash && go build stash.go | ||
16 | +``` | ||
17 | + | ||
18 | +### Quick Start | ||
12 | 19 | ||
13 | ```shell | 20 | ```shell |
14 | -gostash -f etc/config.yaml | 21 | +./stash -f etc/config.yaml |
15 | ``` | 22 | ``` |
16 | 23 | ||
17 | config.yaml示例如下: | 24 | config.yaml示例如下: |
@@ -20,17 +27,91 @@ config.yaml示例如下: | @@ -20,17 +27,91 @@ config.yaml示例如下: | ||
20 | Clusters: | 27 | Clusters: |
21 | - Input: | 28 | - Input: |
22 | Kafka: | 29 | Kafka: |
23 | - Name: gostash | 30 | + Name: go-stash |
31 | + Log: | ||
32 | + Mode: file | ||
24 | Brokers: | 33 | Brokers: |
25 | - - "172.16.186.16:19092" | ||
26 | - - "172.16.186.17:19092" | ||
27 | - Topics: | ||
28 | - - k8slog | ||
29 | - Group: pro | ||
30 | - Consumers: 16 | 34 | + - "172.16.48.41:9092" |
35 | + - "172.16.48.42:9092" | ||
36 | + - "172.16.48.43:9092" | ||
37 | + Topic: ngapplog | ||
38 | + Group: stash | ||
39 | + Conns: 3 | ||
40 | + Consumers: 10 | ||
41 | + Processors: 60 | ||
42 | + MinBytes: 1048576 | ||
43 | + MaxBytes: 10485760 | ||
44 | + Offset: first | ||
31 | Filters: | 45 | Filters: |
32 | - Action: drop | 46 | - Action: drop |
33 | Conditions: | 47 | Conditions: |
48 | + - Key: status | ||
49 | + Value: 503 | ||
50 | + Type: contains | ||
51 | + - Key: type | ||
52 | + Value: "app" | ||
53 | + Type: match | ||
54 | + Op: and | ||
55 | + - Action: remove_field | ||
56 | + Fields: | ||
57 | + - message | ||
58 | + - source | ||
59 | + - beat | ||
60 | + - fields | ||
61 | + - input_type | ||
62 | + - offset | ||
63 | + - "@version" | ||
64 | + - _score | ||
65 | + - _type | ||
66 | + - clientip | ||
67 | + - http_host | ||
68 | + - request_time | ||
69 | + Output: | ||
70 | + ElasticSearch: | ||
71 | + Hosts: | ||
72 | + - "http://172.16.188.73:9200" | ||
73 | + - "http://172.16.188.74:9200" | ||
74 | + - "http://172.16.188.75:9200" | ||
75 | + Index: "go-stash-{{yyyy.MM.dd}}" | ||
76 | + MaxChunkBytes: 5242880 | ||
77 | + GracePeriod: 10s | ||
78 | + Compress: false | ||
79 | + TimeZone: UTC | ||
80 | +``` | ||
81 | + | ||
82 | +## 详细说明 | ||
83 | + | ||
84 | +### input | ||
85 | + | ||
86 | +```shell | ||
87 | +Conns: 3 | ||
88 | +Consumers: 10 | ||
89 | +Processors: 60 | ||
90 | +MinBytes: 1048576 | ||
91 | +MaxBytes: 10485760 | ||
92 | +Offset: first | ||
93 | +``` | ||
94 | +#### Conns | ||
95 | + 链接kafka的链接数,链接数依据cpu的核数,一般<= CPU的核数; | ||
96 | + | ||
97 | +#### Consumers | ||
98 | + 每个连接数打开的线程数,计算规则为Conns * Consumers,不建议超过分片总数,比如topic分片为30,Conns *Consumers <= 30 | ||
99 | + | ||
100 | +#### Processors | ||
101 | + 处理数据的线程数量,依据CPU的核数,可以适当增加,建议配置:Conns * Consumers * 2 或 Conns * Consumers * 3,例如:60 或 90 | ||
102 | + | ||
103 | +#### MinBytes MaxBytes | ||
104 | + 每次从kafka获取数据块的区间大小,默认为1M~10M,网络和IO较好的情况下,可以适当调高 | ||
105 | + | ||
106 | +#### Offset | ||
107 | + 可选last和false,默认为last,表示从头从kafka开始读取数据 | ||
108 | + | ||
109 | + | ||
110 | +### Filters | ||
111 | + | ||
112 | +```shell | ||
113 | +- Action: drop | ||
114 | + Conditions: | ||
34 | - Key: k8s_container_name | 115 | - Key: k8s_container_name |
35 | Value: "-rpc" | 116 | Value: "-rpc" |
36 | Type: contains | 117 | Type: contains |
@@ -38,7 +119,7 @@ Clusters: | @@ -38,7 +119,7 @@ Clusters: | ||
38 | Value: info | 119 | Value: info |
39 | Type: match | 120 | Type: match |
40 | Op: and | 121 | Op: and |
41 | - - Action: remove_field | 122 | +- Action: remove_field |
42 | Fields: | 123 | Fields: |
43 | - message | 124 | - message |
44 | - _source | 125 | - _source |
@@ -50,19 +131,85 @@ Clusters: | @@ -50,19 +131,85 @@ Clusters: | ||
50 | - index | 131 | - index |
51 | - beat | 132 | - beat |
52 | - docker_container | 133 | - docker_container |
53 | - - Action: transfer | 134 | + - offset |
135 | + - prospector | ||
136 | + - source | ||
137 | + - stream | ||
138 | +- Action: transfer | ||
54 | Field: message | 139 | Field: message |
55 | Target: data | 140 | Target: data |
141 | + | ||
142 | +``` | ||
143 | +#### - Action: drop | ||
144 | + - 删除标识:满足此条件的数据,在处理时将被移除,不进入es | ||
145 | + - 按照删除条件,指定key字段及Value的值,Type字段可选contains(包含)或match(匹配) | ||
146 | + - 拼接条件Op: and,也可写or | ||
147 | + | ||
148 | +#### - Action: remove_field | ||
149 | + 移除字段标识:需要移除的字段,在下面列出即可 | ||
150 | + | ||
151 | +#### - Action: transfer | ||
152 | + 转移字段标识:例如可以将message字段,重新定义为data字段 | ||
153 | + | ||
154 | + | ||
155 | +### Output | ||
156 | + | ||
157 | +#### Index | ||
158 | + 索引名称,indexname-{{yyyy.MM.dd}}表示年.月.日,也可以用{{yyyy-MM-dd}},格式自己定义 | ||
159 | + | ||
160 | +#### MaxChunkBytes | ||
161 | + 每次往ES提交的bulk大小,默认是5M,可依据ES的io情况,适当的调整 | ||
162 | + | ||
163 | +#### GracePeriod | ||
164 | + 默认为10s,在程序关闭后,在10s内用于处理余下的消费和数据,优雅退出 | ||
165 | + | ||
166 | +#### Compress | ||
167 | + 数据压缩,压缩会减少传输的数据量,但会增加一定的处理性能,可选值true/false,默认为false | ||
168 | + | ||
169 | +#### TimeZone | ||
170 | + 默认值为UTC,世界标准时间 | ||
171 | + | ||
172 | + | ||
173 | + | ||
174 | + | ||
175 | + | ||
176 | +## ES性能写入测试 | ||
177 | + | ||
178 | + | ||
179 | +### 测试环境 | ||
180 | +- stash服务器:3台 4核 8G | ||
181 | +- es服务器: 15台 16核 64G | ||
182 | + | ||
183 | +### 关键配置 | ||
184 | + | ||
185 | +```shell | ||
186 | +- Input: | ||
187 | + Conns: 3 | ||
188 | + Consumers: 10 | ||
189 | + Processors: 60 | ||
190 | + MinBytes: 1048576 | ||
191 | + MaxBytes: 10485760 | ||
192 | + Filters: | ||
193 | + - Action: remove_field | ||
194 | + Fields: | ||
195 | + - message | ||
196 | + - source | ||
197 | + - beat | ||
198 | + - fields | ||
199 | + - input_type | ||
200 | + - offset | ||
201 | + - request_time | ||
56 | Output: | 202 | Output: |
57 | - ElasticSearch: | ||
58 | - Hosts: | ||
59 | - - "172.16.141.4:9200" | ||
60 | - - "172.16.141.5:9200" | ||
61 | - # {.event}是json输入的event属性值 | ||
62 | - # {{yyyy-MM-dd}}表示日期,比如2020-09-09 | ||
63 | - Index: {.event}-{{yyyy-MM-dd}} | 203 | + Index: "nginx_pro-{{yyyy.MM.d}}" |
204 | + Compress: false | ||
205 | + MaxChunkBytes: 5242880 | ||
206 | + TimeZone: UTC | ||
64 | ``` | 207 | ``` |
65 | 208 | ||
209 | +### 写入速度平均在15W/S以上 | ||
210 | +![go-stash](https://pro-public.xiaoheiban.cn/icon/ee207a1cb094c0b3dcaa91ae75b118b8.png) | ||
211 | + | ||
212 | + | ||
66 | ### 微信交流群 | 213 | ### 微信交流群 |
67 | 214 | ||
68 | 加群之前有劳给一个star,一个小小的star是作者们回答问题的动力。 | 215 | 加群之前有劳给一个star,一个小小的star是作者们回答问题的动力。 |
@@ -74,3 +221,5 @@ Clusters: | @@ -74,3 +221,5 @@ Clusters: | ||
74 | 如果您发现bug请及时提issue,我们会尽快确认并修改。 | 221 | 如果您发现bug请及时提issue,我们会尽快确认并修改。 |
75 | 222 | ||
76 | 添加我的微信:kevwan,请注明go-stash,我拉进go-stash社区群🤝 | 223 | 添加我的微信:kevwan,请注明go-stash,我拉进go-stash社区群🤝 |
224 | + | ||
225 | +### --END |
-
请 注册 或 登录 后发表评论