正在显示
1 个修改的文件
包含
189 行增加
和
40 行删除
@@ -6,12 +6,19 @@ go-stash是一个高效的从Kafka获取,根据配置的规则进行处理, | @@ -6,12 +6,19 @@ go-stash是一个高效的从Kafka获取,根据配置的规则进行处理, | ||
6 | 6 | ||
7 | go-stash有大概logstash 5倍的吞吐性能,并且部署简单,一个可执行文件即可。 | 7 | go-stash有大概logstash 5倍的吞吐性能,并且部署简单,一个可执行文件即可。 |
8 | 8 | ||
9 | -![go-stash](doc/flow.png) | 9 | +![go-stash](https://pro-public.xiaoheiban.cn/icon/84cc2f235035d7f1da6df512d4ba97b7.png) |
10 | 10 | ||
11 | -## Quick Start | 11 | + |
12 | +### 安装 | ||
13 | + | ||
14 | +```shell | ||
15 | +cd stash && go build stash.go | ||
16 | +``` | ||
17 | + | ||
18 | +### Quick Start | ||
12 | 19 | ||
13 | ```shell | 20 | ```shell |
14 | -gostash -f etc/config.yaml | 21 | +./stash -f etc/config.yaml |
15 | ``` | 22 | ``` |
16 | 23 | ||
17 | config.yaml示例如下: | 24 | config.yaml示例如下: |
@@ -20,49 +27,189 @@ config.yaml示例如下: | @@ -20,49 +27,189 @@ config.yaml示例如下: | ||
20 | Clusters: | 27 | Clusters: |
21 | - Input: | 28 | - Input: |
22 | Kafka: | 29 | Kafka: |
23 | - Name: gostash | 30 | + Name: go-stash |
31 | + Log: | ||
32 | + Mode: file | ||
24 | Brokers: | 33 | Brokers: |
25 | - - "172.16.186.16:19092" | ||
26 | - - "172.16.186.17:19092" | ||
27 | - Topics: | ||
28 | - - k8slog | ||
29 | - Group: pro | ||
30 | - Consumers: 16 | 34 | + - "172.16.48.41:9092" |
35 | + - "172.16.48.42:9092" | ||
36 | + - "172.16.48.43:9092" | ||
37 | + Topic: ngapplog | ||
38 | + Group: stash | ||
39 | + Conns: 3 | ||
40 | + Consumers: 10 | ||
41 | + Processors: 60 | ||
42 | + MinBytes: 1048576 | ||
43 | + MaxBytes: 10485760 | ||
44 | + Offset: first | ||
31 | Filters: | 45 | Filters: |
32 | - - Action: drop | ||
33 | - Conditions: | ||
34 | - - Key: k8s_container_name | ||
35 | - Value: "-rpc" | ||
36 | - Type: contains | ||
37 | - - Key: level | ||
38 | - Value: info | ||
39 | - Type: match | ||
40 | - Op: and | ||
41 | - - Action: remove_field | ||
42 | - Fields: | ||
43 | - - message | ||
44 | - - _source | ||
45 | - - _type | ||
46 | - - _score | ||
47 | - - _id | ||
48 | - - "@version" | ||
49 | - - topic | ||
50 | - - index | ||
51 | - - beat | ||
52 | - - docker_container | ||
53 | - - Action: transfer | ||
54 | - Field: message | ||
55 | - Target: data | 46 | + - Action: drop |
47 | + Conditions: | ||
48 | + - Key: status | ||
49 | + Value: 503 | ||
50 | + Type: contains | ||
51 | + - Key: type | ||
52 | + Value: "app" | ||
53 | + Type: match | ||
54 | + Op: and | ||
55 | + - Action: remove_field | ||
56 | + Fields: | ||
57 | + - message | ||
58 | + - source | ||
59 | + - beat | ||
60 | + - fields | ||
61 | + - input_type | ||
62 | + - offset | ||
63 | + - "@version" | ||
64 | + - _score | ||
65 | + - _type | ||
66 | + - clientip | ||
67 | + - http_host | ||
68 | + - request_time | ||
56 | Output: | 69 | Output: |
57 | ElasticSearch: | 70 | ElasticSearch: |
58 | Hosts: | 71 | Hosts: |
59 | - - "172.16.141.4:9200" | ||
60 | - - "172.16.141.5:9200" | ||
61 | - # {.event}是json输入的event属性值 | ||
62 | - # {{yyyy-MM-dd}}表示日期,比如2020-09-09 | ||
63 | - Index: {.event}-{{yyyy-MM-dd}} | 72 | + - "http://172.16.188.73:9200" |
73 | + - "http://172.16.188.74:9200" | ||
74 | + - "http://172.16.188.75:9200" | ||
75 | + Index: "go-stash-{{yyyy.MM.dd}}" | ||
76 | + MaxChunkBytes: 5242880 | ||
77 | + GracePeriod: 10s | ||
78 | + Compress: false | ||
79 | + TimeZone: UTC | ||
80 | +``` | ||
81 | + | ||
82 | +## 详细说明 | ||
83 | + | ||
84 | +### input | ||
85 | + | ||
86 | +```shell | ||
87 | +Conns: 3 | ||
88 | +Consumers: 10 | ||
89 | +Processors: 60 | ||
90 | +MinBytes: 1048576 | ||
91 | +MaxBytes: 10485760 | ||
92 | +Offset: first | ||
93 | +``` | ||
94 | +#### Conns | ||
95 | + 链接kafka的链接数,链接数依据cpu的核数,一般<= CPU的核数; | ||
96 | + | ||
97 | +#### Consumers | ||
98 | + 每个连接数打开的线程数,计算规则为Conns * Consumers,不建议超过分片总数,比如topic分片为30,Conns *Consumers <= 30 | ||
99 | + | ||
100 | +#### Processors | ||
101 | + 处理数据的线程数量,依据CPU的核数,可以适当增加,建议配置:Conns * Consumers * 2 或 Conns * Consumers * 3,例如:60 或 90 | ||
102 | + | ||
103 | +#### MinBytes MaxBytes | ||
104 | + 每次从kafka获取数据块的区间大小,默认为1M~10M,网络和IO较好的情况下,可以适当调高 | ||
105 | + | ||
106 | +#### Offset | ||
107 | + 可选last和false,默认为last,表示从头从kafka开始读取数据 | ||
108 | + | ||
109 | + | ||
110 | +### Filters | ||
111 | + | ||
112 | +```shell | ||
113 | +- Action: drop | ||
114 | + Conditions: | ||
115 | + - Key: k8s_container_name | ||
116 | + Value: "-rpc" | ||
117 | + Type: contains | ||
118 | + - Key: level | ||
119 | + Value: info | ||
120 | + Type: match | ||
121 | + Op: and | ||
122 | +- Action: remove_field | ||
123 | + Fields: | ||
124 | + - message | ||
125 | + - _source | ||
126 | + - _type | ||
127 | + - _score | ||
128 | + - _id | ||
129 | + - "@version" | ||
130 | + - topic | ||
131 | + - index | ||
132 | + - beat | ||
133 | + - docker_container | ||
134 | + - offset | ||
135 | + - prospector | ||
136 | + - source | ||
137 | + - stream | ||
138 | +- Action: transfer | ||
139 | + Field: message | ||
140 | + Target: data | ||
141 | + | ||
142 | +``` | ||
143 | +#### - Action: drop | ||
144 | + - 删除标识:满足此条件的数据,在处理时将被移除,不进入es | ||
145 | + - 按照删除条件,指定key字段及Value的值,Type字段可选contains(包含)或match(匹配) | ||
146 | + - 拼接条件Op: and,也可写or | ||
147 | + | ||
148 | +#### - Action: remove_field | ||
149 | + 移除字段标识:需要移除的字段,在下面列出即可 | ||
150 | + | ||
151 | +#### - Action: transfer | ||
152 | + 转移字段标识:例如可以将message字段,重新定义为data字段 | ||
153 | + | ||
154 | + | ||
155 | +### Output | ||
156 | + | ||
157 | +#### Index | ||
158 | + 索引名称,indexname-{{yyyy.MM.dd}}表示年.月.日,也可以用{{yyyy-MM-dd}},格式自己定义 | ||
159 | + | ||
160 | +#### MaxChunkBytes | ||
161 | + 每次往ES提交的bulk大小,默认是5M,可依据ES的io情况,适当的调整 | ||
162 | + | ||
163 | +#### GracePeriod | ||
164 | + 默认为10s,在程序关闭后,在10s内用于处理余下的消费和数据,优雅退出 | ||
165 | + | ||
166 | +#### Compress | ||
167 | + 数据压缩,压缩会减少传输的数据量,但会增加一定的处理性能,可选值true/false,默认为false | ||
168 | + | ||
169 | +#### TimeZone | ||
170 | + 默认值为UTC,世界标准时间 | ||
171 | + | ||
172 | + | ||
173 | + | ||
174 | + | ||
175 | + | ||
176 | +## ES性能写入测试 | ||
177 | + | ||
178 | + | ||
179 | +### 测试环境 | ||
180 | +- stash服务器:3台 4核 8G | ||
181 | +- es服务器: 15台 16核 64G | ||
182 | + | ||
183 | +### 关键配置 | ||
184 | + | ||
185 | +```shell | ||
186 | +- Input: | ||
187 | + Conns: 3 | ||
188 | + Consumers: 10 | ||
189 | + Processors: 60 | ||
190 | + MinBytes: 1048576 | ||
191 | + MaxBytes: 10485760 | ||
192 | + Filters: | ||
193 | + - Action: remove_field | ||
194 | + Fields: | ||
195 | + - message | ||
196 | + - source | ||
197 | + - beat | ||
198 | + - fields | ||
199 | + - input_type | ||
200 | + - offset | ||
201 | + - request_time | ||
202 | + Output: | ||
203 | + Index: "nginx_pro-{{yyyy.MM.d}}" | ||
204 | + Compress: false | ||
205 | + MaxChunkBytes: 5242880 | ||
206 | + TimeZone: UTC | ||
64 | ``` | 207 | ``` |
65 | 208 | ||
209 | +### 写入速度平均在15W/S以上 | ||
210 | +![go-stash](https://pro-public.xiaoheiban.cn/icon/ee207a1cb094c0b3dcaa91ae75b118b8.png) | ||
211 | + | ||
212 | + | ||
66 | ### 微信交流群 | 213 | ### 微信交流群 |
67 | 214 | ||
68 | 加群之前有劳给一个star,一个小小的star是作者们回答问题的动力。 | 215 | 加群之前有劳给一个star,一个小小的star是作者们回答问题的动力。 |
@@ -73,4 +220,6 @@ Clusters: | @@ -73,4 +220,6 @@ Clusters: | ||
73 | 220 | ||
74 | 如果您发现bug请及时提issue,我们会尽快确认并修改。 | 221 | 如果您发现bug请及时提issue,我们会尽快确认并修改。 |
75 | 222 | ||
76 | -添加我的微信:kevwan,请注明go-stash,我拉进go-stash社区群🤝 | ||
223 | +添加我的微信:kevwan,请注明go-stash,我拉进go-stash社区群🤝 | ||
224 | + | ||
225 | +### --END |
-
请 注册 或 登录 后发表评论