自定义Ambari Alerts

Update thresholds for the Ambari alert


  • 以Host Disk Usage为例

Environment

Product Version
Ambari 2.2.1.0-169
HDP 2.4.0.0-169

Purpose

修改Host Disk Usage warning value,默认50%

Procedure

1、找到集群名字

1
2
3
4
5
6
7
8
9
10
11
12
13
[root@gmjk-dsj8 ~]# curl --user admin:admin http://localhost:8080/api/v1/clusters/
{
"href" : "http://localhost:8080/api/v1/clusters/",
"items" : [
{
"href" : "http://localhost:8080/api/v1/clusters/test",
"Clusters" : {
"cluster_name" : "test",
"version" : "HDP-2.4"
}
}
]
}

2、输出集群所有配置

1
2
3
4
[root@gmjk-dsj8 ~]# curl --user admin:admin http://localhost:8080/api/v1/clusters/test > /root/0620
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 885k 0 885k 0 0 2475k 0 --:--:-- --:--:-- --:--:-- 2488k

3、找到Host Disk Usage配置,URL应该是

1
http://gmjk-dsj8:8080/api/v1/clusters/test/alert_definitions/3

4、查看Host Disk Usage配置细节

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
[root@gmjk-dsj8 ~]# curl -u admin:admin -X GET http://gmjk-dsj8:8080/api/v1/clusters/test/alert_definitions/3
{
"href" : "http://gmjk-dsj8:8080/api/v1/clusters/test/alert_definitions/3",
"AlertDefinition" : {
"cluster_name" : "test",
"component_name" : "AMBARI_AGENT",
"description" : "This host-level alert is triggered if the amount of disk space used goes above specific thresholds. The default threshold values are 70% for WARNING and 80% for CRITICAL",
"enabled" : true,
"id" : 3,
"ignore_host" : false,
"interval" : 1,
"label" : "Host Disk Usage",
"name" : "ambari_agent_disk_usage",
"scope" : "HOST",
"service_name" : "AMBARI",
"source" : {
"parameters" : [
{
"display_name" : "Minimum Free Space",
"description" : "The overall amount of free disk space left before an alert is triggered.",
"name" : "minimum.free.space",
"value" : "5.0E9",
"type" : "NUMERIC",
"units" : "bytes",
"threshold" : "WARNING"
},
{
"display_name" : "Warning",
"description" : "The percent of disk space consumed before a warning is triggered.",
"name" : "percent.used.space.warning.threshold",
"value" : "0.7",
"type" : "PERCENT",
"units" : "%",
"threshold" : "WARNING"
},
{
"display_name" : "Critical",
"description" : "The percent of disk space consumed before a critical alert is triggered.",
"name" : "percent.free.space.critical.threshold",
"value" : "0.8",
"type" : "PERCENT",
"units" : "%",
"threshold" : "CRITICAL"
}
],
"path" : "alert_disk_space.py",
"type" : "SCRIPT"
}
}
}

5、拷贝以上输出,去掉”href”行,修改需要的value和Description

6、上传修改后的JSON,注意 PUT -d 后的单引号

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
[root@gmjk-dsj8 ~]# curl -u admin:admin -H 'X-Requested-By:admin' -X PUT  -d '{
> "AlertDefinition" : {
> "cluster_name" : "test",
> "component_name" : "AMBARI_AGENT",
> "description" : "This host-level alert is triggered if the amount of disk space used goes above specific thresholds. The default threshold values are 70% for WARNING and 80% for CRITICAL",
> "enabled" : true,
> "id" : 3,
> "ignore_host" : false,
> "interval" : 1,
> "label" : "Host Disk Usage",
> "name" : "ambari_agent_disk_usage",
> "scope" : "HOST",
> "service_name" : "AMBARI",
> "source" : {
> "parameters" : [
> {
> "name" : "minimum.free.space",
> "display_name" : "Minimum Free Space",
> "units" : "bytes",
> "value" : 5.0E9,
> "description" : "The overall amount of free disk space left before an alert is triggered.",
> "type" : "NUMERIC",
> "threshold" : "WARNING"
> },
> {
> "name" : "percent.used.space.warning.threshold",
> "display_name" : "Warning",
> "units" : "%",
> "value" : 0.7,
> "description" : "The percent of disk space consumed before a warning is triggered.",
> "type" : "PERCENT",
> "threshold" : "WARNING"
> },
> {
> "name" : "percent.free.space.critical.threshold",
> "display_name" : "Critical",
> "units" : "%",
> "value" : 0.8,
> "description" : "The percent of disk space consumed before a critical alert is triggered.",
> "type" : "PERCENT",
> "threshold" : "CRITICAL"
> }
> ],
> "path" : "alert_disk_space.py",
> "type" : "SCRIPT"
> }
> }
> }' http://gmjk-dsj8:8080/api/v1/clusters/test/alert_definitions/3

7、检查修改后的监控项

文章目录
  1. 1. Update thresholds for the Ambari alert
    1. 1.0.1. Environment
    2. 1.0.2. Purpose
    3. 1.0.3. Procedure
,