diff --git a/code-language/java/README.md b/code-language/java/README.md index 03027ad..37d75de 100644 --- a/code-language/java/README.md +++ b/code-language/java/README.md @@ -1580,7 +1580,164 @@ return orderDO; ### 3.24 设计 - 业务代码写完,就意味着生产就绪了? - +- 上线前的准备 + - 提供健康检测接口 + - 暴露应用内部信息 + - 建立应用指标 Metrics 监控 + +#### A. 准备工作:配置 Spring Boot Actuator +- Spring Boot 有一个 Actuator 模块,封装了诸如健康检测、应用内部信息、Metrics 指标等生产就绪的功能 +- 在 pom 中通过添加依赖的方式引入 Actuator: +```xml + + org.springframework.boot + spring-boot-starter-actuator + +``` +- Actuator 注意一些重要的配置: + - 不希望 Web 应用的 Actuator 管理端口和应用端口重合的话,可以使用 management.server.port 设置独立的端口。 + - Actuator 自带了很多开箱即用提供信息的端点(Endpoint),可以通过 JMX 或 Web 两种方式进行暴露。考虑到有些信息比较敏感,这些内置的端点默认不是完全开启的 + - 默认情况下,Actuator 的 Web 访问方式的根地址为 /actuator,可以通过 management.endpoints.web.base-path 参数进行修改。我来演示下,如何将其修改为 /admin +```properties +management.server.port=45679 +management.endpoints.web.exposure.include=* +management.endpoints.web.base-path=/admin +``` +- 可以访问 http://localhost:45679/admin ,来查看 Actuator 的所有功能URL 了: + - 其中,大部分端点提供的是只读信息,比如查询 Spring 的 Bean、ConfigurableEnvironment、定时任务、SpringBoot 自动配置、Spring MVC 映射等;少部分端点还提供了修改功能, + 比如优雅关闭程序、下载线程 Dump、下载堆 Dump、修改日志级别等 + - Spring Boot 管理工具Spring Boot Admin,它把大部分 Actuator 端点提供的功能封装为了 Web UI + +#### B. 健康检测需要触达关键组件 +- Spring Boot Actuator 帮我们预先实现了诸如数据库、InfluxDB、Elasticsearch、Redis、RabbitMQ 等三方系统的健康检测指示器 HealthIndicator。 +- 通过 Spring Boot 的自动配置,这些指示器会自动生效。当这些组件有问题的时候,HealthIndicator 会返回 DOWN 或 OUT_OF_SERVICE 状态,health 端点 HTTP 响应状态码也会变为 503,我们可以以此来配置程序健康状态监控报警。 +- 为了演示,我们可以修改配置文件,把 management.endpoint.health.show-details 参数设置为 always,让所有用户都可以直接查看各个组件的健康情况 + (如果配置为 whenauthorized,那么可以结合 management.endpoint.health.roles 配置授权的角色): +```properties +management.endpoint.health.show-details=always +``` +- 访问 health 端点可以看到,数据库、磁盘、RabbitMQ、Redis 等组件健康状态是 UP,整个应用的状态也是 UP: +- 在了解了基本配置之后,我们考虑一下,如果程序依赖一个很重要的三方服务,我们希望这个服务无法访问的时候,应用本身的健康状态也是 DOWN。 +- 比如三方服务有一个 user 接口,出现异常的概率是 50%: + - com.baiye.demo.case24.health.UserServiceController + - 要实现这个 user 接口是否正确响应和程序整体的健康状态挂钩的话,很简单,只需定义个 UserServiceHealthIndicator 实现 HealthIndicator 接口即可。 + - com.baiye.demo.case24.health.UserServiceHealthIndicator +- 我们再来看一个聚合多个 HealthIndicator 的案例,也就是定义一个 CompositeHealthContributor 来聚合多个 HealthContributor,实现一组线程池的监控。 +- 首先,在 ThreadPoolProvider 中定义两个线程池,其中 demoThreadPool 是包含一个工作线程的线程池,类型是 ArrayBlockingQueue,阻塞队列的长度为 10;还有一个ioThreadPool 模拟 IO 操作线程池,核心线程数 10,最大线程数 50: + - com.baiye.demo.case24.health.ThreadPoolProvider +- 然后,我们定义一个接口,来把耗时很长的任务提交到这个 demoThreadPool 线程池,以模拟线程池队列满的情况: + - com.baiye.demo.case24.health.UserServiceController.slowTask +- 做了这些准备工作后,让我们来真正实现自定义的 HealthIndicator 类,用于单一线程池的健康状态。 +- 可以传入一个 ThreadPoolExecutor,通过判断队列剩余容量来确定这个组件的健康状态,有剩余量则返回 UP,否则返回 DOWN,并把线程池队列的两个重要数据,也就是当前队列元素个数和剩余量,作为补充信息加入 Health: + - com.baiye.demo.case24.health.ThreadPoolHealthIndicator +- 再定义一个 CompositeHealthContributor,来聚合两个 ThreadPoolHealthIndicator 的实例,分别对应 ThreadPoolProvider 中定义的两个线程池: + - com.baiye.demo.case24.health.ThreadPoolsHealthContributor +- 程序启动后可以看到,health 接口展现了线程池和外部服务 userService 的健康状态 +- Spring Boot 2.3.0增强了健康检测的功能,细化了 Liveness 和 Readiness 两个端点,便于 Spring Boot 应用程序和 Kubernetes 整合。 + +#### C. 对外暴露应用内部重要组件的状态 +- 除了可以把线程池的状态作为整个应用程序是否健康的依据外,我们还可以通过 Actuator的 InfoContributor 功能,对外暴露程序内部重要组件的状态数据。这里,我会用一个例子演示使用 info 的 HTTP 端点、JMX MBean 这两种方式,如何查看状态数据。 +- 实现一个 ThreadPoolInfoContributor 来展现线程池的信息。 + - com.baiye.demo.case24.info.ThreadPoolInfoContributor +- 访问 /admin/info 接口,可以看到这些数据: +- ![组件的状态](pic/组件的状态监控.png) +- 此外,如果设置开启 JMX 的话: +```properties +spring.jmx.enabled=true +``` +- 可以通过 jconsole 工具,在 org.springframework.boot.Endpoint 中找到 Info 这个MBean,然后执行 info 操作可以看到,我们刚才自定义的 InfoContributor 输出的有关两个线程池的信息: +- ![组件的状态](pic/组件的状态监控1.png) +- 对于查看和操作 MBean,除了使用 jconsole 之外,你可以使用 jolokia 把 JMX 转换为 HTTP 协议,引入依赖: +```xml + + org.jolokia + jolokia-core + +``` +- 然后,你就可以通过 jolokia,来执行org.springframework.boot:type=Endpoint,name=Info 这个 MBean 的 info 操作: +- ![组件的状态](pic/组件的状态监控2.png) + +#### D. 指标 Metrics 是快速定位问题的“金钥匙” +- 通过一个实际的案例,来看看如何通过图表快速定位问题。 + - 有一个外卖订单的下单和配送流程,如下图所示。OrderController 进行下单操作,下单操作前先判断参数,如果参数正确调用另一个服务查询商户状态,如果商户在营业的话继续下单,下单成功后发一条消息到 RabbitMQ 进行异步配送流程;然后另一个 + DeliverOrderHandler 监听这条消息进行配送操作。 + - ![组件的状态](pic/组件的状态监控3.png) + - 对于这样一个涉及同步调用和异步调用的业务流程,如果用户反馈下单失败,那我们如何才能快速知道是哪个环节出了问题呢? + - 这时,指标体系就可以发挥作用了。我们可以分别为下单和配送这两个重要操作,建立一些指标进行监控。 + - 对于下单操作,可以建立 4 个指标: + - 下单总数量指标,监控整个系统当前累计的下单量; + - 下单请求指标,对于每次收到下单请求,在处理之前 +1; + - 下单成功指标,每次下单成功完成 +1; + - 下单失败指标,下单操作处理出现异常 +1,并且把异常原因附加到指标上。 + - 对于配送操作,也是建立类似的 4 个指标。我们可以使用 Micrometer 框架实现指标的收集,它也是 Spring Boot Actuator 选用的指标框架。它实现了各种指标的抽象,常用的有三种: + - gauge(红色),它反映的是指标当前的值,是多少就是多少,不能累计,比如本例中的下单总数量指标,又比如游戏的在线人数、JVM 当前线程数都可以认为是 gauge。 + - counter(绿色),每次调用一次方法值增加 1,是可以累计的,比如本例中的下单请求指标。举一个例子,如果 5 秒内我们调用了 10 次方法,Micrometer 也是每隔 5 秒把指标发送给后端存储系统一次,那么它可以只发送一次值,其值为 10。 + - timer(蓝色),类似 counter,只不过除了记录次数,还记录耗时,比如本例中的下单成功和下单失败两个指标。 + - 所有的指标还可以附加一些 tags 标签,作为补充数据。比如,当操作执行失败的时候,我们就会附加一个 reason 标签到指标上。 + - Micrometer 除了抽象了指标外,还抽象了存储。你可以把 Micrometer 理解为类似 SLF4J 这样的框架,只不过后者针对日志抽象,而 Micrometer 是针对指标进行抽象。Micrometer 通过引入各种 registry,可以实现无缝对接各种监控系统或时间序列数据库。 +- 在这个案例中,我们引入了 micrometer-registry-influx 依赖,目的是引入 Micrometer的核心依赖,以及通过 Micrometer 对于InfluxDB(InfluxDB 是一个时间序列数据库,其专长是存储指标数据)的绑定,以实现指标数据可以保存到 InfluxDB: +```xml + + io.micrometer + micrometer-registry-influx + +``` +- 然后,修改配置文件,启用指标输出到 InfluxDB 的开关、配置 InfluxDB 的地址,以及设置指标每秒在客户端聚合一次,然后发送到 InfluxDB: +```properties +management.metrics.export.influx.enabled=true +management.metrics.export.influx.uri=http://localhost:8086 +management.metrics.export.influx.step=1S +``` +- 接下来,我们在业务逻辑中增加相关的代码来记录指标。 +- OrderController 的实现通过 Micrometer 框架,来实现下单总数量、下单请求、下单成功和下单失败这四个指标 + - com.baiye.demo.case24.metrics.OrderController.createOrder + - 当用户 ID<10 的时候,我们模拟用户数据无效的情况,当商户 ID 不为 2 的时候我们模拟商户不营业的情况。 +- 接下来是 DeliverOrderHandler 配送服务的实现。 + - 其中,deliverOrder 方法监听 OrderController 发出的 MQ 消息模拟配送。如下代码 + - com.baiye.demo.case24.metrics.DeliverOrderHandler + - 同时,我们模拟了一个配送服务整体状态的开关,调用 status 接口可以修改其状态。至此,我们完成了场景准备,接下来开始配置指标监控。 +- 我们来安装 Grafana。然后进入 Grafana 配置一个 InfluxDB 数据源: + -![组件的状态监控](pic/组件的状态监控4.png) + - 配置好数据源之后,就可以添加一个监控面板,然后在面板中添加各种监控图表。比如,我们在一个下单次数图表中添加了下单收到、成功和失败三个指标 + - ![组件的状态监控](pic/组件的状态监控5.png) + - 关于这张图中的配置: + - 红色框数据源配置,选择刚才配置的数据源 + - 蓝色框 FROM 配置,选择我们的指标名。 + - 绿色框 SELECT 配置,选择我们要查询的指标字段,也可以应用一些聚合函数。在这里,我们取 count 字段的值,然后使用 sum 函数进行求和。 + - 紫色框 GROUP BY 配置,我们配置了按 1 分钟时间粒度和 reason 字段进行分组,这样指标的 Y 轴代表 QPM(每分钟请求数),且每种失败的情况都会绘制单独的曲线。 + - 黄色框 ALIAS BY 配置中设置了每一个指标的别名,在别名中引用了 reason 这个 tag。 + - 使用 Grafana 配置 InfluxDB 指标的详细方式。其中的 FROM、 SELECT、GROUP BY 的含义和 SQL 类似,理解起来应该不困难。 + - 类似地, 我们配置出一个完整的业务监控面板,包含之前实现的 8 个指标: + - 配置 2 个 Gauge 图表分别呈现总订单完成次数、总配送完成次数。 + - 配置 4 个 Graph 图表分别呈现下单操作的次数和性能,以及配送操作的次数和性能。 +- 配置 4 个 Graph 图表分别呈现下单操作的次数和性能,以及配送操作的次数和性能。 + - 第一种情况是,使用合法的用户 ID 和营业的商户 ID 运行一段时间: + - wrk -t 1 -c 1 -d 3600s http://localhost:45678/order/createOrder\?userId=20&merchantId=2 + - 从监控面板可以一目了然地看到整个系统的运作情况。可以看到,目前系统运行良好,不管是下单还是配送操作都是成功的,且下单操作平均处理时间 400ms、配送操作则是在500ms 左右,符合预期(注意,下单次数曲线中的绿色和黄色两条曲线其实是重叠在一起 + 的,表示所有下单都成功了): + - ![组件的状态监控](pic/组件的状态监控6.png) + - 第二种情况是,模拟无效用户 ID 运行一段时间: + - wrk -t 1 -c 1 -d 3600s http://localhost:45678/order/createOrder?userId=2&merchantId=2 + - 使用无效用户下单,显然会导致下单全部失败。接下来,我们就看看从监控图中是否能看到这个现象。 + - ![组件的状态监控](pic/组件的状态监控7.png) + - 绿色框可以看到,下单现在出现了 invalid user 这条蓝色的曲线,并和绿色收到下单请求的曲线是吻合的,表示所有下单都失败了,原因是无效用户错误,说明源头并没有问题。 + - 红色框可以看到,虽然下单都是失败的,但是下单操作时间从 400ms 减少为 200ms了,说明下单失败之前也消耗了 200ms(和代码符合)。而因为下单失败操作的响应时间减半了,反而导致吞吐翻倍了。 + - 观察两个配送监控可以发现,配送曲线出现掉 0 现象,是因为下单失败导致的,下单失败 MQ 消息压根就不会发出。再注意下蓝色那条线,可以看到配送曲线掉 0 延后于下单成功曲线的掉 0, + 原因是配送走的是异步流程,虽然从某个时刻开始下单全部失败了,但是 MQ 队列中还有一些之前未处理的消息。 + - 第三种情况是,尝试一下因为商户不营业导致的下单失败: + - wrk -t 1 -c 1 -d 3600s http://localhost:45678/order/createOrder\?userId\=20\&merchantId\=1 + - ![组件的状态监控](pic/组件的状态监控8.png) + - 第四种情况是,配送停止。我们通过 curl 调用接口,来设置配送停止开关: + - curl -X POST 'http://localhost:45678/deliver/status?status=false' + - ![组件的状态监控](pic/组件的状态监控9.png) + - 从监控可以看到,从开关关闭那刻开始,所有的配送消息全部处理失败了,原因是 deliveroutofservice,配送操作性能从 500ms 左右到了 0ms,说明配送失败是一个本地快速失败,并不是因为服务超时等导致的失败。而且虽然配送失败,但下单操作都是正常的: + - 最后希望说的是,除了手动添加业务监控指标外,Micrometer 框架还帮我们自动做了很多有关 JVM 内部各种数据的指标。进入 InfluxDB 命令行客户端,你可以看到下面的这些表(指标),其中前 8 个是我们自己建的业务指标,后面都是框架帮我们建的 JVM、各种组 + 件状态的指标: + - 我们可以按照自己的需求,选取其中的一些指标,在 Grafana 中配置应用监控面板: + - ![组件的状态监控](pic/组件的状态监控10.png) + +#### E. 完善的监控体系 +- ![组件的状态监控](pic/组件的状态监控11.png) ### 3.25 设计 - 异步处理好用,但非常容易用错 - 异步处理是互联网应用不可或缺的一种架构模式,大多数业务项目都是由同步处理、异步处理和定时任务处理三种模式相辅相成实现的。 diff --git a/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/health/CommonMistakesApplication.java b/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/health/CommonMistakesApplication.java new file mode 100644 index 0000000..f2f7f0f --- /dev/null +++ b/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/health/CommonMistakesApplication.java @@ -0,0 +1,23 @@ +package com.baiye.demo.case24.health; + +import com.baiye.demo.utils.Utils; +import org.springframework.boot.SpringApplication; +import org.springframework.boot.autoconfigure.SpringBootApplication; +import org.springframework.context.annotation.Bean; +import org.springframework.web.client.RestTemplate; + +@SpringBootApplication +public class CommonMistakesApplication { + + public static void main(String[] args) { + Utils.loadPropertySource(CommonMistakesApplication.class, "actuator.properties"); + + SpringApplication.run(CommonMistakesApplication.class, args); + } + + @Bean + public RestTemplate restTemplate() { + return new RestTemplate(); + } +} + diff --git a/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/health/ThreadPoolHealthIndicator.java b/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/health/ThreadPoolHealthIndicator.java new file mode 100644 index 0000000..26273c0 --- /dev/null +++ b/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/health/ThreadPoolHealthIndicator.java @@ -0,0 +1,32 @@ +package com.baiye.demo.case24.health; + +import org.springframework.boot.actuate.health.Health; +import org.springframework.boot.actuate.health.HealthIndicator; + +import java.util.HashMap; +import java.util.Map; +import java.util.concurrent.ThreadPoolExecutor; + +public class ThreadPoolHealthIndicator implements HealthIndicator { + private ThreadPoolExecutor threadPool; + + public ThreadPoolHealthIndicator(ThreadPoolExecutor threadPool) { + this.threadPool = threadPool; + } + + @Override + public Health health() { + // 补充信息 + Map detail = new HashMap<>(); + // 队列当前元素个数 + detail.put("queue_size", threadPool.getQueue().size()); + // 队列剩余容量 + detail.put("queue_remaining", threadPool.getQueue().remainingCapacity()); + // 如果还有剩余量则返回UP,否则返回DOWN + if (threadPool.getQueue().remainingCapacity() > 0) { + return Health.up().withDetails(detail).build(); + } else { + return Health.down().withDetails(detail).build(); + } + } +} diff --git a/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/health/ThreadPoolProvider.java b/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/health/ThreadPoolProvider.java new file mode 100644 index 0000000..23c746d --- /dev/null +++ b/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/health/ThreadPoolProvider.java @@ -0,0 +1,35 @@ +package com.baiye.demo.case24.health; + +import jodd.util.concurrent.ThreadFactoryBuilder; + +import java.util.concurrent.ArrayBlockingQueue; +import java.util.concurrent.ThreadPoolExecutor; +import java.util.concurrent.TimeUnit; + +public class ThreadPoolProvider { + /** + * 一个工作线程的线程池,队列长度10 + */ + private static ThreadPoolExecutor demoThreadPool = new ThreadPoolExecutor( + 1, 1, + 2, TimeUnit.SECONDS, + new ArrayBlockingQueue<>(10), + new ThreadFactoryBuilder().setNameFormat("demo-threadpool-%d").get()); + + /** + * 核心线程数10,最大线程数50的线程池,队列长度50 + */ + private static ThreadPoolExecutor ioThreadPool = new ThreadPoolExecutor( + 10, 50, + 2, TimeUnit.SECONDS, + new ArrayBlockingQueue<>(100), + new ThreadFactoryBuilder().setNameFormat("io-threadpool-%d").get()); + + public static ThreadPoolExecutor getDemoThreadPool() { + return demoThreadPool; + } + + public static ThreadPoolExecutor getIOThreadPool() { + return ioThreadPool; + } +} diff --git a/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/health/ThreadPoolsHealthContributor.java b/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/health/ThreadPoolsHealthContributor.java new file mode 100644 index 0000000..7a1039e --- /dev/null +++ b/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/health/ThreadPoolsHealthContributor.java @@ -0,0 +1,38 @@ +package com.baiye.demo.case24.health; + +import org.springframework.boot.actuate.health.CompositeHealthContributor; +import org.springframework.boot.actuate.health.HealthContributor; +import org.springframework.boot.actuate.health.NamedContributor; +import org.springframework.stereotype.Component; + +import java.util.HashMap; +import java.util.Iterator; +import java.util.Map; + +@Component +public class ThreadPoolsHealthContributor implements CompositeHealthContributor { + + /** + * 保存所有的子HealthContributor + */ + private Map contributors = new HashMap<>(); + + ThreadPoolsHealthContributor() { + // 对应ThreadPoolProvider中定义的两个线程池 + this.contributors.put("demoThreadPool", new ThreadPoolHealthIndicator(ThreadPoolProvider.getDemoThreadPool())); + this.contributors.put("ioThreadPool", new ThreadPoolHealthIndicator(ThreadPoolProvider.getIOThreadPool())); + } + + @Override + public HealthContributor getContributor(String name) { + //根据name找到某一个HealthContributor + return contributors.get(name); + } + + @Override + public Iterator> iterator() { + // 返回NamedContributor的迭代器,NamedContributor也就是Contributor实例 + 一个命名 + return contributors.entrySet().stream() + .map((entry) -> NamedContributor.of(entry.getKey(), entry.getValue())).iterator(); + } +} diff --git a/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/health/User.java b/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/health/User.java new file mode 100644 index 0000000..488904e --- /dev/null +++ b/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/health/User.java @@ -0,0 +1,13 @@ +package com.baiye.demo.case24.health; + +import lombok.AllArgsConstructor; +import lombok.Data; +import lombok.NoArgsConstructor; + +@Data +@AllArgsConstructor +@NoArgsConstructor +public class User { + private long userId; + private String userName; +} diff --git a/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/health/UserServiceController.java b/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/health/UserServiceController.java new file mode 100644 index 0000000..eee53b6 --- /dev/null +++ b/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/health/UserServiceController.java @@ -0,0 +1,33 @@ +package com.baiye.demo.case24.health; + +import lombok.extern.slf4j.Slf4j; +import org.springframework.web.bind.annotation.GetMapping; +import org.springframework.web.bind.annotation.RequestMapping; +import org.springframework.web.bind.annotation.RequestParam; +import org.springframework.web.bind.annotation.RestController; + +import java.util.concurrent.ThreadLocalRandom; +import java.util.concurrent.TimeUnit; + +@Slf4j +@RestController +@RequestMapping("user") +public class UserServiceController { + @GetMapping + public User getUser(@RequestParam("userId") long id) { + if (ThreadLocalRandom.current().nextInt() % 2 == 0) + return new User(id, "name" + id); + else + throw new RuntimeException("error"); + } + + @GetMapping("slowTask") + public void slowTask() { + ThreadPoolProvider.getDemoThreadPool().execute(() -> { + try { + TimeUnit.HOURS.sleep(1); + } catch (InterruptedException e) { + } + }); + } +} diff --git a/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/health/UserServiceHealthIndicator.java b/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/health/UserServiceHealthIndicator.java new file mode 100644 index 0000000..30448ec --- /dev/null +++ b/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/health/UserServiceHealthIndicator.java @@ -0,0 +1,40 @@ +package com.baiye.demo.case24.health; + +import lombok.extern.slf4j.Slf4j; +import org.springframework.beans.factory.annotation.Autowired; +import org.springframework.boot.actuate.health.Health; +import org.springframework.boot.actuate.health.HealthIndicator; +import org.springframework.stereotype.Component; +import org.springframework.web.client.RestTemplate; + +@Component +@Slf4j +public class UserServiceHealthIndicator implements HealthIndicator { + @Autowired + private RestTemplate restTemplate; + + @Override + public Health health() { + long begin = System.currentTimeMillis(); + long userId = 1L; + User user = null; + try { + //访问远程接口 + user = restTemplate.getForObject("http://localhost:45678/user?userId=" + userId, User.class); + if (user != null && user.getUserId() == userId) { + //结果正确,返回UP状态,补充提供耗时和用户信息 + return Health.up() + .withDetail("user", user) + .withDetail("took", System.currentTimeMillis() - begin) + .build(); + } else { + //结果不正确,返回DOWN状态,补充提供耗时 + return Health.down().withDetail("took", System.currentTimeMillis() - begin).build(); + } + } catch (Exception ex) { + //出现异常,先记录异常,然后返回DOWN状态,补充提供异常信息和耗时 + log.warn("health check failed!", ex); + return Health.down(ex).withDetail("took", System.currentTimeMillis() - begin).build(); + } + } +} diff --git a/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/health/actuator.properties b/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/health/actuator.properties new file mode 100644 index 0000000..1851fd9 --- /dev/null +++ b/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/health/actuator.properties @@ -0,0 +1 @@ +spring.mvc.throw-exception-if-no-handler-found=true \ No newline at end of file diff --git a/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/info/CommonMistakesApplication.java b/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/info/CommonMistakesApplication.java new file mode 100644 index 0000000..f2e689b --- /dev/null +++ b/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/info/CommonMistakesApplication.java @@ -0,0 +1,23 @@ +package com.baiye.demo.case24.info; + +import org.geekbang.time.commonmistakes.common.Utils; +import org.springframework.boot.SpringApplication; +import org.springframework.boot.autoconfigure.SpringBootApplication; +import org.springframework.context.annotation.Bean; +import org.springframework.web.client.RestTemplate; + +@SpringBootApplication +public class CommonMistakesApplication { + + public static void main(String[] args) { + Utils.loadPropertySource(CommonMistakesApplication.class, "actuator.properties"); + + SpringApplication.run(CommonMistakesApplication.class, args); + } + + @Bean + public RestTemplate restTemplate() { + return new RestTemplate(); + } +} + diff --git a/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/info/TestEndpoint.java b/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/info/TestEndpoint.java new file mode 100644 index 0000000..5118a04 --- /dev/null +++ b/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/info/TestEndpoint.java @@ -0,0 +1,24 @@ +package com.baiye.demo.case24.info; + +import org.springframework.boot.actuate.endpoint.annotation.Endpoint; +import org.springframework.boot.actuate.endpoint.annotation.ReadOperation; +import org.springframework.boot.actuate.endpoint.annotation.WriteOperation; +import org.springframework.stereotype.Component; + +import java.util.concurrent.atomic.AtomicLong; + +@Endpoint(id = "adder") +@Component +public class TestEndpoint { + private static AtomicLong atomicLong = new AtomicLong(); + + @ReadOperation + public String get() { + return String.valueOf(atomicLong.get()); + } + + @WriteOperation + public String increment() { + return String.valueOf(atomicLong.incrementAndGet()); + } +} diff --git a/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/info/ThreadPoolInfoContributor.java b/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/info/ThreadPoolInfoContributor.java new file mode 100644 index 0000000..55457ab --- /dev/null +++ b/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/info/ThreadPoolInfoContributor.java @@ -0,0 +1,33 @@ +package com.baiye.demo.case24.info; + +import org.springframework.boot.actuate.info.Info; +import org.springframework.boot.actuate.info.InfoContributor; +import org.springframework.stereotype.Component; + +import java.util.HashMap; +import java.util.Map; +import java.util.concurrent.ThreadPoolExecutor; + +@Component +public class ThreadPoolInfoContributor implements InfoContributor { + private static Map threadPoolInfo(ThreadPoolExecutor threadPool) { + Map info = new HashMap<>(); + //当前池大小 + info.put("poolSize", threadPool.getPoolSize()); + //设置的核心池大小 + info.put("corePoolSize", threadPool.getCorePoolSize()); + //最大达到过的池大小 + info.put("largestPoolSize", threadPool.getLargestPoolSize()); + //设置的最大池大小 + info.put("maximumPoolSize", threadPool.getMaximumPoolSize()); + //总完成任务数 + info.put("completedTaskCount", threadPool.getCompletedTaskCount()); + return info; + } + + @Override + public void contribute(Info.Builder builder) { + builder.withDetail("demoThreadPool", threadPoolInfo(ThreadPoolProvider.getDemoThreadPool())); + builder.withDetail("ioThreadPool", threadPoolInfo(ThreadPoolProvider.getIOThreadPool())); + } +} diff --git a/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/info/ThreadPoolProvider.java b/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/info/ThreadPoolProvider.java new file mode 100644 index 0000000..b4346d6 --- /dev/null +++ b/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/info/ThreadPoolProvider.java @@ -0,0 +1,30 @@ +package com.baiye.demo.case24.info; + +import jodd.util.concurrent.ThreadFactoryBuilder; + +import java.util.concurrent.ArrayBlockingQueue; +import java.util.concurrent.ThreadPoolExecutor; +import java.util.concurrent.TimeUnit; + +public class ThreadPoolProvider { + private static ThreadPoolExecutor demoThreadPool = new ThreadPoolExecutor( + 1, 1, + 2, TimeUnit.SECONDS, + new ArrayBlockingQueue<>(10), + new ThreadFactoryBuilder().setNameFormat("demo-threadpool-%d").get()); + + private static ThreadPoolExecutor ioThreadPool = new ThreadPoolExecutor( + 10, 50, + 2, TimeUnit.SECONDS, + new ArrayBlockingQueue<>(100), + new ThreadFactoryBuilder().setNameFormat("io-threadpool-%d").get()); + + public static ThreadPoolExecutor getDemoThreadPool() { + return demoThreadPool; + } + + public static ThreadPoolExecutor getIOThreadPool() { + return ioThreadPool; + } + +} diff --git a/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/info/User.java b/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/info/User.java new file mode 100644 index 0000000..90a389f --- /dev/null +++ b/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/info/User.java @@ -0,0 +1,13 @@ +package com.baiye.demo.case24.info; + +import lombok.AllArgsConstructor; +import lombok.Data; +import lombok.NoArgsConstructor; + +@Data +@AllArgsConstructor +@NoArgsConstructor +public class User { + private long userId; + private String userName; +} diff --git a/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/info/UserServiceController.java b/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/info/UserServiceController.java new file mode 100644 index 0000000..d5158cd --- /dev/null +++ b/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/info/UserServiceController.java @@ -0,0 +1,34 @@ +package com.baiye.demo.case24.info; + +import lombok.extern.slf4j.Slf4j; +import org.springframework.web.bind.annotation.GetMapping; +import org.springframework.web.bind.annotation.RequestMapping; +import org.springframework.web.bind.annotation.RequestParam; +import org.springframework.web.bind.annotation.RestController; + +import java.util.concurrent.ThreadLocalRandom; +import java.util.concurrent.TimeUnit; + +@Slf4j +@RestController +@RequestMapping("user") +public class UserServiceController { + @GetMapping + public User getUser(@RequestParam("userId") long id) { + //一半概率返回正确响应,一半概率抛异常 + if (ThreadLocalRandom.current().nextInt() % 2 == 0) + return new User(id, "name" + id); + else + throw new RuntimeException("error"); + } + + @GetMapping("slowTask") + public void slowTask() { + ThreadPoolProvider.getDemoThreadPool().execute(() -> { + try { + TimeUnit.HOURS.sleep(1); + } catch (InterruptedException e) { + } + }); + } +} diff --git a/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/info/actuator.properties b/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/info/actuator.properties new file mode 100644 index 0000000..8c449e2 --- /dev/null +++ b/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/info/actuator.properties @@ -0,0 +1,5 @@ +management.server.port=45679 +management.endpoints.web.exposure.include=* +management.endpoint.health.show-details=always +management.endpoints.web.base-path=/admin +spring.jmx.enabled=true \ No newline at end of file diff --git a/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/metrics/CommonMistakesApplication.java b/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/metrics/CommonMistakesApplication.java new file mode 100644 index 0000000..98067f1 --- /dev/null +++ b/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/metrics/CommonMistakesApplication.java @@ -0,0 +1,22 @@ +package com.baiye.demo.case24.metrics; + +import org.geekbang.time.commonmistakes.common.Utils; +import org.springframework.boot.SpringApplication; +import org.springframework.boot.autoconfigure.SpringBootApplication; +import org.springframework.context.annotation.Bean; +import org.springframework.web.client.RestTemplate; + +@SpringBootApplication +public class CommonMistakesApplication { + + public static void main(String[] args) { + Utils.loadPropertySource(CommonMistakesApplication.class, "influxdb.properties"); + SpringApplication.run(CommonMistakesApplication.class, args); + } + + @Bean + public RestTemplate restTemplate() { + return new RestTemplate(); + } +} + diff --git a/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/metrics/Consts.java b/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/metrics/Consts.java new file mode 100644 index 0000000..9802ca5 --- /dev/null +++ b/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/metrics/Consts.java @@ -0,0 +1,7 @@ +package com.baiye.demo.case24.metrics; + +public class Consts { + public static final String QUEUE = "createOrder"; + public static final String EXCHANGE = "createOrder"; + public static final String ROUTING_KEY = "createOrder"; +} diff --git a/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/metrics/DeliverOrderHandler.java b/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/metrics/DeliverOrderHandler.java new file mode 100644 index 0000000..67d5425 --- /dev/null +++ b/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/metrics/DeliverOrderHandler.java @@ -0,0 +1,66 @@ +package com.baiye.demo.case24.metrics; + +import io.micrometer.core.instrument.Metrics; +import lombok.extern.slf4j.Slf4j; +import org.springframework.amqp.rabbit.annotation.RabbitListener; +import org.springframework.web.bind.annotation.PostMapping; +import org.springframework.web.bind.annotation.RequestMapping; +import org.springframework.web.bind.annotation.RequestParam; +import org.springframework.web.bind.annotation.RestController; + +import javax.annotation.PostConstruct; +import java.time.Duration; +import java.time.Instant; +import java.util.concurrent.TimeUnit; +import java.util.concurrent.atomic.AtomicLong; + +/** + * 配送服务消息处理程序 + */ +@RestController +@Slf4j +@RequestMapping("deliver") +public class DeliverOrderHandler { + + /** + * 配送服务运行状态 + */ + private volatile boolean deliverStatus = true; + private AtomicLong deliverCounter = new AtomicLong(); + + /** + * 通过一个外部接口来改变配送状态模拟配送服务停工 + */ + @PostMapping("status") + public void status(@RequestParam("status") boolean status) { + deliverStatus = status; + } + + @PostConstruct + public void init() { + //fixme 指标1 同样注册一个gauge指标deliverOrder.totalSuccess,代表总的配送单量,只需注册一次即可 + Metrics.gauge("deliverOrder.totalSuccess", deliverCounter); + } + + /** + * 监听MQ消息 + */ + @RabbitListener(queues = Consts.QUEUE, concurrency = "5") + public void deliverOrder(Order order) { + Instant begin = Instant.now(); + //fixme 指标2 对deliverOrder.received进行递增,代表收到一次订单消息,counter类型 + Metrics.counter("deliverOrder.received").increment(); + try { + if (!deliverStatus) + throw new RuntimeException("deliver outofservice"); + TimeUnit.MILLISECONDS.sleep(500); + deliverCounter.incrementAndGet(); + //fixme 指标3 配送成功指标deliverOrder.success,timer类型 + Metrics.timer("deliverOrder.success").record(Duration.between(begin, Instant.now())); + } catch (Exception ex) { + log.error("deliver Order {} failed", order, ex); + //fixme 指标4 配送失败指标deliverOrder.failed,同样附加了失败原因作为tags,timer类型 + Metrics.timer("deliverOrder.failed", "reason", ex.getMessage()).record(Duration.between(begin, Instant.now())); + } + } +} diff --git a/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/metrics/Order.java b/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/metrics/Order.java new file mode 100644 index 0000000..b080f95 --- /dev/null +++ b/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/metrics/Order.java @@ -0,0 +1,12 @@ +package com.baiye.demo.case24.metrics; + +import lombok.Data; + +import java.io.Serializable; + +@Data +public class Order implements Serializable { + private Long id; + private Long userId; + private Long merchantId; +} diff --git a/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/metrics/OrderController.java b/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/metrics/OrderController.java new file mode 100644 index 0000000..5799747 --- /dev/null +++ b/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/metrics/OrderController.java @@ -0,0 +1,85 @@ +package com.baiye.demo.case24.metrics; + +import io.micrometer.core.instrument.Metrics; +import lombok.extern.slf4j.Slf4j; +import org.springframework.amqp.rabbit.core.RabbitAdmin; +import org.springframework.amqp.rabbit.core.RabbitTemplate; +import org.springframework.beans.factory.annotation.Autowired; +import org.springframework.web.bind.annotation.GetMapping; +import org.springframework.web.bind.annotation.RequestMapping; +import org.springframework.web.bind.annotation.RequestParam; +import org.springframework.web.bind.annotation.RestController; +import org.springframework.web.client.RestTemplate; + +import javax.annotation.PostConstruct; +import java.time.Duration; +import java.time.Instant; +import java.util.concurrent.TimeUnit; +import java.util.concurrent.atomic.AtomicLong; + +/** + * 下单操作,以及商户服务的接口 + */ +@Slf4j +@RestController +@RequestMapping("order") +public class OrderController { + /** + * 总订单创建数量 + */ + private AtomicLong createOrderCounter = new AtomicLong(); + private RabbitAdmin rabbitAdmin; + @Autowired + private RabbitTemplate rabbitTemplate; + @Autowired + private RestTemplate restTemplate; + + @PostConstruct + public void init() { + // fixme 指标1 注册createOrder.received指标,gauge指标只需要像这样初始化一次,直接关联到AtomicLong引用即可 + Metrics.gauge("createOrder.totalSuccess", createOrderCounter); + } + + /** + * 下单接口,提供用户ID和商户ID作为入参 + */ + @GetMapping("createOrder") + public void createOrder(@RequestParam("userId") long userId, @RequestParam("merchantId") long merchantId) { + //fixme 指标2 记录一次createOrder.received指标,表示收到下单请求 + Metrics.counter("createOrder.received").increment(); + Instant begin = Instant.now(); + try { + TimeUnit.MILLISECONDS.sleep(200); + // 模拟无效用户的情况,ID<10的为无效用户 + if (userId < 10) + throw new RuntimeException("invalid user"); + //查询商户服务 + Boolean merchantStatus = restTemplate.getForObject("http://localhost:45678/order/getMerchantStatus?merchantId=" + merchantId, Boolean.class); + if (merchantStatus == null || !merchantStatus) + throw new RuntimeException("closed merchant"); + Order order = new Order(); + // gauge指标可以得到自动更新 + order.setId(createOrderCounter.incrementAndGet()); + order.setUserId(userId); + order.setMerchantId(merchantId); + //发送MQ消息 + rabbitTemplate.convertAndSend(Consts.EXCHANGE, Consts.ROUTING_KEY, order); + // fixme 指标3 记录一次createOrder.success指标,表示下单成功,同时提供耗时 + Metrics.timer("createOrder.success").record(Duration.between(begin, Instant.now())); + } catch (Exception ex) { + log.error("creareOrder userId {} failed", userId, ex); + //fixme 指标4 记录一次createOrder.failed指标,表示下单失败,同时提供耗时,并且以tag记录失败原因 + Metrics.timer("createOrder.failed", "reason", ex.getMessage()).record(Duration.between(begin, Instant.now())); + } + } + + /** + * 商户查询接口 + */ + @GetMapping("getMerchantStatus") + public boolean getMerchantStatus(@RequestParam("merchantId") long merchantId) throws InterruptedException { + //只有商户ID为2的商户才是营业的 + TimeUnit.MILLISECONDS.sleep(200); + return merchantId == 2; + } +} diff --git a/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/metrics/RabbitConfiguration.java b/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/metrics/RabbitConfiguration.java new file mode 100644 index 0000000..98c5d76 --- /dev/null +++ b/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/metrics/RabbitConfiguration.java @@ -0,0 +1,26 @@ +package com.baiye.demo.case24.metrics; + +import org.springframework.amqp.core.*; +import org.springframework.context.annotation.Bean; +import org.springframework.context.annotation.Configuration; + +@Configuration +public class RabbitConfiguration { + //队列 + @Bean + public Queue queue() { + return new Queue(Consts.QUEUE); + } + + //交换器 + @Bean + public Exchange exchange() { + return ExchangeBuilder.directExchange(Consts.EXCHANGE).durable(true).build(); + } + + //绑定 + @Bean + public Binding binding() { + return BindingBuilder.bind(queue()).to(exchange()).with(Consts.ROUTING_KEY).noargs(); + } +} diff --git a/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/metrics/influxdb.properties b/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/metrics/influxdb.properties new file mode 100644 index 0000000..b71592f --- /dev/null +++ b/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/metrics/influxdb.properties @@ -0,0 +1,3 @@ +management.metrics.export.influx.enabled=true +management.metrics.export.influx.uri=http://localhost:8086 +management.metrics.export.influx.step=1S \ No newline at end of file diff --git a/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/metrics/业务监控.json b/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/metrics/业务监控.json new file mode 100644 index 0000000..75a698e --- /dev/null +++ b/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/metrics/业务监控.json @@ -0,0 +1,923 @@ +{ + "annotations": { + "list": [ + { + "builtIn": 1, + "datasource": "-- Grafana --", + "enable": true, + "hide": true, + "iconColor": "rgba(0, 211, 255, 1)", + "name": "Annotations & Alerts", + "type": "dashboard" + } + ] + }, + "editable": true, + "gnetId": null, + "graphTooltip": 0, + "id": 1, + "links": [], + "panels": [ + { + "gridPos": { + "h": 4, + "w": 12, + "x": 0, + "y": 0 + }, + "id": 6, + "links": [], + "options": { + "fieldOptions": { + "calcs": [ + "last" + ], + "defaults": { + "max": 100, + "min": 0 + }, + "mappings": [], + "override": {}, + "thresholds": [ + { + "color": "green", + "index": 0, + "value": null + } + ], + "values": false + }, + "orientation": "auto", + "showThresholdLabels": false, + "showThresholdMarkers": true + }, + "pluginVersion": "6.2.4", + "targets": [ + { + "alias": "", + "groupBy": [], + "measurement": "createOrder_totalSuccess", + "orderByTime": "ASC", + "policy": "default", + "refId": "A", + "resultFormat": "time_series", + "select": [ + [ + { + "params": [ + "value" + ], + "type": "field" + }, + { + "params": [], + "type": "last" + } + ] + ], + "tags": [] + } + ], + "timeFrom": null, + "timeShift": null, + "title": "成功单量", + "type": "gauge" + }, + { + "gridPos": { + "h": 4, + "w": 12, + "x": 12, + "y": 0 + }, + "id": 7, + "links": [], + "options": { + "fieldOptions": { + "calcs": [ + "last" + ], + "defaults": { + "max": 100, + "min": 0 + }, + "mappings": [], + "override": {}, + "thresholds": [ + { + "color": "green", + "index": 0, + "value": null + } + ], + "values": false + }, + "orientation": "auto", + "showThresholdLabels": false, + "showThresholdMarkers": true + }, + "pluginVersion": "6.2.4", + "targets": [ + { + "alias": "", + "groupBy": [], + "measurement": "deliverOrder_totalSuccess", + "orderByTime": "ASC", + "policy": "default", + "refId": "A", + "resultFormat": "time_series", + "select": [ + [ + { + "params": [ + "value" + ], + "type": "field" + }, + { + "params": [], + "type": "last" + } + ] + ], + "tags": [] + } + ], + "timeFrom": null, + "timeShift": null, + "title": "成功配送", + "type": "gauge" + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "InfluxDB", + "fill": 1, + "gridPos": { + "h": 6, + "w": 12, + "x": 0, + "y": 4 + }, + "id": 4, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": {}, + "percentage": false, + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "alias": "收到", + "groupBy": [ + { + "params": [ + "1m" + ], + "type": "time" + }, + { + "params": [ + "null" + ], + "type": "fill" + } + ], + "measurement": "createOrder_received", + "orderByTime": "ASC", + "policy": "default", + "query": "SELECT sum(\"value\") FROM \"createOrder_received\" WHERE $timeFilter GROUP BY time($__interval) fill(null)", + "rawQuery": false, + "refId": "B", + "resultFormat": "time_series", + "select": [ + [ + { + "params": [ + "value" + ], + "type": "field" + }, + { + "params": [], + "type": "sum" + } + ] + ], + "tags": [] + }, + { + "alias": "成功", + "groupBy": [ + { + "params": [ + "1m" + ], + "type": "time" + }, + { + "params": [ + "null" + ], + "type": "fill" + } + ], + "measurement": "createOrder_success", + "orderByTime": "ASC", + "policy": "default", + "query": "SELECT sum(\"count\") FROM \"createOrder_success\" WHERE $timeFilter GROUP BY time($__interval) fill(null)", + "rawQuery": false, + "refId": "A", + "resultFormat": "time_series", + "select": [ + [ + { + "params": [ + "count" + ], + "type": "field" + }, + { + "params": [], + "type": "sum" + } + ] + ], + "tags": [] + }, + { + "alias": "失败:$tag_reason", + "groupBy": [ + { + "params": [ + "1m" + ], + "type": "time" + }, + { + "params": [ + "reason" + ], + "type": "tag" + }, + { + "params": [ + "null" + ], + "type": "fill" + } + ], + "measurement": "createOrder_failed", + "orderByTime": "ASC", + "policy": "default", + "query": "SELECT sum(\"count\") FROM \"createOrder_failed\" WHERE $timeFilter GROUP BY time($__interval), \"reason\" fill(null)", + "rawQuery": false, + "refId": "C", + "resultFormat": "time_series", + "select": [ + [ + { + "params": [ + "count" + ], + "type": "field" + }, + { + "params": [], + "type": "sum" + } + ] + ], + "tags": [] + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "下单次数", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "fill": 1, + "gridPos": { + "h": 6, + "w": 12, + "x": 12, + "y": 4 + }, + "id": 2, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": {}, + "percentage": false, + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "alias": "成功", + "groupBy": [ + { + "params": [ + "1m" + ], + "type": "time" + }, + { + "params": [ + "area" + ], + "type": "tag" + }, + { + "params": [ + "null" + ], + "type": "fill" + } + ], + "measurement": "createOrder_success", + "orderByTime": "ASC", + "policy": "default", + "refId": "A", + "resultFormat": "time_series", + "select": [ + [ + { + "params": [ + "upper" + ], + "type": "field" + }, + { + "params": [], + "type": "max" + } + ] + ], + "tags": [] + }, + { + "alias": "失败:$tag_reason", + "groupBy": [ + { + "params": [ + "1m" + ], + "type": "time" + }, + { + "params": [ + "reason" + ], + "type": "tag" + }, + { + "params": [ + "null" + ], + "type": "fill" + } + ], + "measurement": "createOrder_failed", + "orderByTime": "ASC", + "policy": "default", + "refId": "C", + "resultFormat": "time_series", + "select": [ + [ + { + "params": [ + "upper" + ], + "type": "field" + }, + { + "params": [], + "type": "max" + } + ] + ], + "tags": [] + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "下单性能", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "format": "ms", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "fill": 1, + "gridPos": { + "h": 6, + "w": 12, + "x": 0, + "y": 10 + }, + "id": 8, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": {}, + "percentage": false, + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "alias": "成功", + "groupBy": [ + { + "params": [ + "1m" + ], + "type": "time" + }, + { + "params": [ + "area" + ], + "type": "tag" + }, + { + "params": [ + "null" + ], + "type": "fill" + } + ], + "measurement": "deliverOrder_success", + "orderByTime": "ASC", + "policy": "default", + "refId": "A", + "resultFormat": "time_series", + "select": [ + [ + { + "params": [ + "count" + ], + "type": "field" + }, + { + "params": [], + "type": "sum" + } + ] + ], + "tags": [] + }, + { + "alias": "失败 $tag_reason", + "groupBy": [ + { + "params": [ + "1m" + ], + "type": "time" + }, + { + "params": [ + "reason" + ], + "type": "tag" + }, + { + "params": [ + "null" + ], + "type": "fill" + } + ], + "measurement": "deliverOrder_failed", + "orderByTime": "ASC", + "policy": "default", + "refId": "C", + "resultFormat": "time_series", + "select": [ + [ + { + "params": [ + "count" + ], + "type": "field" + }, + { + "params": [], + "type": "sum" + } + ] + ], + "tags": [] + }, + { + "alias": "收到", + "groupBy": [ + { + "params": [ + "1m" + ], + "type": "time" + }, + { + "params": [ + "null" + ], + "type": "fill" + } + ], + "measurement": "deliverOrder_received", + "orderByTime": "ASC", + "policy": "default", + "refId": "B", + "resultFormat": "time_series", + "select": [ + [ + { + "params": [ + "value" + ], + "type": "field" + }, + { + "params": [], + "type": "sum" + } + ] + ], + "tags": [] + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "配送次数", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "fill": 1, + "gridPos": { + "h": 6, + "w": 12, + "x": 12, + "y": 10 + }, + "id": 9, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": {}, + "percentage": false, + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "alias": "成功", + "groupBy": [ + { + "params": [ + "1m" + ], + "type": "time" + }, + { + "params": [ + "area" + ], + "type": "tag" + }, + { + "params": [ + "null" + ], + "type": "fill" + } + ], + "measurement": "deliverOrder_success", + "orderByTime": "ASC", + "policy": "default", + "refId": "A", + "resultFormat": "time_series", + "select": [ + [ + { + "params": [ + "upper" + ], + "type": "field" + }, + { + "params": [], + "type": "max" + } + ] + ], + "tags": [] + }, + { + "alias": "失败:$tag_reason", + "groupBy": [ + { + "params": [ + "1m" + ], + "type": "time" + }, + { + "params": [ + "reason" + ], + "type": "tag" + }, + { + "params": [ + "null" + ], + "type": "fill" + } + ], + "measurement": "deliverOrder_failed", + "orderByTime": "ASC", + "policy": "default", + "refId": "C", + "resultFormat": "time_series", + "select": [ + [ + { + "params": [ + "upper" + ], + "type": "field" + }, + { + "params": [], + "type": "max" + } + ] + ], + "tags": [] + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "配送性能", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "format": "ms", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + } + ], + "refresh": "10s", + "schemaVersion": 18, + "style": "dark", + "tags": [], + "templating": { + "list": [] + }, + "time": { + "from": "now-30m", + "to": "now" + }, + "timepicker": { + "refresh_intervals": [ + "5s", + "10s", + "30s", + "1m", + "5m", + "15m", + "30m", + "1h", + "2h", + "1d" + ], + "time_options": [ + "5m", + "15m", + "1h", + "6h", + "12h", + "24h", + "2d", + "7d", + "30d" + ] + }, + "timezone": "", + "title": "业务监控", + "uid": "W13pcS8Zk", + "version": 35 +} \ No newline at end of file diff --git a/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/metrics/应用监控.json b/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/metrics/应用监控.json new file mode 100644 index 0000000..3a644fb --- /dev/null +++ b/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/metrics/应用监控.json @@ -0,0 +1,621 @@ +{ + "annotations": { + "list": [ + { + "builtIn": 1, + "datasource": "-- Grafana --", + "enable": true, + "hide": true, + "iconColor": "rgba(0, 211, 255, 1)", + "name": "Annotations & Alerts", + "type": "dashboard" + } + ] + }, + "editable": true, + "gnetId": null, + "graphTooltip": 0, + "id": 2, + "links": [], + "panels": [ + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "InfluxDB", + "fill": 1, + "gridPos": { + "h": 8, + "w": 12, + "x": 0, + "y": 0 + }, + "id": 2, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": {}, + "percentage": false, + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "alias": "$tag_state", + "groupBy": [ + { + "params": [ + "$__interval" + ], + "type": "time" + }, + { + "params": [ + "state" + ], + "type": "tag" + }, + { + "params": [ + "null" + ], + "type": "fill" + } + ], + "measurement": "jvm_threads_states", + "orderByTime": "ASC", + "policy": "default", + "refId": "A", + "resultFormat": "time_series", + "select": [ + [ + { + "params": [ + "value" + ], + "type": "field" + }, + { + "params": [], + "type": "sum" + } + ] + ], + "tags": [] + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "JVM 线程数", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "InfluxDB", + "fill": 1, + "gridPos": { + "h": 8, + "w": 12, + "x": 12, + "y": 0 + }, + "id": 4, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": {}, + "percentage": false, + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "alias": "$tag_action", + "groupBy": [ + { + "params": [ + "$__interval" + ], + "type": "time" + }, + { + "params": [ + "action" + ], + "type": "tag" + }, + { + "params": [ + "null" + ], + "type": "fill" + } + ], + "measurement": "jvm_gc_pause", + "orderByTime": "ASC", + "policy": "default", + "refId": "A", + "resultFormat": "time_series", + "select": [ + [ + { + "params": [ + "count" + ], + "type": "field" + }, + { + "params": [], + "type": "sum" + } + ] + ], + "tags": [] + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "JVM GC次数", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": 0, + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "InfluxDB", + "fill": 1, + "gridPos": { + "h": 8, + "w": 12, + "x": 0, + "y": 8 + }, + "id": 3, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": {}, + "percentage": false, + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "alias": "发布", + "groupBy": [ + { + "params": [ + "1m" + ], + "type": "time" + }, + { + "params": [ + "null" + ], + "type": "fill" + } + ], + "measurement": "rabbitmq_published", + "orderByTime": "ASC", + "policy": "default", + "query": "SELECT sum(\"value\") FROM \"jdbc_connections_min\" WHERE $timeFilter GROUP BY time($__interval), \"metric_type\" fill(null)", + "rawQuery": false, + "refId": "B", + "resultFormat": "time_series", + "select": [ + [ + { + "params": [ + "value" + ], + "type": "field" + }, + { + "params": [], + "type": "sum" + } + ] + ], + "tags": [] + }, + { + "alias": "确认", + "groupBy": [ + { + "params": [ + "1m" + ], + "type": "time" + }, + { + "params": [ + "null" + ], + "type": "fill" + } + ], + "measurement": "rabbitmq_acknowledged", + "orderByTime": "ASC", + "policy": "default", + "query": "SELECT sum(\"value\") FROM \"jdbc_connections_min\" WHERE $timeFilter GROUP BY time($__interval), \"metric_type\" fill(null)", + "rawQuery": false, + "refId": "A", + "resultFormat": "time_series", + "select": [ + [ + { + "params": [ + "value" + ], + "type": "field" + }, + { + "params": [], + "type": "sum" + } + ] + ], + "tags": [] + }, + { + "alias": "消费", + "groupBy": [ + { + "params": [ + "1m" + ], + "type": "time" + }, + { + "params": [ + "null" + ], + "type": "fill" + } + ], + "measurement": "rabbitmq_consumed", + "orderByTime": "ASC", + "policy": "default", + "query": "SELECT sum(\"value\") FROM \"jdbc_connections_min\" WHERE $timeFilter GROUP BY time($__interval), \"metric_type\" fill(null)", + "rawQuery": false, + "refId": "C", + "resultFormat": "time_series", + "select": [ + [ + { + "params": [ + "value" + ], + "type": "field" + }, + { + "params": [], + "type": "sum" + } + ] + ], + "tags": [] + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "RMQ 消息 QPM", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "format": "none", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "InfluxDB", + "fill": 1, + "gridPos": { + "h": 8, + "w": 12, + "x": 12, + "y": 8 + }, + "id": 5, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": {}, + "percentage": false, + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "alias": "", + "groupBy": [ + { + "params": [ + "1m" + ], + "type": "time" + }, + { + "params": [ + "null" + ], + "type": "fill" + } + ], + "measurement": "http_server_requests", + "orderByTime": "ASC", + "policy": "default", + "query": "SELECT sum(\"value\") FROM \"jdbc_connections_min\" WHERE $timeFilter GROUP BY time($__interval), \"metric_type\" fill(null)", + "rawQuery": false, + "refId": "B", + "resultFormat": "time_series", + "select": [ + [ + { + "params": [ + "count" + ], + "type": "field" + }, + { + "params": [], + "type": "sum" + } + ] + ], + "tags": [] + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "HTTP 请求 QPM", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "format": "none", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + } + ], + "refresh": false, + "schemaVersion": 18, + "style": "dark", + "tags": [], + "templating": { + "list": [] + }, + "time": { + "from": "2020-02-04T11:43:31.520Z", + "to": "2020-02-04T12:02:45.177Z" + }, + "timepicker": { + "refresh_intervals": [ + "5s", + "10s", + "30s", + "1m", + "5m", + "15m", + "30m", + "1h", + "2h", + "1d" + ], + "time_options": [ + "5m", + "15m", + "1h", + "6h", + "12h", + "24h", + "2d", + "7d", + "30d" + ] + }, + "timezone": "", + "title": "应用监控", + "uid": "WiO4uOUWk", + "version": 2 +} \ No newline at end of file diff --git a/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/readme.md b/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/readme.md new file mode 100644 index 0000000..0255c2d --- /dev/null +++ b/code-language/java/java-demo/src/main/java/com/baiye/demo/case24/readme.md @@ -0,0 +1,5 @@ +## 业务代码写完,就意味着生产就绪了? +- 准备工作:配置Spring Boot Actuator:N/A +- 健康检测需要触达关键组件:health +- 对外暴露应用内部重要组件的状态:info +- 指标Metrics是快速定位问题的“金钥匙”:metrics \ No newline at end of file diff --git a/code-language/java/pic/组件的状态监控.png b/code-language/java/pic/组件的状态监控.png new file mode 100644 index 0000000..d6e776d Binary files /dev/null and b/code-language/java/pic/组件的状态监控.png differ diff --git a/code-language/java/pic/组件的状态监控1.png b/code-language/java/pic/组件的状态监控1.png new file mode 100644 index 0000000..74f10ca Binary files /dev/null and b/code-language/java/pic/组件的状态监控1.png differ diff --git a/code-language/java/pic/组件的状态监控10.png b/code-language/java/pic/组件的状态监控10.png new file mode 100644 index 0000000..464d009 Binary files /dev/null and b/code-language/java/pic/组件的状态监控10.png differ diff --git a/code-language/java/pic/组件的状态监控11.png b/code-language/java/pic/组件的状态监控11.png new file mode 100644 index 0000000..0ff36e7 Binary files /dev/null and b/code-language/java/pic/组件的状态监控11.png differ diff --git a/code-language/java/pic/组件的状态监控2.png b/code-language/java/pic/组件的状态监控2.png new file mode 100644 index 0000000..d7306fb Binary files /dev/null and b/code-language/java/pic/组件的状态监控2.png differ diff --git a/code-language/java/pic/组件的状态监控3.png b/code-language/java/pic/组件的状态监控3.png new file mode 100644 index 0000000..fa64501 Binary files /dev/null and b/code-language/java/pic/组件的状态监控3.png differ diff --git a/code-language/java/pic/组件的状态监控4.png b/code-language/java/pic/组件的状态监控4.png new file mode 100644 index 0000000..9a633a9 Binary files /dev/null and b/code-language/java/pic/组件的状态监控4.png differ diff --git a/code-language/java/pic/组件的状态监控5.png b/code-language/java/pic/组件的状态监控5.png new file mode 100644 index 0000000..ca804a7 Binary files /dev/null and b/code-language/java/pic/组件的状态监控5.png differ diff --git a/code-language/java/pic/组件的状态监控6.png b/code-language/java/pic/组件的状态监控6.png new file mode 100644 index 0000000..3b1af2e Binary files /dev/null and b/code-language/java/pic/组件的状态监控6.png differ diff --git a/code-language/java/pic/组件的状态监控7.png b/code-language/java/pic/组件的状态监控7.png new file mode 100644 index 0000000..eec67bb Binary files /dev/null and b/code-language/java/pic/组件的状态监控7.png differ diff --git a/code-language/java/pic/组件的状态监控8.png b/code-language/java/pic/组件的状态监控8.png new file mode 100644 index 0000000..da297cd Binary files /dev/null and b/code-language/java/pic/组件的状态监控8.png differ diff --git a/code-language/java/pic/组件的状态监控9.png b/code-language/java/pic/组件的状态监控9.png new file mode 100644 index 0000000..67d442f Binary files /dev/null and b/code-language/java/pic/组件的状态监控9.png differ