-
Notifications
You must be signed in to change notification settings - Fork 175
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* 添加第三方库 * 添加工具说明文档 * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md
- Loading branch information
Showing
6 changed files
with
130,415 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -89,6 +89,10 @@ kworker/u256:1 15144 13516 1131 | |
node 14221 2589 3355 | ||
``` | ||
|
||
原理介绍: | ||
|
||
[抢占调度原理分析](docs/preempt_time.md) | ||
|
||
### 3.**统计调度延迟:** | ||
|
||
分析系统中进程调度的延迟情况,提供相关统计数据,输出包括当前系统的最大调度延迟、最小调度延迟、平均调度延迟。 | ||
|
@@ -107,6 +111,9 @@ node 14221 2589 3355 | |
17:31:35 362.039000 217053.545000 6.462000 | ||
17:31:36 373.751000 217053.545000 6.462000 | ||
``` | ||
原理介绍: | ||
|
||
[调度延迟原理分析](docs/schedule_delay.md) | ||
|
||
### 4.**统计系统调用响应时间:** | ||
|
||
|
@@ -142,7 +149,7 @@ Time Pid comm syscall_id delay/us | |
|
||
原理介绍: | ||
|
||
[lmp/eBPF_Supermarket/CPU_Subsystem/cpu_watcher/docs/mq_delay功能介绍.md at develop · albertxu216/lmp (github.com)](https://github.com/albertxu216/lmp/blob/develop/eBPF_Supermarket/CPU_Subsystem/cpu_watcher/docs/mq_delay功能介绍.md) | ||
[消息队列延迟原理分析](docs/mq_delay.md) | ||
|
||
### 6.对内核函数schedule()的执行时长进行测试 | ||
|
||
|
@@ -222,4 +229,4 @@ per_len = 1000 | |
|
||
如果你也对cpu_watcher或ebpf感兴趣,欢迎加入我们一起开发cpu_watcher工具,希望我们可以共同成长。 | ||
|
||
**cpu_watcher负责人:**[email protected] [email protected] [email protected] | ||
**cpu_watcher负责人:** [email protected] [email protected] [email protected] |
44 changes: 44 additions & 0 deletions
44
eBPF_Supermarket/CPU_Subsystem/cpu_watcher/docs/preempt_time.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
## preempt_time工具介绍 | ||
|
||
preempt_time,统计每次系统中抢占调度所用的时间。 | ||
|
||
### 原理分析 | ||
|
||
使用 btf raw tracepoint监控内核中的每次调度事件: | ||
|
||
```c | ||
SEC("tp_btf/sched_switch") | ||
``` | ||
btf raw tracepoint 跟常规 raw tracepoint 有一个 最主要的区别是: btf 版本可以直接在 ebpf 程序中访问内核内存, 不需要像常规 raw tracepoint 一样需要借助类似 `bpf_core_read` 或 `bpf_probe_read_kernel` 这样 的辅助函数才能访问内核内存。 | ||
```c | ||
int BPF_PROG(sched_switch, bool preempt, struct task_struct *prev, struct task_struct *next) | ||
``` | ||
|
||
该事件为我们提供了关于抢占的参数preempt,我们可以通过判断preempt的值来决定是否记录本次调度信息。 | ||
|
||
另一挂载点为kprobe:finish_task_switch,即本次调度切换完成进行收尾工作的函数,在此时通过ebpf map与之前记录的调度信息做差,即可得到本次抢占调度的时间: | ||
|
||
```c | ||
SEC("kprobe/finish_task_switch") | ||
``` | ||
### 输出效果 | ||
可以获取到抢占进程的`pid`与进程名,以及被抢占进程的`pid`,和本次抢占时间,单位纳秒 | ||
``` | ||
COMM prev_pid next_pid duration_ns | ||
node 14221 2589 3014 | ||
kworker/u256:1 15144 13516 1277 | ||
node 14221 2589 3115 | ||
kworker/u256:1 15144 13516 1125 | ||
kworker/u256:1 15144 13516 974 | ||
node 14221 2589 2560 | ||
kworker/u256:1 15144 13516 1132 | ||
node 14221 2589 2717 | ||
kworker/u256:1 15144 13516 1206 | ||
kworker/u256:1 15144 13516 1131 | ||
node 14221 2589 3355 | ||
``` |
57 changes: 57 additions & 0 deletions
57
eBPF_Supermarket/CPU_Subsystem/cpu_watcher/docs/schedule_delay.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,57 @@ | ||
## schedule_delay工具介绍 | ||
|
||
schedule_delay工具可以检测该系统当前的调度延迟。即从一个任务具备运行的条件,到真正执行(获得 CPU 的执行权)的这段时间。 | ||
|
||
实时观测该指标可以帮助我们了解到当前操作系统的负载。 | ||
|
||
### 原理分析 | ||
|
||
只需考虑,在何时一个任务会被加入运行队列等待运行。内核提供了两个函数实现这个功能: | ||
|
||
- 新建的进程通过调用`wake_up_new_task`,将新创建的任务加入runqueue等待调度。 | ||
- 进程从睡眠状态被唤醒时触发,调用`ttwu_do_wakeup`函数,进入runqueue等待调度。 | ||
|
||
关于这两个函数,内核提供了两个对应的`tracepoint`: | ||
|
||
| 内核函数 | 对应`tracepoint` | | ||
| :--------------: | :--------------------: | | ||
| wake_up_new_task | sched:sched_wakeup_new | | ||
| ttwu_do_wakeup | sched:sched_wakeup | | ||
|
||
在触发到这两个tracepoint的时候,记录这个进程的信息和进入运行队列的时间。 | ||
|
||
除此之外,我们还应该考虑到,当一个进程**被迫离开cpu**时,其状态依然是`TASK_RUNNING`,所以在schedule时,我们还要做出判断,决定该进程是否要被记录。 | ||
|
||
| 内核函数 | 对应`tracepoint` | | ||
| :------: | :----------------: | | ||
| schedule | sched:sched_switch | | ||
|
||
在触发到这个tracepoint时,记录此时即将要占用cpu的进程信息,通过ebpf map记录的进入运行队列的时间作差,即调度延迟。在这里还需要判断上一个进程是否要被记录。 | ||
|
||
```c | ||
if(prev_state == TASK_RUNNING)//判断退出cpu进程的状态 | ||
``` | ||
|
||
最后要注意的是,为了避免map溢出,我们还需要在进程退出的时候,删除map中记录的数据。 | ||
|
||
| 内核函数 | 对应`tracepoint` | | ||
| :------: | :----------------------: | | ||
| do_exit | sched:sched_process_exit | | ||
|
||
### 输出效果 | ||
|
||
我们可以检测到系统从加载ebpf程序到当前的平均、最大、最小调度时延: | ||
|
||
``` | ||
TIME avg_delay/μs max_delay/μs min_delay/μs | ||
17:31:28 35.005000 97.663000 9.399000 | ||
17:31:29 326.518000 12618.465000 7.994000 | ||
17:31:30 455.837000 217053.545000 6.462000 | ||
17:31:31 422.582000 217053.545000 6.462000 | ||
17:31:32 382.627000 217053.545000 6.462000 | ||
17:31:33 360.499000 217053.545000 6.462000 | ||
17:31:34 364.805000 217053.545000 6.462000 | ||
17:31:35 362.039000 217053.545000 6.462000 | ||
17:31:36 373.751000 217053.545000 6.462000 | ||
``` | ||
|
Oops, something went wrong.