小樱 发表于 2024/8/11 02:18

kangle 3.6.0版本新增boost fcontext协程,应该能比之前的ucontext进一步提高多线程切换性能

kangle 3.6.0版本新增boost fcontext协程,应该能比之前的ucontext进一步提高多线程切换性能

表现在多核心CPU上运行的kangle,使用fcontext应该能提高性能
https://www.boost.org/
https://github.com/boostorg/context

原用的ucontext
https://github.com/kaniini/libucontext

编译方法
cmake .. -DENABLE_FCONTEXT=1

新的fcontext测试对比多线程性能大约提升11%
ucontext
Concurrency Level:      100
Time taken for tests:   10.829 seconds
Complete requests:      100000
Failed requests:      0
Write errors:         0
Total transferred:      48314024 bytes
HTML transferred:       23500000 bytes
Requests per second:    9234.44 [#/sec] (mean)
Time per request:       10.829 (mean)
Time per request:       0.108 (mean, across all concurrent requests)
Transfer rate:          4356.96 received


fcontext
Concurrency Level:      100
Time taken for tests:   10.063 seconds
Complete requests:      100000
Failed requests:      0
Write errors:         0
Total transferred:      48313336 bytes
HTML transferred:       23500000 bytes
Requests per second:    9937.07 [#/sec] (mean)
Time per request:       10.063 (mean)
Time per request:       0.101 (mean, across all concurrent requests)
Transfer rate:          4688.41 received


3.5.21.16生产版本,看来拿3.6.0开发版本对比生产版本果然是,,,毕竟生产版本有jemalloc高性能加持
Concurrency Level:      100
Time taken for tests:   7.525 seconds
Complete requests:      100000
Failed requests:      0
Write errors:         0
Total transferred:      48381928 bytes
HTML transferred:       23500000 bytes
Requests per second:    13288.51 [#/sec] (mean)
Time per request:       7.525 (mean)
Time per request:       0.075 (mean, across all concurrent requests)
Transfer rate:          6278.55 received


两个3.6开发版对比,,确实强了一丢丢,感觉是误差,测了几次结果大概5-7%左右提升
感觉可能是编译没有启用-DCMAKE_BUILD_TYPE=Release选项导致的,启用后在测了一次,,,发现结果依旧一样没区别,性能比3.5.21.16下降了30%左右性能
3.5.21.16
Concurrency Level:      100
Time taken for tests:   7.234 seconds
Complete requests:      100000
Failed requests:      0
Write errors:         0
Total transferred:      24644184 bytes
HTML transferred:       0 bytes
Requests per second:    13823.46 [#/sec] (mean)
Time per request:       7.234 (mean)
Time per request:       0.072 (mean, across all concurrent requests)
Transfer rate:          3326.83 received

ucontext
Concurrency Level:      100
Time taken for tests:   9.313 seconds
Complete requests:      100000
Failed requests:      0
Write errors:         0
Total transferred:      24616784 bytes
HTML transferred:       0 bytes
Requests per second:    10738.10 [#/sec] (mean)
Time per request:       9.313 (mean)
Time per request:       0.093 (mean, across all concurrent requests)
Transfer rate:          2581.42 received

fcontext
Concurrency Level:      100
Time taken for tests:   9.349 seconds
Complete requests:      100000
Failed requests:      0
Write errors:         0
Total transferred:      24604296 bytes
HTML transferred:       0 bytes
Requests per second:    10696.76 [#/sec] (mean)
Time per request:       9.349 (mean)
Time per request:       0.093 (mean, across all concurrent requests)
Transfer rate:          2570.18 received


2024年8月21日补充
官方已经优化修复3.6.0性能下降的问题,请更新到最新提交的代码,编译打开选项 -DENABLE_JEMALLOC=1 也可以带来10%的性能提升
https://github.com/keengo99/kangle/commit/d7f955d5b038e72e94b17b001a85f9f4ebf2a56f
但是经过我测试依旧不行,,,性能对比3.5来说还是很差

3.6新版是重构为动态模块化的 编译好的文件不能直接用 要在机器上单独编译使用
我这编译好的二进制文件发你 你那是无法执行的 会提示so文件未找到
这也可能是性能下降的一个原因,以前是静态编译文件50M,现在动态的只有2M

2024年11月29日补充
今天3.6定版发布更新后,优化了网络发送和接收,减少了上下文切换,性能做到和3.5基本一致了,但是我感觉还有一些优化空间,估计要启用iouring模型

maxrate 发表于 2024/8/11 13:00

版本        3.5.21.16(enterprise),最新版的kangle去哪里装哦,我现在都用的一键脚本

小樱 发表于 2024/8/11 19:01

maxrate 发表于 2024/8/11 13:00
版本        3.5.21.16(enterprise),最新版的kangle去哪里装哦,我现在都用的一键脚本

3.6.0目前处于开发阶段,感兴趣可以去官方仓库下载源代码来使用
https://github.com/keengo99/kangle
页: [1]
查看完整版本: kangle 3.6.0版本新增boost fcontext协程,应该能比之前的ucontext进一步提高多线程切换性能