设为首页收藏本站

ZMX - IT技术交流论坛 - 无限Perfect,追求梦想 - itzmx.com

 找回密码
 注册论坛

QQ登录

只需一步,快速开始

新浪微博账号登陆

只需一步,快速开始

用百度帐号登录

只需两步,快速登录

搜索
查看: 164|回复: 0

为什么我一直说vps kvm基于驱动级虚拟化CPU性能很差,特别是硬盘io读写,pve的cache缓存模式

[复制链接]
 成长值: 330

签到天数: 4735 天

[LV.Master]伴坛终老

发表于 2024/12/5 06:42 | 显示全部楼层 |阅读模式 |Google Chrome 131.0.0.0|Windows 10
天涯海角搜一下: 百度 谷歌 360 搜狗 有道 雅虎 必应 即刻
为什么我一直说vps kvm基于驱动级虚拟化CPU性能很差,特别是硬盘io读写,pve的cache缓存模式

毕竟是驱动级虚拟化,CPU低一点没事,io烂了的话,跑网站数据库要用io的话就废了
一般来说很多服务商都是用pve来开kvm虚拟机的,虽然没有用qcow2(会加密和压缩,等于是在压缩包里面运行虚拟机),用的是raw但是还是很烂
1.jpg

1.png

2.png

一张1分钟的动态图,自己体会什么叫kvm烂io
Xshell_JpAL8Ueh5q.gif

htop看不到wa的占用,不然的话能看到CPU当前是100%消耗

通过查阅pve资料,可能是宿主机服务商设置为cache=writeback导致严重性能的问题,需要正确修改为cache=writethrough
https://pve.proxmox.com/wiki/Performance_Tweaks

直接都内核bug了,磁盘无响应
Message from syslogd@vm906233 at Dec  5 13:31:08 ...
kernel:NMI watchdog: BUG: soft lockup - CPU#0 stuck for 49s! [kworker/0:2:25910]
Message from syslogd@vm906233 at Dec  5 13:37:13 ...
kernel:NMI watchdog: BUG: soft lockup - CPU#0 stuck for 24s! [kworker/0:3:26358]

你可以这样与服务商联系
尊敬的服务商,

您好!

我们在使用您的服务中遇到了严重的性能问题,通过对Proxmox VE资料的研究和分析,初步判断是因为宿主机的设置问题,导致VPS性能严重下降。详细查阅文档后,发现宿主机的缓存策略配置为`cache=writeback`,而这正是问题的根源所在。因此,我强烈建议您将所有VM的缓存策略统一修改为`cache=writethrough`,以改善当前的性能状况。

在Proxmox VE环境中,缓存策略对于存储性能的影响非常显著。`cache=writeback`会将I/O操作缓存到内存中,使写入过程看似更快速,但实际上,这种策略可能在系统负载较高时导致缓存失效,从而引发更多的磁盘I/O等待时间(Steal Time,st),这正是我们在您的服务中发现高负载问题的原因之一。

据官方文档中的性能优化建议,`cache=writethrough`模式提供了一种更稳妥的策略,保证写入操作在到达存储设备前被完整确认。这种方式虽然在表面上可能会稍微降低I/O操作的表观速度,但从长远来看,它能够有效降低系统负载,提高整体的稳定性和可靠性。特别是在虚拟化环境中,这种策略能最大限度地确保数据的一致性与持久性,避免因缓存未及时写入而产生的数据丢失风险。

此外,通过实际运行监测与分析也可以看出,`cache=writeback`策略会导致宿主机上的CPU占用过高,尤其在处理大量数据请求时,st值显著增加。这表明资源消耗非常不均衡,影响了VPS正常的运行和用户的使用体验。

为了保证系统的稳定性和提升性能,我们建议您尽快调整宿主机的VM配置,将缓存策略更改为`cache=writethrough`。这样可以有效缓解当前的高负载问题,并确保未来运行的可靠性。同时,也建议您在实施调整后进行系统监控,以确保新的策略能达到期望的效果。如果有更多问题,请随时联系我们。

感谢您的重视与配合!


和服务商使用英文版沟通
Hello, I found that the ST value of the server is very high, causing
the system load to occupy about 10. The normal system load should be
0.00-0.05
The current average share of st is 90%
https://send.itzmx.com/files/IXOPUe26e0hrWshBDQnQqR7.jpg
https://send.itzmx.com/files/8Uzqld614nbzqu7nU66oGYG.jpg
#90xxxx-xxxxx
server ip 45.140.xxx.xxx
root
O0RkSiW0lHsSm3H

Top Task Manager Wiki README
0.0% st — The percentage of CPU occupied by overselling of response
wait time (steal time) during virtualization
The st value should always remain at 0.0%. If there is any
fluctuation, it means that the host is oversold.

Dear [Service Provider],

I hope this message finds you well.

We have encountered severe performance issues with your service, and after researching Proxmox VE documentation, we suspect the issue is due to the host machine's cache settings. The current setting, `cache=writeback`, seems to be causing significant performance degradation. I strongly recommend switching all VMs to `cache=writethrough` to improve performance.

In a Proxmox VE environment, cache settings greatly impact storage performance. The `cache=writeback` option caches I/O operations in memory, making writes appear faster. However, this can lead to increased I/O wait times (Steal Time, st) under high loads, which is contributing to the high load issues we're experiencing.

According to official performance optimization documentation, the `cache=writethrough` mode is more reliable, ensuring that write operations are fully confirmed before reaching storage devices. While it may slightly decrease apparent I/O speed, it significantly reduces system load over the long term and improves stability and reliability. This is especially important in virtualized environments, as it maximizes data consistency and durability, avoiding data loss risks from uncommitted cache writes.

Moreover, our analysis shows that the `cache=writeback` setting leads to high CPU usage on the host machine, particularly during heavy data requests, which significantly increases the st value. This indicates uneven resource consumption, affecting the normal operation of the VPS and user experience.

To ensure system stability and performance enhancement, we suggest promptly adjusting the VM configurations on the host to `cache=writethrough`. This change should alleviate the current high load issues and ensure reliable future operations. Additionally, we recommend monitoring the system after making these changes to ensure the desired outcomes. Please feel free to reach out if you have further questions.

Thank you for your attention and cooperation.


通过和pve官方人员取得联系,他们回复说宿主机使用iothread也可以改善一些问题
Hi,
I'd recommend turning on the iothread option for the VM disks, otherwise the IO load is handled by the QEMU main thread and that can cause virtual CPUs to get stuck.
Best regards,
Fiona
https://forum.proxmox.com/threads/i-used-a-kvm-virtual-host-in-pve-and-found-a-lot-of-i-o-wait-times-steal-time-st-usage-on-the-virtual-machine-which-lasted-up-to-90.158740/

欢迎光临IT技术交流论坛:http://bbs.itzmx.com/
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 注册论坛 新浪微博账号登陆用百度帐号登录

本版积分规则

手机版|Archiver|Mail me|网站地图|IT技术交流论坛 ( 闽ICP备13013206号-7 )

GMT+8, 2024/12/21 20:38 , Processed in 0.375904 second(s), 22 queries , MemCache On.

Powered by itzmx! X3.4

© 2011- sakura

快速回复 返回顶部 返回列表