Discussion:
[Qemu-discuss] Qemu freeze on I/O intensive workload
Jean-Tiare LE BIGOT
2018-09-26 17:17:49 UTC
Permalink
Hi,

I am using Qemu in a test suite to virtualize 3 x86_64 machines in an
isolated network. The end goal is to run integration tests on a Yocto
generated distribution.

In the setupe phase of the test suite, we start 4 Qemu instances in
parallel with the "-daemonize" option and a global lock to prevent parralel
start (until daemonized).

The machine's configuration files start as follow (prior to variable
#
# General setup
#
[machine]
accel = "kvm" # For acceleration
type = "q35" # For AHCI drive (sda instead of ide)
[memory]
size = "1G"
#
# Control socket
#
[chardev "monitor"]
backend = "socket"
server = "on"
wait = "off"
path = "%%MACHINE_DATA%%/monitor.sock"
[mon]
chardev = "monitor"
mode = "control"
#
# UEFI setup
#
[drive]
if = "pflash"
format = "raw"
readonly = "on"
file = "%%MACHINE_DATA%%/OVMF_CODE.fd"
[drive]
if = "pflash"
format = "raw"
file = "%%MACHINE_DATA%%/OVMF_VARS.fd"
#
# Harddrive and install ISO
#
[drive]
if = "ide"
format = "raw"
file = "%%MACHINE_DATA%%/sda.img"
[drive]
if = "ide"
index = "2"
format = "raw"
file = "%%MACHINE_DATA%%/cdrom-rw.iso"
When started, the machines install themeselves. The install is based on a
"dd" of a pre-generated ext4 image to the new partitions set.

What we observe is, sometime, the virtualized machine freezes in the "dd"
step. From the host side, we observe that the "voluntary_ctxt_switches" of
the Qemu thread does not increase in 3 seconds suggesting that the emulated
process is blocked.

The stack trace (/proc/[Thread ID]/stack) is:

[<ffffffffc055565a>] kvm_vcpu_block+0x8a/0x2f0 [kvm]
[<ffffffffc0571529>] kvm_arch_vcpu_ioctl_run+0x159/0x1620 [kvm]
[<ffffffffc0555146>] kvm_vcpu_ioctl+0x2a6/0x620 [kvm]
[<ffffffffaf287b55>] do_vfs_ioctl+0xa5/0x600
[<ffffffffaf288129>] SyS_ioctl+0x79/0x90
[<ffffffffaf89c0b7>] entry_SYSCALL_64_fastpath+0x1a/0xa5
[<ffffffffffffffff>] 0xffffffffffffffff
We are using Qemu 3.0 with a kernel 4.13.9-300.fc27.x86_64 on the host. The
guest is a Yocto Linux 4.9 with a custom stripped down configuration.

We do not know where the freeze comes from. From what we observe, the
freeze may come from the host kernel, the guest kernel or Qemu itself.

How can we go further in the diagnose ? We can enable traces / patch / run
under gdb / ...

Thanks !
--
Jean-Tiare Le Bigot
Jean-Tiare LE BIGOT
2018-10-02 10:24:22 UTC
Permalink
On one of the frozen quest, I can see ATA and clock errors in the console
[ 4.513829] input: ImExPS/2 Generic Explorer Mouse as
/devices/platform/i8042/serio1/input/input3
[ 66.539190] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6
frozen
[ 66.542582] ata1.00: failed command: FLUSH CACHE
[ 66.544780] ata1.00: cmd e7/00:00:00:00:00/00:00:00:00:00/a0 tag 9
[ 66.544780] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4
(timeout)
[ 66.549921] ata1.00: status: { DRDY }
[ 66.550952] ata1: hard resetting link
[ 103.850697] clocksource: timekeeping watchdog on CPU0: Marking
[ 103.958111] clocksource: 'hpet' wd_now: 6db28ff8
wd_last: 8f3cee23 mask: ffffffff
[ 103.960891] clocksource: 'tsc' cs_now: 317cb4fc49
cs_last: 20265917c3 mask: ffffffffffffffff
[ 103.969066] clocksource: Switched to clocksource hpet
[ 104.295987] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[ 104.298393] ata1.00: configured for UDMA/100
[ 104.300329] ata1.00: retrying FLUSH 0xe7 Emask 0x4
[ 104.303472] ata1.00: device reported invalid CHS sector 0
[ 104.304884] ata1: EH complete
It looks like there is some 60s timeout followed by a 40s link reset. The
virtual hard drives are raw files stored locally on a LVM RAID 0 of 2 SSD
drives. Does it ring a bell ? Similar reports on Internet seems to be
related to remote disks and I'm not sure what to think about the clock skew
?

Any pointer would be greatly appreciated !

On Wed, 26 Sep 2018 at 19:17, Jean-Tiare LE BIGOT <
Hi,
I am using Qemu in a test suite to virtualize 3 x86_64 machines in an
isolated network. The end goal is to run integration tests on a Yocto
generated distribution.
In the setupe phase of the test suite, we start 4 Qemu instances in
parallel with the "-daemonize" option and a global lock to prevent parralel
start (until daemonized).
The machine's configuration files start as follow (prior to variable
#
# General setup
#
[machine]
accel = "kvm" # For acceleration
type = "q35" # For AHCI drive (sda instead of ide)
[memory]
size = "1G"
#
# Control socket
#
[chardev "monitor"]
backend = "socket"
server = "on"
wait = "off"
path = "%%MACHINE_DATA%%/monitor.sock"
[mon]
chardev = "monitor"
mode = "control"
#
# UEFI setup
#
[drive]
if = "pflash"
format = "raw"
readonly = "on"
file = "%%MACHINE_DATA%%/OVMF_CODE.fd"
[drive]
if = "pflash"
format = "raw"
file = "%%MACHINE_DATA%%/OVMF_VARS.fd"
#
# Harddrive and install ISO
#
[drive]
if = "ide"
format = "raw"
file = "%%MACHINE_DATA%%/sda.img"
[drive]
if = "ide"
index = "2"
format = "raw"
file = "%%MACHINE_DATA%%/cdrom-rw.iso"
When started, the machines install themeselves. The install is based on a
"dd" of a pre-generated ext4 image to the new partitions set.
What we observe is, sometime, the virtualized machine freezes in the "dd"
step. From the host side, we observe that the "voluntary_ctxt_switches" of
the Qemu thread does not increase in 3 seconds suggesting that the emulated
process is blocked.
[<ffffffffc055565a>] kvm_vcpu_block+0x8a/0x2f0 [kvm]
[<ffffffffc0571529>] kvm_arch_vcpu_ioctl_run+0x159/0x1620 [kvm]
[<ffffffffc0555146>] kvm_vcpu_ioctl+0x2a6/0x620 [kvm]
[<ffffffffaf287b55>] do_vfs_ioctl+0xa5/0x600
[<ffffffffaf288129>] SyS_ioctl+0x79/0x90
[<ffffffffaf89c0b7>] entry_SYSCALL_64_fastpath+0x1a/0xa5
[<ffffffffffffffff>] 0xffffffffffffffff
We are using Qemu 3.0 with a kernel 4.13.9-300.fc27.x86_64 on the host.
The guest is a Yocto Linux 4.9 with a custom stripped down configuration.
We do not know where the freeze comes from. From what we observe, the
freeze may come from the host kernel, the guest kernel or Qemu itself.
How can we go further in the diagnose ? We can enable traces / patch / run
under gdb / ...
Thanks !
--
Jean-Tiare Le Bigot
--
Jean-Tiare Le Bigot
Loading...