mirror of
https://github.com/JakeHillion/drgn.git
synced 2024-12-23 09:43:06 +00:00
be85631471
Every few builds or so, a vmtest VM crashes after printing "x86: Booting SMP configuration:". After some difficult debugging, I determined that the crash happens in arch/x86/realmode/rm/trampoline_64.S (the code that initializes secondary CPUs) at the ljmp from startup_32 to startup_64. The real problem happens earlier in startup_32: movl $pa_trampoline_pgd, %eax movl %eax, %cr3 Sometimes, the store to CR3 "fails" and CR3 remains zero, which causes the later ljmp to triple fault. This can be reproduced by the following script: #!/bin/sh curl -L 'https://www.dropbox.com/sh/2mcf2xvg319qdaw/AABFKsISWRpndNZ1gz60O-qSa/x86_64/vmlinuz-5.8.0-rc7-vmtest1?dl=1' -o vmlinuz cat > commands.gdb << "EOF" set confirm off target remote :1234 # arch/x86/realmode/rm/trampoline_64.S:startup_32 after CR3 store. hbreak *0x9ae09 if $cr3 == 0 command info registers eax cr3 quit 1 end # kernel/smp.c:smp_init() after all CPUs have been brought up. If we get here, # the bug wasn't triggered. hbreak *0xffffffff81ed4484 command kill quit 0 end continue EOF while true; do qemu-system-x86_64 -cpu host -enable-kvm -smp 64 -m 128M \ -nodefaults -display none -serial file:/dev/stdout -no-reboot \ -kernel vmlinuz -append 'console=0,115200 panic=-1 nokaslr' \ -s -S & gdb -batch -x commands.gdb || exit 1 done This seems to be a problem with nested virtualization that was fixed by Linux kernel commit b4d185175bc1 ("KVM: VMX: give unrestricted guest full control of CR3") (in v4.17). Apparently, the Google Cloud hosts that Travis runs on are missing this fix. We obviously can't patch those hosts, but we can work around it. Disabling unrestricted guest support in the Travis VM causes CR3 stores in the nested vmtest VM to be emulated, bypassing the bug. Signed-off-by: Omar Sandoval <osandov@osandov.com>
38 lines
1.3 KiB
YAML
38 lines
1.3 KiB
YAML
dist: bionic
|
|
|
|
language: python
|
|
python:
|
|
- '3.8'
|
|
- '3.7'
|
|
- '3.6'
|
|
install:
|
|
# If the host is running a kernel without Linux kernel commit b4d185175bc1
|
|
# ("KVM: VMX: give unrestricted guest full control of CR3") (in v4.17), then
|
|
# stores to CR3 in the nested guest can spuriously fail and cause it to
|
|
# crash. We can work around this by disabling unrestricted guest support.
|
|
- |
|
|
if grep -q '^flags\b.*\bvmx\b' /proc/cpuinfo; then
|
|
echo "options kvm_intel unrestricted_guest=N" | sudo tee /etc/modprobe.d/kvm-cr3-workaround.conf > /dev/null
|
|
sudo modprobe -r kvm_intel
|
|
sudo modprobe kvm_intel
|
|
fi
|
|
# Upstream defaults to world-read-writeable /dev/kvm. Debian/Ubuntu override
|
|
# this; see https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=892945. We want
|
|
# the upstream default.
|
|
- echo 'KERNEL=="kvm", GROUP="kvm", MODE="0666", OPTIONS+="static_node=kvm"' | sudo tee /lib/udev/rules.d/99-fix-kvm.rules > /dev/null
|
|
- sudo udevadm control --reload-rules
|
|
# On systemd >= 238 we can use udevadm trigger -w and remove udevadm settle.
|
|
- sudo udevadm trigger /dev/kvm
|
|
- sudo udevadm settle
|
|
script: python setup.py test -K
|
|
|
|
addons:
|
|
apt:
|
|
packages:
|
|
- busybox-static
|
|
- libbz2-dev
|
|
- liblzma-dev
|
|
- qemu-kvm
|
|
- zlib1g-dev
|
|
- zstd
|