Opened 8 years ago
Last modified 7 years ago
#16417 new defect
vboxsf crashes process or VM in case of many file stat operations on shared vboxsf file system in a short time
Reported by: | mrtngrnr | Owned by: | |
---|---|---|---|
Component: | shared folders | Version: | VirtualBox 5.1.14 |
Keywords: | Cc: | dusan.vuckovic@… | |
Guest type: | Linux | Host type: | Linux |
Description
We use a vagrant based toolchain to automatically generate VMs for executing unit tests. They run on Ubuntu 16.04 LTS 64-bit hosts, and the VMs themselves contain various different older and newer Linux distributions. We use current VirtualBox 5.1.14 and also the 5.1.14 guest tools.
Quite a significant percentage (10-20%) of our VM deployments was not successful. In case of "default" paravirtualization using KVM, the whole VM freezes, and the VBox.log file just says:
00:00:42.027263 VMMDev: vmmDevHeartbeatFlatlinedTimer: Guest seems to be unresponsive. Last heartbeat received 4 seconds ago
If using the "minimal" paravirtualization, VM operation continues, just the unit test process itself dies (see log below). This happens consistently at the same place where there are many stat() calls made to get the ctime information of many thousands of (empty) cache files on a shared vboxsf file system, before outdated files are supposed to be cleaned up. If I remove this code part, the error does not occur any more. Therefore my (uneducated) assumption is that it could be related to running many stat operations in a short time. Here's the stack trace as found in /var/log/kern.log in the instance.
Jan 21 22:23:14 vagrant kernel: [ 8.021389] vboxsf: Successfully loaded version 5.1.14 (interface 0x00010004) Jan 21 22:23:14 vagrant kernel: [ 8.328536] VBoxService 5.1.14 r112924 (verbosity: 0) linux.amd64 (Jan 16 2017 17:38:46) release log Jan 21 22:23:14 vagrant kernel: [ 8.328536] 00:00:00.000093 main Log opened 2017-01-21T22:23:14.323839000Z Jan 21 22:23:14 vagrant kernel: [ 8.328566] 00:00:00.000154 main OS Product: Linux Jan 21 22:23:14 vagrant kernel: [ 8.328578] 00:00:00.000170 main OS Release: 3.16.0-4-amd64 Jan 21 22:23:14 vagrant kernel: [ 8.328589] 00:00:00.000182 main OS Version: #1 SMP Debian 3.16.39-1 (2016-12-30) Jan 21 22:23:14 vagrant kernel: [ 8.328614] 00:00:00.000193 main Executable: /opt/VBoxGuestAdditions-5.1.14/sbin/VBoxService Jan 21 22:23:14 vagrant kernel: [ 8.328614] 00:00:00.000194 main Process ID: 1028 Jan 21 22:23:14 vagrant kernel: [ 8.328614] 00:00:00.000195 main Package type: LINUX_64BITS_GENERIC Jan 21 22:23:14 vagrant kernel: [ 8.329284] 00:00:00.000860 main 5.1.14 r112924 started. Verbose level = 0 Jan 21 22:27:01 vagrant kernel: [ 233.348410] VBOXGUEST_IOCTL_HGCM_CALL: 64 Failed. rc=-228. Jan 21 22:27:01 vagrant kernel: [ 233.348430] general protection fault: 0000 [#1] SMP Jan 21 22:27:01 vagrant kernel: [ 233.348461] Modules linked in: vboxsf(O) nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache sunrpc vboxvideo(O) ttm drm_kms_helper ppdev drm evdev vboxguest(O) crc32_pclmul i2c_piix4 pcspkr i2c_core serio_raw aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd parport_pc battery processor parport video thermal_sys ac button autofs4 ext4 crc16 mbcache jbd2 dm_mod sg sd_mod crc_t10dif crct10dif_generic ata_generic crct10dif_pclmul crct10dif_common crc32c_intel e1000 psmouse ata_piix libata scsi_mod Jan 21 22:27:01 vagrant kernel: [ 233.348614] CPU: 1 PID: 1925 Comm: otrs.UnitTest.p Tainted: G O 3.16.0-4-amd64 #1 Debian 3.16.39-1 Jan 21 22:27:01 vagrant kernel: [ 233.348661] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 Jan 21 22:27:01 vagrant kernel: [ 233.348689] task: ffff8800797bcce0 ti: ffff88007bdcc000 task.ti: ffff88007bdcc000 Jan 21 22:27:01 vagrant kernel: [ 233.348718] RIP: 0010:[<ffffffffa02666c1>] [<ffffffffa02666c1>] VGDrvCommonIoCtl+0x261/0x1bf0 [vboxguest] Jan 21 22:27:01 vagrant kernel: [ 233.348763] RSP: 0018:ffff88007bdcfb98 EFLAGS: 00010282 Jan 21 22:27:01 vagrant kernel: [ 233.348784] RAX: 00000000ffffff1c RBX: 0000a0007bdcfc84 RCX: 0000000000000001 Jan 21 22:27:01 vagrant kernel: [ 233.348813] RDX: 0000000000000000 RSI: 000000000000002f RDI: ffff88007bdcf9a0 Jan 21 22:27:01 vagrant kernel: [ 233.348842] RBP: 5f7c81f4149bd010 R08: 000000000000000a R09: 00000000000001c0 Jan 21 22:27:01 vagrant kernel: [ 233.348870] R10: 0000000000000000 R11: ffff88007bdcf6be R12: 0000b00000000000 Jan 21 22:27:01 vagrant kernel: [ 233.348898] R13: 1b3eed7500000000 R14: 5f7c81f4149bd064 R15: 5f7c81f4149bd010 Jan 21 22:27:01 vagrant kernel: [ 233.348927] FS: 00007ff073ca7700(0000) GS:ffff88007fd00000(0000) knlGS:0000000000000000 Jan 21 22:27:01 vagrant kernel: [ 233.348958] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Jan 21 22:27:01 vagrant kernel: [ 233.348981] CR2: 00007ffec82d3ff8 CR3: 000000007bcd8000 CR4: 00000000000406e0 Jan 21 22:27:01 vagrant kernel: [ 233.349008] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Jan 21 22:27:01 vagrant kernel: [ 233.349033] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Jan 21 22:27:01 vagrant kernel: [ 233.349072] Stack: Jan 21 22:27:01 vagrant kernel: [ 233.349082] 001041ed149bd010 000003e900000000 00000002000003e9 013aa00300000902 Jan 21 22:27:01 vagrant kernel: [ 233.349114] 0000000000000000 0000000000000000 ffff880023350880 0000006c00000004 Jan 21 22:27:01 vagrant kernel: [ 233.349147] ffff88007bdcfbfc 0000a00000020830 ffff88007b0c6000 ffffffffa03f877c Jan 21 22:27:01 vagrant kernel: [ 233.349178] Call Trace: Jan 21 22:27:01 vagrant kernel: [ 233.349194] [<ffffffffa03f877c>] ? sf_stat+0x5c/0x110 [vboxsf] Jan 21 22:27:01 vagrant kernel: [ 233.349219] [<ffffffffa03f88b0>] ? sf_inode_revalidate+0x80/0xd0 [vboxsf] Jan 21 22:27:01 vagrant kernel: [ 233.349246] [<ffffffff811b5192>] ? __inode_permission+0x22/0xb0 Jan 21 22:27:01 vagrant kernel: [ 233.349269] [<ffffffff811b58e6>] ? link_path_walk+0x256/0x870 Jan 21 22:27:01 vagrant kernel: [ 233.349292] [<ffffffff811c281a>] ? __d_lookup+0x7a/0x150 Jan 21 22:27:01 vagrant kernel: [ 233.349979] [<ffffffffa03f8914>] ? sf_dentry_revalidate+0x14/0x20 [vboxsf] Jan 21 22:27:01 vagrant kernel: [ 233.350641] [<ffffffff811b47de>] ? lookup_fast+0x25e/0x2b0 Jan 21 22:27:01 vagrant kernel: [ 233.351302] [<ffffffff811b6055>] ? path_lookupat+0x155/0x780 Jan 21 22:27:01 vagrant kernel: [ 233.351927] [<ffffffffa03fa6ee>] ? VbglR0SfCreate+0x7e/0xa0 [vboxsf] Jan 21 22:27:01 vagrant kernel: [ 233.352603] [<ffffffff811b66a6>] ? filename_lookup+0x26/0xc0 Jan 21 22:27:01 vagrant kernel: [ 233.353267] [<ffffffff811ba8e4>] ? user_path_at_empty+0x54/0x90 Jan 21 22:27:01 vagrant kernel: [ 233.353855] [<ffffffff810ed71e>] ? from_kgid_munged+0xe/0x20 Jan 21 22:27:01 vagrant kernel: [ 233.354431] [<ffffffff811af53a>] ? cp_new_stat+0x13a/0x160 Jan 21 22:27:01 vagrant kernel: [ 233.354976] [<ffffffff811af0e6>] ? vfs_fstatat+0x46/0x90 Jan 21 22:27:01 vagrant kernel: [ 233.355536] [<ffffffff811af57a>] ? SYSC_newstat+0x1a/0x40 Jan 21 22:27:01 vagrant kernel: [ 233.356067] [<ffffffff8151adcd>] ? system_call_fast_compare_end+0x10/0x15 Jan 21 22:27:01 vagrant kernel: [ 233.356577] Code: 01 b9 ff ff ff ff 48 89 da 4c 89 d6 4c 89 ef e8 b6 df ff ff 41 89 c7 0f 1f 00 48 83 c4 40 44 89 f8 5b 41 5c 41 5d 41 5e 41 5f 5d <c3> 66 0f 1f 44 00 00 49 83 f8 17 0f 86 34 0c 00 00 48 8d 83 00 Jan 21 22:27:01 vagrant kernel: [ 233.358611] RIP [<ffffffffa02666c1>] VGDrvCommonIoCtl+0x261/0x1bf0 [vboxguest] Jan 21 22:27:01 vagrant kernel: [ 233.359118] RSP <ffff88007bdcfb98> Jan 21 22:27:01 vagrant kernel: [ 233.359636] ---[ end trace 54b9778e1a5d744b ]---
For reference, here's the Perl code that is used to cleanup the (empty cache files):
my $Wanted = sub { # Skip nonregular files and directories. return if ( !-f $File::Find::name ); my $Stat = File::stat::stat($File::Find::name); if ( $Stat && ( time() - $Stat->ctime() > $CacheTTLSeconds ) ) { ## no critic unlink $File::Find::name; # Don't die here, this could be removed in the background by another process too. } }; File::Find::find( $Wanted, $CacheDir );
Less often, a similar error occurs during the vagrant provisioning of the instances, where we copy the source code of our project from the /vagrant shared vboxsf folder into the VM. Here just a simple "cp -r" is used during the deployment. It could be the same problem:
==> default: Copying OTRS code to /opt/otrs ==> default: cp: cannot stat â/vagrant/Sources/otrs/var/httpd/htdocs/js/thirdparty/ckeditor-4.5.6/plugins/mathjax/lang/cy.jsâ: Protocol error
Please let me know if I should try to provide additional information, and thanks a lot for taking a look at this.
Attachments (7)
Change History (14)
comment:1 by , 8 years ago
comment:2 by , 7 years ago
Definitely interesting but perhaps there is a reproduction scenrio which does not need a vagrant toolchain.
comment:3 by , 7 years ago
Oh, and I would still appreciate a VBox.log file of such a VM which hangs or crashes under heavy load...
comment:4 by , 7 years ago
Hello,
Same kernel panic here, also while using Vagrant.
See Vagrant and VirtualBox logs in ticket's attachements.
comment:5 by , 7 years ago
Thi, thanks for the logs! Could you also provide the vboxsf.ko module from the guest where the crash happens? Thank you!
comment:6 by , 7 years ago
It started to happen to me as well.
I have 2 Mac machines running Virtual Box 5.1.22, one one of them is experience the issue.
Machine 1 (do not have the issue)
Machine 2 (Does experience the issue)
[ 3208.333482] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffffc0362b1d [ 3208.333482] [ 3208.388327] CPU: 0 PID: 8859 Comm: php Not tainted 4.4.0-83-generic #106-Ubuntu [ 3208.442679] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 3208.488888] 0000000000000086 0000000005648689 ffff88007b947a10 ffffffff813f9513 [ 3208.590333] ffffffff81cba598 ffff88007b947aa8 ffff88007b947a98 ffffffff8118d607 [ 3208.694710] ffffffff00000010 ffff88007b947aa8 ffff88007b947a40 0000000005648689 [ 3208.771867] Call Trace: [ 3208.797156] [<ffffffff813f9513>] dump_stack+0x63/0x90 [ 3208.850053] [<ffffffff8118d607>] panic+0xd3/0x215 [ 3208.916594] [<ffffffffc0362b1d>] ? sf_stat+0xfd/0x100 [vboxsf] [ 3208.988018] [<ffffffffc03a58a5>] ? rtR0MemFree+0x55/0x70 [vboxguest] [ 3209.054548] [<ffffffff81081119>] __stack_chk_fail+0x19/0x20 [ 3209.117923] [<ffffffffc0362b1d>] sf_stat+0xfd/0x100 [vboxsf] [ 3209.184369] [<ffffffffc0364c4e>] ? VbglR0SfCreate+0x8e/0xb0 [vboxsf] [ 3209.205510] [<ffffffffc0362c13>] ? sf_dentry_revalidate+0x13/0x30 [vboxsf] [ 3209.281477] [<ffffffff8121a3af>] ? lookup_fast+0x24f/0x330 [ 3209.341436] [<ffffffff8121ba79>] ? walk_component+0x49/0x300 [ 3209.406658] [<ffffffff8121d001>] ? link_path_walk+0x191/0x5b0 [ 3209.450785] [<ffffffff8121b46b>] ? path_init+0x1eb/0x3c0 [ 3209.480281] [<ffffffff8121d51c>] ? path_lookupat+0x7c/0x110 [ 3209.557010] [<ffffffff8121f141>] ? filename_lookup+0xb1/0x180 [ 3209.609035] [<ffffffff811ed747>] ? kmem_cache_alloc+0x187/0x1f0 [ 3209.661132] [<ffffffff8121ed46>] ? getname_flags+0x56/0x1f0 [ 3209.701208] [<ffffffff8121f2e6>] ? user_path_at_empty+0x36/0x40 [ 3209.789207] [<ffffffff81214076>] ? vfs_fstatat+0x66/0xc0 [ 3209.853415] [<ffffffff81214631>] ? SYSC_newlstat+0x31/0x60 [ 3209.892152] [<ffffffff8121476e>] ? SyS_newlstat+0xe/0x10 [ 3209.942207] [<ffffffff81840b72>] ? entry_SYSCALL_64_fastpath+0x16/0x71 [ 3210.000542] Kernel Offset: disabled [ 3210.070708] ---[ end Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffffc0362b1d [ 3210.070708]
by , 7 years ago
Attachment: | ubuntu-xenial-16.04-cloudimg-console.3.log added |
---|
Machine 3, Linux ubuntu-xenial 4.10.0-24-generic #28~16.04.1-Ubuntu SMP Wed Jun 14 11:58:14 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
by , 7 years ago
lib/modules/4.10.0-24-generic/kernel/ubuntu/vbox/vboxsf/vboxsf.ko
comment:7 by , 7 years ago
Same issue after upgrading kernel, this time 4.10.0-24-generic
Virtual Box version: 5.1.22
Ubuntu 16.04.2 LTS ubuntu-xenial ttyS0 ubuntu-xenial login: [ 598.928632] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffffc04e0863 [ 598.928632] [ 598.984839] CPU: 0 PID: 4948 Comm: php Not tainted 4.10.0-24-generic #28~16.04.1-Ubuntu [ 599.008107] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 599.048215] Call Trace: [ 599.061841] dump_stack+0x63/0x90 [ 599.072364] panic+0xe4/0x22d [ 599.084014] ? VbglR0HGCMInternalCall+0xe83/0xe90 [vboxguest] [ 599.103309] ? rtR0MemFree+0x55/0x70 [vboxguest] [ 599.125034] __stack_chk_fail+0x19/0x20 [ 599.136182] VbglR0HGCMInternalCall+0xe83/0xe90 [vboxguest] [ 599.159284] ? vgdrvHgcmAsyncWaitCallbackWorker+0x220/0x220 [vboxguest] [ 599.186149] ? vbglDriverIOCtl+0x16/0x20 [vboxsf] [ 599.195686] ? VbglHGCMCall+0x43/0x90 [vboxsf] [ 599.220713] ? VbglR0SfCreate+0x8e/0xb0 [vboxsf] [ 599.230384] ? sf_stat+0x5d/0x100 [vboxsf] [ 599.245906] ? VGDrvCommonIoCtl+0x263/0x1bf0 [vboxguest] [ 599.272188] ? VBoxGuestIDCCall+0x64/0x70 [vboxguest] [ 599.288242] ? vbglDriverIOCtl+0x16/0x20 [vboxsf] [ 599.307753] ? VbglHGCMCall+0x43/0x90 [vboxsf] [ 599.312484] ? VbglR0SfCreate+0x8e/0xb0 [vboxsf] [ 599.325318] ? sf_stat+0x5d/0x100 [vboxsf] [ 599.328750] ? sf_inode_revalidate+0x8c/0xe0 [vboxsf] [ 599.344254] ? __check_object_size+0x108/0x1e3 [ 599.353403] ? strncpy_from_user+0x4d/0x170 [ 599.355276] ? sf_getattr+0x1e/0x40 [vboxsf] [ 599.360343] ? apparmor_inode_getattr+0x1d/0x20 [ 599.376027] ? vfs_getattr_nosec+0x29/0x40 [ 599.383014] ? vfs_getattr+0x26/0x30 [ 599.384696] ? vfs_fstatat+0x78/0xc0 [ 599.401083] ? SYSC_newstat+0x2e/0x60 [ 599.413682] ? SyS_mmap_pgoff+0xf8/0x290 [ 599.428594] ? SyS_newstat+0xe/0x10 [ 599.435137] ? entry_SYSCALL_64_fastpath+0x1e/0xad [ 599.459465] Kernel Offset: 0x39a00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) [ 599.462856] ---[ end Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffffc04e0863 [ 599.462856]
Attaching:
I forgot to mention that we use the following mount options for the vboxsf folders: