Opened 6 years ago
Closed 12 days ago
#18176 closed defect (fixed)
USB resets and hangs during high I/O
Reported by: | Jimson | Owned by: | |
---|---|---|---|
Component: | USB | Version: | VirtualBox 5.2.22 |
Keywords: | usb reset superspeed | Cc: | |
Guest type: | Linux | Host type: | Mac OS X |
Description (last modified by )
Experiencing USB reset and hangs on Virtualbox 5.2.22 on Macbook Pro 2017 10.13.6, on Oracle Linux 7.5 and Ubuntu 18.04 VMs. I'm able to replicate the problem, very quickly, by running the following:
$ sudo dd if=/dev/zero of=./testfile status=progress bs=1024k 639631360 bytes (640 MB) copied, 5.448752 s, 117 MB/s
It starts out pretty fast, but once the reset occurs, the speeds taper down to < 50MB/s. The OL system log shows the resets and I/O errors as:
Dec 10 15:55:28 jdnissen-lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd Dec 10 15:55:38 jdnissen-lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd Dec 10 15:55:44 jdnissen-lobi7 systemd: Got automount request for /proc/sys/fs/binfmt_misc, triggered by 1858 (cma) Dec 10 15:55:44 jdnissen-lobi7 systemd: Mounting Arbitrary Executable File Formats File System... Dec 10 15:55:45 jdnissen-lobi7 systemd: Mounted Arbitrary Executable File Formats File System. Dec 10 15:55:46 jdnissen-lobi7 kernel: EXT4-fs (dm-3): Delayed block allocation failed for inode 12 at logical offset 546816 with max blocks 2048 with error 5 Dec 10 15:55:46 jdnissen-lobi7 kernel: EXT4-fs (dm-3): This should not happen!! Data will be lost Dec 10 15:55:46 jdnissen-lobi7 kernel: JBD2: Detected IO errors while flushing file data on dm-3-8 Dec 10 15:55:51 jdnissen-lobi7 kernel: JBD2: Detected IO errors while flushing file data on dm-3-8 Dec 10 15:55:57 jdnissen-lobi7 kernel: JBD2: Detected IO errors while flushing file data on dm-3-8 Dec 10 15:56:01 jdnissen-lobi7 kernel: JBD2: Detected IO errors while flushing file data on dm-3-8 Dec 10 15:56:07 jdnissen-lobi7 kernel: JBD2: Detected IO errors while flushing file data on dm-3-8 Dec 10 15:56:12 jdnissen-lobi7 kernel: JBD2: Detected IO errors while flushing file data on dm-3-8 Dec 10 15:56:17 jdnissen-lobi7 kernel: JBD2: Detected IO errors while flushing file data on dm-3-8 Dec 10 15:56:22 jdnissen-lobi7 kernel: JBD2: Detected IO errors while flushing file data on dm-3-8 Dec 10 15:56:27 jdnissen-lobi7 kernel: JBD2: Detected IO errors while flushing file data on dm-3-8 Dec 10 15:56:32 jdnissen-lobi7 kernel: JBD2: Detected IO errors while flushing file data on dm-3-8
I've attempted to workaround this with the new test build 6.0.0 RC1, and it didn't help. OL and Ubuntu updates don't help, either.
It's happening on two different USB drives: Seagate 2TB and Western Digital 2TB. The seagate is formatted as NTFS and the WD is Linux EXT4.
In addition, I can attach the drives to an old laptop running bare-metal OL7.2, and it works just fine...no resets, hangs, or slow I/Os, during the 'dd' (or file copies).
I work for Oracle, and this problem is preventing me from being able to copy large patch bundles, used during customer installs.
Attachments (2)
Change History (34)
by , 6 years ago
comment:1 by , 6 years ago
comment:2 by , 6 years ago
The original errors in code-blocks for readability...
Dec 7 10:09:50 lobi7 kernel: usb 2-1: new SuperSpeed USB device number 4 using xhci_hcd Dec 7 10:09:50 lobi7 kernel: usb 2-1: New USB device found, idVendor=1058, idProduct=25a2 Dec 7 10:09:50 lobi7 kernel: usb 2-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3 Dec 7 10:09:50 lobi7 kernel: usb 2-1: Product: Elements 25A2 Dec 7 10:09:50 lobi7 kernel: usb 2-1: Manufacturer: Western Digital Dec 7 10:09:50 lobi7 kernel: usb 2-1: SerialNumber: 575854314536384353524154 Dec 7 10:09:50 lobi7 kernel: usb-storage 2-1:1.0: USB Mass Storage device detected Dec 7 10:09:50 lobi7 kernel: scsi host6: usb-storage 2-1:1.0 Dec 7 10:09:50 lobi7 mtp-probe: checking bus 2, device 4: "/sys/devices/pci0000:00/0000:00:0c.0/usb2/2-1" Dec 7 10:09:50 lobi7 mtp-probe: bus: 2, device: 4 was not an MTP device Dec 7 10:09:51 lobi7 kernel: scsi 6:0:0:0: Direct-Access WD Elements 25A2 1021 PQ: 0 ANSI: 6 Dec 7 10:09:51 lobi7 kernel: sd 6:0:0:0: Attached scsi generic sg2 type 0 Dec 7 10:09:51 lobi7 kernel: sd 6:0:0:0: [sdc] 3906963456 512-byte logical blocks: (2.00 TB/1.81 TiB) Dec 7 10:09:51 lobi7 kernel: sd 6:0:0:0: [sdc] Write Protect is off Dec 7 10:09:51 lobi7 kernel: sd 6:0:0:0: [sdc] No Caching mode page found Dec 7 10:09:51 lobi7 kernel: sd 6:0:0:0: [sdc] Assuming drive cache: write through Dec 7 10:09:51 lobi7 kernel: sd 6:0:0:0: [sdc] Attached SCSI disk ... Dec 7 09:47:23 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 3 using xhci_hcd Dec 7 09:47:23 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 3 using xhci_hcd Dec 7 09:47:24 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 3 using xhci_hcd Dec 7 09:47:24 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 3 using xhci_hcd Dec 7 09:47:24 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 3 using xhci_hcd Dec 7 09:47:24 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 3 using xhci_hcd Dec 7 09:47:24 lobi7 kernel: sd 5:0:0:0: [sdb] FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK Dec 7 09:47:24 lobi7 kernel: sd 5:0:0:0: [sdb] CDB: Write(10) 2a 00 07 4e c4 00 00 04 00 00 Dec 7 09:47:24 lobi7 kernel: blk_update_request: I/O error, dev sdb, sector 122602496 Dec 7 09:47:24 lobi7 kernel: EXT4-fs warning (device dm-3): ext4_end_bio:316: I/O error -5 writing to inode 12 (offset 41783656448 size 8388608 starting block 15324800) Dec 7 09:47:24 lobi7 kernel: Buffer I/O error on device dm-3, logical block 15324800 Dec 7 09:47:24 lobi7 kernel: Buffer I/O error on device dm-3, logical block 15324801 Dec 7 09:47:24 lobi7 kernel: Buffer I/O error on device dm-3, logical block 15324802 Dec 7 09:47:24 lobi7 kernel: Buffer I/O error on device dm-3, logical block 15324803 Dec 7 09:47:24 lobi7 kernel: Buffer I/O error on device dm-3, logical block 15324804 Dec 7 09:47:24 lobi7 kernel: Buffer I/O error on device dm-3, logical block 15324805 Dec 7 09:47:24 lobi7 kernel: Buffer I/O error on device dm-3, logical block 15324806 Dec 7 09:47:24 lobi7 kernel: Buffer I/O error on device dm-3, logical block 15324807 Dec 7 09:47:24 lobi7 kernel: Buffer I/O error on device dm-3, logical block 15324808 Dec 7 09:47:24 lobi7 kernel: Buffer I/O error on device dm-3, logical block 15324809 … Dec 7 09:47:26 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 3 using xhci_hcd Dec 7 09:47:26 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 3 using xhci_hcd Dec 7 09:47:26 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 3 using xhci_hcd Dec 7 09:47:26 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 3 using xhci_hcd Dec 7 09:47:26 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 3 using xhci_hcd Dec 7 09:47:26 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 3 using xhci_hcd Dec 7 09:47:26 lobi7 kernel: sd 5:0:0:0: [sdb] FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK Dec 7 09:47:26 lobi7 kernel: sd 5:0:0:0: [sdb] CDB: Write(10) 2a 00 07 4e d0 00 00 04 00 00 Dec 7 09:47:26 lobi7 kernel: blk_update_request: I/O error, dev sdb, sector 122605568 Dec 7 09:47:26 lobi7 kernel: EXT4-fs warning (device dm-3): ext4_end_bio:316: I/O error -5 writing to inode 12 (offset 41792045056 size 8388608 starting block 15325184)
by , 6 years ago
Attachment: | Oracle Linux 7.2 OBI-2018-12-12-09-33-55.log added |
---|
comment:3 by , 6 years ago
I was able to reproduce this... and then I wasn't. But it certainly didn't happen because anything got intentionally fixed.
The underlying problem unfortunately appears to be some race condition which is inherently very configuration specific and at least for me, currently so hard to reproduce that I can't meaningfully try to fix it. I can read hundreds of gigabytes at ~180 MB/s, no problem, the resets just aren't happening. That doesn't mean the resets won't happen on other machines, but at the moment I don't have those other machines.
Please try again with a Windows host, and report any issues encountered. I believe the problem is much less severe on a Windows host because it only slows things down but does not cause I/O errors.
If you find something that makes the errors more frequent (number of virtual CPUs? idle/busy host? VM memory size?), please add a comment here.
comment:4 by , 6 years ago
Summary: | USB 3.0 resets and hangs during high I/O → USB resets and hangs during high I/O |
---|
I'll add that the problem is not USB 3.0 specific. There is some probability involved, and the error may be more likely to show up as the number of USB transactions go up, so with fast devices the error probably just shows up quicker.
There is definitely a problem in the VirtualBox Linux USB proxy in that it does now allow USB devices to be truly reset. It looks like the USB device may get sufficiently confused that it simply stops responding (that could in fact be triggering the initial reset attempt) and because we do not really reset it, the USB device stays confused until the guest OS gives up on it.
comment:5 by , 6 years ago
My attempts at reproducing this problem on two home Windows 10 computers failed, at first, I think due to bug 84741. However, an upgrade to Vbox 6.0 fixed that, and interestingly, I have been unable to reproduce this bug on Windows 10 hosts running Vbox 6.0, running the same Oracle Linux VM. I have read a large 450GB file and wrote a large 1.7TB file, until the device filled up. However, there were a few minor USB resets:
Dec 21 11:21:38 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd Dec 21 12:33:36 lobi7 kernel: EXT4-fs (dm-3): error count since last fsck: 5 Dec 21 12:33:36 lobi7 kernel: EXT4-fs (dm-3): initial error at time 1544198049: ext4_journal_check_start:56 Dec 21 12:33:36 lobi7 kernel: EXT4-fs (dm-3): last error at time 1544478866: ext4_wait_block_bitmap:497 Dec 21 12:33:56 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 3 using xhci_hcd
The upgrade to Vbox 6.0 doesn't fix this bug, on the Mac host, however. The problem still occurs there, rather quickly.
comment:6 by , 6 years ago
Actually, you did reproduce half of the problem -- it's that "reset SuperSpeed USB device number 2 using xhci_hcd" message. The difference is that on a Windows host, the VM / USB device / VirtualBox can recover from that situation, and on Linux or OS X hosts it can't.
I know exactly why the problem exists on Linux, it's because we only pretend to reset the USB device but don't really. On OS X I'm not sure why it's happening, I thought we should be really resetting the device but perhaps we aren't. I haven't looked at the OS X behavior in detail yet.
On Linux hosts, the question is how to fix it, and I don't have the answer yet though I have some ideas.
comment:7 by , 6 years ago
Michal,
Do you need any more data points? I'm on OSX 10.11.6 with plenty of VMs to test this on. And 2 USB3 HDs.
Also, since you have the power, could you edit the messages from 'Jimson' to include the logs/messages in {{{ ... }}} tags, to make reading a little bit easier? TIA.
Or you could do it too Jimson, they're your messages, you have the authority. Except the original one, the ticket report itself. As the poet once said, "U Can't Touch This!" ;)
comment:8 by , 6 years ago
Description: | modified (diff) |
---|
comment:9 by , 6 years ago
I have had zero luck reproducing the problem on my Mac. The guest never even tries to do the USB resets, so there's no problem. That's not to say there is no issue, just that I can't see it.
The latest test builds include a change in the Linux USB backend. When the guest asks for a USB device to be reset, it is now really reset (rather than ignoring the request as we did before). For me, that fixes the reset behavior -- that is to say, copying occasionally triggers USB resets but the guest recovers from those and continues, instead of losing the device. Feedback on the Linux builds would be useful.
comment:10 by , 6 years ago
Sorry, I missed some of the previous posts, until I noticed the changelog for 6.0.2 fixed some USB reset issues. I just finished installing 6.0.2 maintenance release on my Mac, and I'm able to easily reproduce the problem, while writing a large file...
Jan 15 10:49:52 lobi7 kernel: sd 4:0:0:0: [sdb] CDB: Write(10) 2a 00 07 38 18 00 00 04 00 00 Jan 15 10:49:52 lobi7 kernel: blk_update_request: I/O error, dev sdb, sector 121116672 Jan 15 10:49:52 lobi7 kernel: EXT4-fs warning (device dm-3): ext4_end_bio:302: I/O error -5 writing to inode 12 (offset 41162899456 size 8388608 starting block 15139072) Jan 15 10:49:52 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd Jan 15 10:49:52 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd Jan 15 10:49:53 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd Jan 15 10:49:53 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd Jan 15 10:49:53 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd Jan 15 10:49:53 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd Jan 15 10:49:53 lobi7 kernel: sd 4:0:0:0: [sdb] FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK Jan 15 10:49:53 lobi7 kernel: sd 4:0:0:0: [sdb] CDB: Write(10) 2a 00 07 38 1c 00 00 04 00 00 Jan 15 10:49:53 lobi7 kernel: blk_update_request: I/O error, dev sdb, sector 121117696 Jan 15 10:49:53 lobi7 kernel: EXT4-fs warning (device dm-3): ext4_end_bio:302: I/O error -5 writing to inode 12 (offset 41162899456 size 8388608 starting block 15139200) Jan 15 10:49:53 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd Jan 15 10:49:53 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd Jan 15 10:49:54 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd Jan 15 10:49:54 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd Jan 15 10:49:54 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd Jan 15 10:49:54 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd Jan 15 10:49:54 lobi7 kernel: sd 4:0:0:0: [sdb] FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK Jan 15 10:49:54 lobi7 kernel: sd 4:0:0:0: [sdb] CDB: Write(10) 2a 00 07 38 20 00 00 04 00 00 Jan 15 10:49:54 lobi7 kernel: blk_update_request: I/O error, dev sdb, sector 121118720 Jan 15 10:49:54 lobi7 kernel: EXT4-fs warning (device dm-3): ext4_end_bio:302: I/O error -5 writing to inode 12 (offset 41162899456 size 8388608 starting block 15139328) Jan 15 10:49:54 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd Jan 15 10:49:54 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd Jan 15 10:49:54 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd Jan 15 10:49:55 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd Jan 15 10:49:55 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd Jan 15 10:49:55 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd Jan 15 10:49:55 lobi7 kernel: sd 4:0:0:0: [sdb] FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK Jan 15 10:49:55 lobi7 kernel: sd 4:0:0:0: [sdb] CDB: Read(10) 28 00 07 00 10 78 00 00 08 00 Jan 15 10:49:55 lobi7 kernel: blk_update_request: I/O error, dev sdb, sector 117444728 Jan 15 10:49:55 lobi7 kernel: EXT4-fs error (device dm-3): ext4_wait_block_bitmap:516: comm kworker/u2:0: Cannot read block bitmap - block_group = 463, block_bitmap = 14680079 Jan 15 10:50:01 lobi7 systemd: Created slice User Slice of root. Jan 15 10:50:01 lobi7 systemd: Started Session 3 of user root. Jan 15 10:50:01 lobi7 systemd: Removed slice User Slice of root. Jan 15 10:50:26 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd Jan 15 10:50:57 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd Jan 15 10:51:07 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd Jan 15 10:51:15 lobi7 kernel: EXT4-fs (dm-3): Delayed block allocation failed for inode 12 at logical offset 10082304 with max blocks 103 with error 5 Jan 15 10:51:15 lobi7 kernel: EXT4-fs (dm-3): This should not happen!! Data will be lost Jan 15 10:51:15 lobi7 kernel: JBD2: Detected IO errors while flushing file data on dm-3-8 Jan 15 10:51:20 lobi7 kernel: JBD2: Detected IO errors while flushing file data on dm-3-8 Jan 15 10:51:25 lobi7 kernel: JBD2: Detected IO errors while flushing file data on dm-3-8 ...
One slight change in behavior is it recovered during the first round of these I/O errors and USB resets. However, shortly after, it hit a second round, and was unable to recover, and the dd stopped running ...
[ b8049a7e-c0ce-4739-a586-2bb6e05ec5e6]$ sudo dd if=/dev/zero of=./testfile status=progress bs=1024k [sudo] password for jdnissen: 59401830400 bytes (59 GB) copied, 817.845151 s, 72.6 MB/s dd: error writing ‘./testfile’: Read-only file system 56661+0 records in 56660+0 records out 59413278720 bytes (59 GB) copied, 1051.39 s, 56.5 MB/s
My Mac is now running 10.13.6.
follow-up: 12 comment:11 by , 6 years ago
Sorry if I wasn't clear. There is no change in behavior expected on OS X hosts, since there were no changes. Changes were on Linux hosts only.
comment:12 by , 6 years ago
Replying to michaln:
Sorry if I wasn't clear. There is no change in behavior expected on OS X hosts, since there were no changes. Changes were on Linux hosts only.
Michal, completely understand, but saw the notes on troubles reproducing this on OS X, and just wanted to state it is still easy for me to reproduce, on my OS X laptop.
comment:13 by , 5 years ago
I have observed the same issue with OS X host running Debian client using VBox 6.0.4. When external USB connected disk is in use by linux host it randomly resets with dimes showing:
usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd
comment:14 by , 5 years ago
Just an update on this issue. Recently upgraded my Virtualbox (on Mac) to the recent 6.0.8, and retested this with both Oracle Linux 7.6 and Ubuntu 18.04. The 'dd' locked up on Ubuntu, but it took a while (a couple of 100's of GB). However, on OL7.6, the large file continued writing until the external USB was full, at 2TB. There were some USB resets and "hung thread" dumps, along the way, but it kept cranking.
I think the I/O hangs on Ubuntu were a reproduction of the issue, and think Ubuntu failed over OL because the I/O rate on it was much faster. Until the failure, Ubuntu was writing up to 80MB/s, where OL streamed around 45-60MB/s. Maybe this race condition occurred on Ubuntu because of the accelerated I/O rates.
comment:15 by , 5 years ago
What's the status of resolving this issue? It is preventing me from copying files from NTFS to Linux EXT4 drives.
comment:16 by , 5 years ago
No good news I'm afraid. I spent a lot of time on this problem and got absolutely nowhere. Customers complaining would help, as well as some much better reproduction scenario.
comment:17 by , 5 years ago
I can reproduce it 100%. With any Linux distro I tried. Run Linux in VB (macOS host) - connect external hdd to VB (via USB).
then run non destructive data test:
sudo badblocks -nsv /dev/sda
On Ubuntu/Dabian you can install it with:
sudo apt-get install e2fsprogs
I never managed to finish 1TB disk. It always sooner or later happens.
comment:18 by , 5 years ago
The problem isn't the "sooner", it's the "later". How long does a test need to be run to be certain the problem isn't there? When that takes 30+ minutes, the cycle gets *really* long and I could spend weeks on it and still get nowhere.
One thing I'm wondering about though, is this only happening on Macs or is no one using big external drives with any other host OS?
comment:19 by , 5 years ago
Hello I'm having exactly the same problem I am trying to securely initialise my encrypted hard drive by doing.
pv -tpreb /dev/zero | sudo dd of=/dev/mapper/data bs=128M
And is pretty random, after sometimes 20GB or sometimes 250GB the USB controller hangs, and the only options is to disconnect the external hard drive. I'm having exactly the same errors as @Jimson
I'm using MacOS Majove as host, and Ubuntu server 19.04 as guest.
follow-up: 21 comment:20 by , 4 years ago
I'm seeing this issue on a Windows 10 host with VB 6.10 64-bit with a 64-bit Fedora-32 guest. This is with no Extension Pack installed (the license doesn't seem to permit it in my situation) so I'm running at USB 1.1 speed.
Sometimes I can push 10 GB (slowly) through to an external hard drive, and other times the error occurs pretty quickly. When it occurs, the external filesystem gets unmounted. I can mount it and try the transfer again, and it seems to have about the same probability of success as before.
follow-up: 22 comment:21 by , 4 years ago
Replying to various:
I'm seeing this issue on a Windows 10 host with VB 6.10 64-bit with a 64-bit Fedora-32 guest. This is with no Extension Pack installed (the license doesn't seem to permit it in my situation) so I'm running at USB 1.1 speed.
Could you please attach a VBox.log from the VM that exhibits the problem, and also a snippet of the guest's dmesg output if there any errors logged there? At first glance that does not sound like the same problem, but if it is, it means we've been looking for the bug in the wrong place.
Also, what is the USB storage device that's being passed through?
comment:22 by , 4 years ago
Replying to michaln:
Replying to various:
I'm seeing this issue on a Windows 10 host with VB 6.10 64-bit with a 64-bit Fedora-32 guest. This is with no Extension Pack installed (the license doesn't seem to permit it in my situation) so I'm running at USB 1.1 speed.
Could you please attach a VBox.log from the VM that exhibits the problem, and also a snippet of the guest's dmesg output if there any errors logged there? At first glance that does not sound like the same problem, but if it is, it means we've been looking for the bug in the wrong place.
Also, what is the USB storage device that's being passed through?
I shouldn't copy files from that machine. But I'll try to describe what I see:
The storage device is a WD Passport rotating HDD. The VBox.log says:
VUSB: Attached '00000xxxxx[proxy 1058:07a8]' to port 3 on RootHub$0 (HighSpeed)
That's about the only entry that appears in that file when I connect the HDD.
The dmesg output is more extensive and suggestive. I can't type it all here, but some highlights include:
usb 1-3: new full-speed USB device number 4 using ohci-pci usb 1-3: config 1 interface 0 altsetting 0 endpoint 0x8B has invalid maxpacket 512, setting to 64 usb 1-3: config 1 interface 0 altsetting 0 endpoint 0xA has invalid maxpacket 512, setting to 64 usb 1-3: New USB device found, idVendor-1058, idProduct=007a8, bcdDevice=10.42 usb-storage: 1-3:1.0: USB Mass Storage device detected scsi host6: usb-storage 1-3:1.0 . . . scsi 6:0:0:1: Wrong diagnostic page; asked for 1 got 8 scsi 6:0:0:1: Faild to bind enclosure -19 sdc: sdc1 sdc2 sdc3 sd 6:0:0:0: [sdc] Attached SCSI disk ses 6:0:0:1: Attached Enclosure device . . about 430 seconds pass (during which data are flowing normally), and then: pktcdvd: pktcdvd0: ... some hex bytes... sens 04.0f.00 (Hardware error) pktcdvd: pktcdvd0: ... some hex bytes... sens ad.4b.af (INVALID) . . usb 1-3: reset full-speed USB device number 4 using ohci-pci usb 1-3: device firmware changed usb 1-3: USB disconnect, device number 4 blk_update_request: I/O error, dev sdc, sector xxxx op 0x1:(WRITE) flags 0x104000 phys_seg 9 prio class 0 (and 3 similar) . . . scsi 6:0:0:0:0: regecting I/O to dead device . . . usb 1-3: new full-speed USB device number 5 using ohci-pci
the sequence starts again, but without the errors since I was no longer trying to use the device.
Is that any help?
comment:23 by , 4 years ago
Hi,
I found this problem also. Host: New Macbook Pro (10.15.6). VB 6.1.10. Guest: new install of xubuntu lts. dd writes to the usb disk didn't consistently cause usb reset "crashes", but the problem happened within the first few GB of data written trying to use a 'zpool replace' operation from a local diskimage to a usb 3 disk during the resilver operation. The reset immediately prevented any further use of the usb disk until re-plug-in. I had to give up trying this. From what I can tell, relevant log entries on both sides are similar to those above, but I can provide log extracts and host/guest details if it might help.
Manny
comment:24 by , 2 years ago
I am experiencing this issue as well with VB 6.1.34. The host is an old MacBook Pro (A1502) running Big Sur 11.6.8 and the guest is openmediavault, which is debian based.
Thaison
comment:25 by , 2 years ago
I continue to hit this issue, even with newest Vbox 6.1.36, though the signatures are slightly different. Today, I retested with both GUI file drag copy and 'cp -R' from large set of files on shared folder to direct-mounted EXT4 USB drive. The file copy always fails with following errors:
[jdnissen@ol8 ~]$ sudo grep -i mount /var/log/messages Oct 13 09:16:43 ol8 kernel: EXT4-fs warning (device sdb1): ext4_clear_journal_err:5215: Filesystem error recorded from previous mount: IO failure Oct 13 09:16:43 ol8 kernel: EXT4-fs (sdb1): warning: mounting fs with errors, running e2fsck is recommended Oct 13 09:16:43 ol8 kernel: EXT4-fs (sdb1): mounted filesystem with ordered data mode. Opts: errors=remount-ro Oct 13 09:16:43 ol8 journal[941]: Mounted /dev/sdb1 at /run/media/someuser/EXT4 on behalf of uid 1000 Oct 13 09:51:30 ol8 kernel: EXT4-fs (sdb1): Remounting filesystem read-only Oct 13 09:51:50 ol8 journal[941]: Cleaning up mount point /run/media/someuser/EXT4 (device 8:17 no longer exists) Oct 13 09:51:50 ol8 systemd[1]: run-media-jdnissen-EXT4.mount: Succeeded. Oct 13 12:37:33 ol8 kernel: EXT4-fs warning (device sdb1): ext4_clear_journal_err:5215: Filesystem error recorded from previous mount: IO failure Oct 13 12:37:33 ol8 kernel: EXT4-fs (sdb1): warning: mounting fs with errors, running e2fsck is recommended Oct 13 12:37:33 ol8 kernel: EXT4-fs (sdb1): mounted filesystem with ordered data mode. Opts: errors=remount-ro Oct 13 12:37:33 ol8 journal[941]: Mounted /dev/sdb1 at /run/media/someuser/EXT4 on behalf of uid 1000 Oct 13 13:45:19 ol8 kernel: EXT4-fs (sdb1): Remounting filesystem read-only
The file copy will then give the following errors: [jdnissen@ol8 downloads]$ cp -R bigdir /run/media/someuser/EXT4 cp: error writing '/run/media/someuser/EXT4/<somefile>': Input/output error cp: cannot unlink ‘/run/media/someuser/EXT4/<nextfile>: Read-only file system cp: cannot create regular file '/run/media/someuser/EXT4/<nextfile+1>': Read-only file system ...
The filesystem is automatically mounted with the 'errors=remount-ro' option, so the read-only errors are a result of Linux remounting the disk due to the I/O errors, but why do the large I/O demands result in I/O errors only on Vbox VMs?
$ mount ... dev/sdb1 on /run/media/someuser/EXT4 type ext4 (ro,nosuid,nodev,relatime,seclabel,errors=remount-ro,uhelper=udisks2)'
comment:26 by , 2 years ago
Same sort of resets as back when I filed this bug...
Oct 13 13:45:10 ol8 kernel: usb 2-1: reset SuperSpeed Gen 1 USB device number 5 using xhci_hcd Oct 13 13:45:10 ol8 kernel: usb 2-1: reset SuperSpeed Gen 1 USB device number 5 using xhci_hcd Oct 13 13:45:10 ol8 kernel: usb 2-1: reset SuperSpeed Gen 1 USB device number 5 using xhci_hcd Oct 13 13:45:11 ol8 kernel: sd 3:0:0:0: [sdb] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK Oct 13 13:45:11 ol8 kernel: sd 3:0:0:0: [sdb] tag#0 CDB: Write(10) 2a 00 1f 2a 44 68 00 02 00 00 Oct 13 13:45:11 ol8 kernel: blk_update_request: I/O error, dev sdb, sector 522863720 op 0x1:(WRITE) flags 0x4000 phys_seg 64 prio class 0 Oct 13 13:45:11 ol8 kernel: EXT4-fs warning (device sdb1): ext4_end_bio:317: I/O error 10 writing to inode 97517639 (offset 687865856 size 8388608 starting block 65358029) Oct 13 13:45:11 ol8 kernel: buffer_io_error: 597 callbacks suppressed Oct 13 13:45:11 ol8 kernel: Buffer I/O error on device sdb1, logical block 65357517 Oct 13 13:45:11 ol8 kernel: Buffer I/O error on device sdb1, logical block 65357518
Is there any way someone can pick this bug up and look into it?
comment:27 by , 2 years ago
same problem in virtualbox 6.1.40, win11 host, archlinux guest.
10月 28 00:18:40 PeterWang-ArchLinux kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd 10月 28 00:18:41 PeterWang-ArchLinux kernel: usb 2-1: Enable of device-initiated U1 failed. 10月 28 00:18:41 PeterWang-ArchLinux kernel: usb 2-1: Enable of device-initiated U2 failed.
Note: Guest system is installed in my external ssd for some specific reasons. I passed ssd to VM via usb filter and it was reset when system had a relatively high I/O.
comment:28 by , 2 years ago
I'm not certain, but I suspect what I've seen may be related:
Running a Linux OS in a VB 6.1.x (many versions, up to the current x=40) with the Guest Additions installed (but not the Extension Pack), I've noticed lots of little glitches with USB interactions, including I/O errors of USB disks when the USB data were passed through the host to the guest. These manifest as general sluggishness and unreliability. In particular, during large file transfers an I/O error would be reported in the log of the guest OS, and the USB disk would be unmounted. It would be possible to re-mount the USB disk, so it's more a matter of flakiness than a hard error.
A USB mouse connected to the host OS and passed to the guest OS as an emulated USB tablet or USB mouse had poor performance, too: Scroll actions especially would sometimes not reach the guest, and recently I noticed that button-release events failed to be received if they followed mouse-motion events (so, e.g., if I dragged a window by its titlebar, released the mouse button, and then moved the mouse, the window would make a little jump before the guest recognized that the mouse button was no longer pressed). I confirmed this behavior using 'xev' on the guest.
Finally, the host had been plagued with sporadic bluescreen crashes that didn't seem related to any user activity on either the guest or the host. The 'stop codes' were not consistent from event to event until late October 2022, after which they were all "Faulty Hardware Corrupted Page".
All of these symptoms disappeared when I changed the emulated mouse from USB to PS/2. Now the guest reacts properly to mouse input, large file transfers over USB disks mounted on the guest OS complete as expected (though slowly, since they're still using USB1), and the host hasn't bluescreened in weeks.
So I suspect that an issue in the Guest Additions is corrupting the USB data, either by dropping USB packets, or by injecting a lot of junk packets that interfere with normal USB transfers and sometimes confuses the host OS kernel to the point it crashes.
UPDATE: Windows bluescreened again, so I was apparently optimistic that VB was the source of those crashes and that adjusting the mouse configuration would cure them.
comment:29 by , 2 weeks ago
I think this issue may be resolved now. Currently running MacOS 14.7 with Virtualbox 7.1.4, and Oracle Linux 9 VM. Running the 'dd if=/dev/zero' test that immediately replicated the issue with no longer slows to a crawl, and isn't accompanied with SCSI/USB resets on the Oracle Linux system.
Not sure what combo of MacOS+VBox+OL9 fixed it.
comment:30 by , 2 weeks ago
Just tested this with Oracle Linux 8, same MacOS and Vbox versions, and had the same results, so this issue is definitely fixed now.
comment:31 by , 12 days ago
Thanks for the update! Yes, this problem was fixed several months ago. A customer ran into what I'm pretty sure was the same problem with a DVD writer and was able to send us hardware to reproduce.
The bug turned out to be caused by missing notifications from the guest. There was a race condition, but the window was small enough that on many systems the problem was extremely difficult or impossible to reproduce. But with the right hardware, it happened often enough.
The bug had nothing to do with the type of device, or even USB2 vs USB3, it could affect any device. But of course the more USB traffic, the more chances for the bug to show up. One key ingredient was using only one single USB device, because additional USB traffic would cancel the effect of the bug.
comment:32 by , 12 days ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
Made several attempts at reproducing this issue on a non-Mac host, using the same VM exported/imported. However, my attempts at reproducing on two Windows 10 hosts failed, as the device wouldn't even mount in the VM, on either, with following mount attempt errors...
I ran out of time to troubleshoot this Windows 10 Vbox USB issue, so made another attempt on an old Dell E7440 running Oracle Linux 7.6, and though it took much longer, the USB resets occurred there, too...
Log is attached.