VirtualBox

Opened 6 years ago

Closed 12 days ago

#18176 closed defect (fixed)

USB resets and hangs during high I/O

Reported by: Jimson Owned by:
Component: USB Version: VirtualBox 5.2.22
Keywords: usb reset superspeed Cc:
Guest type: Linux Host type: Mac OS X

Description (last modified by michaln)

Experiencing USB reset and hangs on Virtualbox 5.2.22 on Macbook Pro 2017 10.13.6, on Oracle Linux 7.5 and Ubuntu 18.04 VMs. I'm able to replicate the problem, very quickly, by running the following:

$ sudo dd if=/dev/zero of=./testfile status=progress bs=1024k
639631360 bytes (640 MB) copied, 5.448752 s, 117 MB/s

It starts out pretty fast, but once the reset occurs, the speeds taper down to < 50MB/s. The OL system log shows the resets and I/O errors as:

Dec 10 15:55:28 jdnissen-lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd
Dec 10 15:55:38 jdnissen-lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd
Dec 10 15:55:44 jdnissen-lobi7 systemd: Got automount request for /proc/sys/fs/binfmt_misc, triggered by 1858 (cma)
Dec 10 15:55:44 jdnissen-lobi7 systemd: Mounting Arbitrary Executable File Formats File System...
Dec 10 15:55:45 jdnissen-lobi7 systemd: Mounted Arbitrary Executable File Formats File System.
Dec 10 15:55:46 jdnissen-lobi7 kernel: EXT4-fs (dm-3): Delayed block allocation failed for inode 12 at logical offset 546816 with max blocks 2048 with error 5
Dec 10 15:55:46 jdnissen-lobi7 kernel: EXT4-fs (dm-3): This should not happen!! Data will be lost
Dec 10 15:55:46 jdnissen-lobi7 kernel: JBD2: Detected IO errors while flushing file data on dm-3-8
Dec 10 15:55:51 jdnissen-lobi7 kernel: JBD2: Detected IO errors while flushing file data on dm-3-8
Dec 10 15:55:57 jdnissen-lobi7 kernel: JBD2: Detected IO errors while flushing file data on dm-3-8
Dec 10 15:56:01 jdnissen-lobi7 kernel: JBD2: Detected IO errors while flushing file data on dm-3-8
Dec 10 15:56:07 jdnissen-lobi7 kernel: JBD2: Detected IO errors while flushing file data on dm-3-8
Dec 10 15:56:12 jdnissen-lobi7 kernel: JBD2: Detected IO errors while flushing file data on dm-3-8
Dec 10 15:56:17 jdnissen-lobi7 kernel: JBD2: Detected IO errors while flushing file data on dm-3-8
Dec 10 15:56:22 jdnissen-lobi7 kernel: JBD2: Detected IO errors while flushing file data on dm-3-8
Dec 10 15:56:27 jdnissen-lobi7 kernel: JBD2: Detected IO errors while flushing file data on dm-3-8
Dec 10 15:56:32 jdnissen-lobi7 kernel: JBD2: Detected IO errors while flushing file data on dm-3-8

I've attempted to workaround this with the new test build 6.0.0 RC1, and it didn't help. OL and Ubuntu updates don't help, either.

It's happening on two different USB drives: Seagate 2TB and Western Digital 2TB. The seagate is formatted as NTFS and the WD is Linux EXT4.

In addition, I can attach the drives to an old laptop running bare-metal OL7.2, and it works just fine...no resets, hangs, or slow I/Os, during the 'dd' (or file copies).

I work for Oracle, and this problem is preventing me from being able to copy large patch bundles, used during customer installs.

Attachments (2)

VBox.log (112.4 KB ) - added by Jimson 6 years ago.
Oracle Linux 7.2 OBI-2018-12-12-09-33-55.log (83.2 KB ) - added by Jimson 6 years ago.

Download all attachments as: .zip

Change History (34)

by Jimson, 6 years ago

Attachment: VBox.log added

comment:1 by Jimson, 6 years ago

Made several attempts at reproducing this issue on a non-Mac host, using the same VM exported/imported. However, my attempts at reproducing on two Windows 10 hosts failed, as the device wouldn't even mount in the VM, on either, with following mount attempt errors...

Dec 11 12:05:50 jdnissen-lobi7 kernel: usb 1-1: new high-speed USB device number 2 using xhci_hcd
Dec 11 12:05:50 jdnissen-lobi7 kernel: usb 1-1: device descriptor read/64, error 18
Dec 11 12:05:51 jdnissen-lobi7 kernel: usb 1-1: device descriptor read/64, error 18
Dec 11 12:05:51 jdnissen-lobi7 kernel: usb 1-1: new high-speed USB device number 3 using xhci_hcd
Dec 11 12:05:51 jdnissen-lobi7 kernel: usb 1-1: device descriptor read/64, error 18
Dec 11 12:05:51 jdnissen-lobi7 kernel: usb 1-1: device descriptor read/64, error 18
Dec 11 12:05:52 jdnissen-lobi7 kernel: usb 1-1: new high-speed USB device number 4 using xhci_hcd
Dec 11 12:05:52 jdnissen-lobi7 kernel: usb 1-1: Invalid ep0 maxpacket: 9
Dec 11 12:05:52 jdnissen-lobi7 kernel: usb 1-1: new high-speed USB device number 5 using xhci_hcd
Dec 11 12:05:52 jdnissen-lobi7 kernel: usb 1-1: Invalid ep0 maxpacket: 9
Dec 11 12:05:52 jdnissen-lobi7 kernel: usb usb1-port1: unable to enumerate USB device

I ran out of time to troubleshoot this Windows 10 Vbox USB issue, so made another attempt on an old Dell E7440 running Oracle Linux 7.6, and though it took much longer, the USB resets occurred there, too...

Dec 12 08:54:23 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd
Dec 12 08:54:23 lobi7 kernel: sd 0:0:0:0: [sdb] FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
Dec 12 08:54:23 lobi7 kernel: sd 0:0:0:0: [sdb] CDB: ATA command pass through(16) 85 06 2c 00 00 00 00 00 00 00 00 00 00 00 e5 00
Dec 12 08:54:23 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd
Dec 12 08:54:23 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd
Dec 12 08:54:23 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd
Dec 12 08:54:23 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd
Dec 12 08:54:23 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd
Dec 12 08:54:23 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd

[lobi7 ~]$ cd /run/media/jdnissen/b8049a7e-c0ce-4739-a586-2bb6e05ec5e6/
[lobi7 b8049a7e-c0ce-4739-a586-2bb6e05ec5e6]$ sudo dd if=/dev/zero of=./testfile status=progress bs=1024k
1393245028352 bytes (1.4 TB) copied, 13969.899470 s, 99.7 MB/s
dd: error writing ‘./testfile’: Read-only file system
1328739+0 records in
1328738+0 records out
1393283014656 bytes (1.4 TB) copied, 14164.6 s, 98.4 MB/s

Log is attached.

in reply to:  description comment:2 by Jimson, 6 years ago

The original errors in code-blocks for readability...

Dec  7 10:09:50 lobi7 kernel: usb 2-1: new SuperSpeed USB device number 4 using xhci_hcd
Dec  7 10:09:50 lobi7 kernel: usb 2-1: New USB device found, idVendor=1058, idProduct=25a2
Dec  7 10:09:50 lobi7 kernel: usb 2-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
Dec  7 10:09:50 lobi7 kernel: usb 2-1: Product: Elements 25A2
Dec  7 10:09:50 lobi7 kernel: usb 2-1: Manufacturer: Western Digital
Dec  7 10:09:50 lobi7 kernel: usb 2-1: SerialNumber: 575854314536384353524154
Dec  7 10:09:50 lobi7 kernel: usb-storage 2-1:1.0: USB Mass Storage device detected
Dec  7 10:09:50 lobi7 kernel: scsi host6: usb-storage 2-1:1.0
Dec  7 10:09:50 lobi7 mtp-probe: checking bus 2, device 4: "/sys/devices/pci0000:00/0000:00:0c.0/usb2/2-1"
Dec  7 10:09:50 lobi7 mtp-probe: bus: 2, device: 4 was not an MTP device
Dec  7 10:09:51 lobi7 kernel: scsi 6:0:0:0: Direct-Access     WD       Elements 25A2    1021 PQ: 0 ANSI: 6
Dec  7 10:09:51 lobi7 kernel: sd 6:0:0:0: Attached scsi generic sg2 type 0
Dec  7 10:09:51 lobi7 kernel: sd 6:0:0:0: [sdc] 3906963456 512-byte logical blocks: (2.00 TB/1.81 TiB)
Dec  7 10:09:51 lobi7 kernel: sd 6:0:0:0: [sdc] Write Protect is off
Dec  7 10:09:51 lobi7 kernel: sd 6:0:0:0: [sdc] No Caching mode page found
Dec  7 10:09:51 lobi7 kernel: sd 6:0:0:0: [sdc] Assuming drive cache: write through
Dec  7 10:09:51 lobi7 kernel: sd 6:0:0:0: [sdc] Attached SCSI disk
...
Dec  7 09:47:23 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 3 using xhci_hcd
Dec  7 09:47:23 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 3 using xhci_hcd
Dec  7 09:47:24 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 3 using xhci_hcd
Dec  7 09:47:24 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 3 using xhci_hcd
Dec  7 09:47:24 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 3 using xhci_hcd
Dec  7 09:47:24 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 3 using xhci_hcd
Dec  7 09:47:24 lobi7 kernel: sd 5:0:0:0: [sdb] FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
Dec  7 09:47:24 lobi7 kernel: sd 5:0:0:0: [sdb] CDB: Write(10) 2a 00 07 4e c4 00 00 04 00 00
Dec  7 09:47:24 lobi7 kernel: blk_update_request: I/O error, dev sdb, sector 122602496
Dec  7 09:47:24 lobi7 kernel: EXT4-fs warning (device dm-3): ext4_end_bio:316: I/O error -5 writing to inode 12 (offset 41783656448 size 8388608 starting block 15324800)
Dec  7 09:47:24 lobi7 kernel: Buffer I/O error on device dm-3, logical block 15324800
Dec  7 09:47:24 lobi7 kernel: Buffer I/O error on device dm-3, logical block 15324801
Dec  7 09:47:24 lobi7 kernel: Buffer I/O error on device dm-3, logical block 15324802
Dec  7 09:47:24 lobi7 kernel: Buffer I/O error on device dm-3, logical block 15324803
Dec  7 09:47:24 lobi7 kernel: Buffer I/O error on device dm-3, logical block 15324804
Dec  7 09:47:24 lobi7 kernel: Buffer I/O error on device dm-3, logical block 15324805
Dec  7 09:47:24 lobi7 kernel: Buffer I/O error on device dm-3, logical block 15324806
Dec  7 09:47:24 lobi7 kernel: Buffer I/O error on device dm-3, logical block 15324807
Dec  7 09:47:24 lobi7 kernel: Buffer I/O error on device dm-3, logical block 15324808
Dec  7 09:47:24 lobi7 kernel: Buffer I/O error on device dm-3, logical block 15324809
…
Dec  7 09:47:26 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 3 using xhci_hcd
Dec  7 09:47:26 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 3 using xhci_hcd
Dec  7 09:47:26 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 3 using xhci_hcd
Dec  7 09:47:26 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 3 using xhci_hcd
Dec  7 09:47:26 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 3 using xhci_hcd
Dec  7 09:47:26 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 3 using xhci_hcd
Dec  7 09:47:26 lobi7 kernel: sd 5:0:0:0: [sdb] FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
Dec  7 09:47:26 lobi7 kernel: sd 5:0:0:0: [sdb] CDB: Write(10) 2a 00 07 4e d0 00 00 04 00 00
Dec  7 09:47:26 lobi7 kernel: blk_update_request: I/O error, dev sdb, sector 122605568
Dec  7 09:47:26 lobi7 kernel: EXT4-fs warning (device dm-3): ext4_end_bio:316: I/O error -5 writing to inode 12 (offset 41792045056 size 8388608 starting block 15325184)
Version 0, edited 6 years ago by Jimson (next)

comment:3 by michaln, 6 years ago

I was able to reproduce this... and then I wasn't. But it certainly didn't happen because anything got intentionally fixed.

The underlying problem unfortunately appears to be some race condition which is inherently very configuration specific and at least for me, currently so hard to reproduce that I can't meaningfully try to fix it. I can read hundreds of gigabytes at ~180 MB/s, no problem, the resets just aren't happening. That doesn't mean the resets won't happen on other machines, but at the moment I don't have those other machines.

Please try again with a Windows host, and report any issues encountered. I believe the problem is much less severe on a Windows host because it only slows things down but does not cause I/O errors.

If you find something that makes the errors more frequent (number of virtual CPUs? idle/busy host? VM memory size?), please add a comment here.

comment:4 by michaln, 6 years ago

Summary: USB 3.0 resets and hangs during high I/OUSB resets and hangs during high I/O

I'll add that the problem is not USB 3.0 specific. There is some probability involved, and the error may be more likely to show up as the number of USB transactions go up, so with fast devices the error probably just shows up quicker.

There is definitely a problem in the VirtualBox Linux USB proxy in that it does now allow USB devices to be truly reset. It looks like the USB device may get sufficiently confused that it simply stops responding (that could in fact be triggering the initial reset attempt) and because we do not really reset it, the USB device stays confused until the guest OS gives up on it.

comment:5 by Jimson, 6 years ago

My attempts at reproducing this problem on two home Windows 10 computers failed, at first, I think due to bug 84741. However, an upgrade to Vbox 6.0 fixed that, and interestingly, I have been unable to reproduce this bug on Windows 10 hosts running Vbox 6.0, running the same Oracle Linux VM. I have read a large 450GB file and wrote a large 1.7TB file, until the device filled up. However, there were a few minor USB resets:

Dec 21 11:21:38 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd
Dec 21 12:33:36 lobi7 kernel: EXT4-fs (dm-3): error count since last fsck: 5
Dec 21 12:33:36 lobi7 kernel: EXT4-fs (dm-3): initial error at time 1544198049: ext4_journal_check_start:56
Dec 21 12:33:36 lobi7 kernel: EXT4-fs (dm-3): last error at time 1544478866: ext4_wait_block_bitmap:497
Dec 21 12:33:56 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 3 using xhci_hcd 

The upgrade to Vbox 6.0 doesn't fix this bug, on the Mac host, however. The problem still occurs there, rather quickly.

Last edited 6 years ago by michaln (previous) (diff)

comment:6 by michaln, 6 years ago

Actually, you did reproduce half of the problem -- it's that "reset SuperSpeed USB device number 2 using xhci_hcd" message. The difference is that on a Windows host, the VM / USB device / VirtualBox can recover from that situation, and on Linux or OS X hosts it can't.

I know exactly why the problem exists on Linux, it's because we only pretend to reset the USB device but don't really. On OS X I'm not sure why it's happening, I thought we should be really resetting the device but perhaps we aren't. I haven't looked at the OS X behavior in detail yet.

On Linux hosts, the question is how to fix it, and I don't have the answer yet though I have some ideas.

comment:7 by Socratis, 6 years ago

Michal,

Do you need any more data points? I'm on OSX 10.11.6 with plenty of VMs to test this on. And 2 USB3 HDs.

Also, since you have the power, could you edit the messages from 'Jimson' to include the logs/messages in {{{ ... }}} tags, to make reading a little bit easier? TIA.

Or you could do it too Jimson, they're your messages, you have the authority. Except the original one, the ticket report itself. As the poet once said, "U Can't Touch This!" ;)

comment:8 by michaln, 6 years ago

Description: modified (diff)

comment:9 by michaln, 6 years ago

I have had zero luck reproducing the problem on my Mac. The guest never even tries to do the USB resets, so there's no problem. That's not to say there is no issue, just that I can't see it.

The latest test builds include a change in the Linux USB backend. When the guest asks for a USB device to be reset, it is now really reset (rather than ignoring the request as we did before). For me, that fixes the reset behavior -- that is to say, copying occasionally triggers USB resets but the guest recovers from those and continues, instead of losing the device. Feedback on the Linux builds would be useful.

comment:10 by Jimson, 6 years ago

Sorry, I missed some of the previous posts, until I noticed the changelog for 6.0.2 fixed some USB reset issues. I just finished installing 6.0.2 maintenance release on my Mac, and I'm able to easily reproduce the problem, while writing a large file...

Jan 15 10:49:52 lobi7 kernel: sd 4:0:0:0: [sdb] CDB: Write(10) 2a 00 07 38 18 00 00 04 00 00
Jan 15 10:49:52 lobi7 kernel: blk_update_request: I/O error, dev sdb, sector 121116672
Jan 15 10:49:52 lobi7 kernel: EXT4-fs warning (device dm-3): ext4_end_bio:302: I/O error -5 writing to inode 12 (offset 41162899456 size 8388608 starting block 15139072)
Jan 15 10:49:52 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd
Jan 15 10:49:52 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd
Jan 15 10:49:53 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd
Jan 15 10:49:53 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd
Jan 15 10:49:53 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd
Jan 15 10:49:53 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd
Jan 15 10:49:53 lobi7 kernel: sd 4:0:0:0: [sdb] FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
Jan 15 10:49:53 lobi7 kernel: sd 4:0:0:0: [sdb] CDB: Write(10) 2a 00 07 38 1c 00 00 04 00 00
Jan 15 10:49:53 lobi7 kernel: blk_update_request: I/O error, dev sdb, sector 121117696
Jan 15 10:49:53 lobi7 kernel: EXT4-fs warning (device dm-3): ext4_end_bio:302: I/O error -5 writing to inode 12 (offset 41162899456 size 8388608 starting block 15139200)
Jan 15 10:49:53 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd
Jan 15 10:49:53 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd
Jan 15 10:49:54 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd
Jan 15 10:49:54 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd
Jan 15 10:49:54 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd
Jan 15 10:49:54 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd
Jan 15 10:49:54 lobi7 kernel: sd 4:0:0:0: [sdb] FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
Jan 15 10:49:54 lobi7 kernel: sd 4:0:0:0: [sdb] CDB: Write(10) 2a 00 07 38 20 00 00 04 00 00
Jan 15 10:49:54 lobi7 kernel: blk_update_request: I/O error, dev sdb, sector 121118720
Jan 15 10:49:54 lobi7 kernel: EXT4-fs warning (device dm-3): ext4_end_bio:302: I/O error -5 writing to inode 12 (offset 41162899456 size 8388608 starting block 15139328)
Jan 15 10:49:54 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd
Jan 15 10:49:54 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd
Jan 15 10:49:54 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd
Jan 15 10:49:55 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd
Jan 15 10:49:55 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd
Jan 15 10:49:55 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd
Jan 15 10:49:55 lobi7 kernel: sd 4:0:0:0: [sdb] FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
Jan 15 10:49:55 lobi7 kernel: sd 4:0:0:0: [sdb] CDB: Read(10) 28 00 07 00 10 78 00 00 08 00
Jan 15 10:49:55 lobi7 kernel: blk_update_request: I/O error, dev sdb, sector 117444728
Jan 15 10:49:55 lobi7 kernel: EXT4-fs error (device dm-3): ext4_wait_block_bitmap:516: comm kworker/u2:0: Cannot read block bitmap - block_group = 463, block_bitmap = 14680079
Jan 15 10:50:01 lobi7 systemd: Created slice User Slice of root.
Jan 15 10:50:01 lobi7 systemd: Started Session 3 of user root.
Jan 15 10:50:01 lobi7 systemd: Removed slice User Slice of root.
Jan 15 10:50:26 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd
Jan 15 10:50:57 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd
Jan 15 10:51:07 lobi7 kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd
Jan 15 10:51:15 lobi7 kernel: EXT4-fs (dm-3): Delayed block allocation failed for inode 12 at logical offset 10082304 with max blocks 103 with error 5
Jan 15 10:51:15 lobi7 kernel: EXT4-fs (dm-3): This should not happen!! Data will be lost
Jan 15 10:51:15 lobi7 kernel: JBD2: Detected IO errors while flushing file data on dm-3-8
Jan 15 10:51:20 lobi7 kernel: JBD2: Detected IO errors while flushing file data on dm-3-8
Jan 15 10:51:25 lobi7 kernel: JBD2: Detected IO errors while flushing file data on dm-3-8
...

One slight change in behavior is it recovered during the first round of these I/O errors and USB resets. However, shortly after, it hit a second round, and was unable to recover, and the dd stopped running ...

[ b8049a7e-c0ce-4739-a586-2bb6e05ec5e6]$ sudo dd if=/dev/zero of=./testfile status=progress bs=1024k
[sudo] password for jdnissen: 
59401830400 bytes (59 GB) copied, 817.845151 s, 72.6 MB/s
dd: error writing ‘./testfile’: Read-only file system
56661+0 records in
56660+0 records out
59413278720 bytes (59 GB) copied, 1051.39 s, 56.5 MB/s

My Mac is now running 10.13.6.

comment:11 by michaln, 6 years ago

Sorry if I wasn't clear. There is no change in behavior expected on OS X hosts, since there were no changes. Changes were on Linux hosts only.

in reply to:  11 comment:12 by Jimson, 6 years ago

Replying to michaln:

Sorry if I wasn't clear. There is no change in behavior expected on OS X hosts, since there were no changes. Changes were on Linux hosts only.

Michal, completely understand, but saw the notes on troubles reproducing this on OS X, and just wanted to state it is still easy for me to reproduce, on my OS X laptop.

comment:13 by Dariusz, 5 years ago

I have observed the same issue with OS X host running Debian client using VBox 6.0.4. When external USB connected disk is in use by linux host it randomly resets with dimes showing:

usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd

Last edited 5 years ago by Dariusz (previous) (diff)

comment:14 by Jimson, 5 years ago

Just an update on this issue. Recently upgraded my Virtualbox (on Mac) to the recent 6.0.8, and retested this with both Oracle Linux 7.6 and Ubuntu 18.04. The 'dd' locked up on Ubuntu, but it took a while (a couple of 100's of GB). However, on OL7.6, the large file continued writing until the external USB was full, at 2TB. There were some USB resets and "hung thread" dumps, along the way, but it kept cranking.

I think the I/O hangs on Ubuntu were a reproduction of the issue, and think Ubuntu failed over OL because the I/O rate on it was much faster. Until the failure, Ubuntu was writing up to 80MB/s, where OL streamed around 45-60MB/s. Maybe this race condition occurred on Ubuntu because of the accelerated I/O rates.

comment:15 by Jimson, 5 years ago

What's the status of resolving this issue? It is preventing me from copying files from NTFS to Linux EXT4 drives.

comment:16 by michaln, 5 years ago

No good news I'm afraid. I spent a lot of time on this problem and got absolutely nowhere. Customers complaining would help, as well as some much better reproduction scenario.

comment:17 by Dariusz, 5 years ago

I can reproduce it 100%. With any Linux distro I tried. Run Linux in VB (macOS host) - connect external hdd to VB (via USB).

then run non destructive data test:

sudo badblocks -nsv /dev/sda

On Ubuntu/Dabian you can install it with:

sudo apt-get install e2fsprogs

I never managed to finish 1TB disk. It always sooner or later happens.

comment:18 by michaln, 5 years ago

The problem isn't the "sooner", it's the "later". How long does a test need to be run to be certain the problem isn't there? When that takes 30+ minutes, the cycle gets *really* long and I could spend weeks on it and still get nowhere.

One thing I'm wondering about though, is this only happening on Macs or is no one using big external drives with any other host OS?

Last edited 5 years ago by michaln (previous) (diff)

comment:19 by Juan-Se, 5 years ago

Hello I'm having exactly the same problem I am trying to securely initialise my encrypted hard drive by doing.

pv -tpreb /dev/zero | sudo dd of=/dev/mapper/data bs=128M

And is pretty random, after sometimes 20GB or sometimes 250GB the USB controller hangs, and the only options is to disconnect the external hard drive. I'm having exactly the same errors as @Jimson

I'm using MacOS Majove as host, and Ubuntu server 19.04 as guest.

comment:20 by various, 4 years ago

I'm seeing this issue on a Windows 10 host with VB 6.10 64-bit with a 64-bit Fedora-32 guest. This is with no Extension Pack installed (the license doesn't seem to permit it in my situation) so I'm running at USB 1.1 speed.

Sometimes I can push 10 GB (slowly) through to an external hard drive, and other times the error occurs pretty quickly. When it occurs, the external filesystem gets unmounted. I can mount it and try the transfer again, and it seems to have about the same probability of success as before.

in reply to:  20 ; comment:21 by michaln, 4 years ago

Replying to various:

I'm seeing this issue on a Windows 10 host with VB 6.10 64-bit with a 64-bit Fedora-32 guest. This is with no Extension Pack installed (the license doesn't seem to permit it in my situation) so I'm running at USB 1.1 speed.

Could you please attach a VBox.log from the VM that exhibits the problem, and also a snippet of the guest's dmesg output if there any errors logged there? At first glance that does not sound like the same problem, but if it is, it means we've been looking for the bug in the wrong place.

Also, what is the USB storage device that's being passed through?

in reply to:  21 comment:22 by various, 4 years ago

Replying to michaln:

Replying to various:

I'm seeing this issue on a Windows 10 host with VB 6.10 64-bit with a 64-bit Fedora-32 guest. This is with no Extension Pack installed (the license doesn't seem to permit it in my situation) so I'm running at USB 1.1 speed.

Could you please attach a VBox.log from the VM that exhibits the problem, and also a snippet of the guest's dmesg output if there any errors logged there? At first glance that does not sound like the same problem, but if it is, it means we've been looking for the bug in the wrong place.

Also, what is the USB storage device that's being passed through?

I shouldn't copy files from that machine. But I'll try to describe what I see:

The storage device is a WD Passport rotating HDD. The VBox.log says:

  VUSB: Attached '00000xxxxx[proxy 1058:07a8]' to port 3 on RootHub$0 (HighSpeed)

That's about the only entry that appears in that file when I connect the HDD.

The dmesg output is more extensive and suggestive. I can't type it all here, but some highlights include:

usb 1-3: new full-speed USB device number 4 using ohci-pci
usb 1-3: config 1 interface 0 altsetting 0 endpoint 0x8B has invalid maxpacket 512, setting to 64
usb 1-3: config 1 interface 0 altsetting 0 endpoint 0xA has invalid maxpacket 512, setting to 64
usb 1-3: New USB device found, idVendor-1058, idProduct=007a8, bcdDevice=10.42
usb-storage: 1-3:1.0: USB Mass Storage device detected
scsi host6: usb-storage 1-3:1.0
.
.
.
scsi 6:0:0:1: Wrong diagnostic page; asked for 1 got 8
scsi 6:0:0:1: Faild to bind enclosure -19
sdc: sdc1 sdc2 sdc3
sd 6:0:0:0: [sdc] Attached SCSI disk
ses 6:0:0:1: Attached Enclosure device
.
.
about 430 seconds pass (during which data are flowing normally), and then:

pktcdvd: pktcdvd0: ... some hex bytes... sens 04.0f.00 (Hardware error)
pktcdvd: pktcdvd0: ... some hex bytes... sens ad.4b.af (INVALID)
.
.
usb 1-3: reset full-speed USB device number 4 using ohci-pci
usb 1-3: device firmware changed
usb 1-3: USB disconnect, device number 4
blk_update_request: I/O error, dev sdc, sector xxxx op 0x1:(WRITE) flags 0x104000 phys_seg 9 prio class 0
(and 3 similar)
.
.
.
scsi 6:0:0:0:0: regecting I/O to dead device
.
.
.
usb 1-3: new full-speed USB device number 5 using ohci-pci

the sequence starts again, but without the errors since I was no longer trying to use the device.

Is that any help?

Last edited 4 years ago by various (previous) (diff)

comment:23 by Manny, 4 years ago

Hi,

I found this problem also. Host: New Macbook Pro (10.15.6). VB 6.1.10. Guest: new install of xubuntu lts. dd writes to the usb disk didn't consistently cause usb reset "crashes", but the problem happened within the first few GB of data written trying to use a 'zpool replace' operation from a local diskimage to a usb 3 disk during the resilver operation. The reset immediately prevented any further use of the usb disk until re-plug-in. I had to give up trying this. From what I can tell, relevant log entries on both sides are similar to those above, but I can provide log extracts and host/guest details if it might help.

Manny

Last edited 4 years ago by Manny (previous) (diff)

comment:24 by thaison, 2 years ago

I am experiencing this issue as well with VB 6.1.34. The host is an old MacBook Pro (A1502) running Big Sur 11.6.8 and the guest is openmediavault, which is debian based.

Thaison

comment:25 by Jimson, 2 years ago

I continue to hit this issue, even with newest Vbox 6.1.36, though the signatures are slightly different. Today, I retested with both GUI file drag copy and 'cp -R' from large set of files on shared folder to direct-mounted EXT4 USB drive. The file copy always fails with following errors:

[jdnissen@ol8 ~]$ sudo grep -i mount /var/log/messages Oct 13 09:16:43 ol8 kernel: EXT4-fs warning (device sdb1): ext4_clear_journal_err:5215: Filesystem error recorded from previous mount: IO failure Oct 13 09:16:43 ol8 kernel: EXT4-fs (sdb1): warning: mounting fs with errors, running e2fsck is recommended Oct 13 09:16:43 ol8 kernel: EXT4-fs (sdb1): mounted filesystem with ordered data mode. Opts: errors=remount-ro Oct 13 09:16:43 ol8 journal[941]: Mounted /dev/sdb1 at /run/media/someuser/EXT4 on behalf of uid 1000 Oct 13 09:51:30 ol8 kernel: EXT4-fs (sdb1): Remounting filesystem read-only Oct 13 09:51:50 ol8 journal[941]: Cleaning up mount point /run/media/someuser/EXT4 (device 8:17 no longer exists) Oct 13 09:51:50 ol8 systemd[1]: run-media-jdnissen-EXT4.mount: Succeeded. Oct 13 12:37:33 ol8 kernel: EXT4-fs warning (device sdb1): ext4_clear_journal_err:5215: Filesystem error recorded from previous mount: IO failure Oct 13 12:37:33 ol8 kernel: EXT4-fs (sdb1): warning: mounting fs with errors, running e2fsck is recommended Oct 13 12:37:33 ol8 kernel: EXT4-fs (sdb1): mounted filesystem with ordered data mode. Opts: errors=remount-ro Oct 13 12:37:33 ol8 journal[941]: Mounted /dev/sdb1 at /run/media/someuser/EXT4 on behalf of uid 1000 Oct 13 13:45:19 ol8 kernel: EXT4-fs (sdb1): Remounting filesystem read-only

The file copy will then give the following errors: [jdnissen@ol8 downloads]$ cp -R bigdir /run/media/someuser/EXT4 cp: error writing '/run/media/someuser/EXT4/<somefile>': Input/output error cp: cannot unlink ‘/run/media/someuser/EXT4/<nextfile>: Read-only file system cp: cannot create regular file '/run/media/someuser/EXT4/<nextfile+1>': Read-only file system ...

The filesystem is automatically mounted with the 'errors=remount-ro' option, so the read-only errors are a result of Linux remounting the disk due to the I/O errors, but why do the large I/O demands result in I/O errors only on Vbox VMs?

$ mount ... dev/sdb1 on /run/media/someuser/EXT4 type ext4 (ro,nosuid,nodev,relatime,seclabel,errors=remount-ro,uhelper=udisks2)'

comment:26 by Jimson, 2 years ago

Same sort of resets as back when I filed this bug...

Oct 13 13:45:10 ol8 kernel: usb 2-1: reset SuperSpeed Gen 1 USB device number 5 using xhci_hcd Oct 13 13:45:10 ol8 kernel: usb 2-1: reset SuperSpeed Gen 1 USB device number 5 using xhci_hcd Oct 13 13:45:10 ol8 kernel: usb 2-1: reset SuperSpeed Gen 1 USB device number 5 using xhci_hcd Oct 13 13:45:11 ol8 kernel: sd 3:0:0:0: [sdb] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK Oct 13 13:45:11 ol8 kernel: sd 3:0:0:0: [sdb] tag#0 CDB: Write(10) 2a 00 1f 2a 44 68 00 02 00 00 Oct 13 13:45:11 ol8 kernel: blk_update_request: I/O error, dev sdb, sector 522863720 op 0x1:(WRITE) flags 0x4000 phys_seg 64 prio class 0 Oct 13 13:45:11 ol8 kernel: EXT4-fs warning (device sdb1): ext4_end_bio:317: I/O error 10 writing to inode 97517639 (offset 687865856 size 8388608 starting block 65358029) Oct 13 13:45:11 ol8 kernel: buffer_io_error: 597 callbacks suppressed Oct 13 13:45:11 ol8 kernel: Buffer I/O error on device sdb1, logical block 65357517 Oct 13 13:45:11 ol8 kernel: Buffer I/O error on device sdb1, logical block 65357518

Is there any way someone can pick this bug up and look into it?

comment:27 by PeterWang-dev, 2 years ago

same problem in virtualbox 6.1.40, win11 host, archlinux guest.

10月 28 00:18:40 PeterWang-ArchLinux kernel: usb 2-1: reset SuperSpeed USB device number 2 using xhci_hcd
10月 28 00:18:41 PeterWang-ArchLinux kernel: usb 2-1: Enable of device-initiated U1 failed.
10月 28 00:18:41 PeterWang-ArchLinux kernel: usb 2-1: Enable of device-initiated U2 failed.

Note: Guest system is installed in my external ssd for some specific reasons. I passed ssd to VM via usb filter and it was reset when system had a relatively high I/O.

comment:28 by various, 2 years ago

I'm not certain, but I suspect what I've seen may be related:

Running a Linux OS in a VB 6.1.x (many versions, up to the current x=40) with the Guest Additions installed (but not the Extension Pack), I've noticed lots of little glitches with USB interactions, including I/O errors of USB disks when the USB data were passed through the host to the guest. These manifest as general sluggishness and unreliability. In particular, during large file transfers an I/O error would be reported in the log of the guest OS, and the USB disk would be unmounted. It would be possible to re-mount the USB disk, so it's more a matter of flakiness than a hard error.

A USB mouse connected to the host OS and passed to the guest OS as an emulated USB tablet or USB mouse had poor performance, too: Scroll actions especially would sometimes not reach the guest, and recently I noticed that button-release events failed to be received if they followed mouse-motion events (so, e.g., if I dragged a window by its titlebar, released the mouse button, and then moved the mouse, the window would make a little jump before the guest recognized that the mouse button was no longer pressed). I confirmed this behavior using 'xev' on the guest.

Finally, the host had been plagued with sporadic bluescreen crashes that didn't seem related to any user activity on either the guest or the host. The 'stop codes' were not consistent from event to event until late October 2022, after which they were all "Faulty Hardware Corrupted Page".

All of these symptoms disappeared when I changed the emulated mouse from USB to PS/2. Now the guest reacts properly to mouse input, large file transfers over USB disks mounted on the guest OS complete as expected (though slowly, since they're still using USB1), and the host hasn't bluescreened in weeks.

So I suspect that an issue in the Guest Additions is corrupting the USB data, either by dropping USB packets, or by injecting a lot of junk packets that interfere with normal USB transfers and sometimes confuses the host OS kernel to the point it crashes.

UPDATE: Windows bluescreened again, so I was apparently optimistic that VB was the source of those crashes and that adjusting the mouse configuration would cure them.

Last edited 2 years ago by various (previous) (diff)

comment:29 by Jimson, 2 weeks ago

I think this issue may be resolved now. Currently running MacOS 14.7 with Virtualbox 7.1.4, and Oracle Linux 9 VM. Running the 'dd if=/dev/zero' test that immediately replicated the issue with no longer slows to a crawl, and isn't accompanied with SCSI/USB resets on the Oracle Linux system.

Not sure what combo of MacOS+VBox+OL9 fixed it.

comment:30 by Jimson, 2 weeks ago

Just tested this with Oracle Linux 8, same MacOS and Vbox versions, and had the same results, so this issue is definitely fixed now.

comment:31 by michaln, 12 days ago

Thanks for the update! Yes, this problem was fixed several months ago. A customer ran into what I'm pretty sure was the same problem with a DVD writer and was able to send us hardware to reproduce.

The bug turned out to be caused by missing notifications from the guest. There was a race condition, but the window was small enough that on many systems the problem was extremely difficult or impossible to reproduce. But with the right hardware, it happened often enough.

The bug had nothing to do with the type of device, or even USB2 vs USB3, it could affect any device. But of course the more USB traffic, the more chances for the bug to show up. One key ingredient was using only one single USB device, because additional USB traffic would cancel the effect of the bug.

comment:32 by michaln, 12 days ago

Resolution: fixed
Status: newclosed
Note: See TracTickets for help on using tickets.

© 2024 Oracle Support Privacy / Do Not Sell My Info Terms of Use Trademark Policy Automated Access Etiquette