#19313 closed defect (obsolete)
random network dropouts
Reported by: | SkipMMT | Owned by: | |
---|---|---|---|
Component: | network | Version: | VirtualBox 6.1.2 |
Keywords: | dropouts | Cc: | |
Guest type: | Linux | Host type: | Linux |
Description (last modified by )
Since upgrading the linux host to VirtualBox-6.1.2, the linux guest has been experiencing random network dropouts, 1 to 10 times a day, lasting for a few seconds. At these times, the linux guest reports:
Feb 13 20:04:59 kernel: e1000 0000:00:03.0 eth0: Reset adapter Feb 13 20:05:01 kernel: e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
After downgrading to VirtualBox-6.0.16, the dropouts and the messages disappear.
Attachments (1)
Change History (9)
by , 5 years ago
comment:1 by , 4 years ago
comment:2 by , 4 years ago
Description: | modified (diff) |
---|
comment:3 by , 4 years ago
Status: | new → awaitsfeedback |
---|
one possible explanation for the lack of interest could be that there isn't much information in it to work on this ticket, ie. lack of basic information about guest and host OS, guest configuration, the offending VMs log file, specifically the networking configuration, no information whatsoever how to possibly reproduce this in house to start an evaluation of the problem. basically your ticket is entirely content free aside that one observation you made from a message in a guests log file. So you have found the link to the public bugtracker for submitting bugs, yet you have apparently not read the preamble on that same page: https://www.virtualbox.org/wiki/Bugtracker
comment:4 by , 4 years ago
I did read the preamble, and provided the requested information in the ticket submission form. I was expecting an acknowledgment that someone had seen the ticket and needed more information and would let me know specifically what that information might be.
I know of no way to trigger this failure so that you can reproduce it. Like I said it is random. This happens on four different hosts and two different guests, connected to two different Cisco switches on different networks. The four hosts have similar hardware, and they all use the Intel 82574L network interface hardware. These hosts and guests have run fedora 30, 31, and 32 with all the kernels released for them, which has made no difference in the problem. Obviously, the guests are configured to use the e1000 network interface. At the time of the failure, the host message log has no relevant information, and the guest message log has only those two lines I already posted.
I realize that intermittent problems are the worst to troubleshoot. I haven't found anything that correlates to the failures, except that it happens on every virtualbox 6.1.x, and never on any virtualbox 6.0.x. It does seem to be more frequent on the guest with more network traffic. I know that there is not much information to go on, but that's the way it is. The only approach that I can think of to attack the problem is to inspect the virtualbox network code changes between 6.0 and 6.1 and see what might produce these random outages. Please let me know if there is anything else I can do to help.
comment:5 by , 4 years ago
Hello,
I can confirm the behaviour described by SkipMMT. It began by upgrading to 6.1 series. My guest os is an up-to-date CentOS 7 with a single bridged adapter to host OS.
My host OS kernel is the following:
5.4.0-sabayon #1 SMP Sun Jul 12 21:09:29 UTC 2020 x86_64 Intel(R) Core(TM) i5-8500 CPU @ 3.00GHz GenuineIntel GNU/Linux
I was using VirtIO when problems arose, I tried to switch ethernet adapter drivers to intel's. e1000 driver was intelligent enough to reset itself when it detects a jam. However, it still takes time for e1000 to do so.
The frequency of the jams gets higher with the uptime getting higher. If I reboot the host OS, for a week or so, things go normal, then jams start again. And its frequency also gets higher after that. I followed it from once in 3 days, to twice in a day. Then rebooted the host OS. After 8 days, it happened again. Frustrating.. :/
I can bypass the jam by doing ifconfig down/up from both the host OS and also the guest OS. Then things go normal for a while. The host OS's secondary ethernet interface is a Realtek ethernet card dedicated to this VM, driven by r8169 module. It even does not have an IP on the Host OS. A problem that is beyond our reach is probably plaguing virtualbox kernel modules.
This system was made in a haste until I can make a new corporate VM infrastructure. It is under constant communication traffic. And this ethernet jamming problem is a bummer for me..
I am not a programmer, however, If you can guide me about how to collect more descriptive data, I'll try my best.
Thank you.
comment:6 by , 4 years ago
Hello Again,
As of today, meaning 30 days of stability, I think "Version 6.1.6 r137129 (Qt5.6.1)" solves this problem. As a side note, I did not do a kernel upgrade or a firmware pack upgrade in my system. I solely upgraded Virtualbox itself and its kernel modules.
Hope this helps someone out there..
All the best..
comment:7 by , 4 years ago
Resolution: | → obsolete |
---|---|
Status: | awaitsfeedback → closed |
Thanks for the update.
comment:8 by , 4 years ago
Problem still exists. Using version 6.1..14. All suggestions tested with gro off tso off , no vbox additions etc still resets.
[66169.615963] ------------[ cut here ]------------ [66169.615978] WARNING: CPU: 0 PID: 0 at /build/linux-o3gOgM/linux-4.9.189/net/sched/sch_generic.c:316 dev_watchdog+0x233/0x240 [66169.615980] NETDEV WATCHDOG: enp0s3 (e1000): transmit queue 0 timed out [66169.615981] Modules linked in: binfmt_misc vboxvideo(O) ipt_REJECT nf_reject_ipv4 xt_multiport xt_tcpudp ip6table_filter ip6_tables iptable_filter sb_edac edac_core iTCO_wdt intel_powerclamp kvm_intel iTCO_vendor_support kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel evdev sg intel_rapl_perf lpc_ich serio_raw vmwgfx mfd_core pcspkr rng_core vboxguest(O) ttm drm_kms_helper drm video button ac ip_tables x_tables autofs4 ext4 crc16 jbd2 crc32c_generic fscrypto ecb mbcache sd_mod ata_generic crc32c_intel ata_piix ahci libahci aesni_intel aes_x86_64 glue_helper lrw gf128mul ablk_helper cryptd psmouse e1000 libata scsi_mod i2c_piix4 [66169.616064] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G O 4.9.0-11-amd64 #1 Debian 4.9.189-3+deb9u2 [66169.616066] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [66169.616069] 0000000000000000 ffffffffa4136404 ffff9550e3c03e20 0000000000000000 [66169.616075] ffffffffa3e7b83b 0000000000000000 ffff9550e3c03e78 ffff9550ccd1a000 [66169.616080] 0000000000000000 ffff9550cd242a80 0000000000000001 ffffffffa3e7b8bf [66169.616085] Call Trace: [66169.616088] <IRQ> [66169.616097] [<ffffffffa4136404>] ? dump_stack+0x5c/0x78 [66169.616102] [<ffffffffa3e7b83b>] ? __warn+0xcb/0xf0 [66169.616106] [<ffffffffa3e7b8bf>] ? warn_slowpath_fmt+0x5f/0x80 [66169.616112] [<ffffffffa433eb93>] ? dev_watchdog+0x233/0x240 [66169.616117] [<ffffffffa433e960>] ? dev_deactivate_queue.constprop.26+0x60/0x60 [66169.616122] [<ffffffffa3eea292>] ? call_timer_fn+0x32/0x120 [66169.616126] [<ffffffffa3eea607>] ? run_timer_softirq+0x1d7/0x430 [66169.616132] [<ffffffffa413f564>] ? timerqueue_add+0x54/0xa0 [66169.616136] [<ffffffffa3eec2f8>] ? enqueue_hrtimer+0x38/0x80 [66169.616141] [<ffffffffa44220ad>] ? __do_softirq+0x10d/0x2b0 [66169.616147] [<ffffffffa3e81e52>] ? irq_exit+0xc2/0xd0 [66169.616150] [<ffffffffa4421b2c>] ? smp_apic_timer_interrupt+0x4c/0x60 [66169.616156] [<ffffffffa442025e>] ? apic_timer_interrupt+0x9e/0xb0 [66169.616158] <EOI> [66169.616162] [<ffffffffa441da92>] ? mwait_idle+0x72/0x160 [66169.616171] [<ffffffffa3ebf33a>] ? cpu_startup_entry+0x1ca/0x240 [66169.616180] [<ffffffffa4b3ef5e>] ? start_kernel+0x447/0x467 [66169.616185] [<ffffffffa4b3e120>] ? early_idt_handler_array+0x120/0x120 [66169.616188] [<ffffffffa4b3e408>] ? x86_64_start_kernel+0x14c/0x170 [66169.616191] ---[ end trace b2398e43d8835b28 ]--- [66169.616224] e1000 0000:00:03.0 enp0s3: Reset adapter [66171.728617] e1000: enp0s3 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX [68665.596236] e1000 0000:00:03.0 enp0s3: Reset adapter [68667.676228] e1000: enp0s3 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
Has there been any progress on fixing this regression in version 6.1? It is still present in 6.1.10. This is a show stopper for me. I have applications that fail due to this regression. I can't stay at version 6.0 because 6.0 can't be installed on fedora 32, and I must upgrade to fedora 32.