0

In the past I used a consumer grade Gigabit Switch and had 100MBit as well as Gigabit cards connected to it. Some of my Gigabit cards were prone to "drop" the Ethernet connection and "reconnect" with 100MBit, sometimes persistently sometimes "switching" back to Gigabit speed after some time. After removing all 100MBit NICs the problem vanished for good.

Can somebody explain to me what happened here? Were the Switch's buffers too small? How do 100MBit and Gigabit Cards coexist on a switch?

T Nierath
  • 517
  • 4
  • 11
  • 3
    Certainly sounds like a malfunctioning switch to me... – u1686_grawity May 18 '18 at 08:38
  • No... I bought another switch from another manufacturer, but similar specs (cheap 4-Port Gigabit Switch), same problem. – T Nierath May 18 '18 at 08:40
  • Can you identify the switch(es) and NICs used? Possibly also include a photo of the cables (and their terminations). Did you make the cables yourself or purchase off the shelf? What standard (CAT-x) are the cables? Did you replace the removed 100BASE-T devices with 1000BASE-T devices? If not, were the now unused ports left unused? – Attie May 18 '18 at 09:12
  • Ethernet has a fairly extensive [autonegotiation](https://en.wikipedia.org/wiki/Autonegotiation) mechanism. Dropping links and re-negotiating at a slower speed is likely due to one end (or the other) no longer advertising the higher speeds in an attempt to improve the link's quality - I don't believe that this is part of the standard. Did you observe any dropped / invalid packets on these "gigabit" links? – Attie May 18 '18 at 09:16
  • @Attie The connection was just gone for a couple of seconds, then "recovered" (Windows Connection Status changed to "not plugged in"). I switched all components (Switch, Cables, NIC) and the problem persisted until I finally buckled up and exchanged the 100MBit WRT54G (which was still fast enough for my Broadband connection...) for a Gigabit speed router. – T Nierath May 18 '18 at 09:24
  • What about running `netstat -e` shortly after the link "_recovered_"? Does it show any discarded packets / errors? – Attie May 18 '18 at 09:29
  • Well, I don't use the setup now, so I can't test. I did not have an account here at the time, but now wanted to see if somebody has an idea. Thank you. – T Nierath May 18 '18 at 09:30
  • 2
    In my home network they have coexisted for many years w/o issues. I have devices that simply cannot be upgraded to Gigabit versions (they don't exist, e.g. networked TV tuner box). And my ADSL modem/router is 100Base-T. Re switch buffering: see https://superuser.com/questions/441931/what-differences-are-there-between-home-switches-and-professional-switches/441934#441934 and https://superuser.com/questions/1220611/is-a-trunk-switchport-multiple-collision-domains-for-all-its-individual-device-s/1220635#1220635 – sawdust May 18 '18 at 21:08
  • Since you "fixed" the problem, how are you going to objectively evaluate any guesses you receive? Haven't you created a guessing game with no validation possible? As worded, I vote to close. – sawdust May 18 '18 at 21:13
  • Why? I did not ask about a fix, I asked about how NICs with different speed do coexist in a network. Basically, how does ethernet/the hardware involved do it? – T Nierath May 19 '18 at 04:26

3 Answers3

2

When gigabit links fall back to 100 Mbit the usual cause is bad cabling. 10, 100, 1000 and even 10,000 Mbit/s coexist nicely on the same switch (or possibly even faster but faster switches support 10 and 100 Mbit/s less and less). Check the NIC statistics for FCS errors, runts or other drops.

1000BASE-T requires all four twisted pairs to work while 100BASE-TX only uses two of them. Also, 1000BASE-T is slightly more picky on the cable as the line encoding is a bit more delicate. Quite a few devices fall back to 100BASE-TX when gigabit negotiation fails. The link may also fail altogether as well.

Everything else that's been described here - buffer overflow or flow control has NO impact on the negotiation link speed (physical layer L1) and will NEVER cause a link drop or fallback.

A switch always receives a frame completely before forwarding it (store-and forward) - most do anyway, across different link speeds all switches use store-and forward. It's no problem at all to receive a frame on one 10 Mbit/s port and forward it out another 100 Gbit/s port or vice versa.

Flow control might interfere with the effective throughput rate but never changes the physical layer link rate.

When a gigabit port tries to send a full rate flow to a 100 (or 10) Mbit/s device and flow control is active on all devices, the pause frames sent from the low-speed device will throttle the gigabit port of the sender even if another receiver might want to receive full rate - this is called head-of-line blocking and is a design flaw.

Legacy flow control should generally not be used unless you understand its function and it works in your scenario. Flow control is much better left to the transport layer (esp. TCP) or application layer protocols.

Zac67
  • 2,575
  • 7
  • 18
  • Hm, it really shouldn't be a cable problem, since removing the 100MBit router made the problem vanish, while exchanging the cables did not, or can this behavior be explained with link speed negotiation? Will a NIC really never change it's link speed depending on such criterias as dropped frames? – T Nierath May 29 '18 at 06:25
  • No it won't. Link speed is negotiatiated by fast link pulses (FLP), the link comes up and while it's up it doesn't change. 1000BASE-T also uses PCS negotiation for the single lanes/pairs and if that fails *some* NICs may fall back to 100BASE-TX. The cable is *not* tested more than that. – Zac67 May 29 '18 at 06:30
  • Thanks, but why does it work when removing the slow router? I'd assume it's a layer2 problem since the physical connection between the switch and my host is unchanged. – T Nierath May 29 '18 at 07:56
  • Well, I can't speak for sure for *every* device on the market, but dropping the link speed because of L2 problems is nothing I've ever seen or even heard of. I've worked with roughly 50 or 60 different models of switches. – Zac67 May 29 '18 at 10:46
  • Yes, I thought it was all quite peculiar... but since the phenomenon persisted even after exchanging all parts (cables, NICs, switches) I decided on turning it into a question. Regarding your expanded answer, what happens when a Gigabit Switch sends more Frames than the 100MBit side can receive and after the buffers are full, just dropping frames? – T Nierath May 29 '18 at 12:55
  • 2
    For the first few instances, the buffer fills up - 1 MB buffer can take close to 9 ms of 900 Mbit/s. When the buffer is exhausted, the frames are dropped. With quality-of-service priority and (usually) multiple queues, the switch drops lower priority frames first. – Zac67 May 29 '18 at 14:46
1

The article When Flow Control is not a Good Thing describes the case when there is a mixture of gigabit and Fast Ethernet (100 Mbps) clients in non-managed gigabit switches, where readers have reported gigabit links being forced to Fast Ethernet speeds.

image

The article lays the blame on 802.3x Flow Control and says :

Unfortunately, it seems (at least in small networks) that 802.3x does more harm than good. This may be partly because it duplicates the loss-based flow control mechanism already built into the TCP protocol. But whatever, the reason, I was able to confirm that the throughput loss that some people were attributing to "defective" or "low performance" switches, was in fact, due to Flow Control.

You may find a good treaty on the subject in the article To flow or not to flow?. The article gives three reasons for disabling it :

  • Buffer limitations on some switches
  • Modern devices are now more capable of handling data and processing it fast enough to where flow control is not only unnecessary, but actually a hindrance to better performance
  • Better to manage flow control higher up the stack in the form of congestion control.

Flow Control is disabled by default on many switches, but check your switch. If enabled, try disabling it. You may need to disable it on all endpoints in some rare cases.

harrymc
  • 455,459
  • 31
  • 526
  • 924
  • Thank you, I will read it. But how would I disable flow control for an unmanaged switch? I played around with the NICs driver settings, but none of it had any effect. – T Nierath May 18 '18 at 09:36
  • It would help to have the model of your switch (answer not guaranteed). – harrymc May 18 '18 at 10:15
  • I currently use https://www.zyxel.com/products_services/5-Port-Desktop-Gigabit-Ethernet-Switch-GS-105B-v3/ I doubt it can be configured... I also did experience the same problem with another similar "dumb" 5-port Switch (from netgear) so it should not be the switch's problem. Maybe the WRT54G's NIC somehow triggered flow control. – T Nierath May 18 '18 at 10:37
  • The GS-105B documentation says it's indeed doing flow control, but no documented way to disable it. You might contact Zyxel Support with the problem. Are you sure that this isn't a mundane cable problem, requiring a better cable or shorter distance? – harrymc May 18 '18 at 11:09
  • I actually exchanged all parts, cables, switch and NIC and the problem persisted until I trashed my old WRT54G and upgraded to a router with gigabit NIC. Since then everything works as intended, so it's not a current problem of mine, fortunately. Anyway, thanks for you help, I upvoted your answer but will keep the question open for some time to see if sb. has more ideas on the subject. – T Nierath May 18 '18 at 11:31
  • You probably now got a router that is fast enough so the switch does not need any more to use flow control or memory buffers. I think that your problem is actually that the switch badly handles slow connections and doesn't handle memory buffers in an optimal way. – harrymc May 18 '18 at 13:04
  • Is flow control from host to switch, switch to router or end-to-end (host-to-router)? Also, do you know how these buffers work? One per port? Two per port (in/out)? I guess it's possible that both switch were roughly equally bad and therefore had similar behavior. – T Nierath May 18 '18 at 13:24
  • That requires detailed knowledge of the switch. Sorry, can't do. – harrymc May 18 '18 at 13:31
  • No problem, I didn't think there would be big differences in the basic switch architecture, but I really just treat them as a black box. – T Nierath May 18 '18 at 13:39
  • Flow control is a layer 2 feature which does cause head-of-line blocking with differing speed, but there is no impact on physical layer links at all. – Zac67 May 28 '18 at 18:20
0

Intermittent results sounds a bit like CAT5e (instead of CAT6a or CAT6), or, more likely (and worse yet), CAT5 (instead of CAT5e) cabling. Although removing the 100MBit NICs wouldn't likely help much with that, so...

The switch's buffers being too small sounds like a possibility.
Unlike a "hub", a "switch" has communications that are independent per port. In other words, if you have a 24 port switch, then those are 24 independent connections. If one connection is 100Mbit, that shouldn't prevent another connection from being gigabit.

I could see some potential benefit from a switch being able to get rid of any packet data that it might be remembering. If it is successfully using gigabit connections, then it can complete conversations more quickly, and just be done with a job sooner. This may lower buffer usage. This could also reduce "overheating" issues, if the jobs get done sooner so heat generated by work can stop existing sooner.

TOOGAM
  • 15,243
  • 4
  • 41
  • 58
  • I did exchange cables and it was all very short distance (single room), so it shouldn't be the cables. My idea is that e.g. I'm sending files to another Gigabit host while accessing the Internet via an old 100MBit router. I guess my port has only one buffer and it can't just throw away Frames willy nilly, so it slows down. But I don't know if things really work like this. – T Nierath May 18 '18 at 09:07