In this post, we’ll take a closer look at auto negotiation. Auto negotiation has the following characteristics:
- It is required to be supported.
- Transmits capabilities for speed/duplex.
- Negotiates Energy Efficient Ethernet (EEE) capabilities.
- Determines the leader/follower relationship on the link.
- Needed for PHY Control, a PMA subfunction.
- Performed when initializing the link.
- Auto-MDIX.
The Auto Negotiation transmitter and receiver is actually a separate system in its own right. In multi-speed PHY devices, auto negotiation is used to select the highest speed that both sides of the link are capable of, before the link is trained. However, it is important to understand that auto negotiation is not optional to be supported, but the standard does not require it to be implemented (thanks to Eric Peterson for clarifying this). A leader and follower must be decided so that clock synchronization can take place. Without auto negotiation, this would have to be manually configured. On some devices it is possible to configure speed on 1000BASE-T interface. However, this does normally not disable auto negotation, but rather limit what capabilities get advertised.
Auto negotiation is performed using Fast Link Pulses (FLP). Historically, 10BASE-T used Link Test Pulse (LTP) to verify the integrity of the link. A Normal Link Pulse (NLP), used in auto negotiation, matches exactly the LTP. FLP is a quick burst of NLPs. The FLP burst consists of 17 NLPs that are sent within 2 ms. The next burst starts 16 ms +- 8 ms later. This is shown below:
As part of the FLP, data may optionally be interspersed between the NLPs:
Remember, there are 17 NLPs as part of the FLP. Depending how much data is interspersed, there will be 17-33 pulses sent. If data is found, that represents a logical 1, and no data is a logical 0.
The initial negotiation between devices uses the following flow:
- 1. The two link partners transmit FLP bursts containing their link code words (capabilities). At this stage the Acknowledge bit is not set.
- 2. Within 6 to 17 (inclusive) pulses part of the first FLP burst, the link partners detect if the other party is auto negotiation capable.
- 3. Now the link partner awaits 3 complete, consecutive, and consistent FLP bursts (ignoring Ack bit). It then enters Acknowledge Detect state and begins transmitting FLP bursts with link code word having the Ack bit set.
- 4. It now awaits another 3 complete, consecutive, and consistent FLP bursts where Ack bit is set. It then enters Complete Acknowledge state and transmits 6 to 8 (inclusive) FLP bursts containing link code word where Ack bit is set.
- 5. After another 4-6 (inclusive) FLP bursts, the Next Page exchange takes place (if needed).
- 6. When the optional Next Page exchange is completed, the link partners decide on Highest Common Denominator (HCD), and negotiate to that link if possible. If no common technologies are shared, no link is established.
The flow is shown below:
What happens if one side is not capable of auto negotation or has a fixed speed? One of two things can happen:
- The device enables whichever speed PMA it supports, or.
- The device refuses to establish the link.
Note that the link would be half duplex as auto negotiation is required to negotiate full duplex capabilities.
Now that we have a good understanding of 1000BASE-T, PCS, PMA, and auto negotiation, in the next post we’ll cover what happens at PHY level in a link down event.
Hey Daniel,
I’m curious to hear your take on Auto-MDIX as it relates to 1000BASE-T.
Two specific areas:
First, is it “crossover” stuff really part of the information exchanged in FLPs? That would be a departure from previous generations of Ethernet, but maybe that’s to be expected.
Second, I understand that 1000BASE-T can generally cope with “crossover cables” (whatever that means in a Gigabit word), but does this capability have much relation with the HP patent for Auto-MDIX, which seems to revolve around getting transmitters hooked to receivers and vice versa… Something which doesn’t really apply with gig interfaces.
My impression is that 1000BASE-T has a general problem regardless of the cable in use: There are four transmit channels (in each direction), and the sides must agree about how they’re indexed.
It’s not really about “crossing over” between transmitters and receivers at all, because each loop is both a transmitter and a receiver.
As always, I’m excited to read your thorough take on the matter if you choose to tackle it in the future.
Hey Chris! It’s been a long time!
Let me add it to my list of potential blog posts. I’ll do some research and see what I come up with. I have a few posts already planned out (but not written), so it might take a while until I get there.
Thanks for commenting!
Hi Daniel, I wanted to make one minor comment. The requirement as written in 802.3ab is that components “shall support” auto-negotiation. All of the various components must have the functionality necessary for auto-negotiation to work. But, the standard does not require the implementation of auto-negotiation, and it is possible to run 1000BASE-T with those components statically configured, if you have the configuration interfaces to do so.
This is a very small nit that will not matter for the overwhelming majority of readers and users, but I know that you care about technical accuracy in your learning journey.
Thank you for this outstanding series, it’s as concise and readable of an explanation of the guts of Ethernet as I’ve ever seen. You have a gift for explaining complex things!
Thanks a lot, Eric! (I just realized we met at CL last year)
Like you said, I strive to be as correct as possible. When reading 802.3-2022, section 40.5.1, it says:
”
a) To negotiate that the PHY is capable of supporting 1000BASE-T half duplex or full duplex
transmission.
b) To determine the MASTER-SLAVE relationship between the PHYs at each end of the link.
c) To negotiate EEE capabilities as specified in 28C.12.
This relationship is necessary for establishing the timing control of each PHY. The 1000BASE-T MASTER
PHY is clocked from a local source. The SLAVE PHY uses loop timing where the clock is recovered from
the received data stream.”
Are you saying that there are PHYs that don’t do auto negotiation at all and provide the clocking through another method?
I’m aware of being able to configure speed on 1000BASE-T interfaces, but on the NOSes I’ve worked on that simply means it filters what capabilities it advertises. If you configure 100 Mbit/s for example, it does not advertise that it can do 1 Gbit/s to the other PHY.
I know you work heavily on these things so happy to learn something new here 🙂
> Are you saying that there are PHYs that don’t do auto negotiation at all and provide the clocking through another method?
I’m aware of an implementation (custom gear not generally available) which doesn’t do negotiation at all and allows for master/slave (gross) link clocking election via explicit configuration.
So, it’s *possible* to run a link like this, but my read of the standard is in alignment with yours: If it’s not doing negotiation, we’re talking about something very much like 1000BASE-T, but *not* capital-E Ethernet.
So, it’s kind of like jumbo frames, but an more egregious departure from the standard.
Hi Daniel, thanks for the response.
The source of ambiguity (or, as I like to see it, flexibility) is in 40.12.6
MF1 All 1000BASE-T PHYs shall provide support for Auto-Negotiation…
In the conformance language of the IEEE, “shall provide support” means that the PHY must be able to be configured for auto-negotiation, but does not specify that it must use auto-negotiation for operation. It would say “shall implement” in the case that the use of the specified feature was a mandatory part of the standard, and you can find plenty of instances of this language in other sub-clauses within the standard.
This is something I’ve directly discussed with 802 members and most of them agree that the intent was probably to require implementation rather than support, but it is a very specific and well-understood differentiation in the requirements language that’s consistent across all of the 802 standards.
You are correct that a NOS will not generally support this configuration. Some of them will, on a platform dependent basis, allow you to manually set leader/follower, with mixed results. As you said, for most of these platforms, it will just change/suppress the options it’s offering. Sometimes it will claim to be statically configured but will still be sending and expecting FLPs and will not bring the link up if the other end is not participating in auto-negotiation. The various flavors of Linux variably support disabling auto-negotiation, depending on your exact distro and network card.
If you have direct configuration register access, you can manually configure all aspects of its operation, and it will happily operate in this manner so long as both ends of the link have the appropriate configurations, without FLPs.
This is very interesting, Eric! Thank you for taking the time to properly explain it rather than stating the fact. I’ll make some edits and also post your comment so that others can learn from it.
Thank you!
Hmmm.
I see What Eric is saying, BUT, I disagree with the conclusions. The Protocol implementation conformance statement (PICS) proformas are how a device is assessed for compliance. 40.12.2 says that 1000BASE-T devices must support AN.
40.12.2 Major capabilities/options
AN Support for Auto-Negotiation (Clause 28) 40.5.1 M Yes [ ] Required
The specific PICS entries for AN say
you have to use AN to determine MASTER-SLAVE.
If EEEE is supported, you have to exchange support info using AN
40.12.6.1 1000BASE-T Specific Auto-Negotiation Requirements
AN1 1000BASE-T PHYs shall 40.5.1.2 M Yes [ ] Exchange one Auto-Negotiation Base Page, a 1000BASE-T formatted Next Page, and two 1000BASE-T Unformatted Next Pages in sequence, without interruption, as specified in Table 40–4.
AN2 The MASTER-SLAVE relationship shall be determined during Auto-Negotiation 40.5.2 M Yes [ ] Using Table 40–5 with the 1000BASE-T Technology Ability Next Page bit values specified in Table 40–4 and information received
AN3 Successful completion of the MASTER-SLAVE resolution shall 40.5.2 M Yes [ ] Be treated as MASTER-SLAVE configuration resolution complete.
40.12.6 Management interface
MF1 All 1000BASE-T PHYs shall provide support for Auto-Negotiation (Clause 28) and shall be capable of operating as MASTER or SLAVE. 40.5.1 M Yes [ ]
Clause 28 extensions for 1000BASE-T
28D.5 Extensions required for Clause 40 (1000BASE-T)
Clause 40 (1000BASE-T) makes special use of Auto-Negotiation and requires additional MII registers. This use is summarized below. Details are provided in 40.5.
a) Auto-Negotiation is mandatory for 1000BASE-T (see 40.5.1).
b) 1000BASE-T requires an ordered exchange of Next Page messages (see 40.5.1.2), or optionally an exchange of an Extended Next Page message
c) 1000BASE-T parameters are configured based on information provided by the exchange of Next Page messages.
d) 1000BASE-T uses MASTER and SLAVE to define PHY operations and to facilitate the timing of transmit and receive operations. Auto-Negotiation is used to provide information used to configure MASTER-SLAVE status (see 40.5.2).
e) 1000BASE-T transmits and receives Next Pages for exchange of information related to MASTERSLAVE operation. The information is specified in MII registers 9 and 10 (see 32.5.2 and 40.5.1.1), which are required in addition to registers 0-8 as defined in 28.2.4.
f) 1000BASE-T adds new message codes to be transmitted during Auto-Negotiation (see 40.5.1.3).
g) 1000BASE-T adds 1000BASE-T full duplex and half duplex capabilities to the priority resolution table (see 28B.3) and MII Extended Status Register (see 22.2.2.4). h) 1000BASE-T is defined as a valid value for “x” in 28.3.1 (e.g., link_status_1GigT.) 1GigT represents that the 1000BASE-T PMA is the signal source.
i) 1000BASE-T supports Asymmetric Pause as defined in Annex 28B.
My read of this is completely disabling AN is not compliant to the standard. Systems often implement “fixed configuration” (e.g. 100Mb/s only) by not advertising the other capabilities and letting AN resolve to what was specified.
Thanks for the related posts, Daniel! I myself became interested in the topic when I had to make friends with vendor X and Cisco 3650 using Cisco’s native MM 1G transceivers. At first, on the part of Cisco and vendor X, the port configuration was the default. Link didn’t get up. The port status on Cisco was err-disable, and this status was not generated by some protocols; errdisable recovery would have worked that way. When I tried the plug/unplug transceiver, I received several blinks of the port indicator in green and one in orange, after which the port status in Cisco was displayed as err-disable. Until I “nailed” the following parameters in the vendor X config:
– speed 1000G
– duplex full
– mtu 1500
After that I got 1000G full on both sides and the data ran without problems.
*I also want to note that the vendor X port was 10g, and the Cisco port was 1g.*
After this, it became interesting what was happening inside. Thanks for your hard work, good luck with your research!
Interesting scenario! Thanks, Artem!