Thursday, 30 August 2012

Multicast, IGMP and Spanning-Tree

So, I've come across this problem a lot of times, so I thought I'd try and write a post to help others in the same situation.

The situation is this - you have a large network of switches, using spanning-tree to prevent loops, but you are also using the network for multicast streaming. If you have any significant amount of multicast going on (maybe an IPTV system) then you'll be using IGMP snooping on all the switches to make sure that you don't have traffic going where it's not required. You set it up, and everything is working fine.

But then... it breaks. Badly. Your network starts flooding occasionally, for a couple of minutes at a time. During that time, all traffic on the network is delayed at best, and often dropped.

The interactions of IGMP, STP and your large amounts of multicast traffic are killing your network.

Let's break it down to explain the different things that are happening here:

Why does my network grind to a halt?

It's flooded! When using IGMP snooping, the multicast traffic on your network is normally only sent to those people who want to receive it. However, in this situation, your switches are momentarily sending traffic to all ports. There is so much traffic that your switch ports may be running at capacity, or the end-hosts are getting sent so much unwanted multicast traffic that they can't keep up.


So why does IGMP snooping suddenly stop working?

It doesn't. It is choosing to flood your multicast traffic because it thinks that is the best course of action in the given situation. If we look at the debug messages for IGMP snooping:
00:08:15: IGMPSN: mgt: Received topology change on vlan 1
00:08:15: IGMPSN: mgt: Updating all GCEs with flood portset for in Vlan 1
When spanning-tree protocol tells the switch that a topology change has occurred (more on this below), IGMP snooping will flood your multicast traffic to all ports, assuming that if the topology has changed and your traffic is mission-critical, then it had better send it to all ports to make sure it gets to your end user!

But I don't want that...

Ok, no problem - you can turn it off. In Cisco switches, you need to add this command to every interface you want to stop the flooding on.
 no ip igmp snooping tcn flood
That probably means all your edge ports, and potentially some of your uplink trunks, although these should probably be high enough bandwidth to be able to cope with all your multicast! This command is basically telling your switches "Don't flood traffic when you receive a topology-change notification (TCN)".

What is this topology change anyway? I didn't change anything!

Spanning-tree protocol, although very useful, can be very tricky to get configured correctly, and can cause you a lot of problems. When any switch believes a topology change has occurred, it will send a notification to the root bridge. When the root bridge receives this, it sets the topology-change (TC) bit in its BPDUs, to notify the whole of the rest of the network that a topology change has occurred.


So why are they happening if my network isn't changing?

Spanning tree will send a topology-change-notification (TCN) whenever it believes a topology change has occurred. If you already understand spanning-tree, you will know that any port, as it comes up, will go through two different states, "learning" and "listening", before finally entering the "forwarding" state and starting to operate normally. Any port transitioning in or out of this forwarding state will trigger a TCN. However, if a port is configured with "portfast" it will skip the "listening" and "learning" states and jump straight to "forwarding", without triggering a TCN. So, put simply, any port going up or down, anywhere on your network, that is not in portfast mode, will trigger a TCN, as shown below:
Without Portfast:
02:30:04: %LINK-3-UPDOWN: Interface FastEthernet0/1, changed state to up
02:30:05: set portid: VLAN0001 Fa0/1: new port id 8001
02:30:05: STP: VLAN0001 Fa0/1 -> listening
02:30:06: %LINEPROTO-5-UPDOWN: Line protocol on Interface FastEthernet0/1, changed state to up
02:30:20: STP: VLAN0001 Fa0/1 -> learning
02:30:35: STP: VLAN0001 sent Topology Change Notice on Gi0/1
02:30:35: STP: VLAN0001 Fa0/1 -> forwarding
With Portfast:
02:29:10: %LINK-3-UPDOWN: Interface FastEthernet0/1, changed state to up
02:29:11: set portid: VLAN0001 Fa0/1: new port id 8001
02:29:11: STP: VLAN0001 Fa0/1 ->jump to forwarding from blocking
02:29:12: %LINEPROTO-5-UPDOWN: Line protocol on Interface FastEthernet0/1, changed state to up
As you can see, portfast is very desirable, because not only does it stop unwanted TCNs, but it also means your ports will come up much faster. If you're anything like me, you already put all your access ports into portfast, just because you want them to come up fast. However, you may not put your edge trunk ports (perhaps for a server, wireless AP or VoIP phone) into portfast.

And this will fix all my problems?

Not necessarily. It's possible that you are getting legitimate topology changes within your network. For example, I have seen an occasion where a faulty fibre link was causing an interface flap for an unused switch on the edge of a network. You can track down the source of your TCNs by using "debug spanning-tree events" on your switches. Start with the root bridge, and when your TCN occurs, you should see something like this:
02:38:48: STP: VLAN0001 Topology Change rcvd on Fa0/24
So, work out which switch is on Fa0/24, log into that and run debugs there. Repeat the process until you find the port that is flapping. A quicker way of doing this is to set up all your switches to log debug messages to a syslog server, and turn on spanning-tree event debugs on all the switches at the same time, and then you only have to see a single TCN, rather than having to keep waiting for it to occur. I'll put up another post about syslogs on Cisco another time.

I'm still confused, how do I stop this flooding happening?!

The quickest way is to add the "no ip igmp snooping tcn flood" command to all your interfaces. If you want to stop the underlying cause, make sure all ports where a single device is connected are set up with "spanning-tree portfast" for access ports or "spanning-tree portfast trunk" for trunk ports. Don't do this for links to switches - they should be set up as part of your spanning tree.


I hope this is useful to some people. I've dealt with this situation quite a few times, and the first few it took me a while to figure out what was happening. If you want to understand this further, Cisco has a very helpful page about this here.

Edit: See my next entry for info on some of the Linux commands I used to test multicast in my lab.
Update: See my newer post about IGMP Query Solicitation

12 comments:

Lloyd said...

Thanks for sharing this, I've been fighting a problem with one switch in a stack flooding multicast for weeks. Assuming it was related to snooping/querying, I could never understand how switches in a stack could behave differently. After reading the blog I found that the Portfast on the second switch was different than the first.
Thanks again.

Chris Smailes said...

Thanks for posting this Steve.
I don't know if you monitor this but I have a question please. As far as I can see this flooding behaviour doesn't happen on Cisco 6500s ?

nervegrind3r said...

hello, and thanks for the great post. This is spot on as to whats occurring in one of our customer's multicast networks.
I have some questions:

can you elaborate on what you mean by "Don't do this for links to switches - they should be set up as part of your spanning tree." the only trunks we have in our network are closets that connect back to the core switch, and they are set up as portchannels. In this case, we should only set up portfast on access ports, and forget issues the spanning-tree portfast trunk on the portchannel links?


As far as using the command " no ip igmp snooping tcn flood", should this be added to our closet uplinks (portchannel) in addition to all our edge device access ports?

Steve Haskew said...

Chris, I'd be surprised if the behaviour doesn't occur on 6500s, but it's possible that the default behaviour is different for some features. Also possible that with a 6500 you have enough bandwidth that you just don't notice a flood.

nervegrind3r, it's normal best practice not to use portfast between switches, but only for 'edge' devices. This Cisco page explains a bit: http://www.cisco.com/c/en/us/support/docs/lan-switching/spanning-tree-protocol/12013-17.html. Basically portfast is 'act first, check later' vs the usual 'check first, then act' so you are weighing up convenience/speed with possible disruption in the event of someone creating a physical loop.
The 'no ip igmp snooping tcn flood' can be used on your uplinks yes. Probably look at your traffic... if you have a roughly fixed bandwidth of multicast (e.g. IPTV system in a hotel with a fixed number of channels) then you can calculate which links would get saturated in the event of a flood. So if you have any older switches on a 100M uplink, you probably want to use this command to avoid killing that link. If your whole core and distribution layer is 1G and you have 600M of multicast then no need to prevent the flood within the core/distribution layer as it won't saturate any links (and probably you want flood behaviour so that you have minimum disruption to receiving multicast traffic). Hope that answers your question..?

Gabnet77 said...

Hi Steve, thanks a lot for your post. It really explains what we just had in the network. I have one question regarding the "no ip igmp snooping tcn flood".
This is what Cisco doc says about it: "With the no ip igmp snooping tcn flood command, you can disable multicast flooding on a switch interface following a topology change. """"Only the multicast groups that have been joined by a port are sent to that port, even during a topology change.""""""""

So my question is: in case we configure this feature on a port facing another switch to which the mcast clients are connected in order to avoid mcast flood, what would happen if there's a TCN and STP reconverge on the network? will all the mcast traffic be stopped through that interswitch link or at least the licit mcast channels traffic the clients were subscribed to will still continue being flooded? it's not clear to me if all the flooding is stooped vs the non licit one. If all traffic is cut avoiding mcast flooding maybe we are penalising the licit traffic on the other hand ¿? if so, would the ip igmp snooping query solictian solve this situation? what i' tryting to avoid is the flooding but not the licit traffic

Thanks for your post again
Cheers
Gab

Steve Haskew said...

Hi Gab,

Sorry for the delayed reply! So on a port with "no ip igmp snooping tcn flood" the port will still receive the traffic for the multicast groups it is subscribed to even when there is a TC. That applies even if the port goes to another switch.

However, if you have a redundant topology managed with STP, using this command on *all* ports may cause you a problem. If a user on a switch is the only user in the whole LAN receiving a given multicast group, and the switch they are connected to has an uplink failure (and so uses its redundant path to a different switch) then they will get an interruption of the multicast service for a few moments. The best practice would be not to use this on inter-switch links, but to make sure you have enough bandwidth on these links for all multicast to be temporarily flooded between switches.

Gabnet77 said...

Hi Steve,

it's clear now. Thanks again for you feedback.

Unknown said...

Does this sound like flooding? I recently installed an IPTV system at a sports bar with multiple IPTV encoders and many IPTV decoders using Cisco 2960g switches for distribution. If I reboot or disconnect / reconnect (a physical topology change) any of the devices on the network the video goes crazy; blocking, scrambling and the sort. Does this sound like its caused by flooding? Thanks, J

Steve Haskew said...

Thanks for the question! That does sound a lot like this exact issue. Check the configuration of the ports where the devices are - do you have spanning-tree portfast configured? It sounds a lot like each time a device reboots or is disconnected, you have a Spanning Tree topology change, triggering this temporary multicast flood!

Jay said...

I have been pulling the remainder of my hair out for months over this issue. I thought it was incorrect IGMP settings, defective hardware somewhere on my network or just plain-old something I did wrong. BUT this sounds like what the problem may be. All along i'd been feeling like when I boot a set top box it was "sending data" to the entire network rather than just to the server that contains the channel guide information. A friend who knows way more about complex networking than I do has been helping but we haven't had a solution. The other day, just by unplugging the Ethernet cable of one of the Set top boxes on a 48port switch, all the TVs went bonkers. When I described that situation to him, he suggested that it sounded like a Topology Change Issue and that I should try disabling Spanning Tree. I did some searching and found your article. I am away for a few days but as soon as I return, I will test out the settings and hope for the best. Thank you again for all your information!

JAZ X said...

Thanks for sharing. It solved my problem last for a year and half.
The content proves to be so useful to network Multicast setup in CCTV industry.

farhad moayyed said...

hi
i use cisco 2960x and connect several encoders to it( vlan 10) and trunk out to another head-end,l have problem with ports that are connected to sources(encoders).i expect to have only RX multicast on these ports but i see TX multicast on them .please let me how i can block outgoing multicast signals.consider that i tested SWITCHPORT BLOCK MULTICAST command but it did not work. ACL was useless because it can only be applied on incoming multicast .please help me to solve my problem