Troubleshooting VLAN Switch Problems: No Connectivity

Posted on April 13, 2012 by Joe Rinehart in Networking with 0 Comments

Having already solved your first trouble ticket for the day, which basically was an “Internet is down” issue (including common switch issues, VLAN-related issues, and spanning-tree issues), you’re now ready to tackle your second trouble ticket. This time, we’re looking into a “No connectivity” issue.

(Instructional video below provides a walkthrough of the steps contained in this article.)
Just like the first one, you pick up the ticket, assign it to yourself, contact the requester to inform him/her that you are already actively working on the problem, and perform troubleshooting and resolution.

Here’s a screenshot of this particular request:

Everything is down

You start by going back to some of the devices you were looking at earlier. After some initial investigation, you learn that users aren’t even getting any DHCP addresses.

As always, you begin troubleshooting at the Physical Layer. You execute a show ip interface brief to get a quick summary, and you see that everything is doing well from there.

show interface brief

Sponsored

You follow that up with a command that will give you more detailed information:

show interfaces

All relevant devices are still showing Up/Up.

show interfaces

Just to make sure you eliminate any potential problem, you issue the show controllers fastEthernet 0/0 command. The results tell you that nothing is really wrong there.

For instance, there are no collisions…

show controllers fast ethernet

… it is autonegotiated …

autonegotiated

… and so on.

show controllers fast ethernet last result

At this point, you presume that the problem is not physical.

You then execute the show cdp neighbors command. You see SW1-3, so you conclude that Layer 2 is still working.

show cdp neighbors

You follow that up with a ping to 192.168.1.1.

No connectivity there.

ping no connectivity

Same with VLAN11.

another ping no connectivity

Since all pings are failing, you proceed to execute:

show ip eigrp interfaces

You’re not seeing anything, despite that all your interfaces are configured correctly and routing is working.

show ip eigrp interfaces

You execute show ip eigrp neighbors.

show ip eigrp neighbors

Everything does seem to be down.

You follow that with show ip route, and you only get its local routes.

show ip route

You then decide to see whether the problem is a little more widespread. You go to Router 1-1 to see if there’s a problem there.

ssh to router 1-1

You start by issuing the show ip interface brief command, which then shows that everything is up.

show ip interface brief

You again execute show ip route. This time, unlike in R1-3, you get all sorts of routes.

show ip route

You try to see whether the Internet is working from here.

See if internet is working

It is.

You then try pinging 192.168.1.2. You have connectivity there. You also ping 192.168.1.3, which you know isn’t working.

ping

You then proceed to ping Switch 1-1 (192.168.1.111), Switch 1-2 (192.168.1.112), and Switch 1-3 (192.168.1.113) because these are all on the path. Among the three, Switch 3 is where you’re not getting any connection.

ping switches

Your initial observations tell you that something between Switch 1-2 and Switch 1-3 isn’t working correctly.

You go to Switch 1-3 and issue the show cdp neighbors command. You see downstream but not upstream. This confirms your suspicion that the problem is between Switch 1-2 and Switch 1-3.

show cdp neighbors

Next, you do show spanning-tree, and you notice that it’s not showing anything for spanning-tree.

show spanning tree

You ping 192.168.1.1, but you can’t reach it from here.

ping fail

You go to Switch 1-2 and ping 192.168.1.1 again. You also ping 192.168.11.1. They are both reachable from here.

ping

When you do a show cdp neighbors, you notice that Switch 1-3 is missing.

show cdp neighbors

You execute show spanning-tree. Everything is working as expected.

show spanning tree

Next, you execute the show interfaces trunk command. You see no problem there either.

show interfaces trunk

Taking into account all your observations up to this point, you get the idea that the problem may really be on Switch 1-2.

You go now to Switch 1-3 and issue the command show vtp status. No problem there either.

show vtp status

You finally decide to do a configuration check. To do that, you execute the show running-config command.

show running config

When the results come out, two things catch your attention. First, the lines that say channel-group 4 mode on tells you that this is part of the upstream etherchannel trunk, but then the line that says switchport access vlan 111 tells you that access vlan has been configured.

This will not work as a trunk the way that it’s supposed to. The problem is really getting clearer now.

You execute configure terminal, and then interface range fa0/23 – 24, which is a very helpful command because it allows you to configure multiple interfaces at once. You then do switchport mode trunk. The results indicate that your port channel is back up.

switchport mode trunk

You go back to Router 1-3 and issue the show cdp neighbors command.

show cdp neighbors

Follow that up with a show ip eigrp interfaces command.

show ip eigrp interfaces

After that, you execute show ip route again. This time, you get all sorts of information.

show ip route 1

show ip route 2

You ping 192.168.11.1.

Success.

Next, 216.145.1.2.

Success again.

things are back up

You therefore conclude everything is really back up.

Just like in the first ticket, you summarize everything in a sort of root cause analysis.

  1. The fault was identified on SW1-3.
  2. The fault was isolated to Layer 2, specifically in the VLAN and trunking configuration.
  3. The fault was caused by a misconfigured port mode (a human error).
  4. It was resolved by restoring the original trunking mode configuration on the switch port.

After changing the status of the request to Resolved, you go back to the Home tab. You’re relieved to see only one more situation left to troubleshoot.

one more ticket open

This last situation is somehow different than the other two in that the network is not hard down, but there is a performance issue.

As you look at the request on the ticketing system, you see a message that doesn’t sound as urgent as the other two. Still, it’s your job to find a solution for the problem.

network is slow

The closest point to the problem is Router 1-3, so you start your troubleshooting work there.

You’ve already established that everything, from a workstation point, is working. You know there’s connectivity to the Internet and to the workstations. There’s basically full reachability, so you know that it’s not any kind of a routing issue.

The first command you issue is show interface brief. You find all pertinent interfaces up and running.

show interface brief

Next, you check out Layer 2 by issuing the show cdp neighbors command. That is working as well.

show cdp neighbors

You follow that with show ip eigrp interfaces and then show ip eigrp neighbors. Both results still show nothing wrong.

show ip eigrp interfaces

Same with show ip route.

show ip route

But then when you try a ping test, the result is slower than what you expected.

ping test

Still the question remains – where is the performance issue?

You try a similar ping test from Router 1-1. You get the same slow results.

ping from r1-1

So you know that there’s something between Router 1-1, all the way back through the switch fabric, to Router 1-3. The problem could be at Router 1-3 or in the switching fabric.

You do a show running-configuration, and start looking for anything that would limit traffic such as any filtering, access lists, or anything of that sort.

But then you see that everything is completely in order in this device, so at this point, you eliminate it as a possible cause of the problem.

show running-config

You go to Switch 1-3 and issue a ping to 216.145.1.2. It appears to be unreachable.

uncreachable from ping

To find out why, you execute a show run. There, you notice that the VLAN interface is shut down, which explains why you could not ping out.

vlan interface shutdown

You issue a no shut. Sure enough, after that you are able to ping out again.

no shut

It is still slower than what you were expecting, so you proceed by issuing a show cdp neighbors. You got your etherchannel link substream to the next. Again, everything appears to be in order from here.

show cdp neighbors

You continue with a show spanning-tree. You see that the port channel is up and running, and the other portions of the result aren’t showing anything unusual either.

show spanning-tree

You move on to Switch 1-2 and begin with a show cdp neighbors. You find your two connections upstream to Switch 1-1, so it really looks like your system should be doing well.

show cdp neighbors

Again you do a show spanning-tree. As soon as you get to VLAN 11, you notice something peculiar. When you review the results of your show cdp neighbors command, you notice that the etherchannel configurations are in groups of two.

show cdp neighbors groups of two

Then when you look at the show spanning-tree results, you see the items listed individually.

show spanning tree

To see if something is awry, you execute a show running-config.

show running config

The idea is that, if there’s etherchannel, double the bandwidth between all the switches but up to this one; this becomes the constrained bandwidth area.

Sponsored

You see the two interfaces:

two interfaces

When you scroll down, you see that there is no channel group. The channel group configuration is missing. You presume, for some reason, that it has been deleted.

To start solving that problem, you execute the following commands:

interface range fastEthernet 0/1 – 2
channel-group 1 mode on

interface range

Sure enough, it brings the port channel up.

You then go back to Switch 1-1 and issue the following commands:

interface gi0/15
channel-group 1 mode on
interface gi0/17
channel-group 1 mode on

channel group 1 mode on

After you do a show spanning-tree, you observe that the port channel goes to a “learning” state and then goes back up.

show spanning tree

You see port channel 1.

port channel 1

When you go back to Router 1-3 and do your ping again, you see a significant improvement in the round-trip time. This makes you conclude that the problem was that the etherchannel had been deleted between the switches, and that degraded the bandwidth.

In summarizing everything into your root cause analysis, you take note of the following:

  1. The fault was identified on Devices SW1-1 and SW1-2.
  2. The fault was isolated to Layer 2, Etherchannel.
  3. The fault was caused by deleted configuration, although you’re not sure how that happened.
  4. The problem was resolved by restoring Etherchannel functionality with the channel-group 1 mode on command.

Conclusion

In this series, we covered VLAN/switch troubleshooting techniques. This concludes this two-part article.

Sponsored