Network Troubleshooting

The thing that makes networks awesome is the fact that once you setup a network you’re not done. That may sound like a drawback but it is excellent job security! Engineers or administrators must watch the performance of an organization’s network to make sure that productivity is not affected. Network outages can have a huge effect on an organization, lost revenue and the cost of unproductive employees can severely damage the organization, Earlier this year I also talked about the troubleshooting process which goes hand in hand with network troubleshooting.

One of the major things a network engineer or administrator needs to have is documentation of the network. That may sound like common sense and it is but with outdated documentation or no documentation at all makes it difficult to diagnose where the problem is and even more difficult if more help is needed from outside sources. Just with your computer, car and home, networks need to have a regular maintenance from time to time. Spending time on an operational network can improve its performance and uptime. Instead using the “duct tape” approach when outages occur.

So we know documentation is important but it always seems to be on the backburner, “I’ll get to it when there is time to burn” From my recent experience working with a hospital is that when things went down I had no idea where to start, the documentation was outdated twice over and let’s just say the problem could have been fixed in 5 minutes instead of an hour. Needless to say I broke out Microsoft Visio and worked from the ground up. Having documentation is good idea but the kicker is to remember to update it when changes occur :).

Simple problems are usually simple solutions, we all have been there, a problem comes up 5 minutes before heading home and the problem itself is simple but for whatever reason the solution cannot be found. From my experience from troubleshooting labs and real world environments I have found that looking at the whole problem helps and take it one step at a time. For example a server or PC no longer has network access, does another PC or server that’s on the same network able to use the resources needed? Continue to use the Troubleshooting Process and ask those simple questions until the solution is found.

Don’t cover your tracks! I have done this couple of times myself, when configuring a device like a router, switch, firewall etc. no matter how simple the configuration change seems to be always, copy the commands to a text file after you configured the device. Or better yet copy the commands to a text document first, read it through to make sure it seems correct and if something fails reverse the command.

Don’t copy-run start! This kind of goes with the one above but this method has saved me in some sketchy areas. When I take an already working device and throw some extra commands on it and now it disconnects from the network making an already problem worse than before. A simple reload or a pull of a power plug brings the device up before I made those changes. Once I have confirmed that those commands worked I then issue the famous copy run start command.

Monitor your network, How do you know what “normal” looks like when you have nothing looking at it? There are tons of paid and free tools to help you find that “normal” network load. So when something happens outside of normal you know about it. Some tools that I have worked with are PRTG, MRTG, Syslog servers like Kiwi Syslog, I have also heard of Nagios, WhatsUp Gold, and much more. A simple search will point you in the right direction ;).

I hope this information is helpful when troubleshooting networks because although it can be a headache sometimes it’s my favorite part of the job. If you have any ideas for the next topic that deals with either ICND1 or ICND2 you can always comment below.