Network Uptime Strategy: Five Ways to Avoid Late-Night Calls
It’s Saturday night. You’re just sitting down to a fantastic dinner with some great friends. Life is good. And then the phone rings. The network is down.
This is a familiar scene when you work in IT, and the timing is always terrible. So, how do you avoid getting that call? How do you solve the nuisance of having to skip out on plans, troubleshoot late into the night and eat cold leftovers by yourself?
The best way is to start with a network strategy that increases redundancy, fault tolerance and functionality. ANM can help with every aspect of this strategy – from initial network design to monitoring, testing, support and documentation. We see to it that you get the least intrusive failures, the most uptime and the most time doing the things you really enjoy.
Here are the five steps we can help you take to protect and secure your network from the ground up.
- Design. The old adage about ‘failing to plan’ applies quite aptly to your enterprise network. Begin with the assumption that, at some point, something will fail. This may be a faulty switch, a cut fiber or a lengthy power outage. Murphy’s Law assures us that something will happen, so understand how problems will affect your network traffic and build your network with planned failures in mind.
Critical equipment should include dual power supplies and redundant cores. Cisco’s Hot Standby Router Protocol (HSRP) and multi-chassis link aggregation groups (LAGs) are the industry standard, providing necessary redundancies to keep things running. Your design strategy also should include equal cost multipathing, implementing Layer 3 access and setting up protocols to reroute traffic depending on link failure.
Newer technologies such as SD-WAN are ideal for future-proofing your network. Automatic failover, self-healing attributes and performance-based routing are all great strategies to implement early in your design phase. These approaches can be deployed before a packet ever hits the wire, saving you from hours of weekend troubleshooting and minimizing negative impacts.
- Monitoring. A frantic customer call is not the best way to learn of a network outage. But are you even made aware of it when a power supply switch dies? With effective network monitoring, you will be alerted to issues before the end user, giving you valuable lead time to troubleshoot and solve the problem.
Leading tools like Cisco DNA Center have robust monitoring and assurance capabilities to ensure your equipment is performing as expected. This central network management system provides granular insight into your users, applications and devices – with the ability to learn and adapt to network changes. ANM offers dedicated 24/7 managed services to handle monitoring and troubleshooting services on your behalf – so your plans are never interrupted and you get the rest you need.
- Testing. You’ve designed a robust network with failures in mind and constant monitoring in place. So what happens when an ISP failure occurs? Do your redundant cores actually work as designed? You need to test to find out.
Testing is a time-consuming but necessary step in your network strategy. This includes scheduling a maintenance window and effectively simulating various failure scenarios. It is important to get ahead of outages and make you’re your failover plan works when you have the luxury of time. Fail your ISP links. Reboot a core. Make sure that your robust network design is working as its intended. Then, when it does fail at 2 a.m., you will already know what’s going on.
- Documentation. You have to know what your network looks like to know what can go wrong with it. Proper documentation is key to understanding and troubleshooting potential network issues. While a greenfield deployment may be well documented, networks grow, change and evolve over time. Proper documentation must be reviewed often and updated frequently. Small changes that you plan to document later somehow never show up.
By helping you keep up with important documentation, ANM simplifies troubleshooting, supports seamless knowledge transfers and simplifies onboarding of new staff. By keeping your network blueprint up to date, we save you hours of future headaches.
- Maintenance. Once your network is well designed, well documented, tested and monitored, there is one more step to minimizing dinner party interruptions. Over time, new patches, software images, security fixes and functionality enter your network. The switch with a 17-year uptime looks great, but it is likely running code with a long list of security vulnerabilities. In addition to back-end network care and feeding, you must maintain current software images on your devices. This helps you stay ahead of bugs and security holes that get exposed over the years.
While network outages may be inevitable, following the five steps above will result in fewer calls and faster resolution when the phone does ring. As your IT partner, ANM can assist at every stage of networking and automation. We can help with a network assessment, using tools such as Cisco Active Advisor and RISC assessments. We also help you keep on top of what devices you have out there, when their end-of-service and end-of-life dates are and what code version you’re actually running.
Our experienced engineers have seen just about everything and always have you covered. We’re even happy to answer that call on Saturday night.
John Gallow, Senior Consulting Engineer
John is a pre-sales architect with ANM who has implemented redundant designs for many large enterprises. Previously, he served as a deployment engineer specializing in core route/switch and wireless technologies. Outside of work, he can often be found on his motorcycle, fly fishing in the mountains or working on his home PC.