It would be hard to overstate the importance of avoiding downtime. Every minute your website is inaccessible represents both a financial and customer relations hit. The cumulative effect can lead to catastrophic consequences for e-commerce sites, in particular.
Internet outage events routinely lead to news-grabbing headlines. End-users rely on web platforms for an endless list of day-to-day activities like banking, navigation, and research. When the ability to reach these services suddenly grinds to a halt, the result is a chain reaction of frustration, confusion, and even panic.
Imagine running a retail store that is suddenly unable to complete in-process sales transactions or to accept credit cards. Not only would you lose sales in the moment, but the lingering effects of customer frustration. Will the impacted customers want to return to a store that couldn’t complete a simple sale?
Scale that situation up to millions of customers, and you might have an idea about what happened when Cloudflare went offline twice in the summer of 2019. The networking giant dropped 15% of its global traffic, impacting hundreds of the websites it hosts, including e-commerce behemoth Amazon.com. It was hours before Cloudflare fully regained a steady traffic flow. The company placed the blame on a traffic routing issue caused by Verizon.
Two weeks later, Cloudflare experienced a second outage, apparently due to a human coding error. This time, the entire service went down for a short time.
Most of us are not maintaining Amazon-level websites, but it is clear that a consistent online presence is essential for companies of any size. In the best-case scenario, visitors will be frustrated but try again later. Other visitors will simply move on to an alternate site, one that may become their preference in the future, as well.
In the meantime, frustrated end users may well start spreading the word about your unreliable website, leading to the ripple effects of bad PR.
Downtime causes and solutions
Websites experience sudden outages for many reasons. Some causes are squarely out of our control, but we have the power to prevent others.
Common causes
These causes make up the majority of downtime incidents. Many of these issues are preventable.
Hosting provider limitations and failures
Issue
Cheaper hosting solutions save money in the short term, but cheaper often correlates with less reliable service. Many times, these services are not professionally managed and can leave your site literally in the dark if they go out of business.
Cheap hosts are more likely to skimp on basics like security, leaving them more prone to DDoS attacks and hacks. Less aware companies sometimes sign on with a cheap web host without understanding key elements like bandwidth and data transfer rates. Insufficient bandwidth can be a clear path to unexpected downtime.
Solution
Partner with hosts that have a good reputation and high reliability ratings. Be sure to select a hosting plan that can handle your expected traffic and unexpected spikes. Consider a hosting plan that allows for unlimited traffic.
Expired domain or hosting agreement
Issue
Websites go down every day because someone forgot to renew a domain name or a hosting agreement contract.
Solution
The best way to avoid these issues is to plan ahead. Consider using an auto-renew option for both your domain name and web hosting plans. Record renewal dates in your digital calendar in case the auto-payment doesn’t complete. Credit cards expire, too, for instance.
Human error
Issue
Even well-trained, capable employees make occasional mistakes. Something as simple as removing the wrong file from a server can take down an entire website. A single critical coding error can have the same result.
Solution
In addition to hiring capable SecOps team members, you may need to provide increased oversight if errors are commonplace.
If you have a large enough team, encourage coding partners who help test code before it goes live. Create a set of best practices around coding and system maintenance tasks. This helps to establish consistency and the ability to track down errors more easily.
Traffic overload (DDoS) attack
Issue
Distributed Denial of Service (DDoS) attacks happen for a few reasons. We would all love to “go viral,” but huge spikes in traffic in a short time frame can take down a website without sufficient traffic capacity.
DDoS attacks can happen deliberately, as well, at the hands of bad actors who target a website by directing massive amounts of traffic to it.
Solution
Opt for a hosting plan with sufficient traffic capacity to potentially handle a DDoS attack. If you do experience unexpected traffic, your site will be better prepared to handle the volume. Establish and follow network security protocols and keep your security parameters up to date.
Technical causes
These causes are more technical and generally more challenging to prevent.
DNS Failures
Issue
While the DNS may have failed, downtime due to DNS can also be a result of human error or a general slowdown. Perhaps the DNS was misconfigured. Sometimes, DDoS attacks create DNS failures.
Solution
Investigate the DNS to uncover the root cause and then address it. If necessary, contact your web host, who should be able to verify if the issue is on the hosting or domain name side. Be sure the DNS is configured correctly, and take steps to prevent improper configuration in other instances.
Malicious attacks
Issue
These attacks can occur in several ways.
- Targeted attacks on your database can corrupt tables, interrupt your server functionality, and paralyze your website.
- Volume-based DDoS attacks can create a traffic spike beyond the limitations of any host.
- Less sophisticated but equally effective targeting can involve phishing and offline scams that grant system access to an intruder.
Solution
Again, network security is critical. Keep abreast of your network status at all times through the use of a reliable security platform. Establish best practices around system access.Instruct employees never to grant access to an unknown party, instead create a shortlist of people who are allowed to handle system access.
Consider a modern, AI-based security platform that can immediately detect changes and issue alerts.
CDN failures
Issue
Utilizing the cloud increases convenience and usability across a wide range of SaaS, online database platforms, file-sharing programs, and website backup functions.
Ideally, a CDN helps to prevent website downtime, but it is not without risks. As we saw with the Cloudflare incident, using a CDN-based host can open you up to additional risk versus a traditional server-based hosting provider.
The CDN itself can suffer from a DDoS attack or a targeted online hack, creating a sequence of events that ends with your site going down.
Solution
CDN failure is one of those issues that is mostly out of your control, but taking a Multi CDN approach has become more popular.
This way, you aren’t relying on a single service. As a best practice, partner with only reliable CDN-based hosts, as less reliable hosts are more prone to failures and attacks. Actively monitor your network so that you can respond proactively as much as possible.
When Downtime Happens
Despite your best efforts, your website is likely to experience an outage at some point. The good news is that there are off-the-shelf solutions that can help you quickly and effectively respond, and more importantly, reduce the likelihood of downtime.
Mlytics’ Multi CDN solutions provide companies a more secure, stable, reliable web presence. Our single platform product is easy to manage, intuitive, and robust enough to handle complex Multi CDN scenarios. myltics gives you access to multiple CDNs, AI-based global server load balancing, a web application firewall, DDoS protection, and much more.
Try Mlytics free for 14 days to find out how Mlytics will become your team’s favorite resource.