Keeping websites up and running when under pressure

We’ve had several instances where we’ve had to manage big peaks of traffic to government websites

dxw provides development, hosting, and support services for over 100 public and third sector websites. The kind of support or hosting we provide varies depending on the size and profile of the sites and the capability of the internal tech teams we work with.

Most websites come under pressure at some point, and it’s important that they’re able to keep running and providing services to people when that happens. In this post, I’ve shared some of the approaches we take to make sure sites are able to cope with spikes in traffic.

Keeping national infrastructure up and running

Among the websites we support are a number of major government sites that provide essential information and services to the public. They include sites like NHS England and the Judiciary. As you’d expect, at times, they come under extreme pressure.

Traffic spikes can be caused by external factors outside of an organisation’s control that drive people to a site for news and guidance, or something that we know is coming like a big announcement.

All the websites we host are built to be resilient to peaks. NHS England, for example, is supported by 6 machines in multiple locations so it’s able to deal with its normal load, plus a bit more. For newer sites, we can put auto-scaling in place which automatically adjusts capacity to maintain a website’s performance in response to changes in demand.

Advance warning means we can be better prepared

Our new hosting platform offers greater scalability, but we know that in some circumstances auto-scaling just won’t be fast enough due to the lag in indicators if there’s a sudden rush of traffic. It helps to have advance warning wherever possible so we can prepare for a spike that we know is coming. Something like a message that’s going to be sent out asking people to do a thing that means logging onto the site. Even 1 hour’s notice means we can increase our capacity to cope.

Sometimes doing something quite simple can make all the difference. Often when an announcement is due to be made, for example, lots of searching on a given topic before something is published can be hard for a site to cope with. One way we can manage the impact of that is to work with organisations to change the way they publish things. Creating a holding page will stop so many people searching repeatedly and help to mitigate peaks.

There’s a difference between scaling for transactional sites and content based sites. The latter are much easier to scale. A content based site can easily be cached by a Content Delivery Network (CDN) or a local in memory cache, so the database isn’t used as frequently and the web server resources for rendering aren’t used as much.

On the other hand, transactional sites are likely to write to the database or render the pages differently for every user. So you’re likely to have to scale your web servers and database to deal with the extra load of a spike in visitors if you want to maintain your service and stop it from crashing.

Our experience this year

Over the past 12 months, we’ve had several instances where we’ve had to manage big peaks of traffic to different websites. Sometimes this has been in relation to the pandemic, sometimes it’s been the result of other high profile announcements or events.

Having the right foundations in place, and being prepared wherever possible, has allowed us to respond quickly and keep services available to the public when they need them.