Misconfiguration Takes Down Internet Ads
Last week the internet went without ads, well at least for about 90 minutes, after Google's DoubleClick for Publishers (DFP) service, the ad-serving platform used by many sites, had an apparent outage.
Informationweek reported, "Google's publisher ad network brought in about $3.4 billion in Q3 2014, so 90 minutes of that works out to about $2.4 million"
Gigaom posted "News of the failure began to circulate on Twitter and on Google forums around 9:15ET, as publishing and advertising workers began to ask where all the ads had gone."
AdWeek emphasized that "Some rivals jumped on the disruption to suggest that DoubleClick's system may be straining under the weight of data used by real-time online ad-placement systems that try to show the right ads at the right time in the right place."
Twitter was covered with comments and complaints:
What Caused This Outage?
As the infrastructure in data centers becomes much more complex and the demand for change is more frequent, the challenge of keeping track of changes becomes nearly overwhelming.
The outage, according to Google, was "due to a misconfiguration, [and] we were unable to prevent the outage."
The misconfiguration sparked an outage lasting for about 90 minutes on November 12th, affecting an estimated 55,185 and leaving publishers briefly without revenue
The outage showed just how many web publishers rely on Google's ad-serving system. Also it's events like this onethat highlight the pressures that IT operations face in managing complex systems. Seemingly minor IT changes can slip into complex systems at anytime, whether authorized or not. As shown here, even an infrastructure the size of Google can come undone by just a misconfiguration, pushing a stable system into an incident state, resulting in an outage. As we saw this not only impacts reputation, but makes for angry customers, and heavy financial implications.