The internet is littered with ‘dead links’

There’s an adage that the internet is forever, but new research finds that’s not exactly true.



A quarter of web pages that existed between 2013 and 2023 are no longer accessible, according to new research from Pew Research Center . This so-called “digital decay” is leaving trail of dead links across websites for the government , news media, Wikipedia, raising an important question: What is the long-term cost of losing a Library of Alexandria’s worth of web pages over the course of a decade?



To get to their number, researchers collected a random sample of nearly one million web pages from Common Crawl, an internet archive service, and checked to see if they were still accessible now. Pew found 38% of pages from 2013 were inaccessible; 15% were inaccessible from 2022 and 8% for 2023, showing that while link rot is something that grows with time , it’s still a problem even for sites that are just a year old.



Pew defined inaccessible sites as those that no longer exist on their host servers (aka users get some sort of “404 not found” message when they visit the page). Researchers found at least one broken link in 54% of Wikipedia “reference” sections, on 23% of news webpages, and 21% of government webpages .



“News sites with a high level of site traffic and those with less are about equally likely to contain broken links,” authors from Pew wrote. “Local-level government webpages (those belonging to city governments) are especially likely to have broken links.”



The survey suggested social media sites also deal with a high level of dead links. Nearly 20% of posts are no longer publicly visible on X, the site previously known as Twitter. Of those posts, 60% were from accounts that were either now private, suspended, or deleted, and 40% were posts that had been deleted from accounts that still exist, researchers found. Though they didn’t look at MySpace, there wouldn’t be much to see even if they tried. In 2019, the site lost every piece of content uploaded before 2016 .



“The internet is an unimaginably vast repository of modern life, with hundreds of billions of indexed webpages,” the authors wrote. “But even as users across the world rely on the web to access books, images, news articles, and other resources, this content sometimes disappears from view.”



While former MySpace users might not mourn the loss of awkward photos from their teens and early twenties, the overall phenomenon of digital decay threatens to leave us with less information and an incomplete picture of the evolution of the web.



In a time before personal computers and smartphones, a research project might entail visiting a library to scour physical encyclopedias or view old newspapers on microfilm. Today, people assume the internet will be an eternal repository of knowledge, available with a few taps of a keyboard. It turns out we might have far less available to us than we assume.

Top Articles