Skip to Main Content

Archiving Websites and Data

Tools that can help preserve access to websites and datasets that are needed for research and publication. This guide was created from the work of Shauna-Kay Harrison and Abby Sypniewski

Websites Are Ephemeral

There is no guarantee that anything we can access via the web today will be there tomorrow. The Pew Research Center found that 38% of webpages that existed in 2013 were not accessible a decade later. For Government websites (local, state, and federal) Pew found that about 20% of webpages contained at least one broken link. For news sites that percentage was even higher at 23%. 

If you are using web content for your research you should take some steps to ensure that you don't lose access to it.

This guide aims to help you decide which web archiving tool best fits your needs based on a number of factors. It is organized by use case (i.e. what it is that you’re archiving) along with other factors to help narrow down your choice.

 

Web Archiving

Web archiving is how we make sure what we care about on the web doesn’t disappear. It’s the process of capturing a “snapshot” of a website and preserving it over a period of time for someone to experience it the way we did at the time of capture.

 

Missing U.S. Government Websites and Data Rescue Efforts

If you're looking for US Government information online you may have encountered websites or datasets that have gone missing. It has become increasingly common for US government data sets that were previously publicly available to be removed. Some of these datasets may be altered and made available again, while others may remain offline indefinitely. 

The Data Rescue Project Data Portal is a clearinghouse for information about US government data that are at risk as well as information about efforts and ways to access data that has been preserved.

Our U.S. Government Information Research Guide has a list of non-governmental resources that have preserved US government-produced websites and data.

 

 

Web Archiving Initiatives

There are a number of groups making collaborative efforts towards preserving what is currently on the web including datasets, news articles, fanfiction, videos, websites, and more. Some of these organizations include:

For More Information on Web Archiving

Last Updated: Oct 8, 2025 7:38 AM