Media Cloud is an "open source and open data platform for storing, retrieving, visualizing, and analyzing online news. Media Cloud is newly multi-platform, using connected API keys to allow for retrieving, visualizing, and analyzing content from various social media platforms."
Many researchers are interested in doing automated computational analysis of large quantities of news content ("text mining"). The library's typical license agreement with the vendors of news databases do not allow for content to be downloaded en masse for text mining by individual users of the database. The full text or articles may also not be available for export, or not in a file format convenient for text mining. However, some vendors are now working with libraries to make news content available for text mining (see more resources on this page).
Visit our Text and Data Mining (TDM) support page for more details about TDM, and to help you decide if using the public interface of a news database fulfills your needs or if text mining is appropriate for your research needs.
The library has negotiated with ProQuest to make the historical files of several newspapers available for text mining. For each title, there is a limited date range for which the files are available. Files are in .txt format, and often number in the millions (see details below).
Please view our FAQ for more detailed information about ProQuest Historical Newspaper files for text mining.
These files are only available to current U-M faculty, staff and students and all researchers must sign an MOU (memorandum of understanding) before getting access.
Please contact shevonad@umich.edu or sdenn@umich.edu with any questions and to proceed with getting access to the PQ historical newspaper files.
Title |
Coverage Start Date |
Coverage End Date |
Number of Text Files |
---|---|---|---|
American Israelite |
1854 |
1925 |
250,000 |
Boston Globe |
1872 |
1983 |
11,239,627 |
Chicago Defender |
1909 |
1975 |
1,925,000 |
Chicago Tribune |
1849 |
1935 |
5,250,000 |
Detroit Free Press |
1831 |
1922 |
4,812,453 |
Detroit Free Press |
1923 |
1999 |
1,821,606 |
Guardian & Observer [UK] |
1791 |
1909 |
2,825,000 |
Los Angeles Sentinel |
1934 |
2005 |
1,018,296 |
Los Angeles Times |
1881 |
1931 |
4,667,709 |
New York Times |
1851 |
1934 |
8,073,453 |
Times of India |
1838 |
2005 |
6,828,509 |
Wall Street Journal |
1889 |
1936 |
2,650,000 |
Washington Post |
1877 |
1935 |
5,275,000 |