Library of Congress has archived 170 billion tweets … and counting
The Library of Congress has archived all of the public tweets shared on Twitter since its inception in April 2006 to the present and is exploring how to make the archive accessible to researchers in a more comprehensive and useful way.
The archive now includes approximately 170 billion tweets and growing about half a billion each day since October 2012.
The Library and Twitter signed an agreement in April of 2010 that would provide the Library with the public tweets from the company’s founding to the date of agreement along with all future public tweets under the same terms. This month, all of the objects of the original agreement will be complete with the Library acquiring and preserving the 2006 to 2010 archive, establishing a secure and sustainable process for receiving and preserving the ongoing stream of tweets and creating a structure to organizing the entire archive by date.
“Twitter is a new kind of collection for the Library of Congress but an important one to its mission. As society turns to social media as a primary method of communication and creative expression, social media is supplementing, and in some cases supplanting, letters, journals, serial publications and other sources routinely collected by research libraries,” wrote Erin Allen on the Library’s blog.