Library of Congress is among the largest literature repositories in the world. It has now set its eyes on archiving every single tweet that goes out on Twitter. To this end, it has already collected 170 billion tweets and is in the process of archiving them.
Library of Congress had reached a deal with Twitter back in 2010 about archiving tweets. As per the deal, Twitter is now providing the Library with access to all of its public tweets. Initially, Library of Congress sorted and archived the 21 billion tweets generated between 2006 and 2010. Later on, Twitter provided access to another 150 billion tweets that have been posted since April 2010.
The sheer bulk of tweets generated each day has grown exponentially over the past months. Currently, Twitter users are generating nearly half a billion tweets each day! Library of Congress apparently plans to archive each and every public tweet, deeming it a significant expression of society. However, in doing so, it will have to expand significant resources as we move to even greater number of tweets posted every day.
Commenting about the project, the Library recently wrote, “Twitter is a new kind of collection for the Library of Congress but an important one to its mission. As society turns to social media as a primary method of communication and creative expression, social media is supplementing, and in some cases supplanting, letters, journals, serial publications, and other sources routinely collected by research libraries.”
Library of Congress further said that although a number of researchers have contacted it to access the collection of tweets in order to further their own researches, the Library hasn’t yet approved any of these requests. In archiving the 170 billion tweets so far, the Library has used 133 terabytes of data, which encompasses two full copies of the entire data.
Source: Library of Congress