Anyone who thought they could sneak around Twitter’s increasingly restricted API and get at historical and real-time tweets through the archive in the Library of Congress had better think again. While the Library is making a serious effort to index all tweets since 2006, they’re only opening up this archive to “known researchers” who have the approval of the Library to access the information.
Audry Watters of O’Rielly Radar took a close look at the Library of Congress’ Twitter archive one year after it had partnered with Twitter to begin collecting the data.
The Library has access to Twitter’s historical and real-time tweets through Twitter’s data partner Gnip, who also sells access to the Twitter firehose to interested developers and publishers.
Watters notes that the Library of Congress has been archiving digital content – such as politicians’ websites and digital newspapers – for over a decade, but that Twitter’s constant flow of content (as much as 140 million tweets per day) poses a unique challenge to their archiving abilities.
As it stands, the Library isn’t seeking to catalog all of the tweets on Twitter just yet, but rather provide an index of these tweets that will be searchable by researchers looking to conduct a study of some sort. They will not be opening up this search to the general public, however:
“…access to the Twitter archive will be restricted to “known researchers” who will need to go through the Library of Congress approval process to gain access to the data.”
So, while the Library expects to open its digital doors to its Twitter archive in about four or five months, the average citizen won’t be able to casually look up what their first tweet was, at least for the foreseeable future.
- Twitter's Jack Dorsey Ranks 11th Amongst 50 People Who Changed Our Lives In 2013
- Twitter Changed Its Policy On Blocking Accounts... And Changed It Back Within A Few Hours
- The Most Retweeted Tweet Of 2013 Was...
- #YearOnTwitter: A Look Back At 2013