Think Before You Tweet: Library of Congress Archives Twitter

Library of Congress Archives Twitter

In an April 2010 agreement, Twitter granted the Library of Congress permission to archive public tweets from the social network’s inception in 2006 until 2010. 170 billion tweets later, the Library of Congress has completed its task and continues to process roughly 50 million tweets per day in an effort to record the digital communications that mark societal trends.

Integral in serving both Congress and the public, an archive of tweets is not only representative of the modes of communication that define the twenty-first century, but reflect the dominant narrative shaping society.

In order to create more relevant legislation in the future, the archive of tweets will be available to researchers and lawmakers, provided it is not used for profit. The National Library announced in a public document,

“Archiving and preserving outlets such as Twitter will enable future researchers access to a fuller picture of today’s cultural norms, dialogue, trends and events to inform scholarship, the legislative process, new works of authorship, education and other purposes.”

With technological constraints, however, the Library of Congress has not yet granted access to scholars and legislators. “It is clear that technology to allow for scholarship access to large data sets is not nearly as advanced as the technology for creating and distributing that data,” Gayle Osterberg, the library’s director of communications wrote in the press release.

Since its completion, over 400 researchers have requested access to the archive, with inquiries ranging from the rise of citizen journalism, interest in elected officials’ use of the social network, vaccination rates, and stock market activity. Due to the technological hiccups mentioned above, the archive is not yet user-friendly. A single search, Osterberg continues, could take 24 hours, severely limiting search possibilities.

Furthermore, while tweets are public information, concerns over privacy have prevented the Library of Congress from agreeing to make these records readily available to the public. Concerned centered primarily around deleted tweets and personal information, prompting the Library of Congress to include the following in their agreement with Twitter:

“The Library could make available any portion of the collection six months after it was originally posted on Twitter to “bona fide” researchers.”

The Library of Congress also reserves the right to “filter certain things or wait longer to make them available,” said Martha Anderson, director of the National Digital Information Infrastructure and Preservation Program at the library.

The completion of the first phase of curation comes on the heels of a December announcements of users’ ability to download their personal archives. Curious about what your first tweet was about? Instructions on how to download your archive can be found here.