2011 State of the Union
The datafiles below are samples taken from live feed Twitter and Facebook feed imports during the 2011 State of the Union address.
All data files are in PCAT-archive XML format. To use, download and re-upload to your DiscoverText project using the "DiscoverText XML" format. Also, you
should right-click and save the file instead of just clicking on the link, as the files are quite large.
Huffington Post (Twitter, de-duplicated) (14,476 documents, 11 MB)
Redstate (Twitter, de-duplicated) (1,571 documents, 1 MB)
Sean Hannity (Twitter) (1,170 documents, 930 KB)
#SOTU (Twitter, de-duplicated) (53,369 documents, 40 MB)
#SOTU (Twitter, full dataset) (109,601 documents, 82 MB)
Sarah Palin (mentions on Twitter in #SOTU feeds) (1,271 documents, 927 KB)
- Whitehouse Official Facebook Page (Facebook) (423,358 documents, 315 MB)
Obama (Twitter, de-duplicated) (116,776 documents, 87 MB)
Obama (Twitter, full dataset) (222,441 documents, 166 MB)
Part of the State of the Union 2011 - Mobile Dial-Testing/Polling Initiative with Survey Analytics
Update: 5/4/2011 : 5:24PM EST Our apologies. We have been notified directly from
Twitter that our offering of the Twitter datasets above for free
is a violation of their Terms of Service as well as their
API Terms of Service. We are sorry, but we cannot continue to offer these
sample datasets, or any of the sample Twitter datasets in DiscoverText, in any form whatsoever.