We have been delighted with the response to our call for beta testers to try the GNIP-enabled PowerTrack for Twitter. You can still sign up. Round 1 of the beta test concludes on October 31, 2011. Even just testing the system’s data filtering and collecting capabilitiesfor 1 or 2 days, or as few as 1-2 hours, may convert you to a devoted GNIP via DiscoverText user. As part of taking beta tester applications, we asked folks to tell us something about how they planned to use the beta test opportunity. Thanks to ” Wordle” we can visualize an answer to the question: “Why do people want to take part in the GNIP beta test via DiscoverText?”
This is an 11-minute tutorial covering how you get started using the GNIP Power Track for Twitter (the “full firehose”) to capture large numbers of Tweets for analysis.
This short video talks about some of the advantages when using the GNIP-enabled Power Track for gathering Tweets via DiscoverText.
DiscoverText is preparing to launch a short and exclusive beta test period using “PowerTrack for Twitter Firehose Filtering” a service provided by GNIP. Compared to the “rate limited” service offered by DiscoverText through the public Twitter API, the “Full Firehose” is 50-100 times the volume with powerful Klout, language and keyword filters. If you would like to participate in this trial, please leave us your contact information and tell us a little bit about your work. We will not be able to offer this trial service to everyone, so please make the case for the value you or your organization will add as beta testers.
The evolution of the API opens the door for third-party developers to access information on social media networks. In the best case, this provides a healthy, democratic flow of information. Yesterday, DiscoverText had “rate limits” imposed in terms of its access to Twitter data. As written, the Twitter API allows unauthenticated calls of 150 per hour, per IP address. Authorized calls (users logged on using their Twitter credentials, also known as OAuth) allow for up to 350 calls per hour, per person. In addition, the Twitter Search API has internal rate limiting mechanisms, but Twitter does not publish those specific limitations for fear of abuse. Going over any of these limits results in the user being presented with “Error 420”, which simply means that the user is being rate limited. This hampers the ability to harvest twitter feeds within DiscoverText. We have never had rate limit problems prior to this, but according to timestamps on articles posted on Twitter’s developer website, Twitter might have become more cognizant of those harvesting large amounts of data (not just us), and as a result, are cracking down on heavy users. At Texifter, we fully respect the rules and regulations of the Twitter API, and in no way seek to disobey or bend these set rules in our flagship software product, DiscoverText. On August 18, 2011, the same day we learned of the 420 errors, we performed emergency maintenance to better cope with Twitter rate limitations. We also wanted to more gracefully handle rate limitation errors and to ensure we abide by Twitter Terms of Service. With that said, in order to continue our ability to harvest information from Twitter and perform our cutting-edge research, we are currently exploring easier and more reliable ways to harvest data. The maintenance performed on DiscoverText stills allow 1500 items per fetch as determined by Twitter’s architecture on the public API. In addition, no extraneous error messages should result when DiscoverText is being rate limited. Some searches might be silently delayed for 5 minutes, however, these fetches will catch up as soon as they can. In the near future, look for new developments for DiscoverText. We’ve got big plans for our social media API fetching that will greatly enhance our user’s ability to receive timely and actionable social media feeds. We don’t want to reveal too much right this moment, but we’re sure you’ll like what we have in store and in traditional Texifter style, we’ll plan a large announcement when the time is right.