This is the latest DiscoverText filtering feature designed to speed up the creation of accurate custom machine classifiers. This video shows how we use an interactive display of classifier scores to isolate items in a dataset that require further human coding to improve the accuracy of the classifier. Click on the screenshot below to start the video.
This 60-second video introduces our start-up. Click on the image below to see the Screencast and let us know what you think.
This is Version 1 of the 60-second Texifter elevator pitch . Feedback and questions are truly welcome. Just email firstname.lastname@example.org . ~Thanks!
On November 9th, the Federal Emergency Management Agency(FEMA)conducted its first national test of the Emergency Alert System. In some communities this meant full involvement, with teams responding to mock emergencies, and managers monitoring the execution. In the deaf community, the response to monitor was regarding two Twitter hashtags, #SMEM, and #DEMX. The #SMEM hashtag is specific to the emergency response community, and was created over a year ago, and the #DEMX hastag is specific to the deaf community, but created specifically for this event. Monitoring the usage of these hashtags was Steph Jo Kent, a PhD. Candidate in Communications at the University of Massachusetts. Steph’s goal was to monitor the spread of these hashtags throughout the deaf community and emergency response community and how they crossed channels. In order to do this, she utilized DiscoverText, which is how I was lucky enough to become involved in the project. Monitoring these specific Tweets adds to the already diverse functionality of DiscoverText. To start the project, we simply used the Twitter API to harvest uses of #SMEM and #DEMX beginning on November 2. After the event on November 9, we continued to harvest uses of the hashtags. By early December, we had archived nearly 800 Tweets using the hashtag #DEMX, and nearly 8,000 Tweets using the hashtag #SMEM. From these two archives, it is possible to breakdown Tweets by time and person, giving us valuable information about key individuals and how they spread the hashtag. For Steph’s research, it was particularly valuable to isolate the crossover between the two hashtags. Using our search feature, we were able to isolate cases of crossover and bucket those results. This allows us to move from noisy data, to a more manageable and germane grouping of Tweets. From here, we utilized the newly optimized TopMetafeature to breakdown the occurrences by day and by user. We were able to discover which days and individuals produced the most Tweets. The information we found allowed us to better visualize how the Tweets broke down before and after the event. The results showed a small number of users producing the majority of Tweets, and that prior to the event, there was more usgage of the hashtags. Unfortunately, the mass crossover of Tweets that we had envisioned did not occur. There was a minimal amount of crossover, meaning the message did not travel well through the two communities. Steph has posted a detailed analysis of her findings on her blog, where she uses her expertise to analyze the project. In the future, this same methodology can be applied to hashtags that have been created for marketing or other purposes, such as hashtags for television shows and large events. There is valuable information in these hashtags; they reflect an emergent folksonomy that influences how ideas, links and memes spread over Twitter. Using the GNIP Power Track, these hashtags can be leveraged as metadata, broken down over time and used to display how well information did or did not travel. Overall, this was great experiment, and I am happy to have had the opportunity to collaborate with Steph, and to have participated in a project that has the power to influence the way social media is used to interact those in the deaf community.