There is a lot of cynical thinking ascribed to the position set out by Mark Zuckerberg on the regulation of Facebook and other online platforms. In this instance, he is absolutely right. This is not a new problem. Lawrence Lessig was among those to make clear, long before Facebook and Twitter were born, that some balance of regulatory mechanisms must be struck to prevent the overpowering rise of “architecture” embedded in software code. This video (posted June 1, 2015) about “Code and Other Laws of Cyberspace” is our Lawrence Lessig-inspired answer to the challenge of how to think about regulation in this case.
Over the last ten years, we have had the privilege of working with a wide variety of amazing researchers interested in studying social data. From the earliest days that we worked to develop the free, open source Coding Analysis Toolkit, researchers have asked us about getting blog data, RSS feeds, and other text or image data streams via APIs.
Along the way, we have learned many important lessons and adapted our software to try and keep up with the most ambitious scholars. Researchers are always seeking more and better data along with new tools to unearth insights from, for example, the flood of multilingual social media. Dr. Patricia Riley of the USC Annenberg School of Communication is one such researcher. In this interview, “Patti” shares a number of important ideas about how to study the future with social data.
I am very excited to be undertaking a whole new line of research into political fear. Inspired by my work with Glen Szczypka and the Health Media Collaboratory, I had this proposal accepted for the October 30, 2012 Sentiment Analysis Symposium.
Title: Fear and Loathing on the Social Campaign Trail
Abstract: What are voters afraid of on the eve of the 2012 election? Fear is one of the most freely expressed forms of sentiment in social media. This “Voice of the Voter” presentation looks social data collected in the final week of October and speaks to the nature and salience of fear among the electorate. Bridging political and computational science, Dr. Shulman will present a frightening array of scenarios predicted in the Tweets and Facebook updates as the final phase of the campaign transpires.
Be afraid. Be very politically afraid. Then, please join millions of others already scared out of their wits and express yourself on social media about the approaching election. Results to be posted here October 30th, 2012.
Thanks to some excellent ground work by Joe Delfino and Sean Kelleher, Joe, Sean and I were able to make a pilgrimage to Google, Facebook and Reputation.com for a wildly exciting day of briefings with Q&A. While I’d love to share the details, I can’t! Big secret 😉 However, I can share a few pictures and stories from our day in Silicon Valley… Stu at Google – Take away message: “This was a great meeting!” Sean at Google – “I could move to California.” Joe at Google, after spending the week in the Bay Area attending the 2011 Sentiment Symposium, and the Text Analytics News Conference. “I am already (in my mind) living in California and running the west coast operation.” Stu and his well used Camaro. While running a bit behind schedule on the way to Reputation.com, it is alleged the driver took advantage of the fast moving California 101 freeway, the state’s liberal u-turn policy, certain optional passing strategies based on scenes from action and/or science fiction film, and his passengers stomachs. Joe at Facebook – Joe Delfino got us this meeting. Joe gets meetings. Joe is a meeting-getting animal. We like Joe. When my son saw this picture of his Dad at Facebook on Facebook, he said: “Wow Dad; you look really happy!” I sure was happy. We had come from Google feeling deeply engaged by one of the greatest companies in the history of capitalism and we were sitting in the lobby of another. We had lunch with a gracious host at the company cafeteria and a demo with a diverse group of Facebook sentiment analysts. After years of academic presentations, the freedom to present in jeans and a QDAP t-shirt was a perk that I could probably get used to. The meme ‘west coast office’ was heard frequently as we blazed out of Palo Alto and headed for Redwood City. After the long day in Silicon Valley, the team got stuck in 101 rush-hour traffic, slightly grouchy and despondent, but made it to a wonderful restaurant, Burma Superstar, in the Pacific Heights neighborhood for beer, food, and good company near a place where a Hobbit had been spied. By the time we had returned the Camaro and made it to the train to the SFO terminal for our red eye, we all realized the magnitude of the day we had. It was a huge lift for our confidence and an exciting glimpse into where Texifter is going. It is nearly certain that Texifter will be back on the West Coast soon.
DiscoverText is rolling-out an addition to its analytical toolkit: random sampling. The Web-service already offers an array of tools for text analytics and rigorous, team-based qualitative data analysis. These functions include the ability to code and annotate text, measure inter-rater reliability, adjudicate coder validity, attach memos to text, cluster duplicate and near-duplicate documents, share documents, and to classify text using an active-learning Naive-Bayesian classifier. While still in beta, random sampling is a key new addition. After DiscoverText users amass extraordinary amounts of social media data (for example via the Public Twitter API, the GNIP Powertrack, or the Facebook Social Graph), they can now more easily extract a random sample for analysis. The size of the sample is decided by the user in order to accommodate to iteration, experimentation and other scientific methods. The option is streamlined into the dataset creation process. On the new dataset creation page, you see a sample size prompt. This additional method for data prep and analysis augments current information retrieval techniques, such as search with advanced filtering. It also builds up our framework for expanding available NLP methods from straightforward Bayesian classification, which aims to analyze substantial quantities of data in their original bulk-form, to a menu of computationally intensive methods that can iterate more quickly and effectively against random data samples. For example, the LDA topic model tool we are releasing will be faster and more effective against smaller random samples. This new feature accommodates both an additional analytical approach as well as the opportunity to easily compare results between competing (or complimentary) analytic methods. We look forward to experimenting with this new tool and hearing about how random sampling will enhance the research of our users and users to come. Special Note to DT Users: We need to turn this feature on one account at a time while we are testing it. Drop us a line if you want to try the tool. We’ll keep you posted on the launch as more dataset modifications are pushed live. As always, if you have any questions, feel free to email us anytime at email@example.com. Your feedback is crucial. Sign up and try it out for yourself at discovertext.com.