What we do better than anyone else…
What we do better than anyone else…
Next up: “Fake News & Other AI Challenges” in Vienna. Texifter Founder & CEO Dr. Stuart W. Shulman will present new applications of the DiscoverText platform to the problem of humans and machines learning.
Contributors: John Priebe, Mark Hoy, and Stu Shulman
Note: This is the third in a series of posts about ongoing work initiated by @stuartwshulman for a keynote talk at #screentimebu2018 titled “Fear and Loathing in American Politics: Emotion Detection in Twitter Data,” updated to reflect ongoing research.
The Project – First Person Expressions of Political Fear
In Part 1 of this blog series, we examined how fear can function as a profound political force. It was influential in shaping societal discourse on such topics as the Cold War and the events of September 11, 2001.
There are a wide range of topics that instill fear. Sensing an increase in Fearful American political discourse, we initiated a project to study Twitter tweets specifically related to first person expressions of fear in relation to current U.S. politics. Some examples of tweets that fit this categorization are listed here:
The methodological approach to the research was discussed in Part 2. This third blog post in the series explores how our patented CoderRank system can be used to validate the human coders’ work and create Gold Standard training sets for machine learning on DiscoverText.
Twitter data was collected through Texifter’s web-based DiscoverText service, which provides real-time access to the ongoing stream of Twitter tweets. The Twitter data was licensed by Texifter and stored in DiscoverText, which can perform advanced text analytics to search, filter, code, and machine classify the data. DiscoverText’s analytics solutions are designed specifically for collecting and cleaning up messy survey, public comment, email, Twitter and other text data streams.
Coding the Tweets by Several Human Coders
Humans are good at some things and computers are good at others. A consistent back and forth between humans and machines increases the ability of both to learn and classify unstructured textual data. Using both, accurate coding with high levels of inter-rater reliability and validity is possible. To this end, we assembled a team of coders to read each tweet and then classify them into one of two categories: a first person expression of political fear, or not.
The rules to determine the coding for a tweet are fairly simple. The tweet must grammatically be in the first person. Therefore, a tweet that merely quotes another person or is a retelling of something someone else said is not a first person statement. The tweet must also relate to a political topic. Finally, the tweet must express the emotion of fear. The presence of certain words or expressions, such as “fear”, “I fear”, “I am afraid of”, “scared”, or “terrifies me” are good clues, but the coder must carefully read and understand the tweet to accurately code it.
DiscoverText is designed to record the speed and accuracy of coding over time because we have found that efficient coders are both fast and accurate. Validation (also referred to as adjudication) is the process of deciding whether the coders’ work is accurate and thereby suitable as training data for machine learning. The validation process is used to confirm the validity of individual coder selections and resolve discrepancies or differences among coders. This capability is especially useful during training sessions when reviewing and adjudicating the work of coders to ensure they understand the requirements of the coding task. Based on the results of adjudication, DiscoverText can statistically analyze how often coders are correct in their assumptions about the text.
As seen in Figure 1, a dataset with 2000 items (tweets) has been coded by 6 coders. Each tweet was coded as either a “1st Person Expression of Fear” or “Not”.
Figure 1. A dataset with 2,000 data items (tweets) has been coded by 6 coders.
The validation process begins by selecting the validation tool in DiscoverText (Figure 2). You can optionally chose to validate only those data units where more than one coder has coded the unit, or to exclude those data units where all of the coders are in agreement. This second option reduces the number of data units to be validated when you have a team of accurate coders and it is unlikely that the validation outcome will be different than the team of unanimous coders.
Figure 2. Beginning the validation process.
During the validation process, each data unit (tweet) is displayed, along with the code that each coder selected and what percentage of the coders decided to code the unit as a first person expression of fear (or not). The person performing the validation then decides whether the code is valid or not valid. In Figure 3, five coders determined that the tweet “There is so much disagreement and anger… I fear for our country.” is a first person expression of fear; the validator agrees with this coding and selects “Valid”.
Figure 3. Adjudicating a tweet. Five coders decided that this tweet expresses a first person fear.
Sometimes the validator decides that one (or more) coders was incorrect in their assessment of a particular tweet, and marks it as being “Not valid” (Figure 4).
Figure 4. Adjudicating a tweet. One coder decided that this tweet was not a first person expression of fear, which through adjudication was determined to be an invalid coding.
Viewing the Adjudication Report
The results of the adjudication process can be viewed after adjudicating the coders’ work. It is now possible to rank the coders by their level of accuracy in order to find, retain, or reward the best coders.
The adjudication report (Figures 5, 6) illustrates how often a coding choice was deemed to be valid or invalid and provides a breakdown by code and by coder. While researchers can decide for themselves what level of accuracy is deemed acceptable, for the coding of this data set one coder is clearly less accurate than the others by selecting the valid code only 65.2% of the time (Figure 7).
Figure 5. First page of the adjudication report.
Figure 6. Second page of the adjudication report. Names of coders have been obscured for anonymity.
Figure 7. Third page of the adjudication report. The third coder (circled in green) was the most accurate coder. The sixth coder (circled in red) was clearly the least accurate coder as determined by validation. Names of coders have been obscured for anonymity.
This blog series examines how expressions of 1st person political fear can be identified using a combination of techniques, including Boolean search straings, human coding, and machine learning. DiscoverText was used to collect and sample data, as well as to manage the annotation project. To date we have over 22,000 labeled items and our final post will explore whether or not the end product (a 1st person fear detector) works in the context of American politics.