Data scientists working on text analytics know cleaning data can be time consuming. Users of DiscoverText build reusable custom machine classifiers or “sifters” to find the most (or least) relevant items before using other classifiers for sorting items into topic, sentiment, and other categories. DiscoverText combines hybrid data science methods (measurement, adjucation, iteration, replication) along with established e-discovery text analytics tools, to shorten a process that used to last weeks or months when words get sorted in spreadsheets. Our machine-learning sifters are created in hours or just a few minutes using crowdsourcing. We offer an API and support technical integrations with Twitter and SurveyMonkey. Academics trust DiscoverText to help them do better, more transparent research, resulting in more scholarly publications. Legal teams use our document redaction capability to remove names, metadata, email addresses, and other sensitive information to produce Bates-stamped and spreadsheet-indexed PDF collections.
Boolean defined search, n-grams, word clouds, and custom topic dictionaries are power tools for text analysis and machine-learning
Discover central topics and also elusive but valuable unexpected or rare concepts. Use this information to train machine-learning classifiers to recognize relevant text and social media data. Jump into data using an interactive word CloudExplorer or build a mini topic dictionary using “defined” search. Try our new listview for seeing the top 300 bigrams and trigrams in your data
Create gold standard training sets by labeling your training data accurately and reliably using our state-of-the-art collaborative annotation system. Then use our trusted, multilingual machine learning web service (uClassify) to create and apply your own custom-trained text classifiers. Please take the time to check out the work being done by the large and growing uClassify community.