Skip to Main Content

Digital Humanities at Georgia Southern

Text Analytics Tools

Text Analytics software can be used to classify, sort, and extract information from text to identify patterns, relationships, sentiments, and other actionable knowledge.



Voyant 2: Voyant 2 is a suite of tools for text analysis that can be used on either pasted text, documents from the web or uploaded files.  You can also read the blog of one of its creators Stefan Sinclair here.  Note that not all of the Voyant tools show up on the main interface.  There is a great list of other Voyant tools here.

Natural Language Toolkit (NLTK): NLTK is a leading platform for building Python programs to work with human language data.

TAPoR 3: The Text Analysis Portal for Research (version 2), highlights Voyant as a tool but it has a set of curated lists for other resources.  It is currently the best consolidation site for text analysis on the web.

twXplorer: Similarly Twitter  has made it harder to do analysis than previously, but twXplorer is still useful

Umigon: Umigon analyzes Twitter for "sentiments" (positive, negative, neutral) and is useful for looking at trends.  It is probably the best opinion mining tool available for Twitter.

Netlytic: Netlytic works cross platform on social media (including Youtube, RSS, Instagram) and is good for data visualizaiton.

Wordseer: Wordseer is an older text analysis project that relies on marked-up documents.  Still useful for a number of things, but it requires your data to be structured in a certain way.


Topic Modeling:

Mallet: Topic modeling analyzes clusters of words occurring together in order to refine or develop a topic for analysis.  Mallet is one of the best packages for this.

jsLDA: The goal of this project is to create a browser alternative to Mallet for users who cannot run executable files.