Your Privacy

This site uses cookies to enhance your browsing experience and deliver personalized content. By continuing to use this site, you consent to our use of cookies.
COOKIE POLICY

Skip to main content

Text Analytics Spells “Big Savings”

Text Analytics Spells “Big Savings”
Back to insights

Text analytics and natural language processing are extremely powerful concepts that are increasingly within organizations’ grasp. Many of the concepts for mining text to extract new information have existed since the mid-1980s, but with the rise of the data scientist the barrier of entry has been dramatically lowered. Before we talk about how text analytics might be useful to your organization, let’s establish a quick baseline of understanding. 

What is Text Analytics?

Text analytics is roughly synonymous with text mining, and text data miningTechnically it is not related to biblio-wizardry or vocabu-sorcery but I’d still like to think there’s some magic left in the world. The whole idea behind text analytics is taking a body of text and extracting valuable, discrete, or new information. Think about your business, then think about how much of a paper trail there is: E-mails, contracts, invoices, industry publications, etc. Most organizations have an absolute mountain of text information that is likely providing little value right now, other than its original intended purpose.  

  (See More about turning data into insights /data-activation-when-your-data-hands-you-lemons/ ) 

What about Natural Language Processing?

Natural Language Processing is a subset of text analytics that deals with aspects of language such as identifying the parts of speech, disambiguation, sentiment analysis, and the other vagaries of human language that computers will soon be better at understanding than we are. Although I’m afraid that no amount of context clues can help me understand modern slang (https://thoughtcatalog.com/january-nelson/2018/09/millennial-slang/ ). I used to be cool, but now I’m just a data geek.  

Text Analytics and Machine Learning

As you’d expect in the new frontier of data jiggery, there are quite a few different approaches to text analytics. Some of the more interesting approaches utilize machine learning to train a model on an existing corpus of text and apply that model to related text. Perhaps we’re looking to extract entities by identifying law firm names in a body of legal documents. Maybe we’re trying to measure a customer’s sentiment to a customer service call by identifying speech patterns and word choice. Maybe we’re trying to determine if two historical works are actually written by the same author, or if they’ve just been attributed to the same person. These are exciting use-cases, and I doubt you have to think hard before you come up with something applicable to your own organization. 

A Real World Example

UDig is working with an association who publishes scholarly articles. Their ask is to improve their ability to use the abstracts of the works to automatically match new content with specific peer reviewers. A high-level explanation of our approach to tackling the challenge roughly follows.  

First, we take the massive corpus of abstracts and do some simple pre-processing. We do things like remove stop words (“the”, “and”, etc) and stem words (i.e., change “monitoring” to “monitor”). Next, we calculate a metric called TF-IDF. TF-IDF (which stands for “Term frequency–inverse document frequency”) essentially counts the appearance of a particular word in a document and then penalizes the “score” for the word if it appears in many different documents. For example, the word “the” (if it weren’t already removed by our stop word elimination) would appear quite frequently in a single document; but because it appears numerous times in every document, it gets penalized to count for nothing. Conversely, if one article happens to be about “biblio-wizardry”, and only two other documents contain the terms “biblio-wizardry” we can start to assume those texts might be related; particularly as we assess other common terms across the documents. 

In this case, ranking scholarly articles utilizing TF-IDF lets us get a pretty good idea of when two documents are related; and when two documents have little to do with each other. From there, we can take these terms and marry them up with peer-reviewers. If we discover that one person has a penchant for reviewing articles about “biblio-wizardry” but never touches the (frankly more profane) “vocabu-sorcery”, we know how to route new abstracts as they come in by applying the same technique. 

How achievable is this?

The possibilities for text analytics are endless. While it can be challenging to extract the information and no text analytics project looks the same, I believe there is an absolute treasure trove of value to be discovered. From automating discrete data identification, to gaining a more holistic view of your customers, text analytics is worth investigating.  

 

 

Digging In

  • Artificial Intelligence

    Meet UDig’s 2025 Intern Cohort

    This summer, four talented students from universities across the Southeast joined UDig as interns, bringing curiosity and fresh perspectives to the table. Sarah Galloway is studying Industrial Design at Georgia Institute of Technology. Vansh Joshi is a Computer Science major at the University of Tennessee – Knoxville. Kat Leon is pursuing Computer Science at Virginia […]

  • Artificial Intelligence

    UDig Joins CNBC AI Summit as Gold Sponsor to Advance AI Adoption

    Nashville, Tennessee – August 6, 2025 — UDig, a leading technology consulting firm, is proud to announce its participation as a Gold Sponsor of the inaugural CNBC AI Summit, taking place on October 15, 2025, in Nashville, Tennessee. The CNBC AI Summit will convene top executives, entrepreneurs, and AI leaders to explore how artificial intelligence […]

  • Artificial Intelligence

    Unlocking Your Hidden Goldmine of Information: The Power of Document Intelligence

    In this video, our CTO Josh Bartels and EVP of Consulting Reid Colson break down why document intelligence is more than search—it’s a productivity engine. From surfacing hidden insights to speeding up decision-making, they share how smart organizations are turning static files into strategic assets.

  • Artificial Intelligence

    AI & Automation in Action: Transforming Manufacturing and Distribution

    Whether you are “all in” on artificial intelligence (AI) or a skeptic, the reality is progression is happening daily and the opportunity to capitalize is now. Many manufacturers and distributors are rapidly adopting AI, automation, and smart technologies to streamline operations, improve efficiency, and enhance customer engagement. AI and associated automation are going to transform […]

  • Artificial Intelligence

    AI Agents in Action: 3 Proof of Concepts with Make.com, N8N, and CrewAI

    Our recent exploration into AI agent frameworks revealed fascinating insights about the practical implementation of autonomous business processes. By building three distinct proof of concepts using Make.com, N8N, and CrewAI, we discovered that each platform offers unique strengths for different automation scenarios. From meeting preparation to project management and resource allocation, these AI agents demonstrated […]

  • Artificial Intelligence

    The State of AI: Building Trust and Aligning Strategy to Drive Adoption and Impact

    If you’ve been in a room with technology leaders lately, you’ve probably heard a lot of excitement – and a lot of frustration – about AI. Artificial intelligence has moved rapidly from a conceptual tool to a C-suite priority that offers boundless potential, but implementation remains a messy, human process. The truth is, we’re all […]