Text Analytics – an Important AI Building Block for Business Applications

“Knowledge is power only if man knows what facts not to bother with”, Robert Staughton Lynd

It is not surprising for any of us to learn that the world in which we live is swamped with data – virtually every aspect of our daily activities will directly or indirectly involve the use of data, such as browsing the Internet, or reading the news. But for some of us, we may not know that the aggregate data is growing at an exponential rate. The data that we have to deal with is getting so big that traditional analytical techniques are becoming proportionately obsolete and ineffective. This phenomenon of the exponential increase in data is commonly encompassed by the term Big Data.

Figure 1: Big Data

As early as 2001, Gartner analyst Doug Laney defined the three major problems associated with the explosion of information in the Big Data era, namely Volume, Velocity, and Variety. In essence, the three V’s describe that data is increasing in terms of its sheer quantity, the different sources in which it can exist, and the rate of increase per se. There is simply too much information for business corporations to deal with! Thankfully, opportunities to develop advanced analytical techniques to deal with Big Data are also growing exponentially.

Among the analytical techniques being researched and developed upon, text analytics is quickly gaining more and more attention. Text analytics, or text mining, is commonly defined as the analysis of data in natural language texts. It is pretty similar to data mining, but the data sources involved in text analytics are largely unstructured or semi-structured documents. The primary purpose of text analytics is to derive high-quality information from text-based documents; it utilises similar techniques/methodologies as other fields, like data mining, machine learning, information retrieval, corpus-based computational linguistics.

Today, text analytics is gradually transforming into a field extremely useful for various business applications, such as competitive analysis and improving the quality of market intelligence. It is found that this is mainly due to two reasons- firstly, a large part of corporate information (approximately 80%) are naturally available in textual data formats; secondly, text analytics technology has gained serious attention by computing experts as they are potentially useful for many functions. There are several specialties within text analytics that have different objectives. They are as follows:

  • document classification: group and classify documents into predefined categories (or classes)
  • information retrieval: obtain documents in response to a query
  • clustering/organisation of documents: unsupervised process through which documents are classified into groups called clusters
  • information extraction: involves identification of certain entities in the text, their extraction and representation in a pre-specified format
  • detection of co-occurrence in document: identify and detect concepts that are commonly addressed in documents
  • identification of trends in data: involves the detection of news/articles concerning the emergence of new products or companies
  • text summarisation: selection of representative sentences to summarise the document

Among all real-world applications of text analytics, trend analysis is easily one of the most popular. As the name suggests, it is a method involving the study of a particular entity (such as the type of a customer service request, technical issue raised, etc) and how it changes over a period of time. However, trend analysis need not be limited to data being internally collected at a company- popular social networking platforms like LinkedIn, Facebook, Inc. and Twitter, Inc. employ extensive text analytics techniques to determine the trends on the respective platforms, where the trending entity can be the names of a particular person and his hobby/achievement, event, place or a topic. For instance, Twitter gives its users a series of popular topics, or trending topics, around the world and also allows users to zoom in to a particular locality to find trending topics within that place. Putting a hashtag (#) at the end of each tweet is also conventional among Twitter users, serving as an effective landmark which allows for easy searching of tweets linked to a given trend.

Figure 2: Trends within a Twitter analytics dashboard

Given the wide-ranging uses of text analytics hitherto described, it is not hard to fathom the attractiveness of text analytics in business applications! Effective applications of text analytics can help an organisation to obtain potentially valuable business insights from text-based documents such as emails, word documents, and even commentaries on social media platforms. In addition, obtaining effective and reliable market intelligence will allow for accurate and confident decision-making in various marketing strategies, such as identifying market opportunities, developing market penetration strategies and relevant marketing metrics.

The enormous potential of business applications of text analytics in improving business insights and strategies have received acclamations from various business leaders around the globe, one of whom is Jeff Catlin, CEO for Lexalytics:

“Text analytics is invaluable to anyone facing profound information overload. Instead of pouring through reams of online information, users can leverage text analytics to monitor and proactively adjust to the latest conditions ‘on the ground’. In the case of ExecDex, any business leader or professional concerned about public and market perceptions can easily see where their executive ranks. Not only that, but information on themes and other entities linked to that executive are shown at-a-glance, providing much deeper understanding and awareness of the issues shaping that perception.”

As it stands now, the business applications of text analytics are wide-ranging, and they include:

  1. Knowledge and Human Resource management
  2. Customer Relationship management and Market Analysis
  3. Natural Language Processing and Multilingual Aspects


1. Knowledge and Human Resource Management

Competitive Intelligence:

For many companies, there exists a need to organize and modify their strategies according to market demands and opportunities that are available, and this in turn requires the companies to obtain, manage and analyse enormous amount of data that pertain to themselves, the market and their competitors. The process of manually compiling documents according to a user’s needs and preferences into reports is very labour intensive, and problem is further amplified when it needs to be updated frequently.

Thus, the aim of Competitive Intelligence is to select only relevant information by automatic reading of this data, which is essentially a business application of text analytics. Once the material has been collected, it is classified into the relevant categories and subsequently analysed to get answers to specific and crucial information for company strategies.

Extraction Transformation Loading:

Business applications of text analytics have also found their way into Extraction Transformation Loading (ETL); ETL is aimed at filing non-structured data into categories and structured fields. The data can come from any source, thus text analytics will be useful in extracting and classifying it into the relevant categories for subsequent analysis.

Human Resource Management:

Text analytics techniques are also used to manage human resources, mainly with applications of aiming at analyzing staff’s opinions, monitoring the level of employee satisfaction, as well as reading and storing CVs for the selection of new personnel.

Figure 3: A CV Document Analyser Interface

2. Customer Relationship Management and Market Analysis

Customer Relationship Management (CRM):

In the context of CRM, there are huge volume of unstructured text documents (relating to customer experience management, customer relationship management, and customer service quality) being produced from a variety of sources in contact centers. Business applications of text analysis will be invaluable in automating the process of analyzing documents (such as feedback surveys) and thereby increase the overall efficiency of evaluating them.

Figure 4: Application of Text Analysis in CRM

Market Analysis:

Text analytics is used mainly to analyse competitors and/or monitor customers’ opinions, identify new potential customers, as well as to determine the companies’ image through the analysis of press reviews and other relevant sources. Companies are increasingly looking to apply this technique to derive extremely useful operational and business insights in order to improve their market prospects.

3. Natural Language Processing and Multilingual Aspects

Questioning in Natural Language:

The most important case of business application of the linguistic competences developed in the context of text analytics is the construction of websites that support systems of questioning in natural language. Particularly for companies that have an important part of their business on the web, it is necessary for companies to apply text analytics techniques to improve their webpage’s capability of catering to the needs of customers as much as possible.

Multilingual Applications of Natural Language Processing:

In Natural Language Processing, text analytics applications are also quite frequent and they are characterized by multilinguism. An example includes the use of text analytics techniques to identify and analyse web pages published in different languages.

Data means information, and knowledge is power. However, knowledge obtained from data is contingent on our ability to collect, extract and utilise it to our advantages. With the analytical tools being developed alongside the exponential increase in data, it is definitely possible to deal with the sheer amount of data being generated. In particular, text analytics are intrinsically valuable for business applications in handling and managing data, and it will continue to thrive henceforth.