Deng Shasha: Let the Computer Understand Social Media Better, When Artificial Intelligence Meets Linguistics

Time：0222,2019View：431

The rapid growth of social media and online communities has dramatically changed the manner in which communication takes place. Everyone has become the publisher and disseminator of network information. Therefore, the online text can actually be regarded as the data generated by human as a subjective and intelligent sensor, which contains human's views and views on the world after observation and digestion. Therefore, text analytics can be a good way to mine people's views on a certain thing and the behaviors behind it.

At present, the biggest challenge for the application of text data analysis is how to make use of existing imperfect natural language processing technologies (including: information technology, machine learning, data technology) to generate and create value for enterprises and society. Figure 1 shows the relationship between textual analysis technology maturity and business value. Compared with topic analysis and emotion analysis, speech act and interaction, user modeling and emerging user instances bring more business value to enterprises. Sense-making with significance of information quality and coherence, could have been used proactively maximize the business value of social media.

When Artificial Intelligence Meets Linguistics

Sense-making is an information-processing task that serves as a critical prerequisite for decision-making. Despite their various benefits, social media technologies present two important challenges for sense-making: the need for greater coherence and better understanding of actions. Enhanced coherence and a better understanding of the actions inherent in language are both critical for comprehending communicative context and intentions (Te’eni 2006), and for facilitating sense-making.

Three important aspects of language are semantics, syntax, and pragmatics (Winograd and Flores 1986). Numerous prior technologies that support analysis of computer-mediated communication content have emphasized the semantics of language with particular focus on topics and sentiments of discussion; that is, what people are saying (Abbasi and Chen 2008). The Language-Action Perspective (LAP) emphasizes pragmatics; not what people say, but rather, what people do with language (Winograd and Flores 1986). LAP’s principles are based on several important theories, including Speech Act Theory (Searle 1969). Speech Act Theory (SAT) emphasizes the ordinary speaking view of language, where language is one category of human action: an open collection of speech acts. Language is a social fact and its primary function is to promote sense-making in social interactions (Lyytinen 1985; Kuo and Yin 2011).

Let the Computer Understand Social Media Better

The conventional Information System’s perspective stresses the content of messages rather than the participants’ interactive behavior and the effects they have. In contrast, LAP emphasizes “what people do by communicating, how language is used to create a common basis for communication partners, and how their activities are coordinated through language”. LAP principles may provide important insights for the design and development of text analytics tools capable of improving sense-making from online discourse.

We propose a LAP-based framework for analyzing online discourse (see Figure 1). The framework is predicated on the notion that methods which employ LAP principles can be used to facilitate enhanced (1) conversation disentanglement, (2) coherence analysis, and (3) message speech act classification. Moreover, these three components can be used to collectively improve sense-making capabilities by providing an enhanced representation of coherence relations and communication actions through the use of Speech Act Trees. The proposed framework and related research hypotheses are discussed in the remainder of the section.

Figure 2: A LAP-based Framework to Support Sense-making in Online Discourse

Lab Experiment and Field Experiment

The LTAS system has three major components: conversation disentanglement, coherence analysis, and speech act classification. We rigorously evaluate a system developed based on the framework in a series of experiments that demonstrate the utility of each individual component of the system (and its underlying framework), in comparison with existing benchmark methods.

System performance does not always correlate to user performance, especially with complex tasks (Turpin and Scholer 2006). We evaluated the effectiveness of SATrees generated by LTAS in assisting users with sense-making. Four questions were used in the experiment. As depicted in the results for Experiment, LTAS facilitated demonstratively better sense-making than comparison methods, allowing users to better understand actions, situated actions, and symbolic actions inherent in online group discussion.

Furthermore, the results of a user experiment involving hundreds of practitioners, and a four-month field experiment in a large organization, underscore the enhanced sense-making capabilities afforded by text analytics grounded in LAP principles. The results have important implications for online sense-making and social media analytics.