Filtering COMPS Chat Transcripts for Computer Modeling Using Common Vocabulary
Arts and Sciences
Computing and Information Sciences
The Computer Mediated Problem Solving (COMPS) project aims to create a web-delivered problem-solving environment for student collaborative learning. One feature will be real-time computer monitoring of chat dialogs, informing instructors of the status and productivity of student discussions. This work addresses challenges in preparing typed-chat of a variety of student exercises for computer analysis. Computer identification of productive chat will utilize a vocabulary of common English words not related to specific student exercises. Here we report on software and data resources for maintaining lexicons harvested from chat transcripts. This software aids in discovering vocabulary common to diverse chatting milieus, making the vocabularies available for research and for machine processing of the text. Typed chat also presents lexical challenges. Among them are words stretched looooonger or *starred* for emphasis, along with rampant spelling errors and abbreviations. Algorithms developed for these issues are presented here.
Bouman, Nathaniel I., "Filtering COMPS Chat Transcripts for Computer Modeling Using Common Vocabulary" (2017). Summer Interdisciplinary Research Symposium. 5.