Large Language Models (LLMs) have been touted as extremely effective for representing and generating language, with ChatGPT as a prime example. However, how do LLMs hold against popular community speech patterns, such as code-switching between multiple languages? What challenges are present for spoken languages without a standardized orthography (such as Cypriot Greek)? This talk will outline the speaker’s experience tackling both questions, as well as outline current approaches by other respected researchers in NLP and computational linguistics.
Rebecca Pattichis is currently working with Dr. Spyros Armosts at the University of Cyprus to develop a semi-automatic pipeline from unnormalized Cypriot Greek text to its International Phonetic Alphabet representation. She did her MS in Computer Science as a Google Deepmind Fellow under Dr. Nanyung Peng at the University of California, Los Angeles, where she worked on analyzing transcribed bilingual speech patterns. She completed her BS with Honors under Dr. Christopher Manning at Stanford University, where she specialized in Artificial Intelligence and developed her interest in LLMs for code-switching tasks. Her research interests lie in centering multilingualism to analyze how popular language model’s assumptions may break, as well as collaborating with community linguists to develop contextual language technologies.