Towards AI Models That Can Visually Understand the World’s Cultures
Graham Neubig, Carnegie Mellon University
Fri, 11/22 · 4:30 pm—6:00 pm · 006 Friend Center
Center for Digital Humanities
Neubig will discuss a new frontier in AI models, vision-language models that understand the world’s cultures. First, he will talk through the training of multilingual multimodal multicultural models that understand images and text and have an increased ability to answer culture-specific questions about multimodal data. Next, he will discuss work on “image transcreation”, where models have been developed that can transform images to make them more relevant to a particular culture. This work has applications in several areas, such as cultural localization of educational materials (to accompany translated text). The talk will focus on examples specifically from the African context and challenges we currently face therein.
Graham Neubig is an associate professor at the Language Technologies Institute of Carnegie Mellon University. His research focuses on natural language processing, with a particular interest in fundamentals, applications, and understanding of large language models for tasks such as question answering, code generation, and multilingual applications. His final goal is that every person in the world should be able to communicate with each other, and with computers in their own language. He also contributes to making NLP research more accessible through open publishing of research papers, advanced NLP course materials and video lectures, and open-source software, all of which are available on his website.
Series co-sponsors and supporters include Africa World Initiative, Program in African Studies, and Princeton African Humanities Colloquium.