The Multilingual Multimodal Travelers App
College:
The Dorothy and George Hennings College of Science, Mathematics, and Technology
Major:
Computer Science
Faculty Research Advisor(s):
Yulia Kumar, J. Jenny Li
Abstract:
This academic exploration delves into "The Multilingual Eyes Multimodal Traveler’s App" (MEMTA), an innovative application at the intersection of travel technology and Artificial Intelligence (AI). MEMTA distinguishes itself through the strategic integration of advanced AI technologies, including multimodal Large Language Models (LLMs) like ChatGPT-4, Yolov8 Object Detection, and the Whisper API, to provide unparalleled navigational assistance and situational awareness for a diverse user base comprising tourists and visually impaired individuals.
The core of our study lies in assessing MEMTA's capabilities in real-time, multilingual translation, pronunciation, and context awareness. This investigation reveals how MEMTA leverages the full spectrum of AI advancements to enhance the user experience across various geographical landscapes. By incorporating cutting-edge AI technologies, MEMTA not only interprets the visual world into actionable insights but also transcends language barriers, facilitating seamless communication and interaction in multilingual contexts.
Moreover, this study expands the understanding of MEMTA's applications beyond conventional travel assistance. It shows potential in revolutionizing sectors such as robotics, virtual reality, and military operations, thereby highlighting its extensive applicability and transformative impact. The exploration into these diverse fields reveals MEMTA's role in advancing human-AI interaction, providing innovative solutions to complex challenges, and enhancing operational efficiency and safety in high-stakes environments.
Through this exploration, the study contributes novel insights into the fields of AI-enhanced travel, assistive technologies, and the broader scope of human-AI interaction.