Navigating Adversarial Prompts to Secure Large Language Models

Click the Poster to View Full Screen, Right click to save image

Grant: LSAMP

Dejaun Gayle

College:
The Dorothy and George Hennings College of Science, Mathematics, and Technology

Major:
Computer Science

Faculty Research Advisor(s):
Yulia Kumar

Abstract:
The present investigation rigorously explores the resilience of state-of-the-art artificial intelligence (AI) Large Language Models (LLMs), such as ChatGPT, Microsoft Copilot, and Gemini, as well as AI-driven image generators like DALL-E 3, against adversarial prompts. These advanced models are susceptible to inadvertent or intentional manipulation, leading to the generation of responses or images that contravene the ethical and security guidelines established by their developers. The study employs advanced prompt-engineering and 'jailbreaking' techniques to uncover subtle yet significant vulnerabilities, thereby presenting an innovative methodology for robust testing of AI systems. This approach not only highlights the critical necessity for enhanced AI defenses but also sheds light on the complex interplay between AI innovation and ethical integrity.

At the heart of these findings is a call for proactive and ongoing enhancement of AI technologies to ensure their security. By identifying current shortcomings and vulnerabilities, this research contributes significantly to the wider discourse on responsible AI utilization. It emphasizes the need for developing robust ethical frameworks and advanced security protocols. The researchers propose practical strategies to fortify AI models against adversarial threats, with the goal of establishing a digital ecosystem where ethical compliance and digital security are paramount. Future directions for this research include refining these models further and incorporating new data modalities such as voice and video.


Previous
Previous

Understanding Urban Research at Kean University

Next
Next

Research-Based Design for Interpretation of Cold Storage at Liberty Hall