MIDS Capstone Project Spring 2024

Juniper: Privacy Interface for Large Language Model Interaction

Summary

Our project, JUNIPER is a proxy interface that enables individuals and organizations to unlock the vast potential of Large Language Models (LLMs) by upholding user or organization privacy. It is applied to medical diagnosis for initial proof of concept, where large quantities of private data are present. Individuals and companies should no longer fear sensitive data leaks and utilize the power of open-source LLMs.

Mission Statement

At our core, we are committed to enabling individuals and organizations unlock the vast potential of Large Language Models by steadfastly upholding the paramount importance of privacy.

Background

Privacy breaches and data exposure are significant concerns with the use of LLMs. Users might unintentionally share sensitive information in their LLM prompts, which can be accessed by LLM providers and potentially utilized elsewhere. Organizations are subject to strict regulations such as GDPR, which dictate the handling of personal data. Furthermore, traditional data anonymization methods, while intended to protect privacy, can sometimes compromise the effectiveness of downstream tasks.

MVP

Our MVP focuses on three core objectives essential for the effectiveness and user-friendliness of our system. Firstly, robust measures are implemented to redact or replace private data in prompts before they reach an open-source model like OpenAI, ensuring privacy and compliance with regulations. Secondly, the integrity of the diagnostic process is maintained to ensure consistency between original and treated prompts, bolstering confidence in system accuracy. Lastly, user autonomy is prioritized by allowing intervention and modification of treated prompts, empowering users and enhancing trust. By addressing these aspects, our MVP aims to deliver a comprehensive and user-centric solution for medical diagnosis while upholding privacy and accuracy standards.

Data Sources

We used multiple datasets to address this complex privacy preservation compute problem for LLMs.

Motivation
Motivation
Microsoft Presidio architecture
Microsoft Presidio architecture
RAG & two tower architecture
RAG & two tower architecture

Video

JUNIPER: Privacy Interface for Large Language Model Interaction

JUNIPER: Privacy Interface for Large Language Model Interaction

If you require video captions for accessibility and this video does not have captions, click here to request video captioning.

Last updated:

April 17, 2024