Breakdown of Video's Main Components

Peek into Prototyping

This project was broken into 4 major components that are explained below (except for audio recording, which as a temporary local solution was enabled by PvRecorder).
Mockup

Customizing ZeroWidth LLM RAG

Selecting GPT-4 as the base model with its newly raised token count of 128,000 to account for lengthy appointments, I instructed its role as a medical translator and specific knowledge sets to pull from. I embedded the American Association of Diabetes Glossary of Terms as my source, splitting it into 6 knowledge sets to help with search speed and accuracy.

Mockup

OpenAI Whisper: Speech-to-Text

Whisper is a deep learning model trained on over 680,000 hours of multilingual audio data. Given accuracy being the most critical factor in medical settings, I chose this model to create transcriptions from the original audio recording.

Mockup

Multimedia Display Interface

Streamlit is an open-source framework, Python-based library that allows users to create an application with minimal code and supports multimedia capabilities. Because I wanted to create a tool that could be accessible to people of all socioeconomic and cultural groups, I needed to find multimedia ways of presenting information.

Still curious?

Reach out to learn more details about the project—I promise we've got lots more in store:)