PatientPrompt: ThoughtTree’s Powerful Architecture

ThoughtTree’s infrastructure is engineered upon our state-of-the-art Large Language Model technology.

Background

In mid-2024, the worlds of healthcare and LLMs merged in the form of MedPrompt. MedPrompt is a prominent prompt engineering framework designed for medical domain-specific question-answering (QA) tasks. It was developed to investigate whether a generalist large language model (LLM) augmented with advanced prompting techniques can match or outperform a fine-tuned LLM in medical QA tasks. While MedPrompt has demonstrated strong performance in structured question-answering, its applicability to long-contextual text generation remains uncertain, particularly in cases with multiple plausible responses.

Related Works

MedPrompt and MedPrompt+: These leverage dynamic few-shot prompting, self-consistency ensembling, and output verification to enhance QA accuracy in medical contexts.

Med-PaLM 1 and 2: Developed by Google, these models integrate fine-tuning with medical datasets to improve domain-specific reasoning, surpassing generic LLMs on clinical benchmarks.

While these frameworks excel in structured QA tasks, they do not sufficiently address long-form clinical text generation, a gap that PatientPrompt seeks to fill.

Introducing PatientPrompt

To address this gap, we introduce PatientPrompt, a framework optimized for generating structured, clinically relevant responses from long-form clinical session notes. The DSM-5 TR serves as the authoritative classification system for psychiatric disorders, providing structured diagnostic criteria, disorder definitions, and symptomatology. PatientPrompt integrates multiple retrieval-augmented generation (RAG) modules, each aligned with a specific subdomain within Section II of the DSM-5 TR. These RAGs leverage a structured retrieval process to ensure clinically coherent outputs by combining diagnostic definitions with supporting peer-reviewed studies. To enhance retrieval precision, we maintain a curated vector store of published research, filtered based on relevance, citation impact, and clinical significance.

Why This Matters to You

If you or a loved one has ever faced uncertainty in receiving a mental health diagnosis, you know how important accurate, evidence-backed assessments are. Traditional diagnosis methods often depend on subjective interpretations, leaving room for inconsistencies. PatientPrompt is designed to bridge this gap by leveraging cutting-edge AI technology to provide structured, transparent, and clinically validated insights.

With PatientPrompt, mental health professionals can make better-informed decisions, backed by peer-reviewed data and the latest DSM-5 TR criteria. This means:

More clarity on possible diagnoses and alternatives.
A deeper understanding of comorbid conditions and their implications.
Transparent explanations backed by session notes and clinical standards.
Increased confidence in the diagnostic process for both providers and patients.

At its core, PatientPrompt is about empowering individuals and their care teams with reliable, structured information that enhances mental health treatment and outcomes.

Methodology

PatientPrompt employs the following methodology:

Long-Contextual Hierarchical Summarizer
Large Embedding Model (LEM) for Contextual Few-Shot Prompting
Tree-of-Thoughts with Symptom-Consistency (SC-ToT)
Corpus-of-Standards (CoS)
Query Transformations for Enhanced Retrieval
ReAct Agent with Revisional-Relevance Mechanism

Long-Contextual Hierarchical Summarizer

To manage large clinical session notes, PatientPrompt employs a hierarchical summarization mechanism that segments input data based on topic coherence and clinical relevance. This ensures that retrieval and reasoning components operate on contextually segmented, high-relevance data, mitigating information loss and reducing processing overhead.

Large Embedding Model (LEM) for Contextual Few-Shot Prompting

PatientPrompt incorporates dynamic few-shot prompting for its Large Embedding Model (LEM), leveraging methodologies inspired by the state-of-the-art paper, Making Text Embedders Few-Shot Learners, which allows our models to use semantically similar data as examples not only in our prompt engineering but our LEM as well used in the retrieval process. This allows the system to dynamically adjust retrieval strategies based on contextual relevance, ensuring that examples remain clinically pertinent across diverse patient cases.

Retrieval Augmented Tree-of-Thoughts with Symptom-Consistency

Rather than employing standard Chain-of-Thought reasoning, PatientPrompt utilizes a prompt engineering framework consisting of:

Tree-of-Thoughts (ToT) to allow the problem is decomposed into a series of intermediate steps or “thoughts”, so instead of following a single path, the model explores multiple potential steps at each stage, forming a tree of possible reasoning paths. Each branch is evaluated based on its potential to lead to a successful solution, so the model can backtrack and explore alternative paths as needed.
Self-Consistency with Chain-of-Thought (SC-CoT) Reasoning to consider a variety of linear reasoning steps through temperature-sampling and creates an aggregation of the reasonings to choose the most consistent answer. This technique captures multiple valid ways to solve a problem, reducing reliance on a single, and possibly incorrect, trajectory. We have built upon this methodology reduce unintended biases from the DSM-5 and to give a more holistic approach to patient diagnostics.
Retrieval Augmented Thoughts (RAT) for iterative revision with contextual retrieval for the entire reasoning path, so the model retrieves pertinent information based on the original prompt, the current reasoning step, and previous steps. The retrieved data is used to refine the current step, ensuring accuracy and coherence. This process repeats for each step, to build a contextually grounded final response.
Symptom-Consistency is how we are able mitigate unknown biases from using the DSM-5 and to A) cross validate the patient’s symptoms align with a diagnosis from the Diagnostic Criteria of each disorder or B) not over diagnose the patient if they do not display signs of any known disorder or don’t have symptoms at all.

This combination of frameworks into one system allows PatientPrompt to produce aggregated-sampled and considered reasoning paths that are contextually grounded. We employ this prompt engineering for the following:

Generating multiple diagnostic hypotheses and ranking them based on majority-aligned confidence scores.
Providing alternative diagnoses to explore differential diagnostic possibilities.
Returning ICD-10 insurance codes for each probable diagnosis.
Extracting DSM-5 TR diagnostic criteria for each diagnosis.
Citing session notes and other supporting evidence for diagnostic justification.
Identifying and considering comorbid conditions based on DSM-5 TR relationships.
Returning from our 50,000+ indexed studies to provide supporting evidence for treatment recommendations.

Corpus-of-Standards (CoS)

PatientPrompt integrates a Corpus-of-Standards (CoS), a structured indexing system that refines retrieval by enforcing domain-specific constraints. This corpus optimizes by pre-filter search results, eliminating incorrect retrievals and improving response relevance. For our Mental Health CoS, we use the DSM-5 TR’s Section II for all diagnostic attributes on nearly 300 disorders. We also incorporate over 50,000 peer-reviewed, psychology studies from the past 25 years to ensure each output is backed with the most up-to-date research.

Query Transformations for Enhanced Retrieval

To optimize information retrieval, PatientPrompt applies multiple query transformation techniques, including:

Step Back Prompting: Adjusting query granularity to enhance retrieval precision.

Query Decomposition: Breaking complex queries into simpler sub-queries to improve interpretability.

Query Rewriting: Reformulating queries to align with retriever-specific optimization heuristics.

Multi-Query Expansion: Generating multiple reworded versions of a query to encapsulate semantic variations.

ReAct Agent with Revisional-Relevance Mechanism

Each step within the SC-ToT framework incorporates a ReAct (Reasoning + Acting) agent with a Revisional-Relevance mechanism. This agent iteratively revises outputs based on retrieved evidence, ensuring that responses align with DSM-5 TR diagnostic standards and minimizing inconsistencies. We sort our entire CoS to ensure we are delivering the most relevant information every step of the way.

Conclusion

PatientPrompt represents an advancement in prompt engineering by extending capabilities beyond structured QA tasks to complex, long-context reasoning in specialized for clinical settings with a corpus of published material contextually grounding each response. With each generation, we are able to ensure we are delivering with the most up-to-date research, which empowers our users to give the best patient care. We are committed to always improving our systems to make sure our clients are only supplied with the best.