On Wednesdays, We Ask Questions: Optimizing "Active Listening" in Automated Legal Triage and Referral
On Wednesdays, We Ask Questions: Optimizing "Active Listening" in Automated Legal Triage and Referral
arXiv:2606.00272v1 | Announce Type: new
Abstract
The FETCH classifier employs a cost-effective ensemble of Large Language Models (LLMs) to generate follow-up queries designed to pinpoint the most appropriate match for an applicantās legal issue. This study presents an evaluation of this follow-up question methodology within FETCH, combining insights from expert attorneys with LLM-assisted analysis. Our findings indicate that while economical LLMs are effective for classification duties, producing high-quality, plain-language questions in this context demands a more advanced and expensive model.
Through consultations with legal intake personnel, we developed a rubric for assessing the quality of classification questions in legal intake scenarios. The research reveals that relying solely on prompt engineering is insufficient for enhancing the utility of these questions for intake purposes. Furthermore, we observed a discrepancy between ratings provided by LLMs acting as judges and those given by humans.
By integrating a single high-cost model, GPT-5, into the system, the classifier demonstrated an improved ability to draw out pertinent information from applicants seeking legal assistance. This enhancement resulted in greater accuracy for classification tasks. However, the study also highlighted inconsistent fact-finding across various legal categories, including domestic violence. These inconsistencies conflict with established family law screening protocols, underscoring the importance of implementing specialized screening modules for specific areas of law.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC




