arXiv

Solving Zebra Puzzles Using Constraint-Guided Multi-Agent Systems

Title: Addressing Zebra Puzzles Through Constraint-Driven Multi-Agent Architectures

Original: arXiv:2407.03956v3 Announce Type: replace-cross Abstract: Previous studies have sought to boost the logic puzzle-solving capabilities of Large Language Models (LLMs) by employing methods like chain-of-thought prompting or integrating symbolic representations. However, these frameworks often fall short when tackling intricate logical challenges, such as Zebra puzzles, primarily because converting natural language clues into formal logical statements is inherently complex. To address this, we present ZPS, a multi-agent system that combines LLMs with a standard theorem prover. This system manages the demanding task of puzzle resolution by decomposing the problem into smaller, more tractable components, producing SMT (Satisfiability Modulo Theories) code for resolution via a theorem prover, and leveraging iterative feedback among agents to refine their outputs. Additionally, we developed an automated grader for grid puzzles to verify solution accuracy, demonstrating its reliability through a user study. Our methodology yielded performance gains across all three LLMs evaluated, with GPT-4 achieving a 166% increase in the count of completely correct solutions.


Source: arXiv Generated at: 2026-06-04 00:00:00 UTC

Related Articles

The Do’s and Don’ts of Buying Used Tech Gadgets
New York Times

The Do’s and Don’ts of Buying Used Tech Gadgets

Refurbished tech offers a cost-effective alternative amid component shortages and inflated prices. This guide outlines e...

Who is Elon Musk and what is his net worth?
BBC News

Who is Elon Musk and what is his net worth?

Elon Musk, CEO of Tesla and SpaceX, became the first person to surpass a $500 billion net worth in October 2025. His wea...

AI Boom Propels China Optical Maker to Top Weighting on CSI 300
Bloomberg

AI Boom Propels China Optical Maker to Top Weighting on CSI 300

Driven by surging AI demand, a Chinese optical maker has reached the highest weighting in the CSI 300 index.

AI Bubble 'Something to Look At,' BNP's Huynh Says (Video)
Bloomberg

AI Bubble 'Something to Look At,' BNP's Huynh Says (Video)

BNP Paribas’ Huynh describes the AI bubble as “something to look at,” signaling cautious interest in the sector’s potent...

SoftBank’s PayPay to Buy T&D’s Life Insurer for $840 Million
Bloomberg

SoftBank’s PayPay to Buy T&D’s Life Insurer for $840 Million

PayPay is acquiring T&D Holdings’ life insurer for $840 million, shortly after its historic $879.8 million Nasdaq IPO.

Goldman Sachs CEO David Solomon on Running a Bank in the Age of AI | Odd Lots
Bloomberg

Goldman Sachs CEO David Solomon on Running a Bank in the Age of AI | Odd Lots

Goldman Sachs CEO David Solomon discusses integrating AI into banking operations. He explores how artificial intelligenc...