arXiv

Beyond the Literal: Decomposing Pragmatic Intent in Multimodal Meme Understanding

Title: Moving Beyond the Surface: Disentangling Pragmatic Intent in Multimodal Meme Interpretation

Abstract: Large Vision Language Models (LVLMs) frequently default to describing the visual elements of a meme or sarcastic post when queried about its meaning, rather than capturing the author’s intended message. This limitation arises because standard instruction tuning intertwines the literal content of a post with its pragmatic significance, allowing superficial details to skew the final output. To address this, we recast meme comprehension as a challenge of separating literal content from pragmatic intent. We introduce Intent Projection, a novel framework that disentangles these two signals across the representation, output, and objective layers within a single LVLM backbone.

At the representation stage, an orthogonal projection module eliminates dominant unimodal directions from the fused image-text data, preserving only the pragmatic residual. Simultaneously, a surface-real affect classifier provides the decoder with a discrete tag identifying the polarity gap. The framework also enforces structured reasoning chains at the output level and employs a contrastive reward at the objective level to explicitly penalize responses that merely restate literal descriptions. Evaluated across six multimodal benchmarks, Intent Projection consistently surpasses open-source baselines and reduces the performance gap with proprietary models. The most significant improvements are observed in high-divergence posts, where literal collapse causes the greatest detriment to understanding.


Source: arXiv Generated at: 2026-06-03 00:00:00 UTC

Related Articles

TechCrunch

The world’s largest privately owned laser just turned on

Xcimer Energy activated the Phoenix laser, the world’s largest privately owned laser, aiming to commercialize fusion pow...

Uber Targets Doubling Its Fleet of Electric Motorcycles in Kenya
Bloomberg

Uber Targets Doubling Its Fleet of Electric Motorcycles in Kenya

Uber plans to double its electric motorcycle fleet in Kenya. This expansion aims to enhance sustainable transport option...

AI Saves Time But Most Companies Waste the Gain, Study Shows
Bloomberg

AI Saves Time But Most Companies Waste the Gain, Study Shows

A study reveals that while AI saves employee time, most companies fail to capitalize on these gains, squandering potenti...

JPMorgan Lifts S&P Target on Earnings 'Supercycle'
Bloomberg

JPMorgan Lifts S&P Target on Earnings 'Supercycle'

JPMorgan raised its S&P 500 target, citing an earnings “supercycle” that reflects heightened confidence in corporate pro...

Europe Sleepwalking Into Economic Ruin, Serb Leader Says
Bloomberg

Europe Sleepwalking Into Economic Ruin, Serb Leader Says

Serbian leader warns Europe is sleepwalking into economic ruin.

Delta Electronics Flags Power Crunch
Bloomberg

Delta Electronics Flags Power Crunch

Delta Electronics warns of a looming power deficit due to surging demand and constrained production, predicting serious ...