Using Text-Based Causal Inference to Disentangle Factors Influencing Online Review Ratings
Title: Leveraging Text-Based Causal Inference to Isolate Determinants of Online Review Scores
Abstract
Online reviews serve as a critical source of information regarding how consumers perceive specific dimensions of a product or service. Although aspect-based sentiment analysis has made significant strides in extracting these distinct facets from review text, there remains a scarcity of research focused on quantifying the influence of each aspect on the aggregate perception. This gap is largely due to the high correlations between different aspects, which complicates the isolation of individual effects. To address this challenge, this study proposes a methodology rooted in recent developments in text-based causal analysis, utilizing the CausalBERT framework, to separate the impact of various factors on overall review scores.
We introduce three major enhancements to the CausalBERT model: temperature scaling, which improves the calibration of treatment assignment estimates; hyperparameter optimization, designed to mitigate overadjustment of confounders; and interpretability techniques aimed at characterizing the identified confounds. In our approach, textual mentions within reviews are utilized as proxies for actual real-world attributes. We test the efficacy of this method using both real-world and semi-synthetic datasets comprising more than 600,000 reviews of U.S. K-12 schools. Our findings indicate that the suggested modifications yield more trustworthy estimates, revealing that perceptions of school administration and benchmark performance are primary drivers of overall school ratings.
Source: arXiv Generated at: 2026-06-04 00:00:00 UTC



