Avoiding Structural Failure Modes in Tabular Fair SSL: Online Primal-Dual Allocation under Confidence Gating
Title: Preventing Structural Failure Modes in Tabular Fair Semi-Supervised Learning: Online Primal-Dual Allocation with Confidence Gating
Abstract:
While semi-supervised learning (SSL) facilitates prediction using sparse labeling, critical tabular domains such as healthcare, credit scoring, and recidivism assessment demand rigorous statistical fairness assurances. Through a diagnostic stress test, we uncover a structural conflict inherent in tabular fair SSL. Specifically, when employing confidence-gated pseudo-labeling, moment-matching fairness regularizers can induce two distinct failure modes: "Masking Collapse," where fairness constraints degrade model confidence, thereby starving the system of pseudo-labels; and "Trivial Saturation," where the model drifts toward constant predictions. To address this, we introduce Online Primal-Dual Allocation (OPDA), an online control mechanism that dynamically schedules fairness and entropy-based stability penalties. This scheduling is driven by signals regarding violation levels, risk, and pseudo-label health, eliminating the need for dataset-specific selection of a fixed fairness weight within this diagnostic framework.
Evaluations on standard tabular benchmarks—Adult, ACSIncome, and COMPAS—demonstrate that OPDA effectively mitigates the degenerate regimes associated with static weighting and simple single-signal adaptive baselines. On the Adult and COMPAS datasets, OPDA achieves non-degenerate operating points that are competitive with the empirical static-$\lambda$ frontier. In contrast, on ACSIncome, the method preserves utility while offering a broader fairness-utility spread. Comparisons with OPDA-lite reveal that the full controller primarily shifts the operating point toward higher utility on ACSIncome, whereas on Adult, it illustrates the fairness-utility trade-off between the two variants. These findings establish OPDA as a calibration-free controller capable of maintaining non-degenerate operating points in tabular fair SSL without requiring per-dataset tuning.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC





