Hybrid Imbalanced Regression Through Unified Data-Level and Algorithm-Level Balancing
Title: A Unified Approach to Hybrid Imbalanced Regression via Combined Data and Algorithmic Strategies
Abstract: In machine learning, imbalanced learning poses a significant hurdle, as the presence of underrepresented target values can skew model behavior and diminish accuracy on infrequent yet crucial instances. While this issue has been thoroughly investigated in classification tasks, imbalanced regression has received comparatively less attention. Current solutions typically isolate their efforts to either data-level balancingâwhich risks introducing noise and overfittingâor algorithm-level balancing, which frequently encounters difficulties when handling complex target distributions. To overcome these constraints, we introduce a cohesive hybrid framework that merges data- and algorithm-level balancing techniques into a pipeline compatible with any regressor. This framework operates through five distinct phases: first, adaptive bin partitioning dynamically divides the target space according to local linear coherence; second, it employs target-conditioned representation learning via a Conditional Variational Autoencoder; third, it executes multistage data-level balancing by clustering features in the feature space and oversampling minority clusters; fourth, it applies algorithm-level balancing through a novel Latent-Density Weighted Loss (LDWL), which prioritizes rare samples within both latent and target spaces; and fifth, it utilizes attention-based gated fusion to finalize the regression. Benchmark dataset evaluations reveal that this framework consistently enhances predictive accuracy relative to both standalone regressors and current imbalanced regression methods.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC




