arXiv

An Empirical Study of Data Scale, Model Complexity, and Input Modalities in Visual Generalization

Title: Empirical Analysis of Data Volume, Model Architecture, and Input Types on Visual Generalization

Abstract:

While modern deep neural networks have demonstrated remarkable capabilities in computer vision thanks to their extensive parameter counts and complex nonlinear hierarchical designs, traditional statistical learning theories struggle to account for their generalization capabilities. We focus on three fundamental, controllable variables that likely influence visual generalization: data scale, model complexity, and input modalities. This paper presents an empirical investigation into how these factors impact generalization performance.

Our study begins with a preliminary experiment involving a one-dimensional nonlinear function, where we manipulated the number of training samples and the polynomial degree to isolate the effects of data scale and model complexity. The primary experiments subsequently evaluate model performance on the CIFAR-10 and CIFAR-100 datasets, varying the size of the training data, the network architecture, and the type of input data.

Our findings indicate that expanding the training dataset consistently enhances generalization. In contrast, adjustments to model complexity do not yield reliable improvements. Furthermore, the removal of color information leads to a decline in performance, whereas the inclusion of explicit prior features—such as gradients, edges, and wavelets—produces mixed results that vary depending on the specific model architecture. This work offers a comprehensive empirical examination of the interplay between data scale, model complexity, input modalities, and visual generalization.

Code and experimental logs are available at: https://github.com/zlyd-CV/DeepLearning-Empirical-Studies.


Source: arXiv Generated at: 2026-06-04 00:00:00 UTC

Related Articles

AI Concentration Risk Is the Problem: 3-Minutes MLIV
Bloomberg

AI Concentration Risk Is the Problem: 3-Minutes MLIV

The article argues that AI concentration risk, rather than the technology itself, is the primary concern. It highlights ...

Reuters

Foxconn announces strategic collaboration with Intel on next-gen AI infrastructure

Foxconn and Intel announced a strategic partnership to develop next-generation AI infrastructure. This collaboration aim...

SpaceX Seeks to Raise $75 Billion in Record IPO (Video)
Bloomberg

SpaceX Seeks to Raise $75 Billion in Record IPO (Video)

SpaceX aims for a record $75 billion valuation through an initial public offering. This historic IPO marks a significant...

Broadcom AI Chip Outlook Disappoints Investors
Bloomberg

Broadcom AI Chip Outlook Disappoints Investors

Broadcom’s AI chip projections disappointed investors, dampening market sentiment. The outlook fell short of expectation...

Reuters

Europe's tech 'liberation day'? Computer says not yet

Europe’s expected tech breakthrough remains unrealized, as current systems indicate that a true "liberation day" has not...

Hiranandani Group CEO on Powering India's Digital Future
Bloomberg

Hiranandani Group CEO on Powering India's Digital Future

Hiranandani Group CEO discusses driving India's digital transformation.