The Impact of Temporal Granularity on Socio-Demographic Inference from Household Load Profiles
Title: How Temporal Resolution Affects Socio-Demographic Profiling via Household Energy Data
Smart meter recordings have the potential to expose intimate socio-demographic details about households, sparking significant privacy debates. Although the risk of such inferences has been established at specific time intervals, the specific influence of temporal resolution on inference accuracy has not been thoroughly investigated. This study fills that void by examining how varying levels of temporal granularity—ranging from 15-minute intervals up to weekly aggregates—impact the predictability of eight distinct socio-demographic traits. The analysis utilizes a dataset comprising one year of energy consumption records from 1,589 households.
To ensure robustness, we developed an evaluation framework in which machine learning classifiers were trained on full-year datasets but tested on specific, arbitrary weeks. This methodology compels the models to generalize effectively across both seasonal shifts and weekly patterns. The investigation yields three primary conclusions.
First, although reducing the temporal granularity generally lowers predictive accuracy, the results indicate two distinct performance plateaus. Accuracy remains consistent when moving from 15 minutes to one hour, and similarly stays stable between one day and seven days. These findings suggest viable pathways for data minimization that preserve analytical utility.
Second, the study finds that interpretable, handcrafted features and those generated by the tsfresh library perform competitively against embeddings derived from CNN-based autoencoders. Furthermore, the XGBoost algorithm consistently demonstrated superior performance compared to other tested classifiers.
Third, an analysis of feature importance reveals a divergence between static and dynamic attributes. Static characteristics, such as dwelling size, can be accurately inferred even from coarse-grained data. In contrast, dynamic behaviors, such as swimming pool usage, necessitate fine-grained temporal signals for reliable identification.
Ultimately, this research offers fresh perspectives on the balance between privacy and utility in smart metering systems. It demonstrates how the interplay of temporal resolution, feature extraction techniques, and classifier selection collectively determines the success of socio-demographic inference.
Source: arXiv Generated at: 2026-06-03 00:00:00 UTC



