[IPOL announce] new article: Incidence of the Sample Size Distribution on One-Shot Federated Learning
announcements about the IPOL journal
announce at list.ipol.im
Sun Feb 12 23:53:40 CET 2023
A new article is available in IPOL: http://www.ipol.im/pub/art/2023/440/
Marie Garin, and Gonzalo Iñaki Quintana,
Incidence of the Sample Size Distribution on One-Shot Federated Learning,
Image Processing On Line, 13 (2023), pp. 57–64.
https://doi.org/10.5201/ipol.2023.440
Abstract
Federated Learning (FL) is a learning paradigm where multiple nodes
collaboratively train a model by only exchanging updates or parameters.
This enables to keep data locally, therefore enhancing privacy -
statement requiring nuance, e.g. memorization of training data in
language models. Depending on the application, the number of samples
that each node contains can be very different, which can impact the
training and the final performance. This work studies the impact of the
per-node sample size distribution on the mean squared error (MSE) of the
one-shot federated estimator. We focus on one-shot aggregation of
statistical estimations made across disjoint, independent and
identically distributed (i.i.d.) data sources, in the context of
empirical risk minimization. In distributed learning, it is well-known
that for a total number of m nodes, each node should contain at least m
samples to equal the performance of centralized training. In a federated
scenario, this result remains true, but now applies to the mean of the
per-node sample size distribution. The demo enables to visualize this
effect as well as to compare the behavior of the FESC (Federated
Estimation with Statistical Correction) algorithm - a weighting scheme
which depends on the local sample size - with respect to the classical
federated estimator and the centralized one, for a large collection of
distributions, number of nodes, and features space dimension.
More information about the announce
mailing list