Show simple item record

dc.contributor.advisorGudiño-Mendoza, Gema B.
dc.contributor.authorDurán-González, Erika S.
dc.date.accessioned2025-06-03T18:52:45Z
dc.date.accessioned2026-04-28T16:03:56Z
dc.date.available2025-06-03T18:52:45Z
dc.date.available2026-04-28T16:03:56Z
dc.date.issued2025-05
dc.identifier.citationDurán-González, E. S. (2025). A Comparative Analysis of Algorithms to Address the Imbalanced Dataset Problem in Federated Learning. Trabajo de obtención de grado, Maestría en Ciencia de Datos. Tlaquepaque, Jalisco: ITESO.
dc.identifier.urihttps://hdl.handle.net/20.500.12032/187365
dc.description.abstractTraditional training in Machine Learning (ML) algorithms requires data collected from various devices to be transferred to a central server, which poses potential security and data-privacy risks. An additional critical aspect of machine learning is class imbalance, which arises when certain classes are underrepresented, potentially leading to suboptimal performance, particularly for minority class data. Different approaches such as oversampling, undersampling, and synthetic data creation have been developed for machine learning to overcome this problem. Federated Learning (FL) is a promising privacy-preserving Artificial Intelligence (AI) framework that addresses the challenges presented in traditional machine learning training. In federated learning, class imbalance may also occur, but the previously mentioned approaches in machine learning are not directly applicable. In federated learning, the class distribution is unknown to protect privacy. Several federated learning algorithms have been developed to address this problem. This thesis aims to implement and compare three federating learning algorithms designed to address the class imbalance problem: Combinatorial Upper Confidence Bounds (CUCB), CLass IMBalance Federated Learning (CLIMB), and Federated Feature Distillation (FedFed). Three different data distributions were tested: label imbalance, quantitative imbalance, and double imbalance. To provide common ground for algorithm comparison, the implementation considers the same dataset and data pre-processing, the same neural network model, and hype-parameter training. After implementation, the results showed that CUCB had the best convergence rate, which is due to the algorithm inferring the data distribution from the test dataset. CLIMB addresses the local and global mismatch imbalance type, making the algorithm more robust and exhibiting the best performance in all data distributions. The FedFed does not perform as anticipated, despite utilizing the latest advancements in generative AI. Further exploration needs to be done in this implementation, where a complex environment is tested, such as increasing the number of clients.
dc.language.isoeng
dc.publisherITESO
dc.rights.urihttps://creativecommons.org/licenses/by-nc-nd/4.0/deed.es
dc.subjectFederated Learning
dc.subjectData Privacy
dc.subjectImbalanced Dataset
dc.subjectAlgorithm Design and Analysis
dc.titleA Comparative Analysis of Algorithms to Address the Imbalanced Dataset Problem in Federated Learning
dc.typeinfo:eu-repo/semantics/masterThesis
dc.type.versioninfo:eu-repo/semantics/acceptedVersion


Files in this item

FilesSizeFormatView
ITESO_MAF_MScThesis_ED.pdf2.612Mbapplication/pdfView/Open

This item appears in the following Collection(s)

Show simple item record

https://creativecommons.org/licenses/by-nc-nd/4.0/deed.es
Except where otherwise noted, this item's license is described as https://creativecommons.org/licenses/by-nc-nd/4.0/deed.es

© AUSJAL 2022

Asociación de Universidades Confiadas a la Compañía de Jesús en América Latina, AUSJAL
Av. Santa Teresa de Jesús Edif. Cerpe, Piso 2, Oficina AUSJAL Urb.
La Castellana, Chacao (1060) Caracas - Venezuela
Tel/Fax (+58-212)-266-13-41 /(+58-212)-266-85-62

Nuestras redes sociales

facebook Facebook

twitter Twitter

youtube Youtube

Asociaciones Jesuitas en el mundo
Ausjal en el mundo AJCU AUSJAL JESAM JCEP JCS JCAP