Federated Learning on Multilabel Evolving Data Streams

For details, see the paper: IEEE Internet of Things Journal - Federated Learning on Multilabel Evolving Data Streams

Abstract

Multilabel classification in distributed evolving data stream environment presents significant challenges, including addressing distributed concept drifts and label dependencies. In this study, we introduce two novel solutions employing federated learning (FL) problem transformation techniques to tackle these challenges effectively. Our first approach is an error-driven micro-cluster-based learning strategy that adapts micro-clusters to the evolving data distributions, enabling the handling of concept drifts from various client sources. Our second approach utilizes a graph-based method that leverages graph centrality to capture label dependency and correlation in distributed multilabel data streams. Experimental evaluations reveal that our proposed solutions outperform state-of-the-art methods in terms of multilabel classification metrics. This study highlights the potential of FL in overcoming the challenges associated with distributed multilabel data stream classification.

Quick Start

1. Install Dependencies

We recommend using conda to set up the environment.

conda create --name venv python=3.11 -y
conda activate venv

pip install -r requirement.txt

2. Run Tests (Optional)

python tests/test_utils.py

3. Run the Code

Parameters

Parameter	Default	Description
`--dataset`	yelp	Dataset name
`--clients`	5	Number of federated clients
`--features`	671	Number of features
`--labels`	5	Number of labels
`--max_mc`	500	Max micro-clusters per client
`--global_mc`	500	Max global micro-clusters
`--percent_init`	0.15	Initial data percentage
`--run_type`	fed	Run mode: `fed` (recommended)

Usage Examples

python main.py --dataset yelp --clients 5 --run_type fed

# Scale up with more clients  
python main.py --dataset yelp --clients 10 --run_type fed
python main.py --dataset yelp --clients 20 --run_type fed

# Different datasets
python main.py --dataset scene --clients 3 --run_type fed

The data_preprocessed/ folder contains:

yelp.npy - Yelp multi-label dataset
scene.npy - Scene multi-label dataset

Dataset Statistics

Dataset	Instances	Features	Feature Type	Labels	Cardinality	Link
Emotions	593	72	numeric	6	1.868	Emotions
Birds	645	260	numeric	19	1.014	Birds
Enron	1,702	1,001	nominal	53	3.378	Enron
Image	2,000	294	numeric	5	1.236	Image
Yeast	2,417	103	numeric	14	4.237	Yeast
Scene	2,407	294	nominal	6	1.074	Scene
Slashdot	3,782	1,079	nominal	22	1.181	Slashdot
Tmc2007-500	28,600	500	nominal	22	2.220	Tmc2007-500
Yelp	10,810	671	nominal	5	1.638	Yelp

Citation

If you find this code useful, please consider giving a star ⭐ and citation

@ARTICLE{11098479,
  author={Lamptey, Khalid Odartey and Ayekai, Browne Judith and Ud Din, Salah},
  journal={IEEE Internet of Things Journal}, 
  title={Federated Learning on Multilabel Evolving Data Streams}, 
  year={2025},
  volume={12},
  number={20},
  pages={42103-42115},
  keywords={Streams;Federated learning;Multi label classification;Concept drift;Distributed databases;Accuracy;Training;Machine learning algorithms;Decision trees;Data models;Concept drift;data streams;federated learning (FL);multilabel classification;prototype-learning},
  doi={10.1109/JIOT.2025.3592954}}

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
data_preprocessed		data_preprocessed
fedmul		fedmul
figures		figures
results		results
tests		tests
utils		utils
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirement.txt		requirement.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Federated Learning on Multilabel Evolving Data Streams

Abstract

Quick Start

1. Install Dependencies

2. Run Tests (Optional)

3. Run the Code

Parameters

Usage Examples

Dataset Statistics

Citation

About

Uh oh!

Releases

Packages

Languages

kholam/FedMuL

Folders and files

Latest commit

History

Repository files navigation

Federated Learning on Multilabel Evolving Data Streams

Abstract

Quick Start

1. Install Dependencies

2. Run Tests (Optional)

3. Run the Code

Parameters

Usage Examples

Dataset Statistics

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages