Official implementation of the paper:
Lim, H., Li, Q., Yang, S., & Kim, J. (2025). A BERT‐Based Multi‐Embedding Fusion Method Using Review Text for Recommendation. Expert Systems, 42(5), e70041. Paper Link
This repository provides the official implementation of MFNR (Multi-Embedding Fusion Network for Recommendation), which enhances review-based recommendation by combining multiple pre-trained language models. MFNR integrates BERT and RoBERTa embeddings to capture richer user and item features from textual reviews. Experiments on Amazon and Goodreads datasets show that MFNR outperforms baseline models with an average improvement of 9.18% in RMSE and 14.81% in MAE, demonstrating the effectiveness of the multi-embedding fusion approach for recommender systems.
- Python 3.10
- pandas==2.2.3
- numpy==1.26.4
- ipykernel==6.29.5
- scikit-learn==1.5.2
- transformers==4.44.2
- torch==2.4.1
- torchvision==0.19.1
- pyarrow==17.0.0
Install dependencies:
pip install -r requirements.txtBelow is the project structure for quick reference.
├── data/ # Dataset directory
│ ├── raw/ # Original (unprocessed) datasets
│ └── processed/ # Preprocessed data for training/evaluation
│
├── model/ # Model definitions and checkpoints
│
├── src/ # Core source code
│ ├── bert.py # BERT-based embedding and feature extraction module
│ ├── config.yaml # Model and training configuration file
│ ├── path.py # Path and directory management utilities
│ └── utils.py # Helper functions (data loading, metrics, etc.)
│
├── main.py # Entry point for model training and evaluation
│
├── requirements.txt # Python package dependencies
│
├── README.md # Project documentation
│
└── .gitignore # Git ignore configuration
MFNR (Multi-Embedding Fusion Network for Recommendation) is a review-based recommendation model that leverages BERT and RoBERTa to extract rich semantic features from textual reviews.
The model consists of two parallel networks:
- UPM (User Preference Modeling): captures user preferences from reviews written by users.
- IFM (Item Feature Modeling): extracts item features from reviews written about items.
Both networks share the same architecture but focus on different perspectives. The latent representations generated by UPM and IFM are combined and passed through a rating prediction network, which models nonlinear user–item interactions to predict ratings.
Create a virtual environment and install all dependencies:
conda create -n mfnr python=3.10
conda activate mfnr
pip install -r requirements.txtPlace your dataset in the data/raw/ folder and preprocess it:
# Example: preprocess raw data and save to data/processed/
python src/utils.py --mode preprocessRun the training script with configuration file:
python main.py --config src/config.yaml --mode trainAfter training, evaluate using the saved checkpoint:
python main.py --config src/config.yaml --mode testThe performance of MFNR was evaluated on four real-world review datasets: Industrial and Scientific, Musical Instruments, Prime Pantry, and Goodreads. Results show that the proposed model consistently outperforms existing baselines in both RMSE and MAE metrics.
| Model | Industrial and Scientific | Musical Instruments | Prime Pantry | Goodreads | ||||
|---|---|---|---|---|---|---|---|---|
| RMSE | MAE | RMSE | MAE | RMSE | MAE | RMSE | MAE | |
| MF | 1.430 | 1.342 | 1.433 | 1.344 | 1.421 | 1.331 | 1.246 | 0.967 |
| PMF | 1.277 | 1.245 | 1.308 | 1.270 | 1.280 | 1.239 | 1.222 | 0.960 |
| HFT | 1.275 | 1.056 | 1.271 | 1.068 | 1.213 | 1.015 | 1.121 | 0.875 |
| DeepCoNN | 1.231 | 0.929 | 1.171 | 0.916 | 1.142 | 0.904 | 1.113 | 0.778 |
| NARRE | 1.126 | 0.843 | 1.104 | 0.841 | 1.090 | 0.842 | 0.976 | 0.749 |
| DAML | 1.102 | 0.839 | 1.099 | 0.833 | 1.033 | 0.826 | 0.969 | 0.739 |
| AENAR | 1.100 | 0.836 | 1.098 | 0.831 | 1.028 | 0.817 | 0.968 | 0.738 |
| SAFMR | 1.100 | 0.833 | 1.097 | 0.830 | 1.023 | 0.802 | 0.966 | 0.735 |
| Proposed (MFNR) | 1.087 | 0.803 | 1.094 | 0.823 | 1.017 | 0.763 | 0.965 | 0.724 |
If you find this work useful in your research, please cite our paper:
@article{lim2025bert,
title = {A BERT-Based Multi-Embedding Fusion Method Using Review Text for Recommendation},
author = {Lim, H. and Li, Q. and Yang, S. and Kim, J.},
journal = {Expert Systems},
volume = {42},
number = {5},
pages = {e70041},
year = {2025},
publisher = {Wiley},
doi = {10.1111/exsy.70041}
}For questions, collaborations, or feedback, please contact:
Qinglong Li (이청용)
Assistant Professor, Division of Computer Engineering, Hansung University
Email: leecy@hansung.ac.kr
Last updated: October 2025
