A BERT-Based Multi-Embedding Fusion Method Using Review Text for Recommendation

Official implementation of the paper:

Lim, H., Li, Q., Yang, S., & Kim, J. (2025). A BERT‐Based Multi‐Embedding Fusion Method Using Review Text for Recommendation. Expert Systems, 42(5), e70041. Paper Link

Overview

This repository provides the official implementation of MFNR (Multi-Embedding Fusion Network for Recommendation), which enhances review-based recommendation by combining multiple pre-trained language models. MFNR integrates BERT and RoBERTa embeddings to capture richer user and item features from textual reviews. Experiments on Amazon and Goodreads datasets show that MFNR outperforms baseline models with an average improvement of 9.18% in RMSE and 14.81% in MAE, demonstrating the effectiveness of the multi-embedding fusion approach for recommender systems.

Requirements

Python 3.10
pandas==2.2.3
numpy==1.26.4
ipykernel==6.29.5
scikit-learn==1.5.2
transformers==4.44.2
torch==2.4.1
torchvision==0.19.1
pyarrow==17.0.0

Install dependencies:

pip install -r requirements.txt

Repository Structure

Below is the project structure for quick reference.

├── data/                        # Dataset directory
│   ├── raw/                     # Original (unprocessed) datasets
│   └── processed/               # Preprocessed data for training/evaluation
│
├── model/                       # Model definitions and checkpoints
│
├── src/                         # Core source code
│   ├── bert.py                  # BERT-based embedding and feature extraction module
│   ├── config.yaml              # Model and training configuration file
│   ├── path.py                  # Path and directory management utilities
│   └── utils.py                 # Helper functions (data loading, metrics, etc.)
│
├── main.py                      # Entry point for model training and evaluation
│
├── requirements.txt             # Python package dependencies
│
├── README.md                    # Project documentation
│
└── .gitignore                   # Git ignore configuration

Model Description

MFNR (Multi-Embedding Fusion Network for Recommendation) is a review-based recommendation model that leverages BERT and RoBERTa to extract rich semantic features from textual reviews.

The model consists of two parallel networks:

UPM (User Preference Modeling): captures user preferences from reviews written by users.
IFM (Item Feature Modeling): extracts item features from reviews written about items.

Both networks share the same architecture but focus on different perspectives. The latent representations generated by UPM and IFM are combined and passed through a rating prediction network, which models nonlinear user–item interactions to predict ratings.

How to Run

Environment Setup

Create a virtual environment and install all dependencies:

conda create -n mfnr python=3.10
conda activate mfnr
pip install -r requirements.txt

Data Preparation

Place your dataset in the data/raw/ folder and preprocess it:

# Example: preprocess raw data and save to data/processed/
python src/utils.py --mode preprocess

Train the Model

Run the training script with configuration file:

python main.py --config src/config.yaml --mode train

Evaluate the Model

After training, evaluate using the saved checkpoint:

python main.py --config src/config.yaml --mode test

Experimental Results

The performance of MFNR was evaluated on four real-world review datasets: Industrial and Scientific, Musical Instruments, Prime Pantry, and Goodreads. Results show that the proposed model consistently outperforms existing baselines in both RMSE and MAE metrics.

Model	Industrial and Scientific		Musical Instruments		Prime Pantry		Goodreads
Model	RMSE	MAE	RMSE	MAE	RMSE	MAE	RMSE	MAE
MF	1.430	1.342	1.433	1.344	1.421	1.331	1.246	0.967
PMF	1.277	1.245	1.308	1.270	1.280	1.239	1.222	0.960
HFT	1.275	1.056	1.271	1.068	1.213	1.015	1.121	0.875
DeepCoNN	1.231	0.929	1.171	0.916	1.142	0.904	1.113	0.778
NARRE	1.126	0.843	1.104	0.841	1.090	0.842	0.976	0.749
DAML	1.102	0.839	1.099	0.833	1.033	0.826	0.969	0.739
AENAR	1.100	0.836	1.098	0.831	1.028	0.817	0.968	0.738
SAFMR	1.100	0.833	1.097	0.830	1.023	0.802	0.966	0.735
Proposed (MFNR)	1.087	0.803	1.094	0.823	1.017	0.763	0.965	0.724

Citation

If you find this work useful in your research, please cite our paper:

@article{lim2025bert,
  title     = {A BERT-Based Multi-Embedding Fusion Method Using Review Text for Recommendation},
  author    = {Lim, H. and Li, Q. and Yang, S. and Kim, J.},
  journal   = {Expert Systems},
  volume    = {42},
  number    = {5},
  pages     = {e70041},
  year      = {2025},
  publisher = {Wiley},
  doi       = {10.1111/exsy.70041}
}

Contact

For questions, collaborations, or feedback, please contact:
Qinglong Li (이청용)
Assistant Professor, Division of Computer Engineering, Hansung University
Email: leecy@hansung.ac.kr

Last updated: October 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

A BERT-Based Multi-Embedding Fusion Method Using Review Text for Recommendation

Overview

Requirements

Repository Structure

Model Description

How to Run

Environment Setup

Data Preparation

Train the Model

Evaluate the Model

Experimental Results

Citation

Contact

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
data		data
model		model
src		src
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

fkid009/MFNR-Pytorch

Folders and files

Latest commit

History

Repository files navigation

A BERT-Based Multi-Embedding Fusion Method Using Review Text for Recommendation

Overview

Requirements

Repository Structure

Model Description

How to Run

Environment Setup

Data Preparation

Train the Model

Evaluate the Model

Experimental Results

Citation

Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages