Skip to content

CoDS-GCS/KG-WISE

Repository files navigation

KG-WISE

There is a growing demand for efficient graph neural networks (GNNs) during inference, especially for real-time applications on large knowledge graphs (KGs). GNN inference queries on KGs are computationally expensive and vary in complexity, as each query involves a different number of target nodes associated with sub- graphs of varying densities, structures, and sizes. GNN inference acceleration methods aim to instantiate from a given trained GNN model a smaller and efficient model, using approaches such as pruning, quantization, or knowledge distillation. However, these methods are not optimized for KGs and do not dynamically adjust the model reduction based on the complexity of the inference query. Moreover, they store the entire trained GNN model as a monolithic file, which leads to inefficiencies in both storage and inference with large KGs. This paper introduces KG-WISE, a scalable storage and inference system that decomposes GNN models trained in KGs into lower granularity components. The KG-WISE storage mechanism enables partial model loading based on the KG structure and model components. KG-WISE proposes a query-aware inference method that dynamically adjusts the reduced model based on the inference query complexity. Our evaluation spans six real KGs with up to 42 million nodes and 166 million edges from diverse domains with GNN models for node classification and link prediction. Our exper- iments show that KG-WISE outperforms state-of-the-art systems by reducing inference time by up to 90% and memory usage by up to 80% while achieving comparable or better model performance.

Fig.1: KG-WISE retrieves a LLM extracted subgraph based on the user query, loads the decomposed GNN model from the KV store, and performs on-demand inference using optimized sparse tensor operations.

Pre-requisites

1. Virtuoso Installation

Please follow the steps provided here to install and set up Virtuoso.

2. Data Ingestion.

Ingest your Knowledge Graph KG into the Virtuoso engine.
To replicate the DBLP NC experiment from the paper, you can use the DBLP KG from here

Installation

  • Clone the KG-WISE repo
  • Create KGWISE Conda environment (Python 3.8) and install pip requirements.
  • Activate the KGWISE environment
conda activate KGWISE

Running DBLP Experiment

  1. Add the endpoint to your RDF Engine and graph uri in Run_DBLP.sh
  2. Execute:
chmod +x run_extract.sh
./Run_DBLP.sh

OR

Download the ready-to-use datasets below

Download KGTOSA NC datasets

  • MAG_42M_PV_FG
  • MAG_42M_PV_d1h1
  • DBLP-15M_PV_FG
  • DBLP-15M_PV_d1h1
  • YAGO4-30M_PC_FG
  • YAGO4-30M_PC_d1h1
  • Download KGTOSA LP datasets

  • YAGO3-10_FG_d2h1
  • WikiKG2_FG_d2h1
  • DBLP2023-010305_FG_d2h1
  • LLM Based KG Sampling:

    To run an example of LLM based sampling you can run the following command:

    # Get a LLM based SPARQL query template for the task on KG.  
    python SparqlMLaasService/LLM_subgraph_sampler_rag.py --example DBLP_NC

    You can refer to the provided examples for the following KG datasets: DBLP, YAGO4, MAG, YAGO310, and WikiKG You can refer to any of the provided examples for your custom KG. Include the schema of your KG along with the frequency of the triples.

    Train the Model and Decompose:

    # Train KG-WISE and decompose into KV store 
    python GNNaaS/models/RGCN_Train.py --dataset_name --n_classes <DatasetName>

    Perform KG-WISE inference :

    python GNNaaS/models/wise_ssaint.py --dataset_name <DatasetName>

    Perform Baseline Graph-SAINT inference :

    python GNNaaS/models/Graph-SAINT.py --dataset_name <DatasetName>

    Note:

    To add your own knowledge graphs, you need to upload them to your RDF Engine and configure the endpoint from Constants.py and add API key for LLM_subgraph_sampler_rag.py

    About

    No description, website, or topics provided.

    Resources

    Stars

    Watchers

    Forks

    Releases

    No releases published

    Packages

    No packages published