Resource-Efficient Foundation Model Programming

A modular framework for cost-efficient inference in complex multi-modal tasks using Foundation Model Programs. This system enables dynamic routing of sub-tasks to the most appropriate foundation model backends, optimizing the trade-off between accuracy and computational cost.

🌟 Overview

This repository implements a neurosymbolic execution system where user queries are compiled into Python-like Foundation Model Programs. Each program is composed of generic neural functions (e.g., object detection, VQA) and dynamically chooses among multiple backend models during inference.

Key Features:

Streaming VQA support for binary and open-form tasks
Online policy learning for dynamic backend selection
Structured REINFORCE + Thompson Sampling for cost-aware inference
Composable program interface for multi-modal reasoning

Use Cases:

Visual Question Answering (VQA)
Multi-modal scene understanding
Cost-constrained decision-making for FM workflows

🗂️ Project Structure

📁FMP
├── 📁config                  # YAML configuration files for model setup and routing parameters
│   ├── binary_vqa.yaml       # FM backend and hyperparameter configurations for Streaming Binary VQA experiments
│   ├── binary_vqa_supp.yaml  # FM backend and hyperparameter configurations for Streaming Binary VQA experiments (supplementary)
│   └── open_form_vqa.yaml    # FM backend and hyperparameter configurations for Streaming Open-Form VQA experiments
├── 📁data                    # Input data and saved retrieval images
│   ├── binary_vqa.json       # Dataset annotations and queries for the Streaming Binary VQA task
│   ├── 📁open_form_vqa       # Dataset annotations and queries for the Streaming Open-Form VQA task
├── 📁execution               # Core execution modules
│   ├── backend.py            # Routing backend & program execution engine
│   ├── image_patch.py        # ImagePatch class & spatial reasoning
│   ├── models.py             # ObjectDetection, VQA, LLM wrappers
│   ├── modules.py            # Initialization & model pooling
│   └── utils.py              # Shared utilities (e.g., transforms, loaders)
├── 📁routing                 # Routing algorithms & networks
│   ├── __init__.py
│   ├── algorithms.py         # Reinforcement learning routing logic
│   └── networks.py           # Neural network models for routing decisions
├── api.key                   # API key for remote models (ignored in .gitignore)
├── environment.yml           # Conda environment specification
├── main_verification.py      # Entry point for running Streaming Binary VQA tasks
├── main_vqa.py               # Entry point for running Streaming Open-Form VQA tasks
├── README.md                 # Project documentation
├── utils_verification.py     # Task-specific utilities
├── utils_vqa.py              # Task-specific utilities
└── utils.py                  # Shared utility functions

🛠️ Installation

git clone https://github.com/yourusername/FMP.git
cd FMP
conda env create -f environment.yml
conda activate FMP

Download COCO:

mkdir -p data/coco && cd data/coco
wget http://images.cocodataset.org/zips/train2017.zip
wget http://images.cocodataset.org/zips/val2017.zip
unzip train2017.zip && unzip val2017.zip

📄 Benchmarks

Download binary VQA and open-form VQA benchmarks:

cd data
wget https://utexas.box.com/shared/static/rdcykkjg41i2rfo7itna4tvrygbmzkct.json -O binary_vqa.json
wget https://utexas.box.com/shared/static/p62xf6oqrp92zeboj8kyne21mbeej6pv.zip -O open_form_vqa.zip && unzip open_form_vqa.zip

🚀 Usage

Run Streaming Binary VQA

python main_verification.py \
  --cost_weighting 0.001 0.003 0.005 \
  --config config/binary_vqa.yaml \
  --data data/binary_vqa.json \
  --log logs/

Flag	Description
`--cost_weighting`	Trade-off between accuracy and cost.
`--config`	Program + backend routing YAML.
`--data`	JSON file with query/image annotations.
`--log`	Logging directory.

Run Streaming Open-form VQA

python main_vqa.py \
  --cost_weighting 0.001 0.003 0.005 \
  --config config/open_form_vqa.yaml \
  --data data/open_form_vqa \
  --log logs/

Flag	Description
`--cost_weighting`	Trade-off between accuracy and cost.
`--config`	Program + backend routing YAML.
`--data`	Root directory of open-form VQA data
`--log`	Logging directory.
`--type`	Query categories to evaluate. Default to all.

⚙️ Configuration

Backend model selection, routing policy, and program control flow are defined in config/.
Supports:
- MMDetection (object detection)
- HuggingFace / vLLM (LLMs and VLMs)
- AgentLego (tool-based execution)
- OpenAI/Remote APIs

🙏 Acknowledgments

Based on ideas from ViperGPT, VisProg, and neurosymbolic LLM systems.
Built using PyTorch, MMDetection, HuggingFace Transformers, and AgentLego.
Benchmarks adapted from GQA, COCO, and A-OKVQA datasets.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Resource-Efficient Foundation Model Programming

🌟 Overview

🗂️ Project Structure

🛠️ Installation

📄 Benchmarks

🚀 Usage

Run Streaming Binary VQA

Run Streaming Open-form VQA

⚙️ Configuration

🙏 Acknowledgments

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
config		config
execution		execution
routing		routing
.gitignore		.gitignore
README.md		README.md
baseline_verification_llm.py		baseline_verification_llm.py
baseline_verification_llm_routing.py		baseline_verification_llm_routing.py
baseline_verification_static.py		baseline_verification_static.py
baseline_vqa_llm.py		baseline_vqa_llm.py
baseline_vqa_llm_routing.py		baseline_vqa_llm_routing.py
baseline_vqa_static.py		baseline_vqa_static.py
dsl.prompt		dsl.prompt
environment.yml		environment.yml
main_verification.py		main_verification.py
main_vqa.py		main_vqa.py
utils.py		utils.py
utils_verification.py		utils_verification.py
utils_vqa.py		utils_vqa.py

Flitternie/FMProgramming

Folders and files

Latest commit

History

Repository files navigation

Resource-Efficient Foundation Model Programming

🌟 Overview

🗂️ Project Structure

🛠️ Installation

📄 Benchmarks

🚀 Usage

Run Streaming Binary VQA

Run Streaming Open-form VQA

⚙️ Configuration

🙏 Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages