A FastAPI-based service for managing and processing images with optional AI-powered caption generation.
- Image management (upload, list, retrieve)
- Image cropping with customizable target sizes
- Optional AI-powered caption generation
- Image export functionality
- RESTful API endpoints
- Python 3.12+
- FastAPI
- Pillow
- PyTorch (optional, for AI caption generation)
- Unsloth (optional, for AI caption generation)
- Clone the repository:
git clone <repository-url>
cd gallery-project- Create a virtual environment (recommended):
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate- Install dependencies:
pip install -r requirements.txtwsl sudo mount -t drvfs D: /mnt/d
cd source unsloth/bin/activate python /mnt/c/playground/imageGalleryServer/main.py
The service uses the following configuration:
IMAGES_DIR: Directory where images are stored (default:/Users/stuartleal/gallery-project/images)IMAGES_PER_PAGE: Number of images per page in pagination (default: 10)CAPTION_GENERATOR: Type of caption generator to use (DUMMYorUNSLOTH)
-
GET /images: List images with pagination- Query parameters:
page: Page number (default: 1)page_size: Images per page (default: 10)
- Returns: List of images with metadata
- Query parameters:
-
GET /images/{image_id}: Get a specific image- Returns: Image file
-
GET /images/{image_id}/caption: Get image caption- Returns: Caption text
-
POST /images/{image_id}/caption: Save image caption- Body:
{"caption": "string"}
- Body:
-
POST /images/{image_id}/generate-caption: Generate caption using AI- Query parameters:
prompt: Optional prompt for caption generation
- Returns: Generated caption
- Query parameters:
-
GET /images/{image_id}/preview/{target_size}: Get image preview- Returns: Scaled image preview
-
POST /images/{image_id}/crop: Crop image- Body:
{ "targetSize": number, "normalizedDeltas": { "x": number, "y": number } } - Returns: Cropped image
- Body:
POST /api/export-images: Export selected images- Body:
{ "imageIds": ["string"] } - Returns: ZIP file containing selected images and their captions
- Body:
The service supports two modes for caption generation:
-
Dummy Mode: Generates simple, predefined captions
- No additional dependencies required
- Fast and lightweight
- Good for testing and development
- Example: "A picture of something" or "A picture of {prompt}"
-
AI Mode: Uses Unsloth's Llama 3.2 Vision model
- Requires NVIDIA or Intel GPU
- More sophisticated captions
- Higher resource requirements
- Dependencies:
- PyTorch
- Unsloth
- Transformers
- Accelerate
To switch between modes, modify the CAPTION_GENERATOR setting in config.py:
# For dummy mode (default)
CAPTION_GENERATOR = CaptionGeneratorType.DUMMY
# For AI mode
CAPTION_GENERATOR = CaptionGeneratorType.UNSLOTHgallery-project/
├── image_server/
│ ├── main.py # Main FastAPI application
│ ├── caption_generator.py # Caption generation logic
│ ├── config.py # Configuration settings
│ └── requirements.txt # Python dependencies
├── images/ # Image storage directory
└── README.md # This documentation
- Start the server:
cd image_server
python main.py- The server will start at
http://localhost:4322
- Create new endpoints in
main.py - Add corresponding models in the Models section
- Implement business logic in separate modules
- Update documentation
- Install test dependencies:
pip install -r requirements-test.txt- Run tests:
pytest- Fork the repository
- Create a feature branch
- Commit your changes
- Push to the branch
- Create a Pull Request
[Your chosen license]
For support, please open an issue or contact [your contact information].