-
Notifications
You must be signed in to change notification settings - Fork 4
fix(backend): Backend Docker build fixes for AutoModelForImageTextToText #642
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This PR fixes ModuleNotFoundError: AutoModelForImageTextToText that was breaking backend deployments.
## Changes
1. **Add transformers[vision] extra (pyproject.toml + poetry.lock)**
- Changed: transformers (>=4.46.0) → transformers[vision] (>=4.46.0)
- Reason: Docling's CodeFormulaModel requires vision-text model dependencies
2. **Preserve numpy._core.tests (backend/Dockerfile.backend)**
- Added exclusion: ! -path "*/numpy/*" to tests cleanup
- Reason: numpy._core.tests is a required module, not test code
- Was causing cascading import failures:
- numpy.testing → numpy._core.tests._natype
- scipy → numpy
- sklearn → scipy
- transformers → sklearn
- Result: AutoModelForImageTextToText import failed
## Testing
Validated locally with ARM64 build:
```bash
docker build -f backend/Dockerfile.backend -t backend:test .
docker run --rm backend:test python -c \
"from transformers import AutoModelForImageTextToText; print('✓')"
```
Output: ✓
## Fixes
- Resolves AutoModelForImageTextToText import errors
- Fixes 50+ failed deployments caused by missing vision dependencies
- Prevents accidental deletion of required numpy modules
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
🚀 Development Environment OptionsThis repository supports Dev Containers for a consistent development environment. Option 1: GitHub Codespaces (Recommended)Create a cloud-based development environment:
Option 2: VS Code Dev Containers (Local)Use Dev Containers on your local machine:
Option 3: Traditional Local SetupSet up the development environment manually: # Clone the repository
git clone https://github.com/manavgup/rag_modulo.git
cd rag_modulo
git checkout fix/backend-docker-build
# Initialize development environment
make dev-init
make dev-build
make dev-up
make dev-validateAvailable CommandsOnce in your development environment: make help # Show all available commands
make dev-validate # Validate environment setup
make test-atomic # Run atomic tests
make test-unit # Run unit tests
make lint # Run lintingServices AvailableWhen running
This automated message helps reviewers quickly set up the development environment. |
Code Review - PR #642: Backend Docker Build FixesThank you for this well-documented PR! This addresses a critical deployment issue with a focused, surgical fix. Here's my comprehensive review: ✅ Strengths1. Excellent Problem Analysis
2. Minimal, Targeted Changes
3. Strong Documentation
🔍 Technical AnalysisChange 1: Add
|
…tion errors
This PR fixes Pydantic validation errors that were occurring when the SKIP_AUTH secret was empty.
## Problem
When SKIP_AUTH secret is not set or empty, the backend receives an empty string '', causing:
```
Input should be a valid boolean, unable to interpret input
[type=bool_parsing, input_value='', input_type=str]
```
This was causing backend deployments to fail during the Code Engine application startup.
## Solution
Added default value 'false' to SKIP_AUTH environment variable:
**Before**:
```yaml
SKIP_AUTH: ${{ secrets.SKIP_AUTH }}
```
**After**:
```yaml
SKIP_AUTH: ${{ secrets.SKIP_AUTH || 'false' }}
```
Now when the secret is empty, the backend receives 'false' instead of '', which Pydantic can parse as a boolean.
## Testing
This fix will be validated in the next deployment workflow run. Expected behavior:
- If SKIP_AUTH secret is set: uses that value
- If SKIP_AUTH secret is empty/unset: defaults to 'false'
- Backend starts successfully without Pydantic validation errors
## Related
- Part of deployment fixes series (breaking down PR #641)
- Related to PR #642 (backend Docker fixes)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
…odeengine This PR updates the GitHub Actions workflow to use the correct backend Dockerfile. ## Problem The workflow was using `Dockerfile.codeengine` which: - Used `poetry install` that pulled CUDA PyTorch from poetry.lock (6-8GB NVIDIA libs) - Caused massive Docker image bloat - Led to deployment failures ## Solution Changed the workflow to use `backend/Dockerfile.backend` which: - Parses `pyproject.toml` directly with pip - Uses CPU-only PyTorch index `--extra-index-url https://download.pytorch.org/whl/cpu` - Significantly reduces image size - Works with the fixes from PR #642 (transformers[vision] + numpy cleanup) **Before**: ```yaml file: ./Dockerfile.codeengine ``` **After**: ```yaml file: ./backend/Dockerfile.backend ``` ## Changes - `.github/workflows/deploy_complete_app.yml` (line 215): Updated Dockerfile path ## Testing This fix will be validated in the CI pipeline. Expected behavior: ✅ **Builds use correct Dockerfile**: backend/Dockerfile.backend ✅ **CPU-only PyTorch**: No CUDA libraries in image ✅ **Smaller image size**: ~500MB vs 6-8GB ✅ **Successful deployment**: No import errors ## Type of Change - [x] Bug fix (non-breaking change which fixes an issue) - [x] Deployment fix ## Related PRs This is part of the focused PR strategy to replace PR #641: - **PR #642**: Backend Docker fixes (transformers[vision] + numpy cleanup) - **PR #643**: SKIP_AUTH default value fix - **PR #644** (this PR): Workflow Dockerfile path fix ## Checklist - [x] Code follows the style guidelines of this project - [x] Change is focused and addresses a single issue - [x] Commit message follows conventional commits format - [x] No breaking changes introduced - [x] CI workflows will validate the change --- 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Summary
Fixes
ModuleNotFoundError: AutoModelForImageTextToTextthat was causing 50+ failed backend deployments.This PR contains two critical Docker build fixes:
Changes
1. Add transformers[vision] extra
Files:
pyproject.toml,poetry.lockWhy: Docling's CodeFormulaModel requires
transformers[vision]to access vision-text model dependencies (pillow, torchvision, etc.)2. Preserve numpy._core.tests
File:
backend/Dockerfile.backendWhy:
numpy._core.testsis a required module (not test code) that was being deleted by cleanup, causing cascading import failures:numpy.testingimportsnumpy._core.tests._natypescipyimportsnumpysklearnimportsscipytransformersimportssklearnTesting
✅ Local validation with ARM64 build:
Output: ✓
Fixes
Related
Test Plan
🤖 Generated with Claude Code