# AI-Generated Alt Text Implementation Guide ## Overview This system generates accessibility-compliant alt text for recipe images in both German and English using local Ollama vision models. Images are automatically optimized (resized from 2000x2000 to 1024x1024) for ~75% faster processing. ## Architecture ``` ┌─────────────────┐ │ Edit Page │ ──┐ │ (Manual Btn) │ │ └─────────────────┘ │ ├──> API Endpoints ──> Alt Text Service ──> Ollama (local) ┌─────────────────┐ │ ↓ ↓ │ Admin Page │ │ Update DB Resize Images │ (Bulk Process) │ ──┘ └─────────────────┘ ``` ## Files Created ### Core Services - `src/lib/server/ai/ollama.ts` - Ollama API wrapper - `src/lib/server/ai/alttext.ts` - Alt text generation logic (DE/EN) - `src/lib/server/ai/imageUtils.ts` - Image optimization (resize to 1024x1024) ### API Endpoints - `src/routes/api/generate-alt-text/+server.ts` - Single image generation - `src/routes/api/generate-alt-text-bulk/+server.ts` - Batch processing ### UI Components - `src/lib/components/GenerateAltTextButton.svelte` - Reusable button component - `src/routes/admin/alt-text-generator/+page.svelte` - Bulk processing admin page ## Setup Instructions ### 1. Environment Variables Add to your `.env` file: ```bash OLLAMA_URL="http://localhost:11434" ``` ### 2. Install/Verify Dependencies ```bash # Sharp is already installed (for image resizing) pnpm list sharp # Verify Ollama is running ollama list ``` ### 3. Ensure Vision Model is Available You have `gemma3:latest` installed. If not: ```bash ollama pull gemma3:latest ``` ## Usage ### Option 1: Manual Generation (Edit Page) Add the button component to your edit page where images are managed: ```svelte ``` ### Option 2: Bulk Processing (Admin Page) Navigate to: **`/admin/alt-text-generator`** Features: - View statistics (total images, missing alt text) - Check Ollama status - Process in batches (configurable size) - Filter: "Only Missing" or "All (Regenerate)" ### Option 3: Programmatic API ```typescript // POST /api/generate-alt-text const response = await fetch('/api/generate-alt-text', { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ shortName: 'brot', imageIndex: 0, modelName: 'gemma3:latest' // optional }) }); const { altText } = await response.json(); // altText = { de: "...", en: "..." } ``` ## How It Works ### Image Processing Flow 1. **Input**: 2000x2000px WebP image (~4-6MB) 2. **Optimization**: Resized to 1024x1024px JPEG 85% quality (~1-2MB) - Maintains aspect ratio - Reduces processing time by ~75-85% 3. **Encoding**: Converted to base64 4. **AI Processing**: Sent to Ollama with context 5. **Output**: Alt text generated in both languages ### Alt Text Generation **German Prompt:** ``` Erstelle einen prägnanten Alt-Text (maximal 125 Zeichen) für dieses Rezeptbild. Rezept: Brot Kategorie: Brot Stichwörter: Sauerteig, Roggen Beschreibe NUR das SICHTBARE: Aussehen, Farben, Präsentation, Textur. ``` **English Prompt:** ``` Generate a concise alt text (maximum 125 characters) for this recipe image. Recipe: Bread Category: Bread Keywords: Sourdough, Rye Describe ONLY what's VISIBLE: appearance, colors, presentation, texture. ``` ### Database Updates Updates are saved to: - `recipe.images[index].alt` - German alt text - `recipe.translations.en.images[index].alt` - English alt text Arrays are automatically synchronized to match indices. ## Performance ### Image Optimization Impact | Metric | Original (2000x2000) | Optimized (1024x1024) | Improvement | |--------|---------------------|----------------------|-------------| | File Size | ~12-16MB base64 | ~1-2MB base64 | 75-85% smaller | | Processing Time | ~4-6 seconds | ~1-2 seconds | 75-85% faster | | Memory Usage | High | Low | Significant | ### Batch Processing - Processes images sequentially to avoid overwhelming CPU - Configurable batch size (default: 10 recipes at a time) - Progress tracking with success/fail counts ## Automatic Resizing **Question**: Does Ollama resize images automatically? **Answer**: Yes, but manual preprocessing is better: - **Ollama automatic**: Resizes to 224x224 internally - **Manual preprocessing**: Resize to 1024x1024 before sending - Reduces network overhead - Lowers memory usage - Faster inference - Better quality (more pixels than 224x224) Sources: - [Ollama Vision Models Blog](https://ollama.com/blog/vision-models) - [Optimize Image Resolution for Ollama](https://markaicode.com/optimize-image-resolution-ollama-vision-models/) - [Llama 3.2 Vision](https://ollama.com/library/llama3.2-vision) ## Integration with Image Upload To auto-generate alt text when images change, add to your image upload handler: ```typescript // After successful image upload: if (newImageUploaded) { await fetch('/api/generate-alt-text', { method: 'POST', body: JSON.stringify({ shortName: recipe.short_name, imageIndex: recipe.images.length - 1 // Last image }) }); } ``` ## Troubleshooting ### Ollama Not Available ```bash # Check if Ollama is running curl http://localhost:11434/api/tags # Start Ollama ollama serve # Verify model is installed ollama list | grep gemma3 ``` ### Alt Text Quality Issues 1. **Too generic**: Add more context (tags, ingredients) 2. **Too long**: Adjust max_tokens in `alttext.ts` 3. **Wrong language**: Check prompts in `buildPrompt()` function 4. **Low accuracy**: Consider using larger model (90B version) ### Performance Issues 1. **Slow processing**: Already optimized to 1024x1024 2. **High CPU**: Reduce batch size in admin page 3. **Memory errors**: Lower `maxWidth`/`maxHeight` in `imageUtils.ts` ## Future Enhancements - [ ] Queue system for background processing - [ ] Progress websocket for real-time updates - [ ] A/B testing different prompts - [ ] Fine-tune model on recipe images - [ ] Support for multiple images per recipe - [ ] Auto-generate on upload hook - [ ] Translation validation (check DE/EN consistency) ## API Reference ### POST /api/generate-alt-text Generate alt text for a single image. **Request:** ```json { "shortName": "brot", "imageIndex": 0, "modelName": "llava-llama3:8b" } ``` **Response:** ```json { "success": true, "altText": { "de": "Knuspriges Sauerteigbrot mit goldbrauner Kruste", "en": "Crusty sourdough bread with golden-brown crust" }, "message": "Alt text generated and saved successfully" } ``` ### POST /api/generate-alt-text-bulk Batch process multiple recipes. **Request:** ```json { "filter": "missing", // "missing" or "all" "limit": 10, "modelName": "llava-llama3:8b" } ``` **Response:** ```json { "success": true, "processed": 25, "failed": 2, "results": [ { "shortName": "brot", "name": "Sauerteigbrot", "processed": 1, "failed": 0 } ] } ``` ### GET /api/generate-alt-text-bulk Get statistics about images. **Response:** ```json { "totalWithImages": 150, "missingAltText": 42, "ollamaAvailable": true } ``` ## Testing ```bash # Test Ollama connection curl http://localhost:11434/api/tags # Test image generation (replace with actual values) curl -X POST http://localhost:5173/api/generate-alt-text \ -H "Content-Type: application/json" \ -d '{"shortName":"brot","imageIndex":0}' # Check bulk stats curl http://localhost:5173/api/generate-alt-text-bulk ``` ## License & Credits - Uses [Ollama](https://ollama.com/) for local AI inference - Image processing via [Sharp](https://sharp.pixelplumbing.com/) - Vision model: Gemma3 (better German language support)