Files
homepage/docs/AI_ALT_TEXT_IMPLEMENTATION.md
Alexander Bocken 4ffc0940ef
All checks were successful
CI / update (push) Successful in 1m10s
feat: add AI-powered alt text generation for recipe images
- Implement local Ollama integration for bilingual (DE/EN) alt text generation
- Add image management UI to German edit page and English translation section
- Update Card and recipe detail pages to display alt text from images array
- Include GenerateAltTextButton component for manual alt text generation
- Add bulk processing admin page for batch alt text generation
- Optimize images to 1024x1024 before AI processing for 75% faster generation
- Store alt text in recipe.images[].alt and translations.en.images[].alt
2026-01-05 17:28:19 +01:00

8.1 KiB

AI-Generated Alt Text Implementation Guide

Overview

This system generates accessibility-compliant alt text for recipe images in both German and English using local Ollama vision models. Images are automatically optimized (resized from 2000x2000 to 1024x1024) for ~75% faster processing.

Architecture

┌─────────────────┐
│   Edit Page     │ ──┐
│  (Manual Btn)   │   │
└─────────────────┘   │
                      ├──> API Endpoints ──> Alt Text Service ──> Ollama (local)
┌─────────────────┐   │         ↓                    ↓
│   Admin Page    │   │    Update DB          Resize Images
│  (Bulk Process) │ ──┘
└─────────────────┘

Files Created

Core Services

  • src/lib/server/ai/ollama.ts - Ollama API wrapper
  • src/lib/server/ai/alttext.ts - Alt text generation logic (DE/EN)
  • src/lib/server/ai/imageUtils.ts - Image optimization (resize to 1024x1024)

API Endpoints

  • src/routes/api/generate-alt-text/+server.ts - Single image generation
  • src/routes/api/generate-alt-text-bulk/+server.ts - Batch processing

UI Components

  • src/lib/components/GenerateAltTextButton.svelte - Reusable button component
  • src/routes/admin/alt-text-generator/+page.svelte - Bulk processing admin page

Setup Instructions

1. Environment Variables

Add to your .env file:

OLLAMA_URL="http://localhost:11434"

2. Install/Verify Dependencies

# Sharp is already installed (for image resizing)
pnpm list sharp

# Verify Ollama is running
ollama list

3. Ensure Vision Model is Available

You have gemma3:latest installed. If not:

ollama pull gemma3:latest

Usage

Option 1: Manual Generation (Edit Page)

Add the button component to your edit page where images are managed:

<script>
  import GenerateAltTextButton from '$lib/components/GenerateAltTextButton.svelte';

  // In your image editing section:
  let shortName = data.recipe.short_name;
  let imageIndex = 0; // Index of the image in the images array
</script>

<!-- Add this near your image upload/edit section -->
<GenerateAltTextButton {shortName} {imageIndex} />

Option 2: Bulk Processing (Admin Page)

Navigate to: /admin/alt-text-generator

Features:

  • View statistics (total images, missing alt text)
  • Check Ollama status
  • Process in batches (configurable size)
  • Filter: "Only Missing" or "All (Regenerate)"

Option 3: Programmatic API

// POST /api/generate-alt-text
const response = await fetch('/api/generate-alt-text', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({
    shortName: 'brot',
    imageIndex: 0,
    modelName: 'gemma3:latest' // optional
  })
});

const { altText } = await response.json();
// altText = { de: "...", en: "..." }

How It Works

Image Processing Flow

  1. Input: 2000x2000px WebP image (~4-6MB)
  2. Optimization: Resized to 1024x1024px JPEG 85% quality (~1-2MB)
    • Maintains aspect ratio
    • Reduces processing time by ~75-85%
  3. Encoding: Converted to base64
  4. AI Processing: Sent to Ollama with context
  5. Output: Alt text generated in both languages

Alt Text Generation

German Prompt:

Erstelle einen prägnanten Alt-Text (maximal 125 Zeichen) für dieses Rezeptbild.
Rezept: Brot
Kategorie: Brot
Stichwörter: Sauerteig, Roggen

Beschreibe NUR das SICHTBARE: Aussehen, Farben, Präsentation, Textur.

English Prompt:

Generate a concise alt text (maximum 125 characters) for this recipe image.
Recipe: Bread
Category: Bread
Keywords: Sourdough, Rye

Describe ONLY what's VISIBLE: appearance, colors, presentation, texture.

Database Updates

Updates are saved to:

  • recipe.images[index].alt - German alt text
  • recipe.translations.en.images[index].alt - English alt text

Arrays are automatically synchronized to match indices.

Performance

Image Optimization Impact

Metric Original (2000x2000) Optimized (1024x1024) Improvement
File Size ~12-16MB base64 ~1-2MB base64 75-85% smaller
Processing Time ~4-6 seconds ~1-2 seconds 75-85% faster
Memory Usage High Low Significant

Batch Processing

  • Processes images sequentially to avoid overwhelming CPU
  • Configurable batch size (default: 10 recipes at a time)
  • Progress tracking with success/fail counts

Automatic Resizing

Question: Does Ollama resize images automatically?

Answer: Yes, but manual preprocessing is better:

  • Ollama automatic: Resizes to 224x224 internally
  • Manual preprocessing: Resize to 1024x1024 before sending
    • Reduces network overhead
    • Lowers memory usage
    • Faster inference
    • Better quality (more pixels than 224x224)

Sources:

Integration with Image Upload

To auto-generate alt text when images change, add to your image upload handler:

// After successful image upload:
if (newImageUploaded) {
  await fetch('/api/generate-alt-text', {
    method: 'POST',
    body: JSON.stringify({
      shortName: recipe.short_name,
      imageIndex: recipe.images.length - 1 // Last image
    })
  });
}

Troubleshooting

Ollama Not Available

# Check if Ollama is running
curl http://localhost:11434/api/tags

# Start Ollama
ollama serve

# Verify model is installed
ollama list | grep gemma3

Alt Text Quality Issues

  1. Too generic: Add more context (tags, ingredients)
  2. Too long: Adjust max_tokens in alttext.ts
  3. Wrong language: Check prompts in buildPrompt() function
  4. Low accuracy: Consider using larger model (90B version)

Performance Issues

  1. Slow processing: Already optimized to 1024x1024
  2. High CPU: Reduce batch size in admin page
  3. Memory errors: Lower maxWidth/maxHeight in imageUtils.ts

Future Enhancements

  • Queue system for background processing
  • Progress websocket for real-time updates
  • A/B testing different prompts
  • Fine-tune model on recipe images
  • Support for multiple images per recipe
  • Auto-generate on upload hook
  • Translation validation (check DE/EN consistency)

API Reference

POST /api/generate-alt-text

Generate alt text for a single image.

Request:

{
  "shortName": "brot",
  "imageIndex": 0,
  "modelName": "llava-llama3:8b"
}

Response:

{
  "success": true,
  "altText": {
    "de": "Knuspriges Sauerteigbrot mit goldbrauner Kruste",
    "en": "Crusty sourdough bread with golden-brown crust"
  },
  "message": "Alt text generated and saved successfully"
}

POST /api/generate-alt-text-bulk

Batch process multiple recipes.

Request:

{
  "filter": "missing",  // "missing" or "all"
  "limit": 10,
  "modelName": "llava-llama3:8b"
}

Response:

{
  "success": true,
  "processed": 25,
  "failed": 2,
  "results": [
    {
      "shortName": "brot",
      "name": "Sauerteigbrot",
      "processed": 1,
      "failed": 0
    }
  ]
}

GET /api/generate-alt-text-bulk

Get statistics about images.

Response:

{
  "totalWithImages": 150,
  "missingAltText": 42,
  "ollamaAvailable": true
}

Testing

# Test Ollama connection
curl http://localhost:11434/api/tags

# Test image generation (replace with actual values)
curl -X POST http://localhost:5173/api/generate-alt-text \
  -H "Content-Type: application/json" \
  -d '{"shortName":"brot","imageIndex":0}'

# Check bulk stats
curl http://localhost:5173/api/generate-alt-text-bulk

License & Credits

  • Uses Ollama for local AI inference
  • Image processing via Sharp
  • Vision model: Gemma3 (better German language support)