This question from @Fabio Genovese _ArtigianoDelSoftware_ comes up more often than you’d think—especially for teams with visual-heavy documentation:
“Can Rovo analyze images in KB articles and describe diagrams for users?”
Short answer: Not reliably (yet).
What Rovo can do today
Rovo can:
- Reference KB articles that contain images
- Surface those articles in responses
- Preserve images for users to view in the original content
So users can still:
- See manuals
- Follow diagrams
- Access visual documentation
What Rovo cannot do (today)
Rovo does not reliably interpret image content. That means it cannot:
- Read diagrams
- Understand flow arrows or visual relationships
- Extract meaning from labels inside images
- Accurately describe what’s happening in a visual
If it does describe an image, it’s usually based on:
- Surrounding text
- Captions
- Context clues
Not true image understanding
Why this matters for your use case
In Fabio’s scenario:
- Users rely on diagrams
- Images carry key meaning
- The goal is to explain those visuals
Right now:
- Rovo can point users to the right place
- But it cannot replace the interpretation of the image itself
What customers are doing as a workaround
Teams are getting good results by making images more “AI-readable”:
1. Add descriptive captions
- Explain what the image shows
- Include key steps or relationships
2. Use alt text strategically
- Treat it as searchable, meaningful content
- Not just accessibility filler
3. Pair diagrams with short explanations
- Even 1–2 sentences makes a big difference
4. Store critical visuals with supporting text
- Ensure Rovo has something to index
5 comments