Image input

IsonForge supports multimodal vision via the underlying PRME 26.1 backend. You can attach images to a prompt and the agent reasons over them like text.

Attach an image

One-shot from the command line:

isonforge --image ./screenshot.png "what's wrong with this UI?"

Multiple images:

isonforge --image ./before.png --image ./after.png \
  "describe the diff between these two states"

Inside the REPL:

/image ./error-screenshot.png
> what does this error mean?

The image is base64-encoded and attached to your next message.

Constraints

Format: PNG or JPG only.
Max size: 8 MB per image.
Max images per turn: practical limit of a few; very high counts may exhaust context.

Files over 8 MB are skipped with a warning. Unsupported extensions are skipped with a warning.

What it's good at

UI screenshots. "This page looks broken - what's wrong?"
Error dialogs. "What does this error mean? Should I worry?"
Diagrams. "Explain this architecture diagram."
Whiteboards. "Read what's on this whiteboard and turn it into a markdown spec."
Code screenshots. Though pasting text is always better than screenshotting code.
Mockups. "Build this UI based on the mockup."

What it struggles with

Tiny text in dense screenshots. Take crops with the relevant section, not full screen.
Very long screenshots / scroll captures. Break into chunks.
Handwriting in many languages. Latin script + Bahasa Indonesia + English work well. Mixed scripts or stylized fonts harder.
Photos of code at an angle. Take a screenshot or paste text instead.

Pattern: paired image + repo

/image ./design.png
> implement this design as a React component. Match the spacing exactly.
  Look at our existing components in src/components/ for style conventions.

The agent uses the image as visual spec, reads existing components for style, and writes the new component.

Pattern: error triage

isonforge --image ./browser-error.png "fix this error"

The agent reads the error text from the screenshot, traces it to the relevant file in the codebase, and proposes a fix.

Where images go

Image data lives in your session's messages. It's saved to ~/.isonforge/sessions/<id>.json as base64. This can make session files large - a few MB if you attach multiple high-res images.

/clear clears the session including image data. /export includes images in the export.

Not yet supported

Video (frame-by-frame).
Audio.
PDF as image (extract pages first with pdftoppm).
SVG (rasterize first with rsvg-convert).

When you attach an image, the binary data is sent to the IsonAI backend for inference. Like any prompt, the data flows through the gateway. For sensitive screenshots (PII, internal dashboards), make sure your IsonAI deployment + retention policy match your compliance requirements.