Thinking mode

Thinking mode enables a reasoning pass before the model produces its visible answer. This trades time-to-first-token for substantially better multi-step reasoning, planning, and tool-use decisions.

Toggle

In the REPL:

/thinking on
/thinking off
/thinking          # toggle current state

Setting persists for the session only. CLI restart falls back to server default.

The default ON/OFF is set server-side. By default IsonForge ships with thinking ON.

When to use

Multi-step refactors that touch many files.
Debugging where the root cause isn't obvious.
Architecture decisions ("should this be a queue or a webhook?").
Long planning chains (5+ tool calls).
Anything where being right matters more than being fast.

When to turn off

Quick single-file edits ("rename this variable everywhere").
Tight loops on print-mode pipelines where latency matters.
Status queries ("what's the diff?").
Trivial syntax fixes.

What you see

When thinking is on, IsonForge shows a live "🧠 Thinking" panel above the output:

┌─ 🧠 Thinking ─┐
│ Looking at auth/session.py first to understand the current...
│ The constructor takes a TTL config. I'll need to preserve...
│ Redis with pipelining would be cleanest. Let me check if...
└──────────────┘

It scrolls a 12-line rolling window so it doesn't fill your terminal. When the model finishes reasoning and starts producing the visible answer, the panel collapses into a "Thinking..." entry in your scrollback (expandable via /sessions later).

Latency

Thinking adds 30-60 seconds typical on tool-heavy turns. Sometimes more if the reasoning is deep. This is the cost - the model is doing more compute before responding.

For interactive sessions this is usually worth it. For batch scripts in -p mode where the same task runs 100 times, consider --effort low (turns off thinking + uses lower sampling) instead.

Programmatic control

Print mode:

# Default depends on server, override explicitly:
ISONFORGE_THINKING=1 isonforge -p "complex task"
ISONFORGE_THINKING=0 isonforge -p "quick task"

Settings.json doesn't currently expose a thinking field directly - use the env var or session toggle.

Interaction with effort levels

The --effort flag controls thinking too:

Effort	Thinking
`low`	OFF
`medium`	follows server default
`high`	ON
`xhigh`	ON + more output tokens
`max`	ON + use full token budget

See Effort levels.

Mid-stream abort

If you Ctrl+C during the thinking phase, IsonForge persists what it captured so far as a "Thinking..." entry in scrollback. No data lost.

Caveats

The reasoning text is the model's internal chain-of-thought. It can be wrong, contradict the final answer, or contain tentative ideas. Don't rely on it as the answer; the visible answer that follows is what counts.
Long reasoning chains consume tokens against your context window. Heavy thinking + heavy tool output = /compact becomes worth running.
Reasoning output is persisted to session files. /export includes it. If you share an export, you share the thinking.