Costs & Usage

AI is not priced like traditional SaaS. Every conversation consumes tokens; every token has a cost. This section helps you track the spend and shrink it.


Three Things to Know

1. realvco Subscription ≠ AI Usage Fees

  • realvco monthly fee: pays for the host and operations
  • AI usage fees: paid to OpenAI / Anthropic / Google for API calls

These are billed separately. A realvco subscription gets you the host and companion framework; the AI API keys are yours (or procured via realvco).

2. Tokens Are the Unit — Not Message Count

  • 1 Chinese character ≈ 1.5–2 tokens
  • 1 English word ≈ 1.3 tokens
  • AI replies also count (output tokens typically 3–5× the input price)
  • Longer conversations compound — each request includes the full history

A typical short exchange (you ask 100 words, AI answers 300 words) is roughly 1,000 tokens.

3. Models Vary Wildly in Price

Same task, different costs — the gap can exceed 10× between models. See the pre-installed model list below for per-model pricing. Daily chat rarely needs the premium tier.


Pre-installed Model List

Once you’ve bought pre-installed API credit, realvco installs the following 4 OpenRouter models in Rose’s OpenClaw container ahead of time. Type /model <alias> mid-chat to switch models instantly (full alias mechanism: OpenClaw Overview):

AliasModelInput / 1M tokensOutput / 1M tokensBest for
gmOpenAI GPT-5.4 Minilive pricinglive pricingDaily steady workhorse / settings & adjustments / config & env checks / general OpenClaw / Hermes-Agent operations
dsDeepSeek V4 Pro$0.435$0.87Heavy tasks & long context / complex logs / multi-step debugging
hkAnthropic Claude Haiku 4.5$1.00$5.00High-stakes gatekeeping / Claude-style steady judgment
gfGoogle Gemini 3.5 Flashlive pricinglive pricingMulti-modal & high-tier backup / more expensive / final option

Pricing note: the 2 models with public pricing (ds / hk) reflect OpenRouter’s 2026-05-26 numbers; the 2 new / version-bumped models (gm / gf) defer to live OpenRouter pricing — check the Usage sub-tab in admin-panel for the real-time numbers. Capabilities (and cost) rise as you go down the list.

Model escalation rule: switch after two stuck tries

Default gm handles most everyday work. Don’t push uphill when something isn’t working — pick the right model by task type:

  1. Default GPT-5.4 Mini (/model gm) — daily steady workhorse: settings, config / env checks, general OpenClaw and Hermes-Agent operations.
  2. Complex logs, multi-step debugging, long context → switch to DeepSeek V4 Pro (type /model ds) — first pick for heavy tasks.
  3. Money, compliance, anything needing Claude-style steady judgment → switch to Claude Haiku 4.5 (type /model hk) — high-stakes gatekeeper.
  4. Need vision / multi-modal, or everything else failed → switch to Gemini 3.5 Flash (type /model gf) — more expensive, save it as final backup.

The “two-try rule”: if a model gets stuck twice, step to the next tier; don’t burn more than two attempts on the same one.

How to switch: type /model <alias> (e.g. /model ds, /model hk) mid-chat with Rose, then send — the next reply uses the new model.

Why this works: models have different strengths — settings work is fine on gm; long-context tasks like log triage run more steadily on ds; “can’t get this wrong” scenarios (money / compliance) are safest on hk; gf is reserved for multi-modal needs and the cases where everything else already failed.

How do Ada / Vi (Hermes-Agent) pick their model? Hermes-Agent doesn’t use OpenClaw’s fixed alias list. On first use, pick a model from Ada / Vi’s Settings sub-tab in admin-panel — the available models follow the OpenRouter model pool.

Customers on the older default model set: Rose was originally shipped with a different 4-model set (km / mm / sn / op, now-retired models). The current default is gm / ds / hk / gf — to move to the new set, re-pull the default in Version Upgrade.


Deep Dives


Top 5 Quick Wins

If costs feel high today, work through these in order:

  1. Downshift daily chat to a cheap model — Rose’s default is gm (GPT-5.4 Mini); if you switched primary to a heavier model like hk (Haiku) or gf (Gemini), switch back to gm for a several-× saving
  2. Enable context compression — long conversations auto-summarize old turns, cutting history carried per request
  3. Cap response length — set maxTokens so the AI stops writing novels
  4. Set a monthly budget cap — stop before costs spiral
  5. Route high-volume work to Ada / Vi (Hermes-Agent) — faster and cheaper than Rose; split traffic with Rose

Each has step-by-step instructions in Cost Optimization.