inference and long-running agents

Agent and inference lab.

focus

Wexpro Labs works on the systems around models: lower-cost inference below them, agents that keep working above them, and local tools that make existing agent workflows cheaper and more reliable.

compress context 01

serve models 02

run agents 03

ship probes 04

products

01
Workflow Automation

Done-for-you helpers for repetitive back-office work across portals, emails, PDFs, spreadsheets, and legacy apps.
02
Context Compression

A local background token saver for AI agents reading JSON, JSONL, CSV, and TSV.
03
inference

Lower latency and cost at the runtime layer.
04
long-running agents

Agents that keep state, resume work, and finish multi-step jobs.
05
experiments

Applied probes that turn model capability into working behavior.

Agent and inference lab.

Workflow Automation

Context Compression

inference

long-running agents

experiments