NEXOCLIP

The AI growth engine for streamers.

Drop a VOD. AI scores every clip on virality, generates the hook, routes it to the right streamer's brand, and queues publish with an undo window. The editor's still there when you need it — most mornings you won't.

Sign in API docs For LLMs

Why it's different

AI scoring on every clip

Viral score, hook strength, caption readability, dead-air risk — computed from the multimodal signals NexoClip already collects (rescore, motion, face presence, words-per-sec). The operator picks the AI's top 3 and ships.

Hook generator, 5 tones

One click, 5 viral title candidates in the streamer's voice. Tone presets: aggressive, Gen Z, corporate, curious, default. Click a candidate to drop it into the title overlay.

Intelligence timeline

Per-second markers under the preview: audio peaks, scene cuts, laughter reactions, chat-heat spikes, face-emotion changes. Click any marker to seek. Spot the moment that goes viral before you watch the clip.

Voice-marker triggers

Streamers say clipea esto (clip the next 30s) or clipeaste eso (clip the previous 60s) as natural verbal bookmarks. Custom phrases per brand kit.

Per-speaker brand kits

Multi-streamer VOD? Speaker diarization routes each clip to the right host's colors, fonts, handles, and captions. Speaker identities persist across VODs via embedding match. The differentiator most clipping tools don't ship.

Local GPU transcription

faster-whisper runs on your own GPU. Stream audio never leaves your machine for transcription — only the LLM caption-generation step calls out to Anthropic.

Auto-publish with undo

Trusted brand kits queue with a scheduled-for + undo window. Untrusted kits land in the inbox grouped by VOD/speaker. Same flow either way.

Native to AI agents

Every action is an MCP tool. Agents can ingest a VOD, score clips, pick winners, and publish — without a browser session. Built for the era where the operator is half-human, half-agent.

The growth loop

The one thing a streamer (or their agency) touches each morning:

  1. 1 · ingest Drop VOD · watch Drive · pull from platform

    Drag-drop, OBS-to-Drive auto-watch, or Twitch / Kick VOD pull. Single ingest endpoint, three sources.

  2. 2 · diarize + transcribe Speakers labeled, words timestamped

    pyannote-audio + faster-whisper, both on your GPU. Audio never leaves your machine.

  3. 3 · detect + score Multimodal candidate-finding + AI scoring

    Voice markers, chat heat, audio peaks, scene cuts → candidate windows. Each candidate gets a viral score, hook strength, and dead-air risk.

  4. 4 · cut + brand Vertical clips routed to the right host

    ffmpeg cuts each window, smart-crops 9:16 around the active face, applies the resolved speaker's brand kit.

  5. 5 · hook + variants Title + caption + hashtags per platform

    Claude generates 5 viral-hook titles per tone preset, captions per persona, hashtags per platform. Operator picks one or skips and accepts the AI default.

  6. 6 · ship Auto-publish with undo · or manual review

    Trusted brand kits queue with a scheduled-for + undo window. Untrusted kits land in the inbox grouped by VOD/speaker. Same flow either way.

Frequently asked

Why call it a "growth engine" instead of a clip editor?
Clip editors give you a timeline and trim handles. NexoClip is upstream of editing: AI scores every clip on virality dimensions, generates the hook, routes to the right brand kit, and queues publish with an undo window. The operator's job shifts from "find and cut clips" to "pick the AI's top 3 and let it ship". The editor's still there when you need it — most mornings you won't.
Who is NexoClip for?
Streamers who want clips ready by morning. Multi-host stream collectives that need per-streamer branding within one VOD. Agencies running clipping for several creators at once. Agents / MCP clients that want to drive a clipping pipeline programmatically (we're the only stream-growth OS built agent-first).
Do I need a GPU?
A consumer GPU (RTX 4060+ recommended) handles Whisper + pyannote comfortably for ~4hr VODs. CPU-only mode works for short clips (set NEXOCLIP_WHISPER_DEVICE=cpu) but is much slower.
Which LLM does NexoClip use?
Anthropic Claude exclusively for the live surface. The router supports adding more providers later without code changes, but Claude is what ships — the structured-output reliability is essential for the hook generator, variant generator, and viral-moment selector.
How do I integrate NexoClip into an agent workflow?
Run nexoclip mcp serve --token <api-token> to expose the MCP server over stdio. Claude Code, Cursor, and other MCP clients will see tools for listing streams, kicking off pipelines, fetching clips, and managing brand kits. The same tenant token gates the JSON REST API at /streams, /clips, etc.
What about retention and data privacy?
Per-tenant retention windows (default 30 days for VODs, 90 for clips, 365 for transcripts) — all configurable. Daily sweeper hard-deletes past-cutoff artifacts. No soft-delete bin. Audio + video stay on your machine for transcription; only the LLM caption-generation step calls out to Anthropic.
Is there an API I can call directly?
Yes. OpenAPI spec + interactive Swagger UI. Issue a token via nexoclip tokens issue --tenant <id> --scope full and pass it as Authorization: Bearer <token>.

NexoClip · multi-tenant SaaS for VOD-to-clip workflows · llms.txt · API docs · Sign in