SDK
Image input
Send images to the model — paste them with Ctrl+V in the CLI, or pass URLs, data URIs, file paths, and base64 to session.prompt() in the SDK.
harnext accepts multimodal input. Interactively you paste an image from the clipboard; programmatically you pass images alongside the text prompt and harnext resolves each to the base64 form providers expect.
SDK: session.prompt(text, images)
session.prompt() takes an optional second argument — a list of ImageInput values. Each entry is one of four shapes:
TypeScript
type ImageInput =
| ImageContent // { type: 'image', data: '<base64>', mimeType?: string }
| { url: string; mimeType?: string } // http(s) URL or data: URI
| string; // http(s) URL, data: URI, or local file pathTypeScript
import { createAgentSession, type ImageInput } from '@harnext/core';
const { session } = await createAgentSession({ provider, modelId });
await session.prompt('describe this', [
{ url: 'https://example.com/cat.png' }, // http(s) → fetched
'data:image/png;base64,iVBORw0KGgo…', // data: URI
'/path/to/local.jpg', // file path → read
{ type: 'image', data: '<b64>', mimeType: 'image/png' }, // raw base64
]);How resolution works
resolveImages(inputs)(and the singleresolveImage(input)) normalize every form toImageContent: http(s) URLs are fetched, file paths are read,data:URIs are decoded, and the bytes are base64-encoded.- MIME type is taken from the
data:URI, the HTTP response, or the file extension — or the explicitmimeTypeyou pass. - Base64-only transport. pi-ai carries images as base64; each provider transform then emits the per-API shape. You always hand harnext source images; it deals with the wire format.
- Size cap.
MAX_IMAGE_BYTES(20 MB) bounds a single image, so an oversized input fails fast rather than at the provider.
Exports
From @harnext/core: resolveImages, resolveImage, the ImageInput and ImageContent types, and MAX_IMAGE_BYTES. Resolve and validate ahead of time if you want to surface errors before prompting.CLI: paste with Ctrl+V
At the interactive prompt, Ctrl+V grabs an image from the system clipboard, base64-encodes it, and attaches it to your next message. A 🖼 N chip in the footer shows how many images are pending, and image-only prompts (empty text) are allowed — paste and press enter.
Clipboard tools per OS
| OS | Tool used |
|---|---|
| Linux | xclip (X11) or wl-paste (Wayland) |
| macOS | pngpaste, falling back to pbpaste |
| Windows | PowerShell clipboard access |
- No image on the clipboard?
Ctrl+Vpastes the clipboard text instead, as usual. - No clipboard tool installed? harnext prints a one-time hint on how to install one — never a silent failure.
Source
Multimodal input shipped in QualityUnit/harnext#48. See the announcement post for the why, and the API reference for the exported types.