Edit on GitHub
LatestMarch 2025

v2.1.0

Adds webhook signature verification, a new cost-tracking hook, and performance improvements to the streaming API. No breaking changes from v2.0.0.

New Features

featWebhook signature verificationNew verifyWebhook() export from @vault/sdk/webhooks. Validates HMAC-SHA256 signatures on incoming webhook payloads.
featonTokenUsage hookNew lifecycle hook that fires after every inference response with token counts and estimated cost. Use for cost tracking without manual calculation.
featvault.tokenize()Count tokens for a prompt without running inference. Useful for prompt length checks before making expensive requests.
featvault-3-mini modelUltra-low latency model optimized for edge deployments. 32k context, significantly faster than vault-3-turbo.

Improvements

perfStreaming throughputReduced internal buffering in the async generator. First token latency improved by ~15% on vault-3-turbo.
dxBetter error messagesVaultError now includes the raw response body when available, making it easier to diagnose malformed requests.
dxTypeScript 5.4 supportUpdated type definitions to take advantage of inferred type predicates and improved const type parameters.

Bug Fixes

fixStream cancellationCancelling a stream mid-flight no longer throws an unhandled promise rejection in Node.js 20+.
fixRetry on 503The retry logic was skipping 503 responses. Now correctly retries with exponential backoff.
fixWorkspace headerThe X-Vault-Workspace header was omitted when workspace was set to the string "default". Now always sent.

Deprecations

deprecatedvault.complete()Renamed to vault.infer() for consistency. vault.complete() still works but will be removed in v3.0.0.