Skip to content

Payload schemas

Function-call entrypoints carry an optional args_schema (and return_schema) — JSON Schema fragments that describe the inbound argument payload and the return value. mvmforge uses them in three places:

  1. Build-time validation. The host walks the schema for secret-shaped field names and rejects anything matching the closed token list (token, password, secret, apikey, …) with E_SECRET_IN_SCHEMA.
  2. Wrapper-side runtime validation. Once the upstream-mvm factory is wired, the wrapper validates the decoded inbound payload against args_schema before dispatching, and the return value against return_schema before encoding the reply. This is the user-visible payoff: a typed function gets a typed wire contract for free.
  3. Documentation surface. The canonical IR is the single source of truth for what a function accepts. Generated docs, type stubs for callers in other languages, and OpenAPI- shaped exports all read from args_schema / return_schema when present.

You can populate these fields three ways. mvmforge tries them in order; the first one that succeeds wins.

1. Auto-derived from your signature (the default)

If your decorated function has typed parameters and a typed return, mvmforge derives the schema for you. No extra import, no runtime dependency on pydantic or zod.

@mv.func(name="adder", module="adder")
async def add(a: int, b: int) -> int:
return a + b
import { func } from "mvm-sdk";
export const add = func({ name: "adder", module: "adder" })(
function add(a: number, b: number): number {
return a + b;
},
);

After mvmforge emit, the canonical IR carries:

{
"args_schema": {
"type": "object",
"properties": {
"a": {"type": "integer"},
"b": {"type": "integer"}
},
"required": ["a", "b"],
"additionalProperties": false
},
"return_schema": {"type": "integer"}
}

(Python int maps to JSON Schema integer; TypeScript number maps to number. Use bigint in TypeScript if you want integer.)

Supported types

mvmforge derives schemas for a closed subset, intentionally narrow so the resulting IR remains simple and language-portable. Anything outside this subset aborts derivation per function — your function ships, but args_schema stays unset and the wrapper falls through to “no schema, validate nothing extra.”

PythonTypeScriptJSON Schema
intbigint{"type": "integer"}
floatnumber{"type": "number"}
strstring{"type": "string"}
boolboolean{"type": "boolean"}
Nonenull{"type": "null"}
list[X] / List[X]Array<X> / X[]{"type": "array", "items": <X>}
tuple[X, Y] / Tuple[X, Y][X, Y] (tuple type)array with prefixItems
dict[str, X] / Dict[str, X]Record<string, X>{"type": "object", "additionalProperties": <X>}
X | None / Optional[X]X | null / X | undefined{"oneOf": [<X>, {"type": "null"}]}
X | Y / Union[X, Y]X | Y (union){"oneOf": [<X>, <Y>]}
Literal["a", "b"]"a" | "b"{"enum": ["a", "b"]}

Anything else — custom classes, type-aliased imports from another module, generic constraints, structural objects — aborts derivation silently. Use one of the explicit paths below if you need those.

When derivation aborts

The extractor is all-or-nothing per function for args_schema. It either produces a complete schema with required covering every non-default parameter, or it produces nothing. A single unannotated parameter or unsupported type aborts the entire args_schema for that function. return_schema is independent — an annotated return type still derives even if a parameter aborts the args side.

Aborting is silent and never an error. To debug, run mvmforge doctor — it warns when a function has annotated parameters but no extracted schema.

To opt out of derivation entirely (for tests, CI parity checks, etc.) set MVMFORGE_NO_SIGNATURE_EXTRACTION=1.

2. Pydantic / zod (for richer shapes)

When your shape needs custom validators, regex constraints, branded types, or anything else outside the closed table, reach for the runtime helpers:

import mvm as mv
from pydantic import BaseModel, Field
class AddArgs(BaseModel):
a: int = Field(ge=0)
b: int = Field(ge=0)
@mv.func(
name="adder",
args_schema=mv.derive_schema(AddArgs),
)
async def add(a: int, b: int) -> int:
return a + b
import { func, zodSchema } from "mvm-sdk";
import { z } from "zod";
export const add = func({
name: "adder",
argsSchema: await zodSchema(z.object({
a: z.number().int().nonnegative(),
b: z.number().int().nonnegative(),
})),
})(function add(a: number, b: number): number {
return a + b;
});

mv.derive_schema(...) requires the mvm[schema] extra (pydantic). mv.zodSchema(...) requires zod and zod-to-json-schema as runtime deps.

3. Hand-authored

When even pydantic / zod is too heavy — or you want the IR to be the source of truth and your function to track it — pass a JSON Schema dict / object literal directly:

@mv.func(
name="adder",
args_schema={
"type": "object",
"properties": {
"a": {"type": "integer"},
"b": {"type": "integer"},
},
"required": ["a", "b"],
},
return_schema={"type": "integer"},
)
async def add(a: int, b: int) -> int:
return a + b

Hand-authored schemas always win over auto-derivation.

Cross-SDK byte-identity

Workloads with explicit args_schema= / argsSchema: produce byte-identical canonical IR across the Python and TypeScript SDKs when the schema dict is the same.

Auto-derived schemas are tied to the source language: Python int and TypeScript number map to different JSON Schema types (integer vs number). To get cross-SDK byte-identity for an auto-derived workload, either:

  • Use type pairings that map to the same schema (Python float / TypeScript number → both number), or
  • Author a “Python workload via the TypeScript SDK” (language: "python", source files in .py) so both SDKs feed the same Python module to the host’s extractor.

This is why the tests/corpus/function-app-auto-schema/ corpus entry declares language="python" from both SDKs — the host walks the same adder.py regardless of which SDK authored the manifest.

See also