Options reference
Every option that any ext-infer method accepts, in one table per
method. For conceptual context on individual options, follow the
links in the rightmost column.
Model::load($path, $options)
The second argument is an associative array. Keys are kept as snake-case strings (like PHP ini settings) because load-time tuning is rare and the array form composes well with config arrays loaded from disk.
| Key | Type | Default | See |
|---|---|---|---|
n_gpu_layers | int | 0 | Performance tuning |
use_mmap | bool | true | Performance tuning |
use_mlock | bool | false | Performance tuning |
embedding | bool | false | Embeddings |
pooling | string | 'unspecified' | Embeddings |
Validation rules
- Unknown keys are not rejected — they’re silently ignored. This is
deliberate (forward-compatibility for callers loading config from
files), but it means typos will be silent. If you suspect a typo,
verify with
var_dumpagainst the same string before reporting a bug. - Type mismatches are rejected, with a clear message:
invalid option n_gpu_layers: expected integer. - Negative integers and out-of-range values for
n_gpu_layersare rejected:invalid option n_gpu_layers: must be non-negative. poolingaccepts only the six strings listed in Embeddings → Pooling.
Model::chat($prompt, ...)
Named arguments — no array. PHP 8.0+ named-arguments syntax echoes the
ident verbatim, so you write maxTokens: 256 (camelCase, per PSR-12).
| Argument | Type | Default | See |
|---|---|---|---|
$prompt | \Displace\Infer\Prompt | required | Prompts |
maxTokens | int | 128 | Chat completions |
nCtx | int | 2048 | Chat completions |
temperature | float | 0.0 | Chat completions |
seed | int | 1234 | Chat completions |
Behavior
temperature = 0.0is greedy (deterministic).> 0.0samples, controlled byseed.seedis only consulted whentemperature > 0.maxTokenscaps generation. Hitting it setsResponse::finishReason()to'length'.nCtxis the context window for this call. If the rendered prompt exceeds it,InferenceExceptionis raised before generation starts.
Model::raw($prompt, ...)
Same named-argument shape as chat() plus addBos.
| Argument | Type | Default | See |
|---|---|---|---|
$prompt | string | required | Raw completions |
maxTokens | int | 128 | Chat completions |
nCtx | int | 2048 | Chat completions |
temperature | float | 0.0 | Chat completions |
seed | int | 1234 | Chat completions |
addBos | bool | true | Raw completions → addBos |
Model::embed($text)
Just the text. Pooling and embedding-mode are configured at load time
(see Model::load above).
| Argument | Type | Default | See |
|---|---|---|---|
$text | string | required | Embeddings |
Embedding math
Embedding is read-only; the math methods return new instances rather
than mutating.
| Method | Returns |
|---|---|
vector() | list<float> |
dimensions() | int |
norm() | float |
normalize() | new Embedding |
cosineSimilarity(Embedding $other) | float (in [-1, 1]) |
cosineSimilarity throws InferenceException
on a dimension mismatch — see
Embeddings → vector math.
Prompt
Static factories + immutable with* builders.
| Method | Returns |
|---|---|
Prompt::system($content) | new Prompt |
Prompt::user($content) | new Prompt |
withSystem($content) | new Prompt |
withUser($content) | new Prompt |
withAssistant($content) | new Prompt |
messages() | list<Message> |
lastRole() | ?string |
count() | int |
isEmpty() | bool |
See Prompts for the immutability semantics.
Response
Read-only. Six getters.
| Method | Returns |
|---|---|
text() | string |
reasoning() | ?string |
answer() | string |
hasReasoning() | bool |
finishReason() | string — 'eos'/'length'/'stop' |
tokensGenerated() | int |
See Chat completions → Inspecting a Response.
Environment
Not strictly an option, but bears mentioning here:
| Variable | Effect |
|---|---|
EXT_INFER_LOG=1 | Restore llama.cpp’s verbose stderr logging (silenced by default). |