HTMLNLM

NETWORK TOPOLOGY (RWKV-v6)

Vocabulary Capacity 2048

Hidden Dimension (D) 256

Recurrent Layers (L) 4

OOMB Context Chunk 128

Optimizer

0.00M

Total Ternary Parameters

CORPUS INGESTION & BPE

Drop .txt corpus here or click
Requires uncompressed UTF-8 text

SYSTEM TELEMETRY

OOMB CHUNK-RECURRENT LOSS

Global Step 0

Current Loss 0.0000

Throughput 0.0 tok/sec

QUANTIZATION DISTRIBUTION (b1.58)

-1 (33%) 0 (34%) +1 (33%)

OPTIMIZER (QUINTIC MUON)

Learning Rate Base 0.020

REAL-TIME INFERENCE SAMPLING

Model not allocated. Proceed to ARCHITECTURE tab.

GROUP RELATIVE POLICY OPTIMIZATION

Executes critic-free reinforcement learning. Generates a cohort of responses, evaluates against the target reward condition, and updates parameters via Z-score normalized advantages with an approximate KL divergence constraint.

Prompt Condition

Target Reward Substring

REINFORCEMENT METRICS

Epochs 0

Cohort Size 4

Avg Reward 0.000

COHORT TRAJECTORIES

Awaiting GRPO initialization...

DECODING SAMPLER

Temperature 0.80

Top-P Nucleus 0.90

Max Output Tokens 200

Context Sequence

RUNTIME OUTPUT

VIRTUAL MACHINE STATE PERSISTENCE

Master weights residing in Javascript Float32Arrays are serialized into binary formats. Local DB logic utilizes native ArrayBuffer storage for robustness, strictly eliminating Base64 call stack limits. Momentum buffers from Muon are strategically discarded to halve the storage footprint.

LOCAL INDEXED DB

EXTERNAL I/O (JSON)