– a Full‑stack , open‑source foundation model that:
"Added text summarization feature"
| Modality | Input → Tokenizer | Token Length (max) | Embedding Dim | |----------|-------------------|--------------------|---------------| | Text | Byte‑Pair Encoding (BPE, 50 k vocab) | 2 048 | 4 096 | | Image | Patch‑ify (16×16) → ViT‑Style tokens | 1 024 | 4 096 | | Video | Temporal patches (4‑frame clips) → 3 D‑tokens | 2 048 | 4 096 | | Audio | Log‑Mel spectrogram patches → 64‑ms frames | 1 024 | 4 096 | | Tabular | Feature‑wise tokenisation (categorical + float) | 512 | 4 096 | jul448 full