by zengxiao-he · LLM Plugin · ★ 69
Last updated: · Indexed by AgentSkillsHub · Auto-synced every 8h
Tessera A small, from-scratch LLM stack built around one goal: distill a large teacher into a small student, then serve that student efficiently. Keeping that goal end-to-end means touching most of the pieces that matter in practice — custom GPU kernels, sharded training, an inference engine, quantization, and a serving front end — without any of it being a toy. It runs and is unit-tested on a laptop (CPU or Apple MPS). The Triton/CUDA kernels are written for NVIDIA GPUs; on anything else the model transparently falls back to a torch reference, and the kernels are checked against that reference whenever a GPU is available. What's in it Training side: Decoder transformer with RMSNorm, RoPE, grouped-query attention and SwiGLU ([](
| Stars | 69 |
| Forks | 0 |
| Language | Python |
| Category | LLM Plugin |
| Quality Score | 32.3/100 |
| Last Updated | 2026-06-05 |
| Created | 2026-06-05 |
| Platforms | python |
| Est. Tokens | ~7k |
These tools work well together with tessera for enhanced workflows:
Looking for a tessera alternative? If you're comparing tessera with other llm plugin tools, these 6 projects are the closest alternatives on Agent Skills Hub — ranked by topic overlap, star count, and community traction.
Bilingual (中文+EN) ML / LLM / diffusion / agent interview cheat sheets for AI 秋招 — generated by ARIS /interview
The fastest way to build and start training your own LLM. CLI tool that scaffolds production-ready PyTorch tra
The missing linter and lsp for AI coding assistants. Validate CLAUDE.md, AGENTS.md, SKILL.md, hooks, MCP. Plug
🦀 Prevents outdated Rust code suggestions from AI assistants. This MCP server fetches current crate docs, use
User friendly CLI tool for AI tasks. Stop thinking about LLMs and prompts, start getting results!
Config-driven CLI tool that compresses command output before it reaches an LLM context
Explore other popular llm plugin tools:
tessera is From teacher to tiles — a from-scratch LLM distillation & serving engine: custom Triton/CUDA kernels, FSDP distillation, paged-KV continuous batching, speculative decoding, a Rust gateway, a JAX oracl. It is categorized as a LLM Plugin with 69 GitHub stars.
tessera is primarily written in Python. It covers topics such as cuda, flash-attention, fsdp.
You can find installation instructions and usage details in the tessera GitHub repository at github.com/zengxiao-he/tessera. The project has 69 stars and 0 forks, indicating an active community.
The top alternatives to tessera on Agent Skills Hub include ARIS-in-AI-Offer, create-llm, agnix. Each offers a different approach to the same problem space — compare them side-by-side by stars, quality score, and community activity.