by harleyszhang · Agent Tool · ★ 114
Last updated: · Indexed by AgentSkillsHub · Auto-synced every 8h
llmprofiler llm theoretical performance analysis tools and support params, flops, memory and latency analysis. 主要功能 支持 qwen2.5、qwen3 dense 系列模型。 支持张量并行推理模式。 支持 、、 等硬件以及主流 decoder-only 的自回归模型,可自行在配置文件中增加。 支持分析性能瓶颈,不同 是 还是 ,以及 的性能瓶颈。 支持输出每层和整个模型的参数量、计算量,内存和 。 推理时支持预填充和解码阶段分别计算内存和 latency、以及理论支持的最大 等等。 支持设置计算效率、内存读取效率(不同推理框架可能不一样,这个设置好后,可推测输出实际值)。 推理性能理论分析结果的格式化输出。 如何使用 使用方法,直接调用 文件中函数 函数并输入相关参数即可。 python def llmprofile(modelname="llama-13b", gpuname: str = "v100-sxm-32gb", bytesperparam: int = BYTESFP16, bs: int = 1, seqlen: int = 522, generatelen=1526, dszero: int = 0, dpsize: int = 1, tpsize: int = 1, ppsize: int = 1, spsize: int = 1, layernormdtypebytes: int = BYTESFP16, kvcachebytes: int = BYTESFP16, flopsefficiency: float = FLOPSEFFICIENCY, hbmmemoryefficiency: float = HBMMEMORYEFFICIENCY, intranodememoryefficiency=INTRANODEMEMORYEFFICIENCY, internodememoryefficiency=INTERNODEMEMORYEFFICIENCY, mode: str = "inference", ) - dict: """format print dicts of the total floating-point operations, MACs,
| Stars | 114 |
| Forks | 10 |
| Language | Python |
| Category | Agent Tool |
| Quality Score | 38.25/100 |
| Open Issues | 1 |
| Last Updated | 2025-07-11 |
| Created | 2023-07-26 |
| Platforms | python |
| Est. Tokens | ~505k |
These tools work well together with llm_counts for enhanced workflows:
Looking for a llm_counts alternative? If you're comparing llm_counts with other agent tool tools, these 6 projects are the closest alternatives on Agent Skills Hub — ranked by topic overlap, star count, and community traction.
Enterprise-grade LLM automated deployment tool that makes AI servers truly "plug-and-play".
irresponsible innovation. Try now at https://chat.dev/
A command-line interface tool for serving LLM using vLLM.
WorkflowAI is an open-source platform where product and engineering teams collaborate to build and iterate on
Design, conduct and analyze results of AI-powered surveys and experiments. Simulate social science and market
Large Language Models (LLMs) applications and tools running on Apple Silicon in real-time with Apple MLX.
Explore other popular agent tool tools:
llm_counts is llm theoretical performance analysis tools and support params, flops, memory and latency analysis.. It is categorized as a Agent Tool with 114 GitHub stars.
llm_counts is primarily written in Python. It covers topics such as gpu-performance, llama, llm.
You can find installation instructions and usage details in the llm_counts GitHub repository at github.com/harleyszhang/llm_counts. The project has 114 stars and 10 forks, indicating an active community.
The top alternatives to llm_counts on Agent Skills Hub include LLMOne, LLM-VM, vllm-cli. Each offers a different approach to the same problem space — compare them side-by-side by stars, quality score, and community activity.