Zhipu Releases GLM-5.1 High-Speed Version with Output Speed of 400 Tokens/s

robot
Abstract generation in progress
Zhipu announced the launch of the GLM-5.1 high-speed API, 'GLM-5.1-highspeed', aimed at select enterprise clients. This model achieves an output speed of 400 tokens/s, setting a new speed record for global large model API providers. The GLM-5.1 high-speed version is suitable for scenarios that require extremely low response latency, such as AI programming, real-time interaction, business decision-making, and real-time voice applications. It is now available to select enterprise clients on the Zhipu MaaS platform.
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pinned