Audience

Developers, researchers, and engineers wanting a tool to accurately parse and understand complex documents, layouts, and visual-text content at scale

About GLM-OCR

GLM-OCR is a multimodal optical character recognition model and open source repository that provides accurate, efficient, and comprehensive document understanding by combining text and visual modalities into a unified encoder–decoder architecture derived from the GLM-V family. Built with a visual encoder pre-trained on large-scale image–text data and a lightweight cross-modal connector feeding into a GLM-0.5B language decoder, the model supports layout detection, parallel region recognition, and structured output for text, tables, formulas, and complicated real-world document formats. It introduces Multi-Token Prediction (MTP) loss and stable full-task reinforcement learning to improve training efficiency, recognition accuracy, and generalization, achieving state-of-the-art benchmarks on major document understanding tasks.

Pricing

Starting Price:
Free
Free Version:
Free Version available.

Integrations

API:
Yes, GLM-OCR offers API access
No integrations listed.

Ratings/Reviews

Overall 0.0 / 5
ease 0.0 / 5
features 0.0 / 5
design 0.0 / 5
support 0.0 / 5

This software hasn't been reviewed yet. Be the first to provide a review:

Review this Software

Company Information

Z.ai
Founded: 2019
China
github.com/zai-org/GLM-OCR

Videos and Screen Captures

GLM-OCR Screenshot 1
Other Useful Business Software
The AI workplace management platform Icon
The AI workplace management platform

Plan smart spaces, connect teams, manage assets, and get insights with the leading AI-powered operating system for the built world.

By combining AI workflows, predictive intelligence, and automated insights, OfficeSpace gives leaders a complete view of how their spaces are used and how people work. Facilities, IT, HR, and Real Estate teams use OfficeSpace to optimize space utilization, enhance employee experience, and reduce portfolio costs with precision.
Learn More

Product Details

Platforms Supported
Cloud
Training
Documentation
Support
Online

GLM-OCR Frequently Asked Questions

Q: What kinds of users and organization types does GLM-OCR work with?
Q: What languages does GLM-OCR support in their product?
Q: Does GLM-OCR have an API?
Q: What type of training does GLM-OCR provide?
Q: How much does GLM-OCR cost?

GLM-OCR Product Features

OCR

Convert to PDF
Zone Selection Tool
ID Scanning
Multi-Language
Indexing
Metadata Extraction
Image Pre-processing
Text Editor
Multiple Output Formats
Batch Processing