Audience

Developers, researchers, and engineers wanting a tool to accurately parse and understand complex documents, layouts, and visual-text content at scale

About GLM-OCR

GLM-OCR is a multimodal optical character recognition model and open source repository that provides accurate, efficient, and comprehensive document understanding by combining text and visual modalities into a unified encoder–decoder architecture derived from the GLM-V family. Built with a visual encoder pre-trained on large-scale image–text data and a lightweight cross-modal connector feeding into a GLM-0.5B language decoder, the model supports layout detection, parallel region recognition, and structured output for text, tables, formulas, and complicated real-world document formats. It introduces Multi-Token Prediction (MTP) loss and stable full-task reinforcement learning to improve training efficiency, recognition accuracy, and generalization, achieving state-of-the-art benchmarks on major document understanding tasks.

Pricing

Starting Price:
Free
Free Version:
Free Version available.

Integrations

API:
Yes, GLM-OCR offers API access
No integrations listed.

Ratings/Reviews

Overall 0.0 / 5
ease 0.0 / 5
features 0.0 / 5
design 0.0 / 5
support 0.0 / 5

This software hasn't been reviewed yet. Be the first to provide a review:

Review this Software

Company Information

Z.ai
Founded: 2019
China
github.com/zai-org/GLM-OCR

Videos and Screen Captures

GLM-OCR Screenshot 1
Other Useful Business Software
Iris Powered By Generali - Iris puts your customer in control of their identity. Icon
Iris Powered By Generali - Iris puts your customer in control of their identity.

Increase customer and employee retention by offering Onwatch identity protection today.

Iris Identity Protection API sends identity monitoring and alerts data into your existing digital environment – an ideal solution for businesses that are looking to offer their customers identity protection services without having to build a new product or app from scratch.
Learn More

Product Details

Platforms Supported
Cloud
Training
Documentation
Support
Online

GLM-OCR Frequently Asked Questions

Q: What kinds of users and organization types does GLM-OCR work with?
Q: What languages does GLM-OCR support in their product?
Q: Does GLM-OCR have an API?
Q: What type of training does GLM-OCR provide?
Q: How much does GLM-OCR cost?

GLM-OCR Product Features

OCR

Convert to PDF
Zone Selection Tool
ID Scanning
Multi-Language
Indexing
Metadata Extraction
Image Pre-processing
Text Editor
Multiple Output Formats
Batch Processing