Enhanced version of the standard Unix strings(1) program which uses language models for automatic language identification and character-set identification, supporting over 1400 languages, dozens of character encodings, and 4800+ language/encoding pairs.

Features

  • text extraction
  • language identification
  • character-set identification

Project Activity

See All Activity >