• Home
  • Most Popular
  • Submit
  • About Us
  • Contact Us

Softpile

Free Downloads

Categories
  • Home
  • Most Popular
  • Communications
  • Desktop
  • Games & Entertainment
  • Graphic Apps
  • Network & Internet
  • Security & Privacy
  • System Utilities
Alternative to itextsharp 2022.11.10347
IronPDF offers an itextsharp alternative for HTML to PDF conversion with C# code examples, documentation, and ...
VShell Server for Linux and Mac 4.8
VShell is a versatile and secure file transfer server that supports multiple protocols and is compatible ...
PDF Studio PDF Editor for Linux 2022
PDF Studio is a cost-effective PDF editor that delivers full compatibility with the PDF Standard. It's ...
VQ Probe for Linux 1.5
VQ Probe is a comprehensive software tool that enables objective and subjective video quality analysis. The ...
Resilient Server 2.3
This Debian GNU/Linux (Buster) based software has a customized partitioning scheme that enhances robustness against filesystem ...
Valentina Studio for Linux 9.6
Valentina Studio is a cross-platform GUI manager for Mac, Windows, and Linux. It allows users to ...
VPN Lifeguard for Linux 1.0.58
The software monitors VPN connection and automatically terminates apps during connection loss, re-establishes the connection and ...
G_Viewer 0.84
G_Viewer is a Linux software that serves as both a file system and photo/image viewer. It ...
Checksome File Hash Tool for Linux 1.1
This software allows for the generation and verification of file hashes. It is a quick and ...
KeyWrangler Password Manager for Linux 1.2
A password management software that is secure, offline and extensible. It offers military-grade encryption to protect ...
Home Linux Mguesser Download

Mguesser

April 15, 2008
Mguesser is a component of libmnogosearch that enables the identification of a text file's character set and language. It is available as a standalone application.
Version 0.4
License GPL
Platform Linux
Supported Languages English
Homepage www.mnogosearch.org
Developed by Alexander Barkov
Mguesser is a powerful standalone software that comes as part of the libmnogosearch. It is a core component of the mnogo search engine that allows you to guess the character set and language of a text file accurately. The software uses the highly efficient "N-Gram-Based Text Categorization" technique, implemented in TextCat language guesser, written in Perl. Compared to TextCat, mguesser is significantly faster, especially when handling large texts.

Mguesser package comprises of C written N-gram based algorithms and several language and character set maps for different texts. You can find these maps in the "maps" directory of the package. The software supports various languages and character sets, all clearly listed for your convenience. Among the new features that come with the latest version include the "d" command-line option for loading language maps from a non-default directory. Also, the "-t" command enables you to specify the number of top n-grams to print on the output map. About 30 new maps have also been added to the latest release.

To use mguesser, you need plain text data to your STDIN. It's important to note that other "almost text" formats like HTML may not give accurate results. If need be, you can always add a command-line switch to inform the software that you are inputting HTML. Mguesser works best for text files starting from 500 bytes and above. Texts shorter than this are usually not accurately guessed.

To guess the language and character set of a text file, simply run the following command: "mguesser < text_file". The software will display how well your file corresponds to different language maps, in order of quality. Mguesser can return values ranging between 0 and 1. If you want to display specific results, use the "-n" command-line switch. For instance, "mguesser -n3 < text_file" will display the top 3 results.

To make mguesser load language maps from a non-default directory, use the "-d/path/to/maps/" command. You can also load language maps from multiple directories by using a colon-separated list. To create a new language map, use the "-p -c charset -l language < text_file" command. With this command, mguesser creates a new map based on the text_file and prints it to STDOUT. For best results, use a high-quality source text file, usually around 500 KB.

Finally, you can include mguesser in your own applications. Simply check the main() function located in the guesser.c file to see the order of guesser function calls. Mguesser is a powerful software that can effectively guess language and character set with great accuracy.
What's New

Version 0.4: N/A

Free Download 154K
296
  • Share on:

Most Popular

  1. SlideMap 1.2.2
    100
  2. Quicksilver Forums 1.4.2
    87
  3. WikEd
    78
  4. CherryTV 0.1
    78
  5. porm r2
    74
  6. Intel Software Development Emulator 1.70
    71
  7. xstereograph 2.1
    70
  8. Proxy filesystem for FUSE 1.0.0
    64
  9. Siesta::UserGuide 0.66
    64
  10. fuseftp 0.8
    62

Related Downloads

Xfce 4 Sensors Plugin
Xfce 4 Sensors Plugin is a panel plugin that monitors hardware sensors ...
Black Stratini
This software offers a dark theme that is inspired by Stratini. It ...
Android SDK
The Android project features an all-inclusive software stack designed for mobile devices, ...
amqplib_thrift
AMQPLib_Thrift is a Python library that lets you establish thrift clients and ...
Data::Fax
The DataFAX Perl extension allows for the creation and configuration of DataFAX ...
MultiTail
MultiTail is an effective software tool that enables users to view one ...
EnGarde Community Edition
EnGarde is a Linux distribution that guarantees safety and security, making it ...
TreeLine
TreeLine software serves as a versatile information storage tool for all types ...
ogd
This software facilitates the destruction of all globally stored objects in an ...
DShaper
DShaper is a traffic shaper software that regulates bandwidth usage dynamically.
Copyright © 1999-2025 Softpile Free Downloads
  • Most Popular
  • Submit
  • About Us
  • Contact Us
  • Privacy Policy
  • Disclaimer
  • Terms of Use

Can we use your data to tailor ads for you?

Our partners will collect data and use cookies for ad personalization and measurement.

By choosing "I agree", closing this pop-up or clicking on any element on the page, you agree to the use of cookies to help us provide you with a better user experience.

Learn how Softpile and our partners collect and use data.

You can change your choice at any time in our privacy center.

Cookie Settings

Our website stores four types of cookies. At any time you can choose which cookies you accept and which you refuse. You can read more about what cookies are and what types of cookies we store in our Cookie Policy.

are necessary for technical reasons. Without them, this website may not function properly.

are necessary for specific functionality on the website. Without them, some features may be disabled.

allow us to analyse website use and to improve the visitor's experience.

allow us to personalise your experience and to send you relevant content and offers, on this website and other websites.