• Home
  • Popular Downloads
  • Contact Us
  • Submit

Softpile

Free Downloads

Main Navigation
  • Home
  • Windows
  • Mac OS X
  • Business
  • Desktop
  • Development
  • Education
  • Games
  • Internet
  • Multimedia
  • Utilities
Home » Linux » Uplug

Uplug

April 3, 2008
Uplug is a collection of tools for linguistic corpus processing, word alignment, and term extraction from parallel corpora.
Version: 0.2.0c
License: GPL
Operating System: Linux
Homepage: sourceforge.net
Developed by: Joerg Tiedemann
Uplug is a collection of tools for linguistic corpus processing, word alignment, and term extraction from parallel corpora. Several tools have been integrated in Uplug.

Pre-processing tools include a sentence splitter, tokenizer, and external part-of-speech tagger and shallow parsers. The following external tools are used: the Grok system for English (tagging and chunking) and the morphological analyzer ChaSen for Japanese.

Other tools such as the TreeTagger can easily be added. Translated documents can be sentence aligned using the length-based approach by Gale & Church. Words and phrases can be aligned using the clue alignment approach and the toolbox for training statistical alignment models GIZA++.

What's New in This Release:

· robust conversion of encodings in tag.pl/toktag.pl/chunk.pl
· added treetagger startup scripts for es and nl, replace "nbsp" to " "
· robust conversion between encodings in bitext-indexer.pl/opus-indexer.pl
· added startup scripts for spanish and dutch tree-tagger models
· updated startup scripts for other treetagger models according to latest TreeTagger distribution
· fixed hunalign (bug in converting alignment output to xml)
· added missing ';' at line 40 in Uplug.pm
corpus corpus processing processing term extraction uplug word alignment
Free Download 21.9M
0
  • Share on:

Uplug User Reviews

Sponsored

Categories

  • Linux

Related Downloads

xmlTreeNav
xmlTreeNav is a XML viewer with several features.
LLRP Toolkit
LLRP Toolkit can help reader and software vendors build and parse LLRP messages.
HtmlParser
HTML Parser for Qt4(HTML to text converter).
FeedCreator
FeedCreator provides a PHP class that builds various XML-based news feed ...
DomSax
DomSax is an implementation of a XML-parser based on the standard Document ...
Copyright © 1999-2017 Softpile Free Downloads
  • Contact Us
  • Submit
  • Privacy Policy
  • Disclaimer
  • Terms of Use