April 2, 2009

This software categorizes text records as complete, partial, or near duplicates.

Version 0.05

License Perl Artistic License

Platform Linux

Supported Languages English

Homepage search.cpan.org

Developed by Kim Ryan

If you're looking for a powerful Perl module to help you streamline your duplicate text records, Text::Record::Deduper could be just what you're looking for. This module comes equipped with complete, partial, and near duplicate records, allowing you to customize your deduplication process to your specific needs.

To get started, simply import the module and create a new instance of Text::Record::Deduper. From there, you can use the dedupe_file() method to remove entire lines that are duplicated, or configure the module to dedupe comma separated records based on specific fields.

One of the most valuable features of Text::Record::Deduper is its ability to identify "near" duplicates by allowing for given name aliases. For example, if you have records containing variations of the same name (such as Bob and Robert), you can configure the module to recognize these aliases and group the records accordingly.

Text::Record::Deduper also makes it easy to generate reports and split your records into unique and duplicate files. And with options to ignore case sensitivity and leading/trailing white space, you can be sure that you're finding all of the true duplicates in your data.

Overall, if you're dealing with large sets of text records and need a way to quickly identify duplicates and streamline your data, Text::Record::Deduper is definitely worth checking out.

What's New

Version 0.05: N/A

Free Download 9.9K

Softpile

Free Downloads

Text::Record::Deduper

Most Popular

Related Downloads