This software categorizes text records as complete, partial, or near duplicates.
To get started, simply import the module and create a new instance of Text::Record::Deduper. From there, you can use the dedupe_file() method to remove entire lines that are duplicated, or configure the module to dedupe comma separated records based on specific fields.
One of the most valuable features of Text::Record::Deduper is its ability to identify "near" duplicates by allowing for given name aliases. For example, if you have records containing variations of the same name (such as Bob and Robert), you can configure the module to recognize these aliases and group the records accordingly.
Text::Record::Deduper also makes it easy to generate reports and split your records into unique and duplicate files. And with options to ignore case sensitivity and leading/trailing white space, you can be sure that you're finding all of the true duplicates in your data.
Overall, if you're dealing with large sets of text records and need a way to quickly identify duplicates and streamline your data, Text::Record::Deduper is definitely worth checking out.
Version 0.05: N/A