Command-Line Corrupt Office 2007 Text Extractor extracts text from corrupted docx, xlsx and pptx files where the respective Office 2007 or 2010 programs fail to make this basic recovery. It also works on non-corrupt files.
Version: 2.0.1Coded by Ccy, author of HaHa Zip and using Delphi Zip, Command-Line Corrupt Office 2007 Text Extractor will often recover text from corrupt Office 2007 docx, xlsx, and pptx format files where the respective Office 2007 or 2010 programs cannot make the basic salvaging of the text or data.
Operating System: Windows
Office 2007 Office Open format files are zipped collections of XML files.
There are two kinds of corruption of these types of files, zip structure corruption and corruption of the XML files containing the actual text or data and / or the formatting. The unzipping module used in Office 2007 and 2010, appears to be more finicky than InfoZip module used by Command-Line Corrupt Office 2007 Text Extractor.
Thus the underlying XML can often be extracted as raw material for this new program even though this is not available to Office 2007 and 2010 programs.
In regards to the other type of corruption, XML is by design a very unforgiving medium for file damage.
From the errors returned from attempts at salvaging the text from corrupt docx and pptx files as well as the data from xlsx file, Office 2007 and 2010 appear to be using a standard interpreter of XML. Command-Line Corrupt Office 2007 Text Extractor on the other hand uses coding that is more tolerant of XML errors.
Version 2.0.1: New in v2.0.1 1.) New program name 2.) Extract text from pptx or xlsx to csv Version 1.0.2 1.) Bug fixed. Adding one line only when continuous Tab found 2.) Bug fixed. Remove unwanted text content that parsing from token 3.) New icon of executable Version 1.0.1 1.) First Built