This software is an open source Java-based implementation of The Internet Archive Wayback Machine.
The current Java version supports three access modes: "Archival URL" mode, "Proxy" mode, and "Domain Prefix" mode. In Archival Url mode, users can query and replay access requests as URLs. This mode modifies archived content by rewriting them as Archival URLs, enabling easy access to the Wayback Machine by embedded content and links.
Proxy URL mode allows documents to be replayed within a client browser, ensuring that the client requests additional URLs and embedded content without the need for server-side page markup or JavaScript. This mode can also be used with the Firefox plugin extension to navigate between versions of the current document with the Wayback Machine server displaying images from the same time period.
DomainPrefix mode is similar to ArchivalUrl mode, but with a wildcard DNS scheme to rewrite URLs, enabling all URL substitution to occur on the server. This mode is experimental.
The new Java version can be deployed in various modes, from a standalone application on a single host holding all archived documents and indexes, up to a highly distributed system with indexes and archived content spread across numerous machines. In the local, standalone mode, the new version automatically scans for new archived content in a specified location, indexing and serving new content automatically.
For easy access to content archived by Heritrix as it is crawled, directing the Wayback Machine to look for ARC files in the directory where an instance of the Heritrix web crawler is writing ARC output provides an excellent capability to browse. Though the Proxy URL mode may require special configuration of the client web browser, its browser configuration is not complex. However, it does mean that the content will not be available as a global URL.
Version 1.4.2: N/A