Parse and filter HTML using a PHP class.
The library comprises of multiple classes, including a markup parser class that can handle different types of markup documents like HTML, XML, and DTD files. These classes can be linked to retrieve document token elements produced by the main markup parser class and effectively filter document elements.
The markup validator filter class is responsible for validating documents against a DTD, with the potential to remove invalid tags and attributes. The safe HTML filter class uses white lists to process HTML tags and data, eliminating potentially harmful HTML tags and CSS that could be utilized in cross-site scripting (XSS) or cross-site request forgery (CSRF) security attacks.
Once the HTML tokens have been filtered, the reassembled tokens generate a well-structured and safe HTML document. Furthermore, the HTML links filter class can extract any links found in the HTML document.
Finally, the DTD parser and CSS parser are utility classes used by other classes in the Secure HTML parser and filter library.
Overall, the Secure HTML parser and filter library poses as an effective solution for parsing and filtering insecure HTML tags and CSS styles while also ensuring security against security attacks like XSS and CSRF.
Version 2009.08.27: N/A