Download
For most projects, you should just download the full source code, and in your project, just include it.
However, the individual components can also be downloaded. This might be useful if you only need part of the functionality, or want an easy to manage version for making changes.
-
Main: This is a simple wrapper class which ties together all of the other objects.
-
Parser: Will take a HTML string and a reference to a DOM object (below), and every time a tag is started/closed, or a text node is created, the DOM object is passed the parsed data.
-
DOM: As the parser finds tags, the DOM object builds up a clean internal representation of the document. So before it trusts the data, all the data is passed though the HTML and CSS cleaners (below).
-
Clean HTML: Before the DOM object adds a tag to its representation of the document, it is checked and cleaned. Tests on each tag include ensuring it has the correct parent (<ul> has <li> children), the tag is on one of the allowed lists (one list for inline elements, and one for block level elements), and that the attributes are allowed for that tag (e.g. "href" on an <a>).
-
Clean CSS: Before the DOM object adds the style attribute to its representation of the document, the contents are parsed, checked, and cleaned. Each rule must be recognised, and the value should be appropriate.
-
Output: Is given the DOM array, and converts it into a HTML string.