This document tests the ability of Apache Tika to extract content from an XHTML document.