Monday
Jul252011
Syntax Sugar #3 - Easily Parsing HTML
Monday, July 25, 2011 at 9:30AM The PHP SimpleXML extension makes parsing and using XML documents in your code a piece of cake. Unfortunately, HTML rarely complies as a well-formed XML document.
Using SimpleXML combined with DOMDocument, we can parse a reasonably badly formatted HTML document in very few lines of code. The trick here is using the DOMDocument::$strictErrorChecking variable to ensure that the source is parsed as dodgey HTML.
<?php
// Some HTML string...
$html = file_get_contents("http://codeigniter.com");
// Create a new DOMDocument and set strictErrorChecking to FALSE
$dom = new DOMDocument();
$dom->strictErrorChecking = FALSE;
// Load the HTML into the DOMDocument
$dom->loadHTML($html);
// Load the DOMDocument into SimpleXML... and win!
$obj = simplexml_import_dom($dom);
tagged
php,
syntax sugar
php,
syntax sugar 
Reader Comments