46
These XML parsing functions require the expat library. However, because Apache 1.3.7 and
later is bundled with expat, this library is already installed on most machines. Therefore, PHP
enables these functions by default, and you don't need to explicitly configure PHP to support
XML.
expat parses XML documents and allows you to configure the parser to call functions when it
encounters different parts of the file, such as an opening or closing element tag or character
data (the text between tags). Based on the tag name, you can then choose whether to format
or ignore the data. This is known as event-based parsing and contrasts with DOM XML, which
use a tree-based parser.
A popular API for event-based XML parsing is SAX: Simple API for XML. Originally developed
only for Java, SAX has spread to other languages. PHP's XML functions follow SAX
conventions. For more on the latest version of SAX — SAX2 — see SAX2 by David Brownell
(O'Reilly).
PHP supports two interfaces to expat: a procedural one and an object-oriented one. Since the
procedural interface practically forces you to use global variables to accomplish any
meaningful task, we prefer the object-oriented version. With the object-oriented interface, you
can bind an object to the parser and interact with the object while processing XML. This allows
you to use object properties instead of global variables.
Here's an example application of expat that shows how to process an RSS feed and transform
it into HTML. For more on RSS, see Recipe 12.12
. The script starts with the standard XML
processing code, followed by the objects created to parse RSS specifically:
$xml = xml_parser_create( );
$rss = new pc_RSS_parser;
xml_set_object($xml, $rss);
xml_set_element_handler($xml, 'start_element', 'end_element');
xml_set_character_data_handler($xml, 'character_data');
xml_parser_set_option($xml, XML_OPTION_CASE_FOLDING, false);
$feed = 'http://pear.php.net/rss.php';
$fp = fopen($feed, 'r') or die("Can't read RSS data.");
while ($data = fread($fp, 4096)) {
xml_parse($xml, $data, feof($fp)) or die("Can't parse RSS data");
}
fclose($fp);
xml_parser_free($xml);
After creating a new XML parser and an instance of the
pc_RSS_parser
class, configure the
parser. First, bind the object to the parser; this tells the parser to call the object's methods
instead of global functions. Then call
xml_set_element_handler( )
and
xml_set_character_data_handler( )
to specify the method names the parser should
call when it encounters elements and character data. The first argument to both functions is
the parser instance; the other arguments are the function names. With