The PHP 5.0 DOM is very powerful to create and work on valid RSS Feeds. I found a easy to follow step-by-step-tutorial here:
http://xml-rss.de/xml-rss-feed-mit-php.htm
(in german language but the code can be understand in every language i think) maybe it helps someone to understand the DOM-Thing better. But beware it don´t work on PHP 4!
Funciones de DOM XML (PHP 4)
Funciones obsoletas
Hay algunas funciones que no encajan en el estándar DOM y no deberían de usarse más. Esta funciones están listadas en la siguiente tabla. La función DomNode_append_child() ha cambiado su comportamiento. Ahora añade un hijo en vez de un hermano. Si esto arruina su aplicación, use la función no éstandar DomNode_append_sibling().
| Función antigua | Función nueva |
|---|---|
| xmldoc | domxml_open_mem() |
| xmldocfile | domxml_open_file() |
| domxml_new_xmldoc | domxml_new_doc() |
| domxml_dump_mem | DomDocument_dump_mem() |
| domxml_dump_mem_file | DomDocument_dump_file() |
| DomDocument_dump_mem_file | DomDocument_dump_file() |
| DomDocument_add_root | DomDocument_create_element() seguida de DomNode_append_child() |
| DomDocument_dtd | DomDocument_doctype() |
| DomDocument_root | DomDocument_document_element() |
| DomDocument_children | DomNode_child_nodes() |
| DomDocument_imported_node | Sin reemplazo. |
| DomNode_add_child | Crea un nuevo nodo con, p.ej., DomDocument_create_element() y lo añade con DomNode_append_child(). |
| DomNode_children | DomNode_child_nodes() |
| DomNode_parent | DomNode_parent_node() |
| DomNode_new_child | Crea un nuevo nodo con, p.ej., DomDocument_create_element() y lo añade con DomNode_append_child(). |
| DomNode_set_content | Crea un nuevo nodo con, p.ej., DomDocument_create_text_node() y lo añade con DomNode_append_child(). |
| DomNode_get_content | El contenido es sólo un nodo de texto y se puede acceder con DomNode_child_nodes(). |
| DomNode_set_content | El contenido es sólo un nodo de texto y se puede acceder con DomNode_append_child(). |
Clases
La API del módulo sigue el estándar DOM Nivel 2 standard tan estrechamente como es posible. Por lo tanto, la API está completamente orientada a objetos. Es una buena idea tener el estándar DOM disponibles cuando se use este módulo. Aunque la API está orientada a objetos, hay muchas funciones que pueden ser llamadas de una forma no orientada a objetos pasando el objeto que va a operar como primer argumento. Estas funciones están principalmente para mantener la compatibilidad con versiones antiguas de la extensión, y no deberían usarse cuando se creen nuevos scripts.
Esta API difiere de la API DOM oficial de dos maneras. Primero, todos los atributos de clase están implementados como funciones con el mismo nombre. Segundo, los nombres de funciones siguen la convención de nombres de PHP. Esto significa que una función DOM lastChild() será escrita como last_child().
Este módulo define varias clases que están listadas - incluyendo sus métodos - en las siguientes tablas. Las clases con un equivalente en el estándar DOM son llamadas DOMxxx.
| Nombre de la clase | Clases padre |
|---|---|
| DomAttribute | DomNode |
| DomCData | DomNode |
| DomComment | DomCData : DomNode |
| DomDocument | DomNode |
| DomDocumentType | DomNode |
| DomElement | DomNode |
| DomEntity | DomNode |
| DomEntityReference | DomNode |
| DomProcessingInstruction | DomNode |
| DomText | DomCData : DomNode |
| Parser | Actualmente todavía se llama DomParser |
| XPathContext |
| Nombre del método | Nombre de la función | Observación |
|---|---|---|
| doctype | DomDocument_doctype() | |
| document_element | DomDocument_document_element() | |
| create_element | DomDocument_create_element() | |
| create_text_node | DomDocument_create_text_node() | |
| create_comment | DomDocument_create_comment() | |
| create_cdata_section | DomDocument_create_cdata_section() | |
| create_processing_instruction | DomDocument_create_processing_instruction() | |
| create_attribute | DomDocument_create_attribute() | |
| create_entity_reference | DomDocument_create_entity_reference() | |
| get_elements_by_tagname | DomDocument_get_elements_by_tagname() | |
| get_element_by_id | DomDocument_get_element_by_id() | |
| dump_mem | DomDocument_dump_mem() | no está en el estándar DOM |
| dump_file | DomDocument_dump_file() | no está en el estándar DOM |
| html_dump_mem | DomDocument_html_dump_mem() | no está en el estándar DOM |
| xpath_init | xpath_init | no está en el estándar DOM |
| xpath_new_context | xpath_new_context | no está en el estándar DOM |
| xptr_new_context | xptr_new_context | no está en el estándar DOM |
| Nombre del método | Nombre de la función | Observación |
|---|---|---|
| tagname | DomElement_tagname() | |
| get_attribute | DomElement_get_attribute() | |
| set_attribute | DomElement_set_attribute() | |
| remove_attribute | DomElement_remove_attribute() | |
| get_attribute_node | DomElement_get_attribute_node() | |
| set_attribute_node | DomElement_set_attribute_node() | |
| get_elements_by_tagname | DomElement_get_elements_by_tagname() | |
| has_attribute | DomElement_has_attribute() |
| Nombre del método | Observación |
|---|---|
| DomNode_node_name() | |
| DomNode_node_value() | |
| DomNode_node_type() | |
| DomNode_last_child() | |
| DomNode_first_child() | |
| DomNode_child_nodes() | |
| DomNode_previous_sibling() | |
| DomNode_next_sibling() | |
| DomNode_parent_node() | |
| DomNode_owner_document() | |
| DomNode_insert_before() | |
| DomNode_append_child() | |
| DomNode_append_sibling() | no está en el estándar DOM. Esta función emula el antiguo comportamiento de DomNode_append_child(). |
| DomNode_remove_child() | |
| DomNode_has_child_nodes() | |
| DomNode_has_attributes() | |
| DomNode_clone_node() | |
| DomNode_attributes() | |
| DomNode_unlink_node() | no está en el estándar DOM |
| DomNode_replace_node() | no está en el estándar DOM |
| DomNode_set_content() | no está en el estándar, obsoleto |
| DomNode_get_content() | no está en el estándar, obsoleto |
| DomNode_dump_node() | no está en el estándar DOM |
| DomNode_is_blank_node() | no está en el estándar DOM |
| Nombre del método | Observación | |
|---|---|---|
| name | DomAttribute_name() | |
| value | DomAttribute_value() | |
| specified | DomAttribute_specified() |
| Nombre del método | Nombre de la función | Observación |
|---|---|---|
| target | DomProcessingInstruction_target() | |
| data | DomProcessingInstruction_data() |
| Nombre del método | Nombre de la función | Observación |
|---|---|---|
| add_chunk | Parser_add_chunk() | |
| end | Parser_end() |
| Nombre del método | Nombre de la función | Observación |
|---|---|---|
| eval | XPathContext_eval() | |
| eval_expression | XPathContext_eval_expression() | |
| register_ns | XPathContext_register_ns() |
| Nombre del método | Nombre de la función | Observación |
|---|---|---|
| name | DomDocumentType_name() | |
| entities | DomDocumentType_entities() | |
| notations | DomDocumentType_notations() | |
| public_id | DomDocumentType_public_id() | |
| system_id | DomDocumentType_system_id() | |
| internal_subset | DomDocumentType_internal_subset() |
La clase DomDtd está derivada de DomNode. DomComment está derivada de DomCData.
Ejemplos
Muchos ejemplos de esta referencia requieren una cadena XML. En vez de repetir esta cadena en cada ejemplo, se colocará en un fichero que será incluido por cada ejemplo. Esto fichero incluido se muestra en la sección del ejemplo siguientes. Alternativamente, se podría crear un documento XML y leerlo con DomDocument_open_file().
Ejemplo #1 Fichero incluido example.inc con cadena XML
<?php
$xmlstr = "<?xml version='1.0' standalone='yes'?>
<!DOCTYPE chapter SYSTEM '/share/sgml/Norman_Walsh/db3xml10/db3xml10.dtd'
[ <!ENTITY sp \"spanish\">
]>
<!-- lsfj -->
<chapter language='en'><title language='en'>Title</title>
<para language='ge'>
&sp;
<!-- comment -->
<informaltable ID='findme' language='&sp;'>
<tgroup cols='3'>
<tbody>
<row><entry>a1</entry><entry
morerows='1'>b1</entry><entry>c1</entry></row>
<row><entry>a2</entry><entry>c2</entry></row>
<row><entry>a3</entry><entry>b3</entry><entry>c3</entry></row>
</tbody>
</tgroup>
</informaltable>
</para>
</chapter>";
?>
Tabla de contenidos
- DomAttribute::name — Devuelve el nombre de atributo
- DomAttribute::set_value — Establece el valor d eun atributo
- DomAttribute::specified — Comprueba si se especifica un atributo
- DomAttribute::value — Devuelve el valor de atributo
- DomDocument::add_root — Agrega un nodo raíz [obsoleto]
- DomDocument::create_attribute — Create new attribute
- DomDocument::create_cdata_section — Crear un nuevo nodo cdata
- DomDocument::create_comment — Crea un nuevo nodo de comentario
- DomDocument::create_element_ns — Crea un nuevo nodo elemento con un namespace asociado
- DomDocument::create_element — Crea nuevo nodo tipo elemento
- DomDocument::create_entity_reference — Crea una referencia a entidad
- DomDocument::create_processing_instruction — Crea un nuevo nodo PI
- DomDocument::create_text_node — Crea nuevo nodo de texto
- DomDocument::doctype — Devuelve el tipo de documento
- DomDocument::document_element — Devuelve la raíz de nodo de elemento
- DomDocument::dump_file — Dumps the internal XML tree back into a file
- DomDocument::dump_mem — Dumps the internal XML tree back into a string
- DomDocument::get_element_by_id — Searches for an element with a certain id
- DomDocument::get_elements_by_tagname — Returns array with nodes with given tagname in document or empty array, if not found
- DomDocument::html_dump_mem — Dumps the internal XML tree back into a string as HTML
- DomDocument::xinclude — Substitutes XIncludes in a DomDocument Object
- DomDocumentType::entities — Returns list of entities
- DomDocumentType::internal_subset — Devuelve un subconjunto interno
- DomDocumentType::name — Devuelve el nombre del tipo de documento
- DomDocumentType::notations — Devuelve lista de anotaciones.
- DomDocumentType::public_id — Devuelve el id público de el tipo de documento.
- DomDocumentType::system_id — Devuelve el sistema de identificación de el tipo de documento.
- DomElement::get_attribute_node — Devuelve el nodo de el atributo dado
- DomElement::get_attribute — Devuelve el valor de el atributo dado
- DomElement::get_elements_by_tagname — Obtiene elementos por el nombre de etiqueta
- DomElement::has_attribute — Comprueba si un atributo existe en el nodo actual
- DomElement::remove_attribute — Elimina un atributo
- DomElement::set_attribute_node — Añade nuevos atributos
- DomElement::set_attribute — Establece el valor d eun atributo
- DomElement::tagname — Devuelve el nombre de el elemento actual
- DomNode::add_namespace — Agrega una declaración de namespace a un nodo
- DomNode::append_child — Agrega un nuevo hijo al final de los hijos
- DomNode::append_sibling — Agrega un nuevo hermano a un nodo
- DomNode::attributes — Retorna la lista de atributos
- DomNode::child_nodes — Retorna los hijos del nodo
- DomNode::clone_node — Clona un nodo
- DomNode::dump_node — Vuelca un único nodo
- DomNode::first_child — Retorna el primer hijo de un nodo
- DomNode::get_content — Obtiene el contenido del nodo
- DomNode::has_attributes — Verifica si el nodo tiene atributos
- DomNode::has_child_nodes — Verifica si el nodo tiene hijos
- DomNode::insert_before — Inserta un nuevo nodo como hijo
- DomNode::is_blank_node — Verifica si el nodo está vacío
- DomNode::last_child — Devuelve el último hijo del nodo
- DomNode::next_sibling — Devuelve el siguiente hermano del nodo
- DomNode::node_name — Retorna el nombre del nodo
- DomNode::node_type — Retorna el tipo del nodo
- DomNode::node_value — Retorna el valor de un nodo
- DomNode::owner_document — Retorna el documento al que pertenece el nodo
- DomNode::parent_node — Retorna el padre del nodo
- DomNode::prefix — Retorna el prefijo del namespace del nodo
- DomNode::previous_sibling — Retorna el nodo hermano anterior
- DomNode::remove_child — Quita un hijo de la lista de hijos
- DomNode::replace_child — Reemplaza un hijo
- DomNode::replace_node — Reemplaza un nodo
- DomNode::set_content — Establece el contenido del nodo
- DomNode::set_name — Establece el nombre del nodo
- DomNode::set_namespace — Establece el namespace de un nodo
- DomNode::unlink_node — Elimina un nodo
- DomProcessingInstruction::data — Devuelve los datos del nodo ProcessingInstruction
- DomProcessingInstruction::target — Returns the target of a ProcessingInstruction node
- DomXsltStylesheet::process — Applies the XSLT-Transformation on a DomDocument Object
- DomXsltStylesheet::result_dump_file — Dumps the result from a XSLT-Transformation into a file
- DomXsltStylesheet::result_dump_mem — Dumps the result from a XSLT-Transformation back into a string
- domxml_new_doc — Crea un nuevo documento XML vacío
- domxml_open_file — Crea un objeto DOM a partir de un archivo XML
- domxml_open_mem — Crea un objeto DOM desde un documento XML
- domxml_version — Obtiene la versión de la biblioteca XML
- domxml_xmltree — Crea un árbol de objetos PHP a partir de un documento XML
- domxml_xslt_stylesheet_doc — Crea un objeto DomXsltStylesheet desde un objeto DomDocument
- domxml_xslt_stylesheet_file — Crea un objeto DomXsltStylesheet desde un documento XSL en un fichero
- domxml_xslt_stylesheet — Crea un objeto DomXsltStylesheet desde un documento XSL en una cadena
- domxml_xslt_version — Obtiene la versión de la biblioteca XSLT
- xpath_eval_expression — Evalúa la Ruta de Ubicación XPath en la cadena entregada
- xpath_eval — Evalúa la Ruta de Ubicación XPatch en la cadena dada
- xpath_new_context — Crea un nuevo contexto new
- xpath_register_ns_auto — Registrar el espacio de nombres dado en el contexto XPath pasado
- xpath_register_ns — Registrar el espacio de nombres dado en el contexto XPath pasado
- xptr_eval — Evalúa la Ruta de Ubicación XPtr en la cadena dada
- xptr_new_context — Crea un nuevo contexto XPath
Hi at All,
if you use xpath_eval() you get a xpathobject with a type-member-variable, which tells you about the type of the found content. Here are the values and the corresponding types:
1 = XPATH_NODESET (integer)
2 = XPATH_BOOLEAN (integer)
3 = XPATH_NUMBER (integer)
4 = XPATH_STRING (integer)
I think, but don't know, that the rest of the constants are:
0 = XPATH_UNDEFINED (integer)
5 = XPATH_POINT (integer)
6 = XPATH_RANGE (integer)
7 = XPATH_LOCATIONSET (integer)
I hope i could help some people.
Greetz,
Chris
Referenced array functions drive me crazy for one reason or another (personal issue I guess). So for any others like me, here's my modification (thanks to the original posters below for the base to work on!)
I haven't tested this on much more than simple XML files, so there's probably a few ways to break this - I'm also thinking this could probably be rewritten to be more efficient also, but it's working quite well for me thus far.
<?php
function xml2array($domnode)
{
$nodearray = array();
$domnode = $domnode->firstChild;
while (!is_null($domnode))
{
$currentnode = $domnode->nodeName;
switch ($domnode->nodeType)
{
case XML_TEXT_NODE:
if(!(trim($domnode->nodeValue) == "")) $nodearray['cdata'] = $domnode->nodeValue;
break;
case XML_ELEMENT_NODE:
if ($domnode->hasAttributes() )
{
$elementarray = array();
$attributes = $domnode->attributes;
foreach ($attributes as $index => $domobj)
{
$elementarray[$domobj->name] = $domobj->value;
}
}
break;
}
if ( $domnode->hasChildNodes() )
{
$nodearray[$currentnode][] = xml2array($domnode);
if (isset($elementarray))
{
$currnodeindex = count($nodearray[$currentnode]) - 1;
$nodearray[$currentnode][$currnodeindex]['@'] = $elementarray;
}
} else {
if (isset($elementarray) && $domnode->nodeType != XML_TEXT_NODE)
{
$nodearray[$currentnode]['@'] = $elementarray;
}
}
$domnode = $domnode->nextSibling;
}
return $nodearray;
}
?>
My short way of parsing an XML document, for example displaying the document in a structured form:
<?php
$indent = "";
$file = "semi.xml";
$showfile = file_get_contents("c:/Program Files/Apache Group/apache/htdocs/phpxml" . "/" . $file); // whatever path
// maybe the whole path is not important, look it up in other posts
$newstring=utf8_encode($showfile); // it's important!
if(!$domDocument = domxml_open_mem($newstring)) {
echo "Couldn't load xml...";
exit;
}
$rootDomNode = $domDocument->document_element();
print "<pre>";
printElements($rootDomNode);
print "</pre>";
function printElements($domNode)
{
if($domNode)
{
global $indent;
if($domNode->node_type() == XML_ELEMENT_NODE)
{
print "<br />".$indent."<".$domNode->node_name();
if($domNode->has_attributes())
{
$attributes = $domNode->attributes();
foreach($attributes as $domAttribute)
{
print " $domAttribute->name=\"$domAttribute->value\"";
}
}
print ">";
if($domNode->has_child_nodes())
{
$indent.=" ";
$nextNode = $domNode->first_child();
printElements($nextNode);
$indent= substr($indent, 0, strlen($indent)-2);
print "<br />".$indent."<"."/".$domNode->node_name().">";
}
else
{
print "$domNode->node_value()</".$domNode->node_name().">";
}
}
$nextNode = $domNode->next_sibling();
printElements($nextNode);
}
}
?>
Re: websiterepairguys... Close but no cigar ;-)
As written it will not work if the repeated tags are somewhere other than the first node, i.e. the following will not work:
<nodes>
<node>onething</node>
<node>something</node>
<node>something</node>
</nodes>
You must store the new node name when you get a new sibling that doesn't match the previous and then it will work OK. Amended code:
function dom_to_array($domnode, &$array) {
$parent=$domnode;
$domnode = $domnode->firstChild;
$myname=$domnode->nodeName;
$x=1;
while (!is_null($domnode)) {
switch ($domnode->nodeType) {
case XML_ELEMENT_NODE: {
if ( !$domnode->hasChildNodes()) {
$array[$domnode->nodeName]='';
} else if ( $domnode->hasChildNodes() && $domnode->firstChild->nodeType==XML_TEXT_NODE) {
$array[$domnode->nodeName]=$domnode->firstChild->nodeValue;
} else if ( $domnode->hasChildNodes() ) {
$array_ptr = & $array[$domnode->nodeName];
dom_to_array($domnode, $array_ptr);
}
break;
}
}
$domnode = $domnode->nextSibling;
if($domnode->nodeName == $myname)
{
$domnode->nodeName.=($x++);
} else {
$myname = $domnode->nodeName;
}
}
}
I tried using the dom_to_simple_array that the user jas posted above, but it didnt work very well.
The problems were it didnt handle sibling nodes with the same name, such as:
<nodes>
<node>something</node>
<node>something</node>
</nodes>
Also, when it built child arrays from child nodes, it always interjected an wrapping array around the child, which isnt necessary. Here is the patched code:
function dom_to_array($domnode, &$array) {
$parent=$domnode;
$domnode = $domnode->firstChild;
$myname=$domnode->nodeName;
$x=1;
while (!is_null($domnode)) {
switch ($domnode->nodeType) {
case XML_ELEMENT_NODE: {
if ( !$domnode->hasChildNodes()) {
$array[$domnode->nodeName]='';
} else if ( $domnode->hasChildNodes() && $domnode->firstChild->nodeType==XML_TEXT_NODE) {
$array[$domnode->nodeName]=$domnode->firstChild->nodeValue;
} else if ( $domnode->hasChildNodes() ) {
$array_ptr = & $array[$domnode->nodeName];
dom_to_array($domnode, $array_ptr);
break;
}
}
}
$domnode = $domnode->nextSibling;
if($domnode->nodeName==$myname)
{
$domnode->nodeName.=($x++);
}
}
}
snippet of array produced by this:
[admin] => Array
(
[menu] => Array
(
[title] => Page Manager
[view] => list
)
[files] => Array
(
[filename] => modules/testmodule/testmodule.php
[filename1] => modules/testmodule/testmodule.xml
[filename2] => media/lang/en-us/templates/testmodule.tpl
)
)
If you want to subclass the domxml-classes, you have to use PHP5. It doesn't work with PHP4, and never will.
If you are using apache, instead of copying files around (iconv.dll for instance) you can use this in your httpd.conf for apache:
LoadFile "d:/php/dlls/iconv.dll"
I placed this line before
LoadModule php4_module "d:/php/sapi/php4apache2.dll"
and it worked, no copying of files or anything therefore helps when updating php, don't have to mess around searching for files and other stuff.
I recently developed a script for parsing DHL XML transaction responses - finding it a pain in the rear to actually parse the XML and set my variables - it actually wasn't that hard once I figured it out - and it goes something like this...
<?php
// Use with a class containing functions set_attributes() and
// set_data(). Use the following to set variables from the
// resulting xml. $node is a dom xml object - in the first call
// to loop, $node would be equal to the root document
// element.
function loop($node) {
// set attribute tags here
if ($node->has_attributes()) {
$this->set_attributes($node);
} // end if node has attributes
if ($node->has_child_nodes()) {
$this->loop($node->first_child());
} // end if node has child
else {
$this->set_data($node);
} // end if node has no child
// get next sibling
$node = $node->next_sibling();
if ($node) {
$this->loop($node);
} // end if node
} // end function loop
?>
The code goes from the root element, if the element has attributes - it sets attribute variables. Second, it recursively proceeds to the lowest level element (no more children). Once that level has been reached, data variables are set. The next step goes to the next sibling of the element, if it exists. If the next sibling does not exists, the function is ended and the current element is returned to the parent element. The parent element is then checked for siblings. This process continues (as is with recursion) until the parent element is back at the root element, which is the end of the document.
You can always use a sax parser (expat) which saves on memory storage (there is none as sax is event driven) and use this neat code to produce an array structure of you xml file :
see http://fr2.php.net/manual/fr/function.xml-parse.php
comment by
tgrabietz at bupnet dot de
22-Sep-2004 05:05
i needed to have an easy way to create a multi-dimensional but EXTREMELY SIMPLE php array out of some XML text i'm receiving. NOT an object. just an ARRAY.
i found that as simple a request as this seemed to be, the new (php5) DOM functions do not provide this functionality.
even the SimpleXML functions are object-oriented, which doesn't work for some of my purposes (sending to a Smarty template variable for looping through, etc.) -- returning attributes as SimpleXMLElement objects instead of strings, etc.. i just wanted an ARRAY containing the data as STRINGS.
eli (http://www.hoktar.com) had submitted such code earlier, based on domxml/php4 calls. his function was called "domxml_xmlarray".
but when php5 came out, eli's comments at the bottom of the PHP site got erased. (fortunately, i had already saved his code.) no doubt, mine will too w/next version..
furthermore, as far as i can tell, no one has taken the cue to add something like eli's domxml_xmlarray function directly into the DOMDocument object (but it would be nice).
so i translated eli's code, now using the dom calls (instead of the older domxml calls), and renamed the function to "dom_to_simple_array()".
below is a script containing the function itself as well as an example of its use. just copy it to your server somewhere and execute it and it should work right off the bat if you are using php5.
thanks.
jeff stern
==================================================================
<?php
function dom_to_simple_array($domnode, &$array) {
$array_ptr = &$array;
$domnode = $domnode->firstChild;
while (!is_null($domnode)) {
if (! (trim($domnode->nodeValue) == "") ) {
switch ($domnode->nodeType) {
case XML_TEXT_NODE: {
$array_ptr['cdata'] = $domnode->nodeValue;
break;
}
case XML_ELEMENT_NODE: {
$array_ptr = &$array[$domnode->nodeName][];
if ($domnode->hasAttributes() ) {
$attributes = $domnode->attributes;
if (!is_array ($attributes)) {
break;
}
foreach ($attributes as $index => $domobj) {
$array_ptr[$index] = $array_ptr[$domobj->name] = $domobj->value;
}
}
break;
}
}
if ( $domnode->hasChildNodes() ) {
dom_to_simple_array($domnode, $array_ptr);
}
}
$domnode = $domnode->nextSibling;
}
}
# now, let's make a sample string containing some XML
$strXMLData = "<contacts>
<contact>
<name>
John Doe
</name>
<phone>
123-456-7890
</phone>
</contact>
<contact>
<name>
Mary Smiley
</name>
<phone>
567-890-1234
</phone>
</contact>
</contacts>";
# create a DOM tree xml object (hierarchical array) from
# this XML string
$domdoc = new DOMDocument;
$domdoc->loadXML($strXMLData);
# now simplify the DOM array into a very simple array structure
# first, create an empty array to be filled with your
# simplified array result..
$aData = array();
# now, pass the dom document and your empty array to the
# converter function.
dom_to_simple_array($domdoc, $aData);
# now $aData contains your simplified array, so print it out
?><html>
<body>
<p>there are <? echo count($aData['contacts'][0]['contact']); ?>
contacts</p>
<p>the 2nd contact's phone number is
<?echo $aData['contacts'][0]['contact'][1]['phone'][0]['cdata']; ?>
</p>
<hr />
<p>Here is the raw array structure:</p>
<pre>
<? print_r($aData); ?>
</pre>
</body>
</html>
==================================================================
PHP4/DOMXML code is not compatible with the new PHP5/dom extension. While the conversion is quite strait forward, it can take a long time if domxml has been broadly used. Moreover, it can be interesting to have old PHP4 scripts ready for PHP5 as soon as possible even if the server is still running PHP4. Since I have that kind of problem, if have written a small library to include in PHP4 scripts to enable them to be run on PHP5. http://alexandre.alapetite.net/doc-alex/domxml-php4-php5/
It does not cover all the domxml functionality, but most of the main functions and can easily be extended. Tested with PHP4.3.7 and PHP5.0.0RC3 but I will try to keep it updated. I hope it can help.
When installing PHP --with-dom and --with-dom-xslt on a Red Hat 9.0 remember to install the following packages:
libxml
libxml2
libxml2-devel
libxslt
libxslt-devel
Then you will be spared error messages when trying to configure.
regards
SAM
Sorry, a bug in my code... I made the first version late at night, sorry!
The bug was in the "if ($ChildDomNode->has_child_nodes())" block, I didn't save the data for the for the CildNode of the CildNodes. the bug has been fixed.
<?php
function getElementAttributes($DomNode,$elementName,$attriName)
{
if ($ChildDomNode = $DomNode->first_child())
{
while($ChildDomNode)
{
if ($ChildDomNode->node_type() == XML_ELEMENT_NODE)
{
if($ChildDomNode->node_name() == $elementName)
{
if ($ChildDomNode->has_attributes())
{
$Array = $ChildDomNode->attributes();
foreach ($Array AS $DomAttribute)
{
if($DomAttribute->name() == $attriName)
{
$nodeArray[] = $DomAttribute->value();
}
}// foreach ($Array AS $DomAttribute)
}//if ($ChildDomNode->has_attributes())
}
if ($ChildDomNode->has_child_nodes())
{
$tmpArray = (getElementAttributes($ChildDomNode,$elementName,$attriName));
$nodeArray = array_merge($nodeArray, $tmpArray);
unset($tmpArray);
}// if ($ChildDomNode->has_child_nodes())
}//if ($ChildDomNode->node_type() == XML_ELEMENT_NODE)
$ChildDomNode = $ChildDomNode->next_sibling();
}//while($ChildDomNode)
return $nodeArray;
}//if ($ChildDomNode = $DomNode->first_child())
}
$file = "test3.xml";
$element = "pb";
$att = "id";
$DomDocument = domxml_open_file($file);
$RootDomNode = $DomDocument->document_element();
$array = getElementAttributes($RootDomNode,$element,$att);
echo "<pre>";
print_r($array);
echo "</pre>";
?>
Hey;
If you need to parse XML on an older version of PHP (e.g. 4.0) or if you can't get the expat extension enabled on your server, you might want to check out the Saxy and DOMIT! xml parsers from Engage Interactive. They're opensource and pure php, so no extensions or changes to your server are required. I've been using them for over a month on some projects with no problems whatsoever!
Check em out at:
DOMIT!, a DOM based xml parser, uses Saxy (included)
http://www.engageinteractive.com/redir.php?resource=3&target=domit
or
Saxy, a sax based xml parser
http://www.engageinteractive.com/redir.php?resource=4&target=saxy
Brad
This recursive function will iterate over a DOM object and display it as a nicely formatted XML structure. I used intuitive variable names to help learn more about the DOM functions and their return values.
<<?php
function PrintDomTree($DomNode)
{
if ($ChildDomNode = $DomNode->first_child()) {
static $depth = 0;
$whitespace = "\n<br>".str_repeat(" ", ($depth * 2));
while ($ChildDomNode) {
if ($ChildDomNode->node_type() == XML_TEXT_NODE) {
echo trim($ChildDomNode->node_value());
} elseif ($ChildDomNode->node_type() == XML_ELEMENT_NODE) {
$HasTag = 1;
echo $whitespace;
echo "<", $ChildDomNode->node_name();
if ($ChildDomNode->has_attributes()) {
$Array = $ChildDomNode->attributes();
foreach ($Array AS $DomAttribute) {
echo " ", $DomAttribute->name(), "=\"", $DomAttribute->value(), "\"";
}
}
echo ">";
if ($ChildDomNode->has_child_nodes()) {
$depth++;
if (PrintDomTree($ChildDomNode)) {
echo $whitespace;
}
$depth--;
}
echo "</", $ChildDomNode->node_name(), ">";
}
$ChildDomNode = $ChildDomNode->next_sibling();
}
return $HasTag;
}
}
?>
If you're having trouble understanding how the the DOM XML extension fits together you may find the UML diagram here helps: http://www.phppatterns.com/index.php/article/articleview/38
When parsing "iso-8859-1" encoded XML files, use "utf8_decode" to recover node contents (libxml uses "UTF-8" internal encoding, so conversion needed).
--- BEGIN: mydata.xml ---
<?xml version="1.0" encoding="iso-8859-1"?>
...
--- END: mydata.xml---
--- BEGIN: myparser.php ---
<?php
...
$domxml = domxml_open_file("mydata.xml"));
...
$content = utf8_decode(trim($node->content));
echo $content;
...
?>
--- END: myparser.php
-eof-
