LearnHow to parse XML with PHP5

   
Treehouse

Treehouse
writes on July 26, 2011

Feeds are streams of content that people can use to share pieces of information across websites. PHP5’s simpleXML functions dramatically simplify the process of interpreting the feeds into something useful for your web pages.

I recently worked on a widget that displayed a live feed of songs from one of our radio station’s playlist. The system that runs our radio station outputs XML which is then fed to a database. A program turns that data into feeds that get pulled by other programs.
Live feed from our station's playlist

It’s common to use XML to share data between servers that use different technologies. The system that generates the playlist is proprietary, so XML provides an easy format through which this data can be shared.

What is XML?

The eXtensible Markup Language is a way to structure your data for sharing across sites. Some of the technologies that are crucial to the web like RSS (Real Simple Syndication) and Podcasts are special flavors of XML. The beautiful thing about XML is that you can easily roll your own for anything you need.

XML is easy to create because it’s a lot like HTML…except you can make up your own tags. Let’s say, for example that you’re putting together a feed for a list of songs playing at your own radio station. We’ll keep this simple, so we’ll just encode the name of the artist, the title of the song, plus the time when the song was played. We make up a couple of tags called <title> and <artist> and wrap each of them around a <song> tag. We’ll create a dateplayed attribute for each song with the date and time the song was played. You might encode something like that in this manner.

<songs>
    <song dateplayed="2011-07-24 19:40:26">
        <title>I left my heart on Europa</title>
        <artist>Ship of Nomads</artist>
    </song>
    <song dateplayed="2011-07-24 19:27:42">
        <title>Oh Ganymede</title>
        <artist>Beefachanga</artist>
    </song>
    <song dateplayed="2011-07-24 19:23:50">
        <title>Kallichore</title>
        <artist>Jewitt K. Sheppard</artist>
    </song>
</songs>

There’s some rules that you have to adhere to when creating XML data. If you’re familiar with XHTML…you’ll be right at home with some of these, but let’s review them:

  • XML is case sensitive so <Title>` is not the same as <title>.
  • All XML elements must have closing tags.
  • XML requires a root element (the <songs> tag above serves as our root element)
  • Attributes must be quoted
  • Special characters (like & (&amp;) and < (&lt;) and > (&gt;) signs) must be encoded.

XML is a bit more strict than HTML, but is real easy to create and deal with.

Introducing simpleXML

With simpleXML, it’s as easy as reading the XML and then accessing it’s contents through an easy to read object. Assuming we’ve got our XML file above saved as a file called songs.xml in the same folder as our php file, we can read the whole feed into an object with the following code.

<?php
    $mysongs = simplexml_load_file('songs.xml');
?>

That’s it! The file can even be the URL of a feed on the web and not just a file on your hard drive. We now have an object that is a representation of our file. The songs object has been absorbed into the $mysongs variable. If we want to output the name of the first artist in our list we can refer to it like this:

<?php
    $mysongs = simplexml_load_file('songs.xml');
    echo $mysongs->song[0]->artist;
?>

Showing the first artist in our XML document

Notice that our XML tags are mapped as part of the object so we can get to any element simply by typing it’s name. Remember that arrays are 0 indexed in PHP so our first title would be our 0th title. Now, let’s output the third song title.

<?php
    $mysongs = simplexml_load_file('songs.xml');
    echo $mysongs->song[2]->title;
?>

Showing the third song's title

Working with Attributes

In order to get to our dates, we’ll need to know how to access attributes, the notation is slightly different than with tags, but just as easy. As a matter of fact, it works just like accessing an array element. Let’s see how you would take a look at the second song’s date.

<?php
    $mysongs = simplexml_load_file('songs.xml');
    echo $mysongs->song[1]['dateplayed'];
?>

Showing the second song's date played

Making a list of songs

So now that we’ve got the basics of accessing elements, let’s write the code to make a complete list of our songs parsed by interpreting our XML file.

<?php
    $mysongs = simplexml_load_file('songs.xml');
    echo "<ul id="songlist">n";
    foreach ($mysongs as $songinfo):
        $title=$songinfo->title;
        $artist=$songinfo->artist;
        $date=$songinfo['dateplayed'];
        echo "<li><div class="title">",$title,"</div><div class="artist">by ",$artist,"</div><time>",$date,"</time></li>n";
    endforeach;
    echo "</ul>";
?>

We’re using a foreach statement here to go through each song and then parsing the information into a simple HTML list. You can use that in a regular HTML document or use it as a widget to display a list of songs.

Generating a list of songs with simpleXML

Parsing a Flickr feed from a set

There are lots of XML feeds available for you to parse online. We can, for example, get a feed from a Flickr set for inclusion on a website. That way, when you update your Flickr set, the widget will automatically display this on your site. I’ve prepared a special set with some pictures of kittens. In order to get the XML for this feed, we can go to our page on Flickr and look for the XML icon at the bottom left of the screen.

My kittens set on Flickr

We want to examine the feed to determine it’s structure. You can right click on that feed’s link on the flickr page and save it to your hard drive, then rename the file photoset.xml and open it up with your browser.

Reading the XML document

I’ve opened the XML file with Safari, which lets me take a look at the structure (Safari will normally try to push an RSS link to it’s built in reader, so saving it, giving it an XML extension and opening it will let us view the structure). I can see that each photo is stored as an <entry> tag in our XML. Inside this tag, there are two <link> tags. The first one has a link to our image on Flickr. The second one has a link to a medium sized version of our image.

I’m going to modify the code slightly and make an adjustment on the second link to get a small thumbnail of all of our images. Before I do that, I’ll go back to Flickr and right click on the link to the XML file and this time copy it to the clipboard, then adjust the code:

<?php
    $mypix = simplexml_load_file('http://api.flickr.com/services/feeds/photoset.gne?set=72157627229375826&nsid=73845487@N00&lang=en-us');
    foreach ($mypix->entry as $pixinfo):
        $title=$pixinfo->title;
        $link=$pixinfo->link['href'];
        $image=str_replace("_b.jpg","_s.jpg",$pixinfo->link[1]['href']);
        echo "<a href="",$link,""><img src="",$image,"" alt="",$title,"" /></a>n";
    endforeach;
?>

We’re going to use the <title> tag as the alt text of our <img> tag. We’re also going to use the first of the <link> tags as the href attribute of our anchor tag. Getting to our second link will be a bit trickier. We can use the array notation to get to the second link’s href attribute ($pixinfo->link[1]['href']). However, that will give us a much larger picture than we’ll need. I’m using a simple str_replace function to change the end of our link url from _b.jpg, to _s.jpg. That will give us thumbnails, which gives you something like this.

The finished feed of kittens from Flickr

Conclusion

SimpleXML is super easy and even fun to use to parse complicated feeds like the example from Flickr. Notice that it had no problems getting to the second instance of the <link> tag inside the <entry> tag. This would have been really difficult to do with previous versions of PHP. There’s a lot more to simpleXML, which you can read in the PHP manual. If you’re interested in an object oriented approach, PHP5 also provides the SimpleXMLElement class.

If you’d like to learn more PHP, feel free to head over to the Think Vitamin Membership PHP course.

GET STARTED NOW

Learning with Treehouse for only 30 minutes a day can teach you the skills needed to land the job that you've been dreaming about.

Get Started

58 Responses to “How to parse XML with PHP5”

  1. great post
    anyone can easily learn through this post
    thanks

  2. why don’t use PHP DOM, can you tell me the difference between simpleXML and PHP DOM. thanks

  3. Thank you very, very much Treehouse!
    This simple tutorial helped me way more than official (confusing) PHP documentation. So simple and instructive. God bless you…

    • Stephanie Hallberg on April 16, 2017 at 6:03 pm said:

      Update pls.. Code in tutorial gives
      Parse error: syntax error, unexpected ‘”‘, expecting ‘,’ or ‘;’ in C:\xampp\htdocs\project\_new\nyt\wireframe\_test\simplexml\easy.php on line 7

  4. Baroninn Vefhönnun on February 27, 2016 at 2:53 am said:

    Arg !
    The comment system took out some of my code , so the above code does not work …
    Please correct it for me “planetoftheweb”
    Thanks

  5. Baroninn Vefhönnun on February 27, 2016 at 2:50 am said:

    This is a great article on how to use xml for simple tasks. Thank you !

    If anyone is wondering why the list of songs and flickr feed isnt working its because you cannot use double-quotes inside double-quotes, (not that im an expert, and correct me if im wrong.)

    I had to modify the code to make it work , and below is the correct code that worked for me :
    List of songs :
    <?php
    $mysongs = simplexml_load_file('songs.xml');
    echo " “;
    foreach ($mysongs as $songinfo):
    $title=$songinfo->title;
    $artist=$songinfo->artist;
    $date=$songinfo[‘dateplayed’];
    echo “”,$title,”by “,$artist,””,$date,” “;
    endforeach;
    echo “”;
    ?>

    And here is the flickr-feed code :
    entry as $pixinfo):
    $title=$pixinfo->title;
    $link=$pixinfo->link[‘href’];
    $image=str_replace(“_b.jpg”,”_s.jpg”,$pixinfo->link[1][‘href’]);
    echo “ “;
    endforeach;
    ?>

    I also changed n to   to get the space betwene the photos, when using n it just generated n betwene the photos,
    Im not sure if this is outdated or if my server isnt allowing this code, but I had to modify the code.
    This is a great article, and following your guide, creating xml and php file in a folder on my server was nothing but amazing and simple. Just copy paste and then examine the code to understand what each element is doing.

    Thank you for creating this awesome tut on simplexml.
    I would like to learn alot more.

  6. You could certainly see your enthusiasm in the article you write.
    The world hopes for even more passionate
    writers such as you who are not afraid to mention how they believe.

    At all times follow your heart.

  7. How could you group arrays based on a particular value of an element. Say in the example you want to create an array of all songs (including artists) by dateplayed?

  8. Doesn’t work with SOAP-XMLs 🙁
    Returns $xml->count() == 0

  9. Great article, I have been trying hard for very long to find this solution. This is the best and perfect article I have read about XML parsing using PHP.

    I read the comments above and I am concerned about knowing the best way to parse the large xml data. I have a product data feed of over 300,000 in different categories, the biggest category feed has over 30000 products in it. What can be done about this?

  10. alert(‘hi ……………’)

  11. As the food rolls down the conveyor belt, it is in a simple
    pattern. Internet is the best solution for everything now parents
    can give dozens of new games to their kids without paying anything.
    If you want a real relationship then that’s probably not the
    best way to find it.

  12. Awesome! Thanks for putting this together. Really helped out with a project I’m currently working at!

  13. Its both homie

  14. antony492 on August 13, 2013 at 6:57 am said:

    Problem is when the file gets big, simple xml uses a lot of memory because it stores the whole file there. It is very inefficient, why would I want to store the whole file if I want one bit?

  15. Gero Calderón on August 6, 2013 at 10:01 pm said:

    Hi, anybody can help me? What if I want to get the content of the tag
    . What I must to do for get this info. Thank
    you.

  16. Hi, anybody can hepl me? What if I want to get the content of the tag . What I must to do for get this info. Thank you.

  17. Why do use comma, shouldn’t it be a full stop ?

    n”;

  18. Andrew Wanyama on July 25, 2013 at 6:41 am said:

    Thanks

  19. Michael on July 6, 2013 at 4:27 am said:

    Pagination or previous and next links would be nice.

  20. Thiago Oliveira on May 11, 2013 at 11:14 pm said:

    Nice article! Thank you.

  21. sriharigoud2010 on May 7, 2013 at 2:02 am said:

    Nice tutorial

  22. Learn something new today thanks to you great stuff

  23. Nice tutorial!  Could have used that a couple months ago when I rebuilt my Comics page 🙂

  24. Japio Katwijk on August 6, 2011 at 6:04 pm said:

    This is no html5, but just php … losers

  25. Japio Katwijk on August 6, 2011 at 6:04 pm said:

    This is no html5, but just php … losers

  26. Very good write up on practical working with both PHP and XML.

    Good on ya 🙂

  27. Very good write up on practical working with both PHP and XML.

    Good on ya 🙂

  28. Nice article.  These two tips should help out people as well:

    1)  If you come across a XML feed with many nested elements xpath can come in handy because it goes straight to the part of the structure that you care about:

    foreach ($xml->xpath(‘//player’) as $item) {
    echo $item->firstname;
    }
    http://php.net/manual/en/simplexmlelement.xpath.php

    2) simplexml does not use gzip compression when fetching.  Don’t waste your own bandwidth and the owners.  So use cURL with gzip compression if the file is not local, then pass it to simpleXML.

  29. Nice article.  These two tips should help out people as well:

    1)  If you come across a XML feed with many nested elements xpath can come in handy because it goes straight to the part of the structure that you care about:

    foreach ($xml->xpath(‘//player’) as $item) {
    echo $item->firstname;
    }
    http://php.net/manual/en/simplexmlelement.xpath.php

    2) simplexml does not use gzip compression when fetching.  Don’t waste your own bandwidth and the owners.  So use cURL with gzip compression if the file is not local, then pass it to simpleXML.

  30. I really enjoyed reading your article! Thanks it is very helpful to me! 🙂

  31. I want to run a IPN handler for a PayPal shopping cart system. Can I use any specific scripts to do this in the ssl folder? http://soundcloud.com/groups/hcg-activator-13

  32. Sachin Pethani on July 26, 2011 at 4:42 pm said:

    Hi Ray,

    I’m much interested in how to generate xml from php data structure.i.e. array

    will you recommendate any existing php library to build xml ?

    here data structure is not flat. I mean, array could be 1-d or n-dimensional.

    Thanks
    Sachin

    • If rolling my own generator proved too cumbersome, I would probably go with the XMLWriter functions then. http://www.php.net/manual/en/ref.xmlwriter.php. Caefer below has some good articles below on that subject.

  33. Ray, thank you so much! I remember trying to use SimpleXML a few years back (it looks as though it has been updated since then as it involved a lot more than just loading the XML file in one line).

    When I discovered it, I got really excited and then really dejected when it seemed to be almost impossible to use. I then pretty much cast it aside and forgot about it.

    Now I’m really looking forward to using it again.

  34. Ray, thank you so much! I remember trying to use SimpleXML a few years back (it looks as though it has been updated since then as it involved a lot more than just loading the XML file in one line).

    When I discovered it, I got really excited and then really dejected when it seemed to be almost impossible to use. I then pretty much cast it aside and forgot about it.

    Now I’m really looking forward to using it again.

  35. Hey guys…thanks for your comments. My jaw dropped the first time I read about this method since it’s something I have to deal with often. 

  36. Hey guys…thanks for your comments. My jaw dropped the first time I read about this method since it’s something I have to deal with often. 

  37. Dude, great post. Simple and to the point and, as Jesudas said, you’ve taught XML in 10 mins!

  38. Dude, great post. Simple and to the point and, as Jesudas said, you’ve taught XML in 10 mins!

  39. caefer on July 26, 2011 at 9:11 am said:

    While SimpleXML works for a lot of use cases you should keep in mind that it keeps a complete document object model (DOM) in memory which can easily kill your application when handling big XML data.

    An alternative that is harder to write but lighter and faster though not always applicable is XMLReader.

    Read about when to use it here:
    http://test.ical.ly/2010/06/29/working-with-php-and-xml-consider-xmlreader-and-xmlwriter/

    And about how to use it here:
    http://test.ical.ly/2011/03/08/simply-iterate-over-xml-with-plain-php-using-little-memory-and-cpu/

    • Hi Caefer,

      Those are some good articles and points on XMLReader. I really like SimpleXML’s ease of use, but you’re definitely right about the memory/performance. If you have an XML file that is extremely large, then you might want to look at an alternative way of parsing your XML. Sometimes, ease of programming is a good thing though.

  40. caefer on July 26, 2011 at 9:11 am said:

    While SimpleXML works for a lot of use cases you should keep in mind that it keeps a complete document object model (DOM) in memory which can easily kill your application when handling big XML data.

    An alternative that is harder to write but lighter and faster though not always applicable is XMLReader.

    Read about when to use it here:
    http://test.ical.ly/2010/06/29/working-with-php-and-xml-consider-xmlreader-and-xmlwriter/

    And about how to use it here:
    http://test.ical.ly/2011/03/08/simply-iterate-over-xml-with-plain-php-using-little-memory-and-cpu/

    • Hi Caefer,

      Those are some good articles and points on XMLReader. I really like SimpleXML’s ease of use, but you’re definitely right about the memory/performance. If you have an XML file that is extremely large, then you might want to look at an alternative way of parsing your XML. Sometimes, ease of programming is a good thing though.

  41. Wat hell..you teached me Xml in 10 min..

    Gr8.. writting ..

     

  42. Wat hell..you teached me Xml in 10 min..

    Gr8.. writting ..

     

Leave a Reply

You must be logged in to post a comment.

man working on his laptop

Are you ready to start learning?

Learning with Treehouse for only 30 minutes a day can teach you the skills needed to land the job that you've been dreaming about.

Start a Free Trial
woman working on her laptop