Site icon Treehouse Blog

How to parse XML with PHP5

Feeds are streams of content that people can use to share pieces of information across websites. PHP5’s simpleXML functions dramatically simplify the process of interpreting the feeds into something useful for your web pages.

I recently worked on a widget that displayed a live feed of songs from one of our radio station’s playlist. The system that runs our radio station outputs XML which is then fed to a database. A program turns that data into feeds that get pulled by other programs.

It’s common to use XML to share data between servers that use different technologies. The system that generates the playlist is proprietary, so XML provides an easy format through which this data can be shared.

What is XML?

The eXtensible Markup Language is a way to structure your data for sharing across sites. Some of the technologies that are crucial to the web like RSS (Real Simple Syndication) and Podcasts are special flavors of XML. The beautiful thing about XML is that you can easily roll your own for anything you need.

XML is easy to create because it’s a lot like HTML…except you can make up your own tags. Let’s say, for example that you’re putting together a feed for a list of songs playing at your own radio station. We’ll keep this simple, so we’ll just encode the name of the artist, the title of the song, plus the time when the song was played. We make up a couple of tags called <title> and <artist> and wrap each of them around a <song> tag. We’ll create a dateplayed attribute for each song with the date and time the song was played. You might encode something like that in this manner.

<songs>
    <song dateplayed="2011-07-24 19:40:26">
        <title>I left my heart on Europa</title>
        <artist>Ship of Nomads</artist>
    </song>
    <song dateplayed="2011-07-24 19:27:42">
        <title>Oh Ganymede</title>
        <artist>Beefachanga</artist>
    </song>
    <song dateplayed="2011-07-24 19:23:50">
        <title>Kallichore</title>
        <artist>Jewitt K. Sheppard</artist>
    </song>
</songs>

There’s some rules that you have to adhere to when creating XML data. If you’re familiar with XHTML…you’ll be right at home with some of these, but let’s review them:

XML is a bit more strict than HTML, but is real easy to create and deal with.

Introducing simpleXML

With simpleXML, it’s as easy as reading the XML and then accessing it’s contents through an easy to read object. Assuming we’ve got our XML file above saved as a file called songs.xml in the same folder as our php file, we can read the whole feed into an object with the following code.

<?php
    $mysongs = simplexml_load_file('songs.xml');
?>

That’s it! The file can even be the URL of a feed on the web and not just a file on your hard drive. We now have an object that is a representation of our file. The songs object has been absorbed into the $mysongs variable. If we want to output the name of the first artist in our list we can refer to it like this:

<?php
    $mysongs = simplexml_load_file('songs.xml');
    echo $mysongs->song[0]->artist;
?>

Notice that our XML tags are mapped as part of the object so we can get to any element simply by typing it’s name. Remember that arrays are 0 indexed in PHP so our first title would be our 0th title. Now, let’s output the third song title.

<?php
    $mysongs = simplexml_load_file('songs.xml');
    echo $mysongs->song[2]->title;
?>

Working with Attributes

In order to get to our dates, we’ll need to know how to access attributes, the notation is slightly different than with tags, but just as easy. As a matter of fact, it works just like accessing an array element. Let’s see how you would take a look at the second song’s date.

<?php
    $mysongs = simplexml_load_file('songs.xml');
    echo $mysongs->song[1]['dateplayed'];
?>

Making a list of songs

So now that we’ve got the basics of accessing elements, let’s write the code to make a complete list of our songs parsed by interpreting our XML file.

<?php
    $mysongs = simplexml_load_file('songs.xml');
    echo "<ul id="songlist">n";
    foreach ($mysongs as $songinfo):
        $title=$songinfo->title;
        $artist=$songinfo->artist;
        $date=$songinfo['dateplayed'];
        echo "<li><div class="title">",$title,"</div><div class="artist">by ",$artist,"</div><time>",$date,"</time></li>n";
    endforeach;
    echo "</ul>";
?>

We’re using a foreach statement here to go through each song and then parsing the information into a simple HTML list. You can use that in a regular HTML document or use it as a widget to display a list of songs.

Parsing a Flickr feed from a set

There are lots of XML feeds available for you to parse online. We can, for example, get a feed from a Flickr set for inclusion on a website. That way, when you update your Flickr set, the widget will automatically display this on your site. I’ve prepared a special set with some pictures of kittens. In order to get the XML for this feed, we can go to our page on Flickr and look for the XML icon at the bottom left of the screen.

We want to examine the feed to determine it’s structure. You can right click on that feed’s link on the flickr page and save it to your hard drive, then rename the file photoset.xml and open it up with your browser.

I’ve opened the XML file with Safari, which lets me take a look at the structure (Safari will normally try to push an RSS link to it’s built in reader, so saving it, giving it an XML extension and opening it will let us view the structure). I can see that each photo is stored as an <entry> tag in our XML. Inside this tag, there are two <link> tags. The first one has a link to our image on Flickr. The second one has a link to a medium sized version of our image.

I’m going to modify the code slightly and make an adjustment on the second link to get a small thumbnail of all of our images. Before I do that, I’ll go back to Flickr and right click on the link to the XML file and this time copy it to the clipboard, then adjust the code:

<?php
    $mypix = simplexml_load_file('http://api.flickr.com/services/feeds/photoset.gne?set=72157627229375826&nsid=73845487@N00&lang=en-us');
    foreach ($mypix->entry as $pixinfo):
        $title=$pixinfo->title;
        $link=$pixinfo->link['href'];
        $image=str_replace("_b.jpg","_s.jpg",$pixinfo->link[1]['href']);
        echo "<a href="",$link,""><img src="",$image,"" alt="",$title,"" /></a>n";
    endforeach;
?>

We’re going to use the <title> tag as the alt text of our <img> tag. We’re also going to use the first of the <link> tags as the href attribute of our anchor tag. Getting to our second link will be a bit trickier. We can use the array notation to get to the second link’s href attribute ($pixinfo->link[1]['href']). However, that will give us a much larger picture than we’ll need. I’m using a simple str_replace function to change the end of our link url from _b.jpg, to _s.jpg. That will give us thumbnails, which gives you something like this.

Conclusion

SimpleXML is super easy and even fun to use to parse complicated feeds like the example from Flickr. Notice that it had no problems getting to the second instance of the <link> tag inside the <entry> tag. This would have been really difficult to do with previous versions of PHP. There’s a lot more to simpleXML, which you can read in the PHP manual. If you’re interested in an object oriented approach, PHP5 also provides the SimpleXMLElement class.

If you’d like to learn more PHP, feel free to head over to the Think Vitamin Membership PHP course.

Exit mobile version