eZ Components - Feed ~~~~~~~~~~~~~~~~~~~~ .. contents:: Table of Contents Introduction ============ Description ----------- The purpose of the Feed component is to handle parsing and creating RSS and ATOM feeds. XML feeds overview ================== XML feeds --------- An XML feed is an XML document with a certain structure, which lists a series of *entries* or *items*. An example XML feed: .. include:: tutorial/rss2_example.xml :literal: This XML document describes an RSS2 feed, with channel elements *title*, *link*, *description*, *language*, *pubDate* and *webMaster*. The XML document also contains 2 entries (*item*), each one with the elements *title*, *link*, *description*, *pubDate* and *guid*. These elements are not the only ones present in RSS2 feeds, and some elements are not required to be present. The Feed document allows creating and parsing such XML documents. The feed types supported by the Feed component are ATOM, RSS1 and RSS2. Modules ------- XML feeds are extensible through modules. A module has a namespace and certain XML elements. An example of a feed module is iTunes, which allows creating and parsing podcasts for the `iTunes`_ media player. Example of feed (podcast) with the iTunes module: .. include:: tutorial/podcast_example.xml :literal: The elements of the iTunes module are the ones with ``itunes:`` in the element name. They describe the feed (*subtitle*, *author*, *summary*, *image*, *category*) and the items (called episodes) in the feed (*author*, *subtitle*, *summary*, *duration*, *keywords*). These iTunes elements are not the only ones present in iTunes podcasts. Applications ------------ XML feeds can be used in many applications: - **content aggregation** - blogs or journals can provide their content in an XML feed form. Subscribers to a feed are able to view content aggregated from multiple websites in one location, using an `aggregator`_ software program. - **news** - websites can provide news in a feed format. The advantage is that users do not need to check a website or subscribe to newsletters, but instead can have news from multiple sources in their `aggregator`_ program. - **podcasts** - XML feeds can have *enclosures*, which are links to media files (audio, video, pdf, etc). Some `aggregator`_ programs or the `iTunes`_ media player can download automatically these media files when they become available. Class overview ============== An overview of the most important classes in the Feed component. Base classes ------------ ezcFeed Defines a feed object of a specified type. Can be created from scratch or from an existing XML document (with autodetection of type). It can be generated into an XML document. ezcFeedElement Base class for all feed element types. ezcFeedModule Base class for all feed modules. Supported feed types -------------------- A feed has a type (eg. RSS1, RSS2 or ATOM). The feed type defines which processor is used to parse and generate that type. The following feed processors are supported by the Feed component: - **ATOM** (ezcFeedAtom) - |ATOM-specifications|_ - **RSS1** (ezcFeedRss1) - (RDF Site Summary) |RSS1-specifications|_ - **RSS2** (ezcFeedRss2) - (Really Simple Syndication) - |RSS2-specifications|_ A new processor can be defined by creating a class which extends the class ezcFeedProcessor and implements the interface ezcFeedParser. The new class needs to be added to the supported feed types list by calling the ezcFeed::registerFeed() function. Supported feed modules ---------------------- The following modules are supported by the Feed component: - **Content** (ezcFeedContentModule) - |Content-specifications|_ - **CreativeCommons** (ezcFeedCreativeCommonsModule) - |CreativeCommons-specifications|_ - **DublinCore** (ezcFeedDublinCoreModule) - |DublinCore-specifications|_ - **Geo** (ezcFeedGeoModule) - |Geo-specifications|_ - **iTunes** (ezcFeedITunesModule) - |iTunes-specifications|_ A new module can be defined by creating a class which extends the class ezcFeedModule. The new class needs to be added to the supported modules list by calling the ezcFeed::registerModule() function. Feed element types ------------------ The following element types are implemented in the Feed component (extending the class ezcFeedElement): - **item** (ezcFeedEntryElement) - specifies a feed item (entry) with the sub-elements *author*, *category*, *comments*, *content*, *contributor*, *copyright*, *description*, *enclosure*, *id*, *link*, *published*, *title*, *updated*, *source* - **category** (ezcFeedCategoryElement) - specifies a category with the attributes *term*, *scheme*, *label* and *category* (for iTunes subcategories). - **cloud** (ezcFeedCloudElement) - specifies an RSS2 cloud element with the attributes *domain*, *port*, *path*, *registerProcedure*, *protocol* - **content** (ezcFeedContentElement) - specifies an ATOM content element with the attributes *text*, *type*, *language*, *src* - **date** (ezcFeedDateElement) - specifies a date element with the attribute *date*. The date is stored as a DateTime object (assigning an integer timestamp or a formatted date works as well) - **enclosure** (ezcFeedEnclosureElement) - specifies an RSS2 enclosure element with the attributes *url*, *type*, *length*. Converted to an ezcFeedLinkElement when generating an ATOM feed. - **generator** (ezcFeedGeneratorElement) - specifies a generator element with the attributes *name*, *url*, *version* - **id** (ezcFeedIdElement) - specifies an identifier element with the attributes *id*, *isPermaLink* - **image** (ezcFeedImageElement) - specifies an image with the attributes *link*, *title*, *url* (RSS1 and RSS2), *description*, *width*, *height* (RSS2 only), *about* (RSS1 only) - **link** (ezcFeedLinkElement) - specifies a link with the attributes *href*, *rel*, *hreflang*, *title*, *type*, *length* - **person** (ezcFeedPersonElement) - specifies a person with the attributes *name*, *email*, *uri* (ATOM only) - **skipDays** (ezcFeedSkipDaysElement) - specifies an RSS2 skipDays element with the attributes *days* - **skipHours** (ezcFeedSkipHoursElement) - specifies an RSS2 skipHours element with the attributes *hours* - **source** (ezcFeedSourceElement) - specifies a source element with the RSS2 attributes *source* and *url*, and ATOM elements *title*, *description*, *copyright*, *author*, *contributor*, *updated*, *generator*, *image*, *icon*, *id*, *link*, *category* - **text** (ezcFeedTextElement) - specifies a text with the attributes *text*, *type* (ATOM only), *language* (ATOM only) - **textInput** (ezcFeedTextInputElement) - specifies an RSS1 and RSS2 text input element with the attributes *name*, *link*, *title*, *description*, *about* (RSS1 only) How to create a feed ==================== This part of the tutorial will show you step by step how to create an RSS2 feed which handles the news on a website. Creating an ATOM or RSS1 is similar, although some code needs to be changed. See the `Feed creator example`_ for a sample application which can be used to create simple XML feeds of any type (ATOM, RSS1 or RSS2) by providing a simple text file as input. The information which we want to show in the feed is: - the *title* of the feed: ``eZ news feed`` - a *description* of the feed: ``This RSS feed contains news feeds for eZ Publish and eZ Components.`` - a *published* date (called *pubDate* in RSS2) for the feed: ``Wed, 05 Mar 2008 14:28:45 +0000`` - an *author* (called *managerEditor* in RSS2): ``nospam@ez.no (Derick)`` (this is the recommended way to specify an author in RSS2, with an email address and the name of the person in paranthesis) - a *link* to our website: ``http://ez.no/`` - the news *items* (a short story about each product release) A news item can be defined by these elements (example for one news item): - the *title* of the news item: ``eZ Components 2007.1 released`` - a short *description* of the news item: ``The new release of eZ Components include Workflow, Authentication...`` - a *published* date (called *pubDate* in RSS2) for the news item: ``Mon, 02 Jul 2007 11:36:45 +0000`` - an *author* of the news item: ``nospam@ez.no (Derick)`` (this is the recommended way to specify an author in RSS2, with an email address and the name of the person in paranthesis) - a *link* to the detailed article on our website: ``http://ezcomponents.org/resources/news/news-2007-07-02`` (this will show in most `aggregator`_ programs as a ``Complete story`` link) Step 1. Create a feed object ---------------------------- A feed object can be created by calling the constructor with the optional feed type (``atom``, ``rss1`` or ``rss2``). In our case we create an generic feed:: The type of the resulting XML feed document will be specified in `Step 5. Generate the XML feed`_. Step 2. Add feed elements ------------------------- We add the *title*, *description* and *author* to the feed:: title = 'eZ news feed'; $feed->description = 'This RSS feed contains news feeds for eZ Publish and eZ Components'; $feed->published = 'Wed, 05 Mar 2008 14:28:45 +0000'; ?> Because some feed types support multiple *link* and *author* elements, multiple elements of this type can be added to a feed (although this is not fully supported by RSS2 |RSS2-specifications|_ or by `aggregator`_ programs). The way to add an element which can appear multiple times, or an element which supports attributes, is to call the method **add()** from ezcFeed or ezcFeedElement classes. The attributes of the element are set as simple properties (the way **$feed->title** is set). So to add a *author* element to our feed, we will do:: add( 'author' ); $author->name = 'Derick'; $author->email = 'nospam@ez.no'; ?> In the above example. $author is an object of type ezcFeedPersonElement. Other properties can be set on it (*uri*) which are mainly used for ATOM feeds. When generating an RSS2 feed, the generated XML element will look like this (from the values added to the $author object above):: nospam@ez.no (Derick) In ATOM it will look like this:: Derick nospam@ez.no To add a *link* element to our feed, we will do:: add( 'link' ); $link->href = 'http://ez.no/'; ?> In the above example. $link is an object of type ezcFeedLinkElement. Other properties can be set on it (*title*, *rel*, *hreflang*, *type*, *length*) which are mainly used for ATOM feeds. Step 3. Add an item ------------------- A feed *item* is an element which can appear multiple times, so it is added via the method **add()** of ezcFeed:: add( 'item' ); ?> Next we add the *title*, *description* and *author* elements to the news item:: title = 'eZ Components 2007.1 released'; $item->description = 'The new release of eZ Components include Workflow, Authentication...'; $item->published = 'Mon, 02 Jul 2007 11:36:45 +0000'; ?> We add the *link* and *author* elements to the news item in the same way as for the feed:: add( 'author' ); $author->name = 'Derick'; $author->email = 'nospam@ez.no'; $link = $item->add( 'link' ); $link->href = 'http://ezcomponents.org/resources/news/news-2007-07-02'; ?> Step 4. Add more items ---------------------- To add more news items to our feed, we repeat the step 3 as many times as needed. Step 5. Generate the XML feed ----------------------------- To create the XML feed from the $feed object, we call the **generate()** method of ezcFeed:: generate( 'rss2' ); ?> After running this line, $xml will contain the XML string of the feed (in case no exceptions were thrown due to required elements not being present). This is the string from $xml with 3 news items: .. include:: examples/feed_creator/data/news.xml :literal: Some elements were added automatically, namely *lastBuildDate* (current system time at generation time), *generator* (``eZ Components Feed`` along with the version of the Feed component (``dev``) and a link to this tutorial) and *docs* (``http://www.rssboard.org/rss-specification`` - a link to the RSS2 |RSS2-specifications|_). If you want to generate ATOM and RSS1 feed documents at this step, you can call **generate()** with ``atom`` and ``rss1`` respectively as arguments. As some elements are required for ATOM and RSS1, you might receive an exception. In this case add the required elements and call **generate()** again. Step 6. Save the XML feed to a file ----------------------------------- The generated XML feed needs to be saved in a file in order to be made accessible to users of your website:: Assuming that our host is ``ez.no``, this will be the location of our newly created XML feed: ``http://ez.no/feeds/news.xml``. You can also output the XML directly, while setting the HTTP Content-Type header:: generate( 'rss2' ); header( 'Content-Type: ' . $feed->getContentType() . '; charset=utf-8' ); echo $xml; ?> Assuming that this script is kept in ``http://ez.no/feeds/news.php``, when opening this URL in a web browser, the XML will be output with the content-type *application/rss+xml*. If the browser is configured properly to handle this content-type, it will open the feed aggregator software program, otherwise it will ask the user which application to use for that content-type. Step 7. Feed validation ----------------------- Use a `feed validator`_ to validate your newly created feed. Some warnings can appear, but unless the feed is not validated, it should be parseable by most applications and `aggregator`_ programs. Step 8. Make the XML feed accessible ------------------------------------ There are some methods to let the user know that a website provides an XML feed, so that the user can save the feed link in his feed `aggregator`_. Automatic feed discovery ```````````````````````` In the HTML source of every page add this line for RSS1 or RSS2 feeds:: Or this line for ATOM feeds:: In modern browsers the user will be informed (usually via a small icon like |feed icon| in one corner of the browser or in the address bar) that the current page has a web feed. If the user clicks on this icon his feed aggregator client will start and save the link to the feed in its database (if the user's system has a feed aggregator client and is configured to handle ``application/rss+xml`` and ``application/atom+xml`` content with the aggregator). Multiple feeds can be added to the same page (for example you can provide ATOM and RSS2 feeds). Note: some browsers might not recognize the non-standard ``application/rss+xml`` type and select the ATOM feed by default. The *title* attribute of the *link* HTML tag can be used to differentiate between multiple feeds (for example ``News``, ``Latest offers``, etc). Link to the feed document ````````````````````````` In the HTML source of every page (usually in the header and/or footer) add this line for RSS1 or RSS2 feeds:: RSS feed Or this line for ATOM feeds:: ATOM feed The user can drag this link to his feed `aggregator`_, where it will be added to the aggregator's database of feeds. It is customary to add the feed icon |feed icon| next to a feed link, so that the user finds the feed link easier on the page. See this Mozilla__ page for more information about the feed icon. __ http://www.mozilla.org/foundation/feed-icon-guidelines/ .. |feed icon| image:: img/feed-icon-14x14.png Feed creator example -------------------- In the sub-directory ``Feed/docs/examples`` there is a **feed_creator** application which can be used to create simple XML feeds from minimal text files. The structure of the text files accepted by this application is:: Feed title Feed link Feed published date Feed author name Feed author email Feed description Item 1 title Item 1 link Item 1 published date Item 1 author name Item 1 author email Item 1 description Item 2 title Item 2 link Item 2 published date Item 2 author name Item 2 author email Item 2 description .. etc An example of an input text file: .. include:: examples/feed_creator/data/news.txt :literal: The **feed_creator** application will read an input file with the above structure and output an XML feed of the chosen type (``rss1``, ``rss2`` or ``atom``). An XML file will also be written in the same directory as the input file, with the name of the input file plus the ``.xml`` extension. Example of usage (current directory is the **feed_creator** directory):: php feed_creator.php rss2 data/news.txt After running this command, the file ``data/news.xml`` will be created, containing an RSS2 feed with the values read from ``data/news.txt``: .. include:: examples/feed_creator/data/news.xml :literal: See the section `Step 8. Make the XML feed accessible`_ for details on how to provide access to the generated XML feed. How to create an iTunes podcast =============================== A podcast is a collection of media files (called episodes) distributed over the Internet using XML feeds. The podcast doesn't contain the media file, but it contains links to these media files, plus information about the files (meta-information) like creator, duration, category, copyright information, etc. The Feed component supports creating and parsing feeds which define podcasts. This part of the tutorial will show you step by step how to create an RSS2 podcast with `iTunes`_ elements. The `iTunes`_ media player supports RSS2 feeds, so creating ATOM or RSS1 podcasts for iTunes is not recommended (although possible). The information which we want to show in the podcast is: - the *title* of the feed: ``Flight of the RC plane`` - a *description* of the feed. This will also be used if we don't declare the *summary* iTunes element. The contents of this element are shown in `iTunes`_ in a separate window that appears when the "circled i" in the Description column is clicked: ``A podcast for fans of remote-control planes, with information about planes, competitions, tutorials and tips`` - an *author* (called *managerEditor* in RSS2): ``editor@rcplanes.example.com (Derick)`` (this is the recommended way to specify an author in RSS2, with an email address and the name of the person in paranthesis) - a *link* to our website: ``http://rcplanes.example.com/`` - the episodes or *items* (information about each episode, a link to the media file associated with the episode and iTunes meta-information for the media files) - iTunes elements needed to be able to submit our podcast to `iTunes`_. iTunes elements for the whole feed: - one or more *category* elements (from the page `iTunes categories`_). Our podcast could be for example in the category ``Technology``, sub-category ``Gadgets`` - one or more *keywords*, separated by commas: ``RC planes,gadgets,flying`` - an *image* for our podcast. `iTunes`_ recommends an image of at least 500x500 pixels: ``http://rcplanes.example.com/images/rc_plane_big.jpg`` - a *subtitle* for our podcast. This is shown in the Description column in `iTunes`_. Should be a very short description: ``Competitions, tutorials and tips for remote-control planes``. - *explicit* status (some of our RC pilots are known to curse a lot): ``yes`` (other values are ``no`` and ``clean``, with ``no`` being the default) An episode can be defined by these elements (example for one episode): - the *title* of the episode: ``Flying an RC plane indoors`` - a short *description* of the episode: ``In this episode, Derick talks about how to fly an RC plane in a big hall, around people working and throwing stuff at the plane.`` - an *author* of the episode: ``derick@rcplanes.example.com (Derick)`` (this is the recommended way to specify an author in RSS2, with an email address and the name of the person in paranthesis) - a *link* to the detailed article on our website: ``http://rcplanes.example.com/articles/fly-an-rc-plane-indoors.html`` - an *enclosure*, which is a link to a media file: ``http://rcplanes.example.com/media/003-flying-indoors.mp3``. Other information can also be provided as attributes for the *enclosure*, namely *length* in bytes and *type* (``audio/x-mp3`` in our case) - the date and time of the episode, or *published* (called *pubDate* in RSS2). The `iTunes`_ program uses this timestamp to download the latest episode automatically: ``Fri, 04 Jan 2008 11:18:34 +0100`` In addition, these iTunes elements are added for each episode: - *duration* of the episode: ``29:20`` (29 minutes 20 seconds). Other ways to specify the duration are ``S``, ``M:SS``, ``MM:SS``, ``H:MM:SS`` or ``HH:MM:SS`` (H = hours, M = minutes, S = seconds). - one or more *keywords*, separated by commas: ``RC planes,office,flying,enemies`` See the section `How to create a feed`_ for detailed steps. This part of the tutorial will concentrate on how to add the `iTunes`_ information to the feed. Step 1. Create an RSS2 feed --------------------------- We start with creating an RSS2 feed object, which we fill with *title*, *description*, *author* and *link* elements:: title = 'Flight of the RC plane'; $feed->description = 'A podcast for fans of remote-control planes, with information about planes, competitions, tutorials and tips'; $author = $feed->add( 'author' ); $author->name = 'Derick'; $author->email = 'editor@rcplanes.example.com'; $link = $feed->add( 'link' ); $link->href = 'http://rcplanes.examples.com/'; ?> Step 2. Add iTunes feed elements -------------------------------- The iTunes elements will be contained in an iTunes module. A module is added to an ezcFeed or ezcFeedItem object using the method **addModule()**. A module is an object of class ezcFeedModule. Next we will add *category*, *keywords*, *image*, *subtitle* and *explicit* iTunes elements to our $feed object:: addModule( 'iTunes' ); $iTunes->keywords = 'RC planes,gadgets,flying'; $iTunes->explicit = 'yes'; $iTunes->subtitle = 'Competitions, tutorials and tips for remote-control planes'; // add an image for the podcast $image = $iTunes->add( 'image' ); $image->link = 'http://rcplanes.example.com/images/rc_plane_big.jpg'; // add the podcast in the category Technology->Gadgets $category = $iTunes->add( 'category' ); $category->term = 'Technology'; $subCategory = $category->add( 'category' ); $subCategory->term = 'Gadgets'; ?> See the `iTunes categories`_ page for a list of the iTunes categories you can add a podcast to. Note: some category names contain ``&`` which should be encoded as ``&`` (eg. ``Society & Culture``). Step 3. Add an item ------------------- Next we will add an episode (*item*) to our $feed object, and we will add the *title*, *description*, *author* and *link* elements to it:: add( 'item' ); $item->title = 'Flying an RC plane indoors'; $item->description = 'In this episode, Derick talks about how to fly an RC plane in a big hall, around people working and throwing stuff at the plane'; $item->published = 'Fri, 04 Jan 2008 11:18:34 +0100'; $author = $item->add( 'author' ); $author->name = 'Derick'; $author->email = 'derick@rcplanes.example.com'; $link = $item->add( 'link' ); $link->href = 'http://rcplanes.example.com/articles/fly-an-rc-plane-indoors.html'; $enclosure = $item->add( 'enclosure' ); $enclosure->url = 'http://rcplanes.example.com/media/003-flying-indoors.mp3'; $enclosure->length = 49099054; // bytes $enclosure->type = 'audio/x-mp3'; ?> Step 4. Add iTunes item elements -------------------------------- We will add the iTunes elements *duration* and *keywords* to our $item object from the previous step:: addModule( 'iTunes' ); $iTunes->duration = '29:20'; $iTunes->keywords = 'RC planes,office,flying,enemies'; ?> Step 5. Add more items ---------------------- To add more episodes to our podcast, we repeat the steps 3 and 4 as many times as needed. Step 6. Generate the XML feed ----------------------------- Follow these steps from the previous tutorial `How to create a feed`_: - `Step 5. Generate the XML feed`_ - `Step 6. Save the XML feed to a file`_ - `Step 7. Feed validation`_ - `Step 8. Make the XML feed accessible`_ In the end, our podcast will look like this (with 3 episodes added): .. include:: examples/feed_creator/data/podcast.xml :literal: Step 7. Submit the podcast to the iTunes Store ---------------------------------------------- Follow the steps on the section `Submitting Your Podcast to the iTunes Store`_ from the iTunes |iTunes-specifications|_. The iTunes |iTunes-specifications|_ contains other useful information you might need when you are creating podcasts. How to parse an XML feed ======================== An XML feed can be stored in a file, at an URL or in a string variable. By using the methods **parse()** (for files and URLs) or **parseContent()** (for string variables), an ezcFeed object can be created from an XML feed. This part of the tutorial will show you step by step how to parse an XML feed, read its elements, iterate over the items in the feed and read the items' elements. Step 1. Parse an XML feed ------------------------- If the XML feed is stored in a file or URL, use the static method **parse()** from ezcFeed to parse the feed and create an ezcFeed object out of it:: If the XML feed is stored in a string variable, use the static method **parseContent()** from ezcFeed to parse the feed and create an ezcFeed object out of it:: These exceptions can be thrown while parsing a feed: - ezcBaseFileNotFoundException - if the feed URL or file is not found - ezcFeedParseErrorException - if the XML content is broken, or not a feed, or unsupported feed type, or if RSS1 and RSS2 feeds are missing the element Step 2. Read the feed elements ------------------------------ At step 1 we created an ezcFeed object by parsing an XML feed. Next we will read the elements from the feed. Depending on the feed type, certain elements are present while others are not. The feed type can be retrieved via the method **getFeedType()** from ezcFeed:: getFeedType(); ?> This will return ``rss1``, ``rss2`` or ``atom``. It is always a good idea to read only those elements which are present in $feed. To test if an element is present, call **isset()** on it:: title ) ) { $title = $feed->title->__toString(); } ?> Another way to write this is (assuming that a missing *title* can be considered null):: title ) ? $feed->title->__toString() : null; ?> Depending on your application, you might need only some elements, let's say *title*, *description*, *author* and *link*. So to read those elements you will use:: title ) ? $feed->title->__toString() : null; $description = isset( $feed->description ) ? $feed->description->__toString() : null; if ( isset( $feed->author ) ) { $author = isset( $feed->author->name ) ? $feed->author->name : null; // RSS2 feeds usually have the author in this format: email_address (author_name) // the $author string can be parsed to extract the email and name. // // ATOM feeds contain the author's email in a separate element: $feed->author->email. // and in addition they have the uri element for authors: $feed->author->uri } $links = array(); if ( isset( $feed->link ) ) { foreach ( $feed->link as $link ) { $links[] = $link->href; } } ?> Because *link* can appear multiple times, we had to resort to this long code to get the links out of the feed. The $links array will be empty if no links are in the XML feed, or will contain all the links that appear on the feed-level. Step 3. Iterate over the feed items ----------------------------------- A feed can contain zero or more *item* elements. Let's say we want to extract the *title*, *description*, *published*, *author* and *link* of all items. This is how we will do it:: item as $item ) { $title = isset( $item->title ) ? $item->title->__toString() : null; $description = isset( $item->description ) ? $item->description->__toString() : null; $published = isset( $item->published ) ? $item->published->date->format( 'c' ) : null; if ( isset( $item->author ) ) { $author = isset( $item->author->name ) ? $item->author->name : null; // RSS2 feeds usually have the author in this format: email_address (author_name) // the $author string can be parsed to extract the email and name // // ATOM feeds contain the author's email in a separate element: $item->author->email. // and in addition they have the uri element for authors: $item->author->uri } $links = array(); if ( isset( $item->link ) ) { foreach ( $item->link as $link ) { $links[] = $link->href; } } $items[] = array( 'title' => $title, 'description' => $description, 'author' => $author, 'links' => $links ); } ?> The *published* element is an ezcFeedDateElement object encapsulating a DateTime object, so we return the date as a string with **format( 'c' )**. Other formats can be used also, see the documentation for `date_format()`_. How to parse an iTunes podcast ============================== See the section `How to parse an XML feed`_ for detailed steps. This part of the tutorial will concentrate on how to fetch the `iTunes`_ information from an RSS2 feed. Step 1. Parse an XML feed ------------------------- If the XML feed is stored in a file or URL, use the static method **parse()** from ezcFeed to parse the feed and create an ezcFeed object out of it:: If the XML feed is stored in a string variable, use the static method **parseContent()** from ezcFeed to parse the feed and create an ezcFeed object out of it:: Step 2. Read the feed elements ------------------------------ Depending on the feed type, certain elements are present while others are not. The feed type can be retrieved via the method **getFeedType()** from ezcFeed:: getFeedType(); ?> This will return ``rss1``, ``rss2`` or ``atom``. Depending on your application, you might need only some elements, let's say *title*, *description*, *author* and *link*. So to read those elements you will use:: title ) ? $feed->title->__toString() : null; $description = isset( $feed->description ) ? $feed->description->__toString() : null; if ( isset( $feed->author ) ) { $author = isset( $feed->author->name ) ? $feed->author->name : null; // RSS2 feeds usually have the author in this format: email_address (author_name) // the $author string can be parsed to extract the email and name // // ATOM feeds contain the author's email in a separate element: $feed->author->email. // and in addition they have the uri element for authors: $feed->author->uri } $links = array(); if ( isset( $feed->link ) ) { foreach ( $feed->link as $link ) { $links[] = $link->href; } } ?> Step 3. Read the iTunes feed elements ------------------------------------- Before reading iTunes elements from the feed, we need to make sure that the feed has the iTunes module. We use the method **hasModule()** from ezcFeed or ezcFeedItem:: hasModule( 'iTunes' ) ) { // process the iTunes module } ?> We can also call the **isset()** method to check if the iTunes module is present:: iTunes ) ) { // process the iTunes module } ?> This is how we fetch information from the iTunes module. Let's say we want to fetch the *keywords*, *subtitle*, *image* and *category* elements:: iTunes ) ) { $iTunes = $feed->iTunes; $keywords = isset( $iTunes->keywords ) ? $iTunes->keywords->__toString() : null; $subtitle = isset( $iTunes->subtitle ) ? $iTunes->subtitle->__toString() : null; $image = isset( $iTunes->image ) ? $iTunes->image->__toString() : null; $categories = array(); if ( isset( $iTunes->category ) ) { foreach ( $iTunes->category as $category ) { $cat = array( 'term' => $category->term ); if ( isset( $category->category ) ) { $cat['subCategory'] = $category->category->term; } $categories[] = $cat; } } } ?> In iTunes, *category* is an element which can have sub-categories. The code above reads the *category* values (from the *text* attribute) and the sub-category (if any) from each *category*. The $categories array can look like this:: array( 0 => array( 'term' => 'Technology', 'subCategory' => 'Gadgets' ), ); if the iTunes elements appeared like this in the XML feed:: Step 4. Iterate over the feed items ----------------------------------- Let's say we want to extract the *title*, *description*, *published*, *author* and *link* of all items. This is how we will do it:: item as $item ) { $title = isset( $item->title ) ? $item->title->__toString() : null; $description = isset( $item->description ) ? $item->description->__toString() : null; $published = isset( $item->published ) ? $item->published->date->format( 'c' ) : null; if ( isset( $item->author ) ) { $author = isset( $item->author->name ) ? $item->author->name : null; // RSS2 feeds usually have the author in this format: email_address (author_name) // the $author string can be parsed to extract the email and name // // ATOM feeds contain the author's email in a separate element: $item->author->email. // and in addition they have the uri element for authors: $item->author->uri } $links = array(); if ( isset( $item->link ) ) { foreach ( $item->link as $link ) { $links[] = $link->__toString(); } } $media = array(); if ( isset( $item->enclosure ) ) { $enclosure = $item->enclosure[0]; $media = array( 'url' => isset( $enclosure->url ) ? $enclosure->url : null, 'length' => isset( $enclosure->length ) ? $enclosure->length : null, 'type' => isset( $enclosure->type ) ? $enclosure->type : null ); } $items[] = array( 'title' => $title, 'description' => $description, 'author' => $author, 'links' => $links, 'media' => $media ); } ?> After running the code, the $media array will contain the *url*, *length* and *type* of the media file specified in the feed item. The $media array is added to the $items array to be processed later by the application. Step 5. Read the iTunes item elements ------------------------------------- Let's say we want to fetch the *duration* of the iTunes module inside an item. The code from the previous section is altered as follows:: enclosure ) ) { $enclosure = $item->enclosure[0]; $media = array( 'url' => isset( $enclosure->url ) ? $enclosure->url : null, 'length' => isset( $enclosure->length ) ? $enclosure->length : null, 'type' => isset( $enclosure->type ) ? $enclosure->type : null ); if ( isset( $item->iTunes ) ) { $iTunes = $item->iTunes; $media['duration'] = isset( $iTunes->duration ) ? $iTunes->duration : null; } } // ... ?> After running the code, the $media array will contain the *url*, *length* and *type* of the media file specified in the feed item, and the *duration* of the media file taken from the iTunes module (if available). Best practices ============== This section lists some useful tips for handling feed documents. Universal feed generator ------------------------ In order to generate all 3 feed types (ATOM, RSS1, RSS2) from the same ezcFeed data, these elements must be added to an ezcFeed object: - *author* - *description* - *id* - at least one *item* with all required elements - *link* - *title* - *updated* And these elements must be added to an ezcFeedEntryElement object: - *author* - *description* - *id* - *link* - *title* - *updated* This is a minimal script to be able to generate all 3 feed types from the same ezcFeed data:: add( 'author' ); $author->name = "Indiana Jones"; $author->email = "indy@example.com"; $feed->description = "This feed shows Indiana Jones movie releases"; $feed->id = "http://indy.example.com/"; $link = $feed->add( 'link' ); $link->href = "http://indy.example.com/"; $feed->title = "Indiana Jones movie releases"; $feed->updated = time(); // add a feed item $item = $feed->add( 'item' ); $author = $item->add( 'author' ); $author->name = "Indiana Jones"; $author->email = "indy@example.com"; $item->description = "Indy meets ****** and has a hell of an adventure"; $item->id = "http://indy.example.com/4"; $link = $item->add( 'link' ); $link->href = "http://indy.example.com/4"; $item->title = "Indiana Jones and the Kingdom of the Crystal Skull"; $item->updated = time(); $atom = $feed->generate( 'atom' ); $rss1 = $feed->generate( 'rss1' ); $rss2 = $feed->generate( 'rss2' ); ?> Media type ---------- ATOM ```` All ATOM feeds must be identified with the ``application/atom+xml`` media type. Use the **getContentType()** method after calling **generate( 'atom' )** on an ezcFeed object to get this string, or use **ezcFeedAtom::CONTENT_TYPE**. RSS1 ```` All RSS1 feeds should be identified with the ``application/rss+xml`` media type (although it is not a standard yet). Use the **getContentType()** method after calling **generate( 'rss1' )** on an ezcFeed object to get this string, or use **ezcFeedRss1::CONTENT_TYPE**. RSS2 ```` All RSS2 feeds should be identified with the ``application/rss+xml`` media type (although it is not a standard yet). Use the **getContentType()** method after calling **generate( 'rss2' )** on an ezcFeed object to get this string, or use **ezcFeedRss2::CONTENT_TYPE**. Extending the Feed component ============================ Register a new feed type ------------------------ A new feed type can be defined by creating a class which extends the class ezcFeedProcessor and implements the interface ezcFeedParser. The new class needs to be added to the supported feed types list by calling the ezcFeed::registerFeed() function. Example of new feed type:: feedContainer = $container; $this->feedType = self::FEED_TYPE; $this->contentType = self::CONTENT_TYPE; } public function generate() { // write implementation here // should return XML as a string based on the feed elements from // $this->feedContainer which are accessed as $this->element_name // (example $this->title) } public static function canParse( DOMDocument $xml ) { // write implementation here // should return true if this class can parse the $xml DOMDocument received // and false otherwise } public function parse( DOMDocument $xml ) { // write implementation here $feed = new ezcFeed(); // parse $xml and fill $feed with properties fetched from $xml return $feed; } } ?> Example of how to use the feed type above:: generate( 'opml' ); ?> Register a new module --------------------- A new module can be defined by creating a class which extends the class ezcFeedModule. The new class needs to be added to the supported modules list by calling the ezcFeed::registerModule() function. Example of a new module type:: level, // and false otherwise } public function add( $name ) { // add the element $name to this module at level $this->level (feed-level or item-level) } public static function getModuleName() { return 'Slash'; } public static function getNamespace() { return 'http://purl.org/rss/1.0/modules/slash/'; } public static function getNamespacePrefix() { return 'slash'; } } ?> Example of how to use the module above:: add( 'item' ); $slash = $item->addModule( 'Slash' ); // add properties for the Slash module to $slash $xml = $feed->generate( 'rss2' ); // or the feed type which is needed ?> Specifications and RFCs ======================= For a list of supported RFCs and specifications of the feed types and modules, please see the `specifications`_ page. .. _specifications: Feed_specifications.html .. _feed validator: http://validator.w3.org/feed/ .. _RSS1-specifications: http://web.resource.org/rss/1.0/spec .. _RSS2-specifications: http://www.rssboard.org/rss-specification .. _ATOM-specifications: http://atompub.org/rfc4287.html .. _Content-specifications: http://purl.org/rss/1.0/modules/content/ .. _CreativeCommons-specifications: http://backend.userland.com/creativeCommonsRssModule .. _DublinCore-specifications: http://dublincore.org/documents/dces/ .. _Geo-specifications: http://www.w3.org/2003/01/geo/ .. _iTunes-specifications: http://www.apple.com/itunes/store/podcaststechspecs.html .. _aggregator: http://en.wikipedia.org/wiki/List_of_feed_aggregators .. _date_format(): http://php.net/manual/en/function.date-format.php .. _iTunes categories: http://www.apple.com/itunes/store/podcaststechspecs.html#categories .. _iTunes: http://www.apple.com/itunes/ .. _Submitting Your Podcast to the iTunes Store: http://www.apple.com/itunes/store/podcaststechspecs.html#submitting .. |ATOM-specifications| replace:: RFC 4287 .. |RSS1-specifications| replace:: Specifications .. |RSS2-specifications| replace:: Specifications .. |Content-specifications| replace:: Specifications .. |CreativeCommons-specifications| replace:: Specifications .. |DublinCore-specifications| replace:: Specifications .. |Geo-specifications| replace:: Specifications .. |iTunes-specifications| replace:: Specifications .. Local Variables: mode: rst fill-column: 79 End: vim: et syn=rst tw=79 nocin