File: news_parser_4.php

Recommend this page to a friend!
  Classes of Carter Comunale  >  news_parser_4  >  news_parser_4.php  >  Download  
File: news_parser_4.php
Role: ???
Content type: text/plain
Description: This contains the parser and article classes.. please use morever_api.php it is the updated version of this class with channel and category support! Thanks Mike
Class: news_parser_4
Author: By
Last change:
Date: 19 years ago
Size: 6,399 bytes


Class file image Download

        File name: news_parser_4.php - the 4 indicates that php 4 or better is required. php3 port?
        Classes: article and news_xml_parser.
        Purpose: These two classes are intended to be used with the news feed site.
                        article is a simple object that represents a single news feed.
                        news_xml_parser is an xml parser that creates article objects from the xml news feed.

        ToDo: Well, there is a mess of stuff that I could add to this, but for now I am just leave it. You can add to it.
                   It would be nice to have more configuration stuff for the url that gets passed in.
                   moreover offers a ton of options, it would be nice if I handled it better.
                   check out
                   for details on building feed urls to pass the parser. They have category support in addition to the keyword
                   stuff I used in my example code (show_news.php) which should be with this file.

        Author: Carter Comunale   ( comments and suggestions are welcome.
        Date: 07/04/2001 (the 4th of July!)
        Modified Last By:         <your name here>
        Modified Last Date:      <the date you changed it>
        Note: Feel free to do whatever you want with this code, however, if you do change it making it better send me a note.
                  I would like to know what you did :)

So you want to see it work? copy an past this url


// simple class to hold our news feed articles that we build
class article {
    var $article_id;
    var $url;
    var $headline_text;
    var $source;
    var $media_type;
    var $cluster;
    var $tagline;
    var $document_url;
    var $harvest_time;
    var $access_registration;
    var $access_status;

    function article() {
        // do nothing for now just be nice oo style.

class news_xml_parser {
    var $xml_file;
    var $type;
    var $xml_parser;
    var $news_objects;
    var $current_tag;
    var $current_article;

    function news_xml_parser($xml_file) { // constructor
        $this->xml_file = $xml_file;
        $this->type = 'UTF-8';
        $this->parser = xml_parser_create($this->type);
        xml_parser_set_option($this->parser, XML_OPTION_CASE_FOLDING, true);
        xml_parser_set_option($this->parser, XML_OPTION_TARGET_ENCODING, 'UTF-8');

    function parse() {
        if (!($fp = fopen($this->xml_file, 'r'))) {
            echo "Could not open $xml_file for parsing!\n";
        while ($data = fread($fp, 4096)) {
            if (!($data = utf8_encode($data))) {
                echo 'ERROR'."\n";
            if (!xml_parse($this->parser, $data, feof($fp))) {
                die(sprintf( "XML error: %s at line %d\n\n",

    function tag_open($parser,$tag,$attributes) {
	$this->current_tag = $tag;
        switch ($tag) {
		case "MOREOVERNEWS":  // this tag means we are at the start of a new xml file, create the array to hold the objects created
		$this->news_objects = array (" ");

                case "ARTICLE": // when we get this tag, create a new article object
		$this->current_article = new article();


    function cdata($parser,$cdata) {

        switch ($this->current_tag) {

                case "URL":
	        if (!$this->current_article->url) {
			$this->current_article->url = $cdata;

                case "HEADLINE_TEXT":
                if (!$this->current_article->headline_text) {
			$this->current_article->headline_text = $cdata;

                case "SOURCE":
                if (!$this->current_article->source) {
	                $this->current_article->source = $cdata;

                case "MEDIA_TYPE":
                if (!$this->current_article->media_type) {
	                $this->current_article->media_type = $cdata;

                case "CLUSTER":
                if (!$this->current_article->cluster) {
	                $this->current_article->cluster = $cdata;

                case "TAGLINE":
                if (!$this->current_article->tagline) {
	                $this->current_article->tagline = $cdata;

                case "DOCUMENT_URL":
                if (!$this->current_article->document_url) {
	                $this->current_article->document_url = $cdata;

                case "HARVEST_TIME":
                if (!$this->current_article->harvest_time) {
	                $this->current_article->harvest_time = $cdata;

                case "ACCESS_REGISTRATION":
                if (!$this->current_article->access_registration) {
	                $this->current_article->access_registration = $cdata;

                case "ACCESS_STATUS":
                if (!$this->current_article->access_status) {
			$this->current_article->access_status = $cdata;


    function tag_close($parser,$tag) {

	switch ($tag) {

                case "ARTICLE": // when we get this tag, we are done with thee current object, insert it into the arrray.
		array_push($this->news_objects, $this->current_article);


    function free_parser() {

For more information send a message to info at phpclasses dot org.