Adding feeds with CG-FeedRead - a tutorial

By Murray Bourne, 19 Jan 2006

Someone recently asked for a recommended method for pulling feeds from news sources into a PHP-based website.

Update, 27 Feb 2007: There were some problems with the feeds from Bangladesh and the $dLimit variable. I have rewritten the tutorial where necessary.

Background

I use CG-Feedread by David Chait on my Interactive Mathematics site to pull posts from squareCircleZ blog into the homepage (you can see the links under the heading "Mathematics Blog" in the right column of the math site).

First step

Download the latest CG-FeedRead. It is called something like "WP 1.5.1.x Compatible...". It's free, but Chait appreciates donations.

The script works fine but the documentation is rather hard to figure if you are a newby.

CG FeedRead is designed to work with the WordPress blog engine, but here I am going to assume that we are independent of WordPress.

Installation

After you have downloaded the zip file, extract the files. You get lots of files, but will only need 4, from the \plugins\cg-plugins\ directory.

We'll assume the PHP page that you want to pull the feeds into is in the root directory of http://www.mysite.com/.

Create a directory /cg in your root directory (so it will be http://www.mysite.com/cg/) . Upload the following 4 files from the zip into that /cg directory:

  • cg-feedread.php
  • helper_fns.php
  • uni_fns.php
  • XMLParser.php

[Update 14 Nov 2006: Please see David Chait's comment below, about uploading all of the files, and my comment following.]

Inside the cg/ directory, create a directory called /cache_feedread (so it will be http://www.mysite.com/cg/cache_feedread/). This holds a small cache of the feeds (this is much better than calling the feed everytime a user goes to your PHP page.) CHMOD the permissions for the cache directory to 764. (The script needs to be able to write the cached files. If 764 does not work, try giving more write permissions.)

That's it for installing CG FeedRead.

The PHP page to display feeds (single feed only)

1. Start with the shell of an HTML document (you need all this so it displays properly in a browser), something like:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 
      Transitional//EN">
<html>
  <head>
  <meta http-equiv="content-type" 
      content="text/html; charset=utf-8">
  <title></title>
  </head>
  <body>
  </body>
</html>

2. Now, before the <!DOCTYPE HTML PUBLIC ... stuff at the very top, put the following:

<?php
  require_once("cg/cg-feedread.php");
?>

This tells the current page to load the cg-feedread script.

3. In the <body> section, put the following (this tells the cg-feedread script to go get the feed, process it and display it on this page):

<?php
$feedUrl = "http://www.intmath.com/blog/feed";
$maxItemsPerFeed = 4;
$showDetails = true;
$cacheName = "blog";
$filterCat='';
$tLimit = -1;
$dLimit = 10;
$noHTML = true;
$showTime = false;
$feedStyle = false;
$noTitle = true;
$showTimeGMT = true;
$titleImages = false;
$multiSiteTitle = false;
$makeRSS = false;
$rssLink="";

$feedOut = getSomeFeed($feedUrl, $maxItemsPerFeed, 
  $showDetails, $cacheName, $filterCat, $tLimit, $dLimit, 
  $noHTML, $showTime, $feedStyle, $noTitle, $showTimeGMT, 
  $titleImages, $multiSiteTitle, $makeRSS, $rssLink);

if ($feedOut)
  echo $feedOut;
?>

Note 1: I found the listing of these variables in the original files was quite confusing until I rewrote them like this. Takes space, but at least you know what is going on.

Note 2: The line, $dLimit = 10; gives you the first 10 characters of the post. (But see the comments below - this needed an extra tweak to work.)

Example Results Page

You can see the results of the previous step at the top of: CG FeedRead Examples. Also on that page, following the squareCircleZ feed are the feeds from the 2 news sites (The New Nation and The ABC, Australia) that we will combine into one feed in the section below.

4. Save your page with a PHP extension (something like feedread.php will do), load it on your server (in the root directory - see above) and then call it in your browser (it will be www.mysite.com/feedread.php. All should work. (Good luck!)

The PHP page to display multiple feeds

This now answers the original request, for a way to pull multiple feeds into one page. The following is similar to what we had above, but allows for multiple feeds.

1. In the <body> section, this time use the following code:

<?php
// first, make an array of all the feeds you want mixed
$feeds = array ( 
"http://timesofindia.indiatimes.com/rssfeedsdefault.cms",
"http://abc.net.au/news/syndicate/offbeatrss.xml"
);

// decide how many total entries you want, 
//sampled from how many PER FEED
$count = array(10, 5); 
// 10 max output, 5 max sourced from each feed.
$showDetails = true;
$cacheName = "arrayOfFeeds";
$filterCat="";
$tLimit = -1;
$dLimit = 10;
$noHTML = true;
$showTime = false;
$feedStyle = false;
$noTitle = true;
$showTimeGMT = true;
$titleImages = false;
$multiSiteTitle = false;
$makeRSS = false;
$rssLink="";

$feedOut = getSomeFeed($feeds, $count, $showDetails, 
 $cacheName, $filterCat, $tLimit, $dLimit, $noHTML,  
 $showTime, $feedStyle, $noTitle, $showTimeGMT, 
 $titleImages, $multiSiteTitle, $makeRSS, $rssLink);

if ($feedOut)
 echo $feedOut;
?>

Notice the differences between this and the single feed example above. This time you need to supply an array of the feeds you want and the variables for the function are a bit different.

You can see the result of this step at the bottom of the example page under the heading "Mega Array": CG FeedRead Examples.

2. Once again, save your page with a PHP extension (something like feedread.php), load it on your server (in the root directory - see above) and then call it in your browser. All should work. (Good luck!)

Tips

  • The first time you try the multifeed version you may get all sorts of messages in your browser window - this just indicates that the caching is going on. Refresh a few times and it should be neat.
  • If you get stray messages you don't want (like "CGFR: MultiFeed (2) processing...") just comment those lines out of the cg-feedread.php file (they are for error checking and it's okay to kill them). They look like:
    dbglog("CGFR: MultiFeed (2) processing...");
  • I have styled the example - see the HTML output to see how it can be done.
  • You can play with the variables in the function - refer to the feedreadReadme.htm page that comes with the download.
  • I have found that not all feeds work. For example, the xml feed from The New Nation site (http://nation.ittefaq.com/artman/publish/rss.xml) did not work, but appears to be properly formed XML. Maybe a setting I need to tweak. Update: I realise now it didn't work because it is Windows formatted, not UTF-8. That's why I changed to the IndiaTimes feed.

Good luck. Hope it works for you.

See the 29 Comments below.

29 Comments on “Adding feeds with CG-FeedRead - a tutorial”

  1. syedmahm says:

    Oh! wonderful, I am through!
    No one before you could make this whole thing so easy for me. Thank you so much for all of this.

  2. Prashanth Narayanan says:

    very nice tutorial!
    had this running in 5 minutes! keep up the good work!
    -prash.

  3. stefan asemota says:

    thanks for the turorial...CG Feedread seems to have a problem with special characters encoding.... is there a workaround for this?

  4. Murray says:

    Stefan - you'll notice that the feeds from squareCircleZ (at the top of the examples page) are reading special characters just fine, but the ones from the news feeds lower down on the page are having a problem.

    A lot of blog engines (and obviously news ones too) replace "&" with "&amp;" and this messes up the feeds and makes it appear that CG Feedread cannot handle special characters. If you have control of the output, try experimenting with any code that replaces "&" with "&amp;". If you are pulling from outside sources, you may need to play with the functions in the uni_fns.php file in CG Feedread.

  5. Larry Eeles says:

    This is a great tutorial.. thats the first I have got to work.

    The only thing is that i would like this to update when a new blog is posted and it dosent seam to do this? is there a way of doing this?

    Thanks Larry

  6. Murray says:

    Hi Larry. I'm glad that you found the tutorial useful. Maybe by now you have already seen your updated post in the page containing CGFeedread.

    This script caches the feed details on the server. This is so that the original blog is not suffering from unnecessary hits and expensive bandwidth. (Or worse, the owner of the blog may block you because you are hogging the pipe). I can't remember the original cache period (I think it is one hour) but you can change this setting in the script, near the top.

    When testing, I will drop this setting down to maybe 2 minutes, publish a new post in the blog, wait a few minutes, refresh the page where the feeds are displaying and if I see the details of the new post, I know all is working well. Then I set it back to 1 hour.

  7. Dave says:

    Zac - have you been able to get this script to work on a new page? I can get it to work, but $noHTML = true or to false does not display the HTML of the original post. Also, when I try to limit the length that is displayed ($dLimit) nothing happens. In your example you use "10" which is ignored.

  8. Murray says:

    Hi Dave

    How are things in Canada?

    The trick is to reduce the caching time so that whatever changes you make to the variables will show up more quickly. In the cg-feedread.php file, go to the line near the top and change it to something like
    $XML_CACHE_TIME = 20;
    Now after 20 seconds you should see the changes you have made to $noHTML (refresh after 20 seonds - it should display in all its HTML glory)..

    As for $dLimit, the original function in Chait’s file has several variables fixed, which means any changes in your own file that calls those function variables will be ignored.

    In cg-feedread.php, change the function getSomeFeed by removing the equals bits so it looks something like...
    getSomeFeed($InUrl, $maxItemsPerFeed, $showDetails, $cacheName, $filterCat,...

    Now, when you feed values from your own feedread.php page to the function in cg-feedread.php, they should be effective.

    Don\'t forget to change back that $XML_CACHE_TIME variable or you may get blacklisted from the site(s) you are feeding from...

    Let me know how it goes.

    Update: This does not appear to work any more. See a later comment on this issue.

  9. David says:

    Just to note, you can make things a tiny bit simpler by uploading the entire cg-plugins folder (and not messing around with which files to upload!), and whichever of the cg-ZZZZZ-plugin.php files you want in order to let WP control activation of the plugins (I normally have folks upload the entire thing, and then just activate the ones they want). So upload cg-feedread-plugin.php (or whatever it is called), and then you don't have to modify your theme to activate it.

    BTW, uploading the entire folder will become more and more important as my plugins share a lot of code, and I continue to 'factor' shared code out into individual files. It also allows you to play with other plugins without trying to figure out what relies on what... 😉

    -d

  10. Murray says:

    Thanks, David -I appreciate your input (and your coding!).

    Actually, I separated out the files I needed because I was having a conceptual problem understanding which files I needed for WordPress (as it turned out, none) and which ones I needed for my feedread situation (the four I mentioned).

    My usual approach with such situations is to strip away all that I don\'t need and then get the subset of files to work. But I see your point about uploading the lot, especially now that you are using code libraries.

  11. Alex Dichev says:

    Thank you!

    This was very nice tutorial.
    I have few pages ready reading feeds running in 10 minutes..

    Thanks again!

  12. Murray says:

    Hi Alex. I'm glad that you found the tutorial useful.

    Zac

  13. mickeyb says:

    Hi

    I think I followed your instructions to the letter. I have wordpress and made the link to the blog inside your piece of script.

    This is what I got. Only the title of the most recent blog appeared at the end of a lot of errors

    CGFR: Recaching blog ( )... CGFR: dealing with singlular-entry case...
    Warning: fopen(/home/channon/public_html/cg/cache_feedread/blog.DAT) [function.fopen]: failed to open stream: No such file or directory in /home/channon/public_html/cg/cg-feedread.php on line 809

    Warning: flock() expects parameter 1 to be resource, boolean given in /home/channon/public_html/cg/cg-feedread.php on line 810

    Warning: fwrite(): supplied argument is not a valid stream resource in /home/channon/public_html/cg/cg-feedread.php on line 812

    Warning: flock() expects parameter 1 to be resource, boolean given in /home/channon/public_html/cg/cg-feedread.php on line 813

    Warning: fclose(): supplied argument is not a valid stream resource in /home/channon/public_html/cg/cg-feedread.php on line 814

    Warning: fopen(/home/channon/public_html/cg/cache_feedread/blog.html) [function.fopen]: failed to open stream: No such file or directory in /home/channon/public_html/cg/cg-feedread.php on line 835
    CG-Feedread failed to save feed to disk -- couldn't write to the cache_feedread directory.
    Warning: fclose(): supplied argument is not a valid stream resource in /home/channon/public_html/cg/cg-feedread.php on line 847

    Well blogger me….
    OK.…

    can you help??

  14. mickeyb says:

    sorted that one out now

  15. Murray says:

    Hi Mickeyb
    Looks like you had a file permissions problem on the blog.DAT file. I'm glad you got it sorted out.

  16. mickeyb says:

    Hi again

    Am almost there. The only trouble I have now is that my CG-Feedreader links on my html home page won't update when new blogs come in no matter how much I refresh the page.

    The only way I can make it happen is to go into the cache-feedread folder and delete the blog.DAT and blog.html files.

    any idea what's going wrong?

  17. mickeyb says:

    ooops... didn't read the comments above. will try that solution

  18. mickeyb says:

    AAAAAaaarrrrrrrgh!

    now it has totally and utterly stopped working. It is putting the WHOLE blogs on my html page instead of just the head and the first few characters!!!!!

    what can have gone wrong. I've reloaded everything, killed off the blog Dat and html files put it still keeps coming back

    what can have gone wrong!!!!

  19. Murray says:

    Hi Mickey

    If you set (for example)
    $dLimit = 100;
    it will give you the first 100 characters of the post.

    Keep smiling!

    Update: Actually, this does not work! I'm not sure why. I have a workaround, though, which is not so bad.

    Open cg-feedread.php and paste this function into the top (underneath the comments, near where it has the feedread version number):

    function limit_words($string,$limit)
    {
      $numWords = 0;
      $result = "";
      if($limit<1 || !is_int($limit)) return $result;
        $word = strtok($string, " \\n\\t");
        $result .= $word;
        while($word && (++$numWords < $limit)) {
          $word = strtok(" \\n\\t");
          $result .= " $word";
        }
      return $result;
    }
    

    Now, about 2/3 of the way down the cg-feedread.php script, find the line that says:

    $itemDescription = cleanBadChars($itemDescription); // just in case...

    After that line, add these 2 lines (the first is a call to the function and the second adds the "..." at the end of each truncated post.

    $itemDescription = limit_words( $itemDescription, $dLimit );  
    $itemDescription .= "...";			
    

    Now you can go back to your feedread.php file and set the $dLimit to whatever you like.

    It should work okay. Good luck.

    I have not extensively tested this - let me know if it breaks.

  20. mickeyb says:

    Hi zac

    that bit of script worked ok once I fiddled with it. I replaced

    ($string, " nt");

    with

    ($string, " %");

    because it was delivering the dlimit but with all the n and t letters missing from the text. so I thought I'd use a character that rarely appears in the intro of blogs

    the script had been working fine until last night. strange how it suddenly went wrong. all I had done was change themes on my blog but that can't have done the damage.

    cheers

    mickey

  21. Murray says:

    Wak!

    WordPress changed my "\n\t" into "nt" (because it was in a <pre> tag) and I hadn't noticed it.

    This removes new line and tab characters from the post when counting the number of words (and puts them back in when giving the output).

    I have modified the code in my comment so it looks correct and can be copied and pasted.

    I see your example is working fine now.

  22. mickeyb says:

    Hi Zac

    Is it possible to get the blog on the home page to show the name of the blogger too apart from the first ten words

    also can latest comments be shown on the front?

    cheers

    Mickey

  23. Murray says:

    Hi Mickey

    Latest comments is easy. You just give to the function the feed for comments.

    See Comments feed for an example.

    To achieve this, I just gave it

    /comments/feed/

    instead of

    /feed/

    In your case, you would give it.

    .../wordpress/?feed=comments-rss2

    You will get the name of the person who commented.

    I'm afraid the name of the blogger will require tweaking Chait's script. Maybe you could contact him at his blog.

    Another option is the following...

    Feedread is great if you are pulling feeds from outside sources. But if you are pulling your own feeds, it really is overkill. On my mathematics site, I pull 3 posts from my blog on each page (see here for an example towards the bottom of the left column). I am not using Feedread at all there, I am accessing the database directly.

    Are you interested in that option?

  24. Phoenix says:

    I know this thread is like a year old. I dont know if anyone even looks at this anymore, but for the sake of keeping sanity I will post my question anyways.

    I have tried to limit the number of words that appear, really I only want the title to show up and nothing else. I read through the comments and did what zac said, and i edit the change that mickeyb pointed out.

    My site shows lots of the post, i only want 15 characters, however, where i refresh nothing happens.

    Anyone care to help? I know its old but its worth a try....thanks

  25. Murray says:

    Hi Phoenix

    If you check it again now, maybe it will look how you want. The caching system of this script can be a trap - you make change to the script and then nothing seems to change on the HTML page.

    When modifying, set the timeout ($XML_CACHE_TIME) to a very low value so changes appear immediately.

    Then remember to set it back!

    If it is not OK still, send the link and I'll have a look.

  26. Eric says:

    Zac,

    Are you still helping with the cg-feedreader?

    I am trying to get it to work on the main page of a site.
    I am getting this error:

    CGFR: Recaching blog ... CGFR: ERROR trying to read feed blog ! CGFR: XMLParser Error: 404 Not Found ... XMLParser Error: 404 Not Found

    Seems that it is more of a missing files issue than a plugin.

    Let me know if you can.

  27. Eric says:

    Zac,

    Fixed it. WordPress settings user error!

  28. Simon says:

    Hi - lovely description, thank you.

    Despite that I'm having a tiny problem, in that before the content of my blog appears. I get an error message. I don't know PHP so I'm anxious not to poke around too much but I'm happy to poke at things if you tell me what to poke....... 🙂

    Alternatively, you mentioned to Mickey a couple of comments above that if one was pulling form one's own blog it was easy to access the database directly. If that's easy, I'm all for it! 🙂

    Simon

  29. Phil says:

    Fantastic, thanks for the guide. Saved heaps of time.

Leave a comment


Comment Preview

HTML: You can use simple tags like <b>, <a href="...">, etc.

To enter math, you can can either:

  1. Use simple calculator-like input in the following format (surround your math in backticks, or qq on tablet or phone):
    `a^2 = sqrt(b^2 + c^2)`
    (See more on ASCIIMath syntax); or
  2. Use simple LaTeX in the following format. Surround your math with \( and \).
    \( \int g dx = \sqrt{\frac{a}{b}} \)
    (This is standard simple LaTeX.)

NOTE: You can't mix both types of math entry in your comment.

Search IntMath, blog and Forum