I am working on an aggregator at the moment for an opt-in blog service.
Essential a handful of people submit their WP blogs to the aggregator for, well, aggregation.
I am currently using a slightly hacked version of http://feedwordpress.radgeek.com/
What I want to be able to do is for each bit of syndicated content shoot off to the post in question and grab the latest comments and comment count.
Anyone have any opinions on the best way to do this?
Use a token that stores the latest date in each rss feed read for the next time you do an “aggregation”
a) Possibly re-combine it in a new RSS feed and publish that one for subscription.
b) Possibly re-post each post in the new twitter feed
header:
simple loop based on a file based storage of cache and last-datetime-counter:
So the same loop needed for the comments which is the form /feed , so
you call with each post you find above ($url) a new exact the same loop but then for ($url)/feed. These are the comments, same loop, same procedure.
The line “// now store it somewhere or e.g. post it to your owb blog via xmlrpc” now means you could call your own aggregator blog xmlrpc to add either a new post or a new comment to an existing post.
1) adding a new post is pretty default, you could call this function:
and the blog is entered in your aggregator blog. Now for the comment:
you have to check that the post is already present in your blog before adding a comment via XMLRPC. So first do the check, then do the comment (and actually maybe do that even before adding a new post).
Add a new comment via: wp.newComment (http://codex.wordpress.org/XML-RPC_wp)
I you are close to the aggregator application database you can skip XMLRPC and do instead of XMLRPC direct database calls to validate e.g. if the post is already present, grab the id, then check if the comment is present, if not store it, if updated, update it, if deleted, then delete it (to be completely in sync).
(you need to delete comments and posts on your side to prevent people calling you to delete a post or comment because of all kinds of reasons, sometimes years later) –> leading to a decision on how long you want to keep aggregated posts and comments for external display.
I would run this scheduled rss checker and updater as a seperate application and not in the same application as where the presentation layer is to make it scalable etc…
IMHO