Refresh external feeds only in cron?

Is there an easy way to ensure that external feeds (via fetch_feed()) are only fetched via cron, and not when a regular user visits the site? This is for performance reasons, I want to minimize the page load time.

The only time a feed should load in a normal request are probably when the cache is empty (the first time it is loaded) and maybe when a logged in user is visiting the page (because I will be the only one with a user account).

Related posts

Leave a Reply

3 comments

  1. My recommendation would be to set up a wrapper for fetch_feed(). Call the wrapper function through WordPress’ cron and you shouldn’t have an issue.

    So something like:

    function schedule_fetch_feeds() {
        if ( ! wp_next_scheduled( 'cron_fetch' ) ) {
            wp_schedule_event( time(), 'hourly', 'cron_fetch', 'http://blog1.url/feed' );
            wp_schedule_event( time(), 'hourly', 'cron_fetch', 'http://blog2.url/feed' );
        }
    }
    
    function fetch_feed_on_cron( $url ) {
        $feed = fetch_feed( $url );
    
        delete_transient( "feed-" . $url );
    
        set_transient( "feed-" . $url, $feed, 60*60 );
    }
    
    add_action( 'wp', 'schedule_fetch_feeds' );
    add_action( 'cron_fetch', 'fetch_feed_on_cron', 10, 1 );
    

    Keep in mind, I haven’t had a chance to test this yet! But it should create cron jobs to grab each of your feeds and store the feeds temporarily in transients. The transients have 1-hour expirations, because the cron should realistically be updating them every hour anyway.

    You can pull the feeds out of the transients using:

    function get_cached_feed( $url ) {
        $feed = get_transient( "feed-" . $url );
        if ( $feed ) return $feed;
    
        $feed = fetch_feed ( $url );
        set_transient( "feed-" . $url, $feed, 60*60 );
        return $feed;
    }
    

    If the transient exists, the function grabs it and returns it. If it doesn’t, the function will grab it manually, cache it, and still return it.

  2. Breaking my own rule and adding a second answer … but for a very specific reason …

    I took a closer look at the actual core code for fetch_feed() in response to Rarst’s comment on the original question:

    holy grail here would be to make native fetch_feed() get all feeds asynchronously in cron and never when user loads front-end page

    The actual function code is:

    function fetch_feed($url) {
        require_once  (ABSPATH . WPINC . '/class-feed.php');
    
        $feed = new SimplePie();
        $feed->set_feed_url($url);
        $feed->set_cache_class('WP_Feed_Cache');
        $feed->set_file_class('WP_SimplePie_File');
        $feed->set_cache_duration(apply_filters('wp_feed_cache_transient_lifetime', 43200, $url));
        do_action_ref_array( 'wp_feed_options', array( &$feed, $url ) );
        $feed->init();
        $feed->handle_content_type();
    
        if ( $feed->error() )
            return new WP_Error('simplepie-error', $feed->error());
    
        return $feed;
    }
    

    To save you the trouble of reading up on the SimplePie() object … the $feed->init() function first checks to see if we’re caching the feed and, if so, retrieves from the cache rather than re-requesting the feed from the original source.

    Each feed it cached for 43200 seconds (or 12 hours) which is the lifetime of the transient. You could modify this up or down by using the ‘wp_feed_cache_transient_lifetime’ filter.

    To Address the Original Question

    Feeds aren’t re-fetched via cron. They’re fetched once, then cached for future use. The only time a feed is loaded when a user first visits the page is when the cache is empty. For a high-traffic site, this should be relatively rare.

    So if you’re facing performance issues related to feeds, you’ve probably got some other issue going on.

  3. How about you store & keep them updating in the options table by the native linux cron and just fetch in on pages (database read). This way page load times won’t be affected at all.

    Edit: Alrite! If you are familiar with how alternate cron works in WordPress by creating another process and then continue with its work. Adapting that approach would seem to be a good solution to your case along with transients to initiate such a request.