JSON: schedule creation of json file

After some research I’ve managed to create a json file which I use as input for Twitter’s typeahead (prefetch). On my website I have several hundreds of events and several thousands of artists (both custom post types).

The json files are created using the function below (only posted one function since the function for artists is the same except it’s a different post type). Since we keep updating the events and artists, and since we have so much of both, I was wondering how I should go about running the functions. My idea was to schedule this function, so it will run once every day or so. After some more research I should be able to fix this (using this post).

Read More

Now my question is if this is good practice. I thought it would be great to schedule the functions to run every night (we have over 8000 artists so I don’t want the functions to slow down the website for visitors), but maybe there’s another efficient way of building such large datasets.

The function:

function json_events() {

  $args = array( 
    'post_type' => 'events',
    'posts_per_page' => -1,
    'post_status' => 'publish'
  );

  $query = new WP_Query($args);

  $json = array();

  while ($query->have_posts()): $query->the_post();
    $json[] = get_the_title();
  endwhile;
  wp_reset_query();

  $fp = fopen('events.json', 'w');
  fwrite($fp, json_encode($json));
  fclose($fp);
}

Related posts

1 comment

  1. My idea was to schedule this function, so it will run once every day
    or so. After some more research I should be able to fix this (using
    this post).

    Note that WP_Cron setted to daily is impossible to control exactly at what time the creation happen. If no one visit your site on night the first user on morning will experience a very long page loading time.
    There are some workaround, sometimes I’ve used one: create a function that hook into shutdown action and then, from that function call the url that run the heavy work via curl.

    However, if you have access to the cron table is better use this feature.

    Also because search engines will trigger the WP_Cron too, so there are chances that the slow work start on search engine visit, resulting in slow page loading that will affect SEO.

    .. but maybe there’s another efficient way of building such large
    datasets.

    If you need to refresh all the file content every day there are little chances. But if all your artists/events CPT post titles don’t change every day, you can

    1. Run the file creation once
    2. hooking into save_post, post_updated and delete_post and when a new artist or event is added, increment the prefetch file (i.e. just append the new title instead of build entire file) with the new title. When a title is updated or a post is deleted (or restored from trash) recreate the file again

    In this way you don’t need to run any heavy scheduled task, and no website slow down will occur.

    I’ve written a simple plugin that do the incremental saving stuff. The plugin saves json files in the WP uploads directory under a /json subfolder.

    <?php 
    /**
     * Plugin Name: Json Posts Incremental
     * http://wordpress.stackexchange.com/questions/113198/json-schedule-creation-of-json-file
     * Author: G. M.
     * Author URI: http://wordpress.stackexchange.com/users/35541/
     */
    
    class Json_Posts_Incremental {
    
      static $types = array('artists', 'events');
    
      static $path;
    
      static $url;
    
      static function init() {
        self::set_path();
        add_action('admin_init', array( __CLASS__, 'build_all') );
        add_action('delete_post', array( __CLASS__, 'update'), 20, 1 );
        add_action('post_updated', array( __CLASS__, 'update'), 20, 3 );
      }
    
      static function build_all() {
        foreach ( self::$types as $type ) self::build($type, false);
      }
    
      static function update( $post_ID = 0, $post_after = null, $post_before = null) {
        if ( ! empty($post_after) && ! empty($post_before) ) {
          $new = $post_before->post_status == 'auto-draft' || $post_before->post_status == 'new';
          if ( $new ) return self::increment($post_ID, $post_after, false);
          $skip = $post_after->post_status != 'publish' && $post_before->post_status != 'publish';
          if ( $skip ) return;
          $trash = ( $post_after->post_status == 'trash' && $post_before->post_status == 'publish' ) || // trash
            ( $post_after->post_status == 'publish' && $post_before->post_status == 'trash' ); // restore
          if ( ! $trash && ( $post_after->post_title == $post_before->post_title ) ) return;
        } else {
          $post_after = get_post($post_ID);
          if ( ! is_object($post_after) || ! isset($post_after->post_type) || $post_after->post_status == 'trash' ) return;
        }
        if ( in_array($post_after->post_type, self::$types) ) self::build( $post_after->post_type, true );
      }
    
      static function increment( $post_ID, $post, $update ) {
        if ( $update || ! in_array($post->post_type, self::$types) ) return;
        $file = trailingslashit(self::$path) . $post->post_type . '.json';
        $content = file_exists($file) ? file_get_contents($file) : '[]';
        $content = rtrim($content, ']');
        if ( $content != '[') $content .= ',';
        $content .= json_encode($post->post_title) . ']';
        self::write($file, $content);
      }
    
      static function get_json( $type ) {
        $file = trailingslashit(self::$path) . $type . '.json';
        return ( file_exists($file) ) ? file_get_contents($file) : '';
      }
    
      static function get_json_file( $type ) {
        $file = trailingslashit(self::$path) . $type . '.json';
        return ( file_exists($file) ) ? $file : '';
      }
    
      static function get_json_url( $type ) {
        $file = trailingslashit(self::$path) . $type . '.json';
        return ( file_exists($file) ) ? trailingslashit(self::$url) . $type . '.json' : '';
      }   
    
      private static function build( $type = '', $force = false ) {
        if ( ! in_array($type, self::$types) ) return;
        $file = trailingslashit(self::$path) . $type . '.json';
        if ( file_exists($file) && ! $force ) return;
        $titles = array();
        $posts = get_posts("post_type=$type&nopaging=1");
        if ( ! empty($posts) ) {
          foreach ( $posts as $post )
            $titles[] = apply_filters('the_title', $post->post_title);
          $content = json_encode( $titles );
          self::write($file, $content);
        }
      }
    
      private static function set_path() {
        $upload_dir = wp_upload_dir();
        self::$path = $upload_dir['basedir'] . '/json';
        self::$url = $upload_dir['baseurl'] . '/json';
        if ( ! file_exists(self::$path) ) mkdir(self::$path, 0775);
      }
    
      private static function write( $file = '', $content = '' ) {
        $fp = fopen($file, 'w');
        fwrite($fp, $content);
        fclose($fp);
      }
    
    }
    
    add_action('init', array('Json_Posts_Incremental', 'init') );
    

    To retrieve the json file url, e.g. to pass it to js you can use Json_Posts_Incremental::get_json_url($type) where $type is the post type e.g. events.

    Others utility functions are:

    Json_Posts_Incremental::get_json_file($type) to retrieve the file path.

    Json_Posts_Incremental::get_json($type) to retrieve the json content as string.

Comments are closed.