In Which Contexts are Plugins Responsible for Data Validation/Sanitization?

I want to make sure all of the data in my plugins/themes is handled securely before entering the database and before being output to the browser. My problem is that there are situations where the API handles the sanitization for you — like when saving post meta fields — and others where the plugin/theme author is wholly responsible for doing it — like when saving custom settings.

For the scope of this question, I’m not concerned about validating data at the domain level — e.g., checking that an Age field on a form is between 0 and 120, or that an e-mail address is valid. I’m only concerned about security — e.g., escaping SQL queries to avoid SQL injection when saving to the database, or sanitizing data that’s output to HTML templates to avoid XSS.

Read More

For output sanitization, I know that you always need to use functions like esc_html() and esc_attr() when echo’ing variables into HTML templates. But, what about when using template tags? Do they all sanitize the output already? If so, for which context (general HTML, tag attributes, etc)? Some functions have variants for different contexts (like the_title_attribute(), but most don’t.

For input sanitization, I know that I need to use $wpdb->prepare() when making manual queries, but what about when using the Settings API to create a plugin settings page, or saving post meta fields for a custom post type?

Right now I’ve just been digging through Core and reading tutorials every time I use a function to find out if it sanitizes or not, but that’s error-prone and time consuming. I’m hoping to find some kind of comprehensive list of all the possible situations and whether or not the API handles it. e.g.,

API validates/sanitizes

  • Saving post meta with update_postmeta()
  • Saving user meta with update_user_meta()
  • Outputting a post title – use the contextually appropriate variant of the_title()
  • etc

You have to manually validate/sanitize

  • Saving plugin options with the Settings API. Pass a callback as the 3rd parameter of register_setting().
  • Direct database queries: Wrap the query in $wpdb->prepare().
  • Outputting variables in HTML. Use esc_attr(), esc_html(), etc
  • etc

I’d also be interested to understand why the API provides it in certain situations, but not others. I’m assuming it has something to do with the unknown nature of the data, but would love to hear a thorough explanation.

Related posts

Leave a Reply

2 comments

  1. There are two concepts here:

    • validation – making sure data is valid, i.e. an integer is an integer, a date is a date (in the right format etc). This should be done just before saving the data.
    • sanitisation – making the date safe for its use in the current context (e.g. escaping SQL queries, or escaping HTML on output).

    Validation is, almost universally, solely down to you. You know what data you are asking from a user, and you know what data you are expecting – WordPress doesn’t. Validation would be performed, for example, on the save_post hook before saving it to the database with update_post_meta, or it might be done through specifying a callback function in the Settings API, called just before WordPress saves the data.

    Sanitisation is a bit more mixed. When dealing with data that WordPress natively knows about (e.g. a post’s tile) you can be sure that WordPress has already made the data safe. However ‘safe’ depends on context; what is safe for use on a page, is not necessarily safe as an element attribute. Hence WordPress will have different functions for different context (e.g the_title(), the_title_rss(), the_title_attribute()) – so you need to use the right one.

    For the most part your plug-in might deal with post meta – or maybe event data from a custom table. WordPress doesn’t know what this data is or what it is for, so it certainly doesn’t know how to make it safe. This is up to you. This is particularly important in using esc_url(), esc_attr(), esc_textarea() etc to prevent malicious input from being able to embed code. Since WordPress knows next_posts() is suppose to print an url to the page, it applies esc_url() – but with post meta, say, it doesn’t know that it stores an url – or what you want to do with it (if printing, esc_url(), if redirecting esc_url_raw(). If in dobut – err on the side of caution and escape it yourself – and do this as late as possible.

    Finally – what about saving data? Do you need to make it safe then? As mentioned you do need to make sure data is valid. But if using WordPress API (wp_insert_post(),update_post_meta() etc) then you don’t need to sanitise the data – because when saving data the only sanitising you need to be doing is to escape SQL statements – and WordPress does this. If you are running direct SQL statements (say to read/write data from a custom table) then you should use the $wpdb class to help you sanitise your queries.

    I wrote this blog post on data sanitisation and validation which you might find helpful – in it I talk about what’s expected of you in this respect.

  2. Not sure it’s that thorough, but with any plugin or theme, user input should be sanitized. Database operations should be done by using the $wpdb-> methods. All $_GET and $_POST data should be sanitized.

    This is more best practice for PHP programming than WordPress.

    So, in conclusion, if there is a WordPress function, use it, if not, sanitize your variables and inputs yourself.

    If I was too vague, please ask a more specific question.