What is WordPress file encoding?

I noticed that WordPress files use UNIX end-of-line and UTF-8 without BOM. Is this a standard that should be used? Is there anything else to pay attention to?

I noticed that Filezilla fails to transfer Macintosh end-of-line files properly to Windows or Linux servers. New line endings are missing. And sometimes they are doubled with different configuration. I’d like to avoid that if possible.

Related posts

2 comments

  1. UNIX line endings (n) and UTF-8 are just common code standards. As far as I can see, they are not even mentioned in the Coding Standards. Most (all?) core PHP files are just plain US-ASCII. Try to follow that path to keep your files as compatible as possible.

    If you use UTF-8, add a line at the start of a PHP file with an Encoding Cookie or file variable:

    <?php # -*- coding: utf-8 -*-
    

    Many editors (Emacs, Vim, Sublime, Scite) will use that to set the proper encoding automatically.

    I have also seen:

    <?php # -*- coding: utf-8-unix -*-
    

    But I am not sure how compatible it is.

  2. PHP itself does not care whether you use DOS or UNIX style line endings in the PHP code sections, because it simply ignores whitespace anyway, so both of those get ignored regardless.

    PHP interpreters do not know about the existence of Byte Order Marks in files, and so having a BOM at the start of the PHP file will cause that BOM to be output when the PHP file is executed. This is often undesirable, and so PHP files in general cannot have a BOM in them.

    PHP files don’t care about encoding, they simply produce output as-given for most cases, and thus anything outside the PHP sections can be more or less anything, including straight binary output for all PHP cares. For maximum compatibility and portability, UTF-8 without a BOM is recommended.

    WordPress uses Unix style line endings for preference, however submissions to the plugin repository are not checked for one style or the other. Use whichever you like. Submissions to the theme repository are checked for mixed line-endings, which means a file which has both Unix and DOS style line endings in it. This often happens when copy and pasting code using certain Mac editors. These mixed line endings can cause problems in Subversion repositories, so the theme uploader will reject files with mixed line endings. Plugin authors, having direct access to their SVN repositories, will receive similar failures from SVN when uploading mixed-up files themselves.

Comments are closed.