Apache RewriteRule: Is the pattern “.” a special case?

I have seen numerous RewriteRule examples along the lines of the following:

RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]

The intention here is that URIs that are not backed by a real file will be handled by a CMS, hooked into via /index.php. WordPress, for example, uses this technique in its standard(?) .htaccess file.

Read More

However, the regexp doesn’t make any sense. Without getting into a discussion of what subject-string is being matched against (but see “What is matched?” in the official Apache RewriteRule docs), the regexp . should only match the very first character! However, it behaves identically to .*, that is, matching the entire subject-string. Setting RewriteLogLevel 5, I get the same results and log entries for . and .*.

I can only conclude that a RewriteRule with the pattern . is special-cased and matches the whole string (a la .*), rather than processed via the RegExp engine as-is. However, I haven’t seen this documented anywhere. Is the documentation deficient, or am I just totally missing something?

Thanks!

Related posts

Leave a Reply

2 comments

  1. Oh, duh….I hate when I realize I’m just being lame.

    I’m approaching mod-rewrite from the mindset of a programmer. In every regexp library I know of, substitution replaces only the matched part of the subject-string. So e.g.

    $ perl
    $hello = "hello";
    $hello =~ s/lo/p/;
    print "$hellon";
    ^D
    

    Displays

    help
    

    In mod-rewrite, however, we aren’t manipulating the string. If the URI is matched at all, we completely replace it with the Substitution. Therefore, a mere match (“.”) is enough to replace the entire string with /index.php.

    The Apache docs sort of state this, but not so forcefully as to correct someone who’s in the wrong headspace:

    The Substitution of a rewrite rule is
    the string that replaces the original
    URL-path that was matched by Pattern.

  2. By default, regular expressions match the input string anywhere.
    Your assumption that . should only match the very first character is false: it will match any character, the first one, the last one, the 24th char, it doesn’t matter: any non-empty string will match. .* has a subtle difference: it will match any string, even zero length ones.

    To get the behavior you thought, use ^ (match at the beginning of the string) and $ (match at the end).

    Summary:

    . will match any string having at least one character
    .* will match any string, even empty
    ^.$ will match only a one character long string
    ^.*$ will match any string in single-line mode, or any line in multi-line mode