Redirect loop on attempted web proxy

I have an Apache server running WordPress in the web root (/var/www/html). In my access_log I have been seeing many entries of the form:

98.209.16.114 - - [15/Feb/2013:21:19:51 -0500] "GET http://www.twitter.com HTTP/1.1" 301 - "-" "curl/7.28.1"
98.209.16.114 - - [15/Feb/2013:21:19:51 -0500] "GET http://www.twitter.comhttp/www.twitter.com HTTP/1.1" 301 - "-" "curl/7.28.1"
98.209.16.114 - - [15/Feb/2013:21:19:51 -0500] "GET http://www.twitter.comhttphttp/www.twitter.comhttp/www.twitter.com HTTP/1.1" 301 - "-" "curl/7.28.1"
98.209.16.114 - - [15/Feb/2013:21:19:52 -0500] "GET http://www.twitter.comhttphttphttp/www.twitter.comhttphttp/www.twitter.comhttp/www.twitter.com HTTP/1.1" 301 - "-" "curl/7.28.1"

where www.twitter.com can be replaced with any number of odd domains external to my own.

Read More

EDIT: the copied lines include curl because I was testing this phenomenon from my own command line.

The pertinent lines in my httpd.conf file are:

NameVirtualHost *:80

<VirtualHost *:80>
  DocumentRoot /var/www/html
  ServerName www.mydomain.com
  <Directory /var/www/html>
    AllowOverride All
  </Directory>
</VirtualHost>

<VirtualHost *:80>
  ServerName mydomain.com
  RewriteEngine On
  RewriteRule ^/(.*) http://www.mydomain.com/$1 [L,R=301]
</VirtualHost>

and the .htaccess file in the WordPress directory looks like:

<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteRule ^index.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
</IfModule>

This happens tens of times per day. A few questions I have are:

  • Should I be worried about so many proxy attempts? I’ve turned off mod_proxy and while http requests are looping, https requests seem to return either 301 or 400
  • Is there any sort of performance loss I should be worried about?
  • How do I fix the darn thing?

Let me know what other information you need.

Related posts

Leave a Reply

1 comment

  1. The only modules you need loaded by apache for wordpress are the following… this excludes your php module and I’ll explain that a bit more after:

    LoadModule authz_host_module modules/mod_authz_host.so
    LoadModule log_config_module modules/mod_log_config.so
    LoadModule expires_module modules/mod_expires.so
    LoadModule deflate_module modules/mod_deflate.so
    LoadModule env_module modules/mod_env.so
    LoadModule setenvif_module modules/mod_setenvif.so
    LoadModule mime_module modules/mod_mime.so
    LoadModule autoindex_module modules/mod_autoindex.so
    LoadModule dir_module modules/mod_dir.so
    LoadModule alias_module modules/mod_alias.so
    LoadModule rewrite_module modules/mod_rewrite.so
    LoadModule negotiation_module modules/mod_negotiation.so
    LoadModule headers_module modules/mod_headers.so
    

    Depending on whether you are running php fcgi or php mod_php (prefork) then you would include one of the following:

    #---- For FPM of PHP Enable both below and disable php5_module
    LoadModule fastcgi_module modules/mod_fastcgi.so
    LoadModule actions_module modules/mod_actions.so
    #---- For standard php5 module enable below and disable 2 above
    #LoadModule php5_module  modules/libphp5.so
    

    The redirects that I’m seeing based on your log output looks more like a badly configured mod_rewrite rule, or a 301 redirect plugin in wordpress that is conflicting with rules on apache.

    Now as a general rule of thumb, when we host namevirtualhost based vhosts, we create a default vhost with
    Servername default

    And set it’s doc root to a directory that includes 1 html file and rewrite rule to transfer all traffic to the index.html (The html file just outputs hosted by) That way you filter out all of the garbage traffic that isn’t based on your domain itself, and since it’s only running html it doesn’t require any iops based on php as it’s static html only.

    That way your vhost will only handle requests for the domain that it hosts. Now that doesn’t fix the issue with 301’s that are happening. If you can share your mod_rewrite configuration then perhaps there is something in there that we can help with.

    The biggest problem I’ve seen in the community with SSL based traffic is when the developers have to integrate with load balancers where the SSL certificate is truncated on the load balancer instead of apache. With that said you can no longer do a rewrite condition / rule to check if https != on, redirect 301 to a new location as the load balancer terminates the cert and sends http (port 80) traffic to your webhost so apache will always think it’s unencrypted even though it hits a secured vhost

    One additional note on your vhost. It’s better to not use .htaccess files unless you ABSOLUTELY have to. For wordpress a vhost directory element should look something like this for optimal performance.

    <VirtualHost *:80>
            ServerName www.example.com
            DocumentRoot /path/to/doc_root
            <Directory /path/to/doc_root/>
                    AllowOverride None
                    Options SymLinksIfOwnerMatch MultiViews -Indexes
                    Order allow,deny
                    Allow from all
                    RewriteEngine On
                    RewriteRule ^index.php$ - [L]
                    RewriteCond %{REQUEST_FILENAME} !-f
                    RewriteCond %{REQUEST_FILENAME} !-d
                    RewriteRule ^.*$ index.php [L]
            </Directory>
    </VirtualHost>
    

    The reason why I say optimal performance is that the .htaccess file does not get read by apache on every single request that comes to it. It instead reads it as part as vhost config and saves the configuration in memory so essentially you save a huge amount on iops by not having it use a .htaccess in addition to additional security of not having to rely on .htaccess files and overrides.

    And finally using RewriteBase is irrelevant and sometimes just causes prolems unless you host wordpress under a specific alias like /blog/ In your case based on your .htaccess file it appears to be on the base domain so there is no need to have that directive in there. For reference see the rewrite rules i have in the vhost config above.