I have a domain like domaindev.com.
I have set the robots.txt
file on this domain to block everything:
User-agent: *
Disallow: /
Just like that. So that blocks everything from being crawled.
Now here is where it gets interesting. We also have several subdomains.domaindev.com hosted on our server and I want to block all of those subdomains as well from being crawled. I want a simple way to block any old subdomains and any new subdomains. Is there a line I can add to the www.domaindev.com that will prevent any subdomains under the domaindev.com
?
Is the best way to make a default robots.txt
and just drop it in all of the subdomain folders manually.
I’d really like a definite solution so as not to have to continually do same.
We use WordPress and in the wp-admin we have set it to not let spiders crawl our websites. But somehow these websites are finding there way into Google.
How do I go about it?
I’ve searched the site and found this line to add to my .htaccess
file:
Header set X-Robots-Tag "noindex, nofollow"
I’m going to do that as well.
I also saw that it is part of the standard that each subdomain would need its own robots.txt
file.
Put an
Alias
directive in your httpd.conf file outside of any VirtualHost section to catch any “robots.txt” requests.