If you go to wordpress admin and then settings->privacy, there are two options asking you whether you want to allow your blog to be searched though by seach engines and this option:
I would like to block search engines,
but allow normal visitors
How does wordpress actually block search bots/crawlers from searching through this site when the site is live?
According to the codex, it’s just
robots
meta tags, robots.txt and suppression of pingbacks:These are “guidelines” that all friendly bots will follow. A malicious spider searching for E-Mail addresses or forms to spam into will not be affected by these settings.
With a robots.txt (if installed as root)
or (from here)
You can’t actually block bots and crawlers from searching through a publicly available site; if a person with a browser can see it, then a bot or crawler can see it (caveat below).
However, there is something call the Robots Exclusion Standard (or robots.txt standard), which allows you to indicate to well behaved bots and crawlers that they shouldn’t index your site. This site, as well as Wikipedia, provide more information.
The caveat to the above comment that what you see on your browser, a bot can see, is this: most simple bots do not include a Javascript engine, so anything that the browser renders as a result of Javascript code will probably not be seen by a bot. I would suggest that you don’t use this as a way to avoid indexing, since the robots.txt standard does not rely on the presence of Javascript to ensure correct rendering of your page.
Once last comment: bots are free to ignore this standard. Those bots are badly behaved. The bottom line is that anything that can read your HTML can do what it likes with it.
I don’t know for sure but it probably generates a robots.txt file which specifies rules for search engines.
Using a Robots Exclusion file.
Example: