I’ve been researching the best ways to tell search engine robots just exactly which parts of your WordPress site they should index, and which parts to avoid. Robots.txt is one obvious way to go about this, but there’s no shortage of conflicting opinions about just what to block. The goal here is mainly to avoid duplicate content in Google, which will toss pages into the Supplemental Index. But I’ve seen Yahoo index some pretty bone-headed pages before, so there are multiple reasons to do something with your robots.txt file.
Then I found a really interesting little chunk of code you can stick in your header.php file. It tells the bots to follow links in every case, but only index single posts, static pages and the home page:
<?php if(is_single() || is_page() || is_home()) { ?>
<meta name="googlebot" content="index,noarchive,follow,noodp" />
<meta name="robots" content="all,index,follow" />
<meta name="msnbot" content="all,index,follow" />
<?php } else { ?>
<meta name="googlebot" content="noindex,noarchive,follow,noodp" />
<meta name="robots" content="noindex,follow" />
<meta name="msnbot" content="noindex,follow" />
<?php }?>
At this point I think I’m just going to leave that for a bit and see what happens. I’m not wild about messing with robots.txt – I’m not confident I know what I’m doing with it. I do have my wp-admin folder blocked, but that’s it. I can’t figure out why anyone would want pages from wp-content indexed either, but some people recommend leaving them in so… yeah, I’m just going to see what this code does.
Related posts:
- Robots.txt file validator
- Meta Descriptions?
- Tips to SEO your WordPress blog
- Fixed WordPress is_home error
- SEO-Open Firefox extension


