Block Less Quality pages from Search Engine Crawling using Robots.txt

This is a small SEO quality tip which I want to give you all which probably don’t make lot of direct impact on any site ranking but improves the overrall quality score of the site in the eyes of search engines.

And that tip is to Block all Less Quality and unwanted pages from search engines by using robots.txt. For example, I had lot of less quality pages on MWolk specially through free proxy submission which were just having link and little content and these pages were making 50% of the total number of pages indexed in Google for me. Now Google often deindex such pages and there was a threat of these pages getting deindexed and eventually effecting more quality pages of the site, not to mention, I wasn’t getting any traffic from search engines anyway through these pages so I blocked all of them through robots.txt.

If you don’t know how to use robots.txt, check whether your root folder has a file named robots.txt, if you don’t have, create one.

Here is a sample robots.txt file

User-agent: spambot
Disallow: / 
User-agent: *
Crawl-delay: 10
 
Disallow: /proxies/
Disallow: /proxy_sites/
Disallow: /example.php

Here I am allowing all bots to crawl my site but blocking a bot named spambot to crawl my site. If you don’t understand how to get the bot names, you are better off not blocking any bot as accidentally you can block even google, yahoo and msn bots to crawl your site.

I have kept as crawl delay of 10 which is somewhat delayed crawl delay. You can make it 5 or 1 to get your site crawled faster. Many bots ignore the crawl delay parameter too.

And then I have blocked 2 paths, /proxies/ and /proxy_sites/ and one file /example.php so search engines don’t crawl and index the content of them.

Related posts:

  1. A Structural Search Engine from Opera : MAMA Metadata Analysis and Mining Application or MAMA is a new...
  2. Traffic to your Twitter profile from Google Search Engine With Alexa Rank of 79, Twitter is certainly one of...
  3. Rapidshare and Megaupload Search Engine In the process of making MWolk.com more better and to...
  4. Hits column in Robots Spider visitors section of Awstat explained If you owns a website and uses awstats program to...
  5. Search Rapidshare using RapidLibrary Rapidlibrary is one of the most popular Rapidshare Search...


Liked this post ? Subscribe to MWolk Blog via RSS Feed or via Email and receive free daily Tech and Money making tips.

No Responses to “Block Less Quality pages from Search Engine Crawling using Robots.txt”

No feedback yet.

Leave a Reply

Name Email Website URI