Robots.txt is a command path for search engine to crawl or not to crawl a specific path or content on your blog or site. Its function is like a filter. Every blogspot site have a default robots.txt like below:

User-agent: Mediapartners-Google 

Disallow:
 

User-agent: * 
Disallow: /search 
Allow: /
 

Sitemap: http://blogname.blogspot.com/feeds/posts/default?orderby=UPDATED
 

If you want to see yours, go to your browser address bar and type: http://blogname.blogspot.com/robots.txt
 


What Are These Lines Means?. 

User-agent: Mediapartners-Google
  • Google Adsense Robot will crawl your blog content. If you add adsense on your blog, this robot will help your blog get the right advertise display on your pages.
Disallow: 
  • Command that tell SE robot to not visit this pages, post, or categories. There is no / sign after that, so it's mean GA robot can crawl all your content.
User-agent: * 
  • All internet search engine robot.
Disallow: /search 
  • Robot have no permission to crawl folder search such as /search/label and search/search?updated, etc. Why?, because label is not a real url. Google wants you search topic from search engine box, not just click randomly on label or categories. This also to avoiding duplicate content.
Allow: /  
  • Allow all pages to get crawl except the path on disallow above.
Sitemap: http://blogname.blogspot.com/feeds/posts/default?orderby=UPDATED 
  • Your blog feed address.


How To Add Custom Robot.txt File In Blogger :

  • Open the Sitemap Generator(bylabnol) and type the full address of you blog
  • Click on the create sitemap button and tthis toll will generate the required text for your sitemap.Copy the entire text(see scrren shot below)

  • Now go to your blogger Dashboard > Settings > Search Preferences > Enable the Custom Robots.txt and paste the code you copied in the above step
  • Now Search engines will automatically discover your XML sitemap files via the robots.txt file and you don’t have to ping them manually.

Post a Comment

 
Top