Today we are going see an essential concept in SEO that is Robots.txt .The importance of Robots.txt and its usabilities and the sites which offer you to generate Robots file.Usually the search engines come and visit our web site,crawl our pages and index our content,How to communicate with the search engines .How to tell the search engine not to index the particular content or page .These actions can be performed by the Robots file .First of all we will see
Diagrammatic explanation of Robots.txt

What is Robots.txt?
Robots.txt defined by webconfs “is a text (not html) file you put on your site to tell search robots which pages you would like them not to visit. Robots.txt is by no means mandatory for search engines but generally search engines obey what they are asked not to do. It is important to clarify that robots.txt is not a way from preventing search engines from crawling your site (i.e. it is not a firewall, or a kind of password protection) and the fact that you put a robots.txt file is something like putting a note “Please, do not enter” on an unlocked door – e.g. you cannot prevent thieves from coming in but the good guys will not open to door and enter. That is why we say that if you have really sen sitive data, it is too naïve to rely on robots.txt to protect it from being indexed and displayed in search results.
The location of robots.txt is very important. It must be in the main directory because otherwise user agents (search engines) will not be able to find it – they do not search the whole site for a file named robots.txt. Instead, they look first in the main directory (i.e. http://mydomain.com/robots.txt) and if they don’t find it there, they simply assume that this site does not have a robots.txt file and therefore they index everything they find along the way. So, if you don’t put robots.txt in the right place, do not be surprised that search engines index your whole site.
The concept and structure of robots.txt has been developed more than a decade ago and if you are interested to learn more about it, visit http://www.robotstxt.org/ or you can go straight to the Standard for Robot Exclusion because in this article we will deal only with the most important aspects of a robots.txt file. Next we will continue with the structure a robots.txt file. “
Structure oF Robots.txt file
User-agent:
Disallow:
“User-agent” are search engines’ crawlers and disallow: lists the files and directories to be excluded from indexing. In addition to “user-agent:” and “disallow:” entries, you can include comment lines – just put the # sign at the beginning of the line:
Some Famous Tools For Generating Robots.txt



This entry was posted on Tuesday, October 13th, 2009 at 1:26 am and is filed under SEO & SEM News. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.





