Harnessing the Power of Robots.txt

Domains | Total Words: 246

Once we have a website up and running, we need to make sure that all visiting search engines can access all the pages we want them to look at.

Sometimes, we may want search engines to not index certain parts of the site, or even ban other SE from the site all together.

This is where a simple, little 2 line text file called robots.txt comes in.

Robots.txt resides in your websites main directory (on LINUX systems this is your /public_html/ directory), and looks something like the following:

User-agent: *
Disallow:

The first line controls the bot that will be visiting your site, the second line controls if they are allowed in, or which parts of the site they are not allowed to visit

If you want to handle multiple bots, then simple repeat the above lines.
So an example:

User-agent: googlebot
Disallow:

User-agent: askjeeves
Disallow: /

This will allow Goggle (user-agent name GoogleBot) to visit every page and directory, while at the same time banning Ask Jeeves from the site completely.
To find a reasonably up to date list of robot user names this visit...

To view and download this full PLR article, you must be logged in. Registration is completely free. Once you create your account, you will be able to browse, search & downlod from our PLR articles database of over "1,57,897+" on 1,000's of niches and 200+ categories without paying a penny. Click here to signup...

Related Articles You Might Like