What is the robots.txt file and how to tell him to the finders which can and it cannot see help to your webpage


When you have a webpage, the normal thing is that she to the finders dedicates many efforts to like so that they compensate to you with better positions in his results. He is something that passes us to all.

But, you knew that there is a way to tell them where can enter and where no? Thus it is, by much power that they have, you can have the last word and everything thanks to the robots.txt file.

Of what it consists? What advantages have? How you can create one? All those questions are those that we answered in the next lines.

Prepare to you, today you say to Google “here control to him I”… And the best thing is than it will thank for it to you. 😉

What is the robots.txt file?

If you go to its page in Wikipedia, the explanation that there is the following one:

A robots.txt file in a website will work as a request that specific that determined robots does not make case to specific archives or directories in their search.

Or shelp with other words, one is a file that the robots read that index the webpages where it is indicated that parts of this must ignore. That is its main function but, as we see more ahead, they give game much more.

Why so is important the robots.txt file?

Although one is an optional file, our advice is that your webpage has him. The reason? All the advantages that offer:

What seems to you? It is worth or the trouble not to have one? We do not have any doubt… A full one yes.

How to create your own file robots.txt

To create the robots.txt file does not have complication any but, before teaching to you how to do it, it is important that you know these aspects:

Once this is clear, it is called on to put hands to the work. To create archivo.txt is very simple, since it is enough with opening the notepad (or any other similar program), including the restrictions that you want and to keep it with the robots.txt name.

1. More important commandos

The form in which the file “speaks” to the spiders of the finders must fulfill some requisite that pick up in the Robots Protocol Exclusion:

  • You must only use the allowed commandos.
  • The robots distinguish between small letters and capital letters, punctuations and spaces reason why it is necessary to respect them.
  • In order to put a commentary, the pad is used (#).

And now yes, the main commandos are:

  • User-agent: it is obligatory and it indicates the robot of the finder that must follow the order (you can consult the name of each here).
  • Disallow: with this commando you indicate the directory or URL who does not have to be tracked.
  • Allow: it is used to revoke disallow and to allow that a subdirectory of a blocked directory is acceded to an a.
  • Sitemap: in case you count on several of these archives, in this commando it is indicated which must rake. He is optional.

To part of that, you can include in the certain commandos characters that help to spin finer:

  • Asterisk (*): it is a form to say that “all bond”. For example, if you have the galleries of images ordinates by directories and wants to avoid its indexing, you would use “/galeria*/”.
  • Dollar ($): it is used to make reference at the end of a direction Web. For example, “/.aspx$” causes that the archives are not compiled that finish with that extension.

As you see, this contributes some very interesting possibilities.

Example of file robots.txt

The moment arrives for seeing “in action” this tool and for it, we have created an example of robots.txt file that you could use in any webpage:

User-Agent: *

Disallow: /imagenes

Allow: /imagenes/fotografias/

Sitemap: https://tupaginaweb.com/sitemap.xml

We see what means each line:

  1. It is a form to indicate that all the robots must fulfill the norms.
  2. We avoid that they track a concrete directory.
  3. Him tenth that of the directory who was striped, yes that can index a concrete subdirectory.
  4. The direction of sitemap of the webpage.

And it remembers that you can sharpen more with the use of the asterisk and the dollar.

It verifies that the robots.txt file does not have errors

By very optional that is, once you have it, is important to have the certainty that it does what you want that does, something simple to verify if you know Google Webmasters Tools. In particular, you must enter Search Console, one of the tools that compose the suite that the great G makes your available.

Once you are inside, you have two options (both in the menu of “Tracking”):

  • To explore as Google: it is enough with which you click in “Obtaining and processing” so that it shows to you how the finder sees your webpage.
  • Fitting room of robots.txt: as it indicates his own name, one is a place where to verify that everything is correct within the file.

You decide on the option that you choose, is very important that the result is of zero errors because, if something goes bad in robots.txt, means that your site goes bad.

Does Your webpage count on a robots.txt file?

Or it is of those which the finders enter “until the kitchen”? If you have something to tell us about the robots.txt file, want to tell your experience us creating it or have some doubt, you do not doubt in writing us in the commentaries of more down.

We want that you tell us what it goes up to around the head to you on this fear so… to by them! 😉