20 Apr How To Create And Configure Your Robots.txt File
If you’re not serious at the moment of reading this tutorial, then you need to.
Robots.txt isn’t just a normal file, rather it can make or break the website you’re running so long. If this file is configured correctly, then it can help the search engine bots and thus improving the organic traffic.
But, on the other hand, it can stop the same bots from accessing your websites and thus, it will be a barrier to the organic traffic and your website.
So, don’t scan this tutorial, rather read it, learn it and I’ll help you know how to create and configure your Robots.txt file.
Table of Contents
What’s a Robots.txt file?
If you’re unsure of the reason for the existence of this file, then you need to go through this part. Otherwise, you can skip to our main tutorial.
It is a kind of standard followed by major search engines. Every website should have a Robots.txt file if the developer wants the search engine to access only a part of it, not complete.
Search Engine bots are always moving around the internet, finding new webpages, scanning changes in older ones, and thus indexing them according to their ranking algorithms in their database.
Since, we both own a web blog, we want it to be indexed as quickly as it can in the search engines. After mastering content strategy, web development, SEO, and any other related aspect to a website’s success, Robots.txt file needs to be touched. And, it can open a wonderful path.
It is a simple text file that can be created using the Notepad program, and it holds onto certain codes, as specified by the search engine ethics.
Whenever a bot arrives at your site to scan and considers indexing its webpages, it first checks the code written to this special file.
If a web directory or a path or webpage is allowed to index, according to that file, then bot proceeds accordingly. But, if the webmaster doesn’t want the bot to index few gray areas, then it won’t proceed in those locations.
Note – This is a standard followed by all legit search engines, and they’re loyal and genuine bots. It doesn’t mean that every bot follows those ethics. But, for us, these ethics are followed by Google, and that’s all we should care.
How to Create a Robots.txt file
You just need to start your computer, open a new Notepad or text file with whatever similar program your computer is having.
Now, save that file with ‘robots.txt’ named and extension, and you’re done creating the file.
But, the work isn’t over yet.
That empty file won’t do any wonders, rather you need to add some codes. Don’t worry, I’m coming to them, and it’s the real part that needs complete attention.
User-agent: googlebot
Disallow: /cgi-bin
This is the first set of instructions that you can add to the file at your end. It will allow the Google bots to scan the root folders and files of your website except the ‘cgi-bin’ folder.
Getting the point?
Let’s move onto the advanced part.
How to configure your Robots.txt file
Before we get started, I want you to know that you need to write these codes as written here. I’ll leave you a link to the complete source of every related information on this topic, and I recommend you to tally things from there.
The two basic codes or rules.
User-agent – Using this code, you’re defining the specific search engine bot or all of them.
Disallow – this tells the bot that the ordered file/folder path is restricted and shouldn’t be indexed.
Example,
User-agent: *
Disallow: /images/
Now, this code, calls onto every bot, no matter which search engine it comes from, and allows them to access and index your whole website, except the ‘/images/’ folder.
Also, it is better to use ‘*’ i.e. writing the code for all search engine bots rather than specifying only a few of them. But, just to update your knowledge, following are the bot names of popular search engines.
- Googlebot – for Google
- Googlebot-News – for Google News
- Googlebot-Image – for Google Images
- Bingbot – for Bing (remember this search engine from Microsoft?)
Within this file, you can even type in the URL of sitemap of your website, and it will help the bot to find quickly the same. If you’re not aware of its benefit, then it helps in indexing the new webpages of your website, ASAP.
For a normal WordPress blog, you can use the following sets of rules within that file.
sitemap: https://85ideas.com/sitemap_index.xml
User-agent: *
Disallow: /cgi-bin/
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /wp-content/
Disallow: /archives/
Disallow: /*?*
Disallow: /comments/feed/
Disallow: *?replytocom
Disallow: /wp-*
User-agent: Googlebot-Image
This is just a sample, and yours needs to have the sitemap file URL of your blog and other sets of rules as per the condition at your end.
If you don’t want to do it by yourself and want a better looing solution, then you should use WordPress SEO plugin by Yoast, as it will allow you to do related settings through simple checkboxes with all detailed information you require at that time.
Once the file is ready, then you simply need to save it, and then upload it to the root folder to your hosting server, through File Manager app in cPanel or any FTP manager you use.
Here is the official thread on this matter by Google, having the complete information you wish to know. Do check this out before proceeding.
Also, make sure that a complete backup of the website is there with you and also keep the backup of Robots.txt file if it’s already existing.
Do let me know if you face any issues or want to know anything in detail. Don’t forget to share this important stuff with your fellow blogger friends. Peace.
No Comments