Tag: Robots Txt
How to Prevent Duplicate Content with Effective Use of the Robots.txt and Robots Meta Tag
Posted on Jun.30, 2009, under Web Traffic No Comments
Duplicate content is one of the problems that we regularly come across as part of the search engine optimization services we offer. If the search engines determine your site contains similar content, this may result in penalties and even exclusion from the search engines. Fortunately it’s a problem that is easily rectified.
Your primary weapon of choice against duplicate content can be found within "The Robot Exclusion Protocol" which has now been adopted by all the major search engines.
There are two [...]
Search Engine Spiders Lost Without Guidance – Post This Sign!
Posted on Jun.06, 2009, under Web Traffic No Comments
The robots.txt file is an exclusion standard required by all web crawlers/robots to tell them what files and directories that you want them to stay OUT of on your site. Not all crawlers/bots follow the exclusion standard and will continue crawling your site anyway. I like to call them “Bad Bots” or trespassers. We block them by IP exclusion which is another story entirely.
This is a very simple overview of robots.txt basics for webmasters. For a complete and thorough lesson, [...]
The Role of the Robots.txt File to Improve Site Ranking!
Posted on Jun.02, 2009, under Web Traffic No Comments
Not many web master take the time to use a robots.txt file for their website. For search engine spiders that use the robots.txt to see what directories to search through, the robots.txt file can be very helpful in keeping the spiders indexing your actual pages and not other information, such as looking through your stats!
The robots.txt file is useful in keeping your spiders from accessing parts folders and files in your hosting directory that are totally unrelated to your actual [...]
Link Exchange Tips, No Tricks
Posted on May.29, 2009, under Web Traffic No Comments
Use text links, avoid image links.
Anyhow, if you have used image links, then always make sure to put your keywords in the alt tags.
Put your prime keywords in text links and always insure to put a short descriptions of your website/page in minimum of 20-30 words or more if allowed.
In text links if you are not allowed to add descriptions then always try to put a short descriptions in title tag. E.g.: < a href =” link-exchange.html ” title =” [...]
Robots.txt or how to get your site properly spidered, crawled, indexed by bots
Posted on Mar.27, 2009, under Web Design No Comments
So you heard about someone stressing the importance of the
robots.txt file, or noticed in your website’s logs that the
robots.txt file is causing an error, or somehow it is on the
very top of the top visited pages, or, you read some article
about the death of the robots.txt file and about how you should
not bother with it ever again. Or maybe you never heard of the
robots.txt file but are intrigued by all that talk about
spiders, robots and crawlers. In this article, I [...]
How to Keep Robots Out of Your Web Site
Posted on Feb.02, 2009, under Web Design No Comments
THE ROBOTS.TXT FILE
You know that search engines have been created to help people find information quickly on the Internet, and the search engines acquire much of their information through robots (also known as spiders or crawlers), that look for web pages for them.
The spiders or crawlers robots explore the web looking for and recording all kinds of information. They usually start with URL submitted by users, or from links they find on the web sites, the sitemap files or the [...]
Harnessing the Power of Robots.txt
Posted on Jan.28, 2009, under Web Design No Comments
Once we have a website up and running, we need to make sure that all visiting search engines can access all the pages we want them to look at.
Sometimes, we may want search engines to not index certain parts of the site, or even ban other Search Engines from the site all together.
This is where a simple, little 2 line text file called robots.txt comes in.
Robots.txt resides in your websites main directory (on LINUX systems this is your /public_html/ directory), [...]
Tips to Protect Your Downloads or Products
Posted on Jan.14, 2009, under Web Design No Comments
1. Upload robots.txt file in to your root directory and include the folder name where you set your downloads.
More information on how to set robots.txt: http://www.webmasters-central.com/wp/se/robotstxt.shtml
2. Set the permission of the download folder to 711 OR upload an index file to that folder. This makes that folder web inaccessible.
For example create a folder named ‘test’. Usually by default it will be chmoded to 755 or 777. Put some files like test.htm, test1.htm.
Now you type the URL of the folder – [...]
The robots.txt file
Posted on Dec.16, 2008, under Web Design No Comments
Since the beginning of Internet there is a need to index the Web
and many robots are built for this purpose. You already know
that famous Google bot which is indexing the Web to keep track
of urls and build a scheme out of it (link popularity
algorithm…).
There are not so many way to scan a website but some pages of a
website might not need to be crawled for any reasons such as
privacy…
A Standard for Robot Exclusion has been created and now
robots from search [...]
The robots.txt file
Posted on Nov.26, 2008, under Web Design No Comments
Since the beginning of Internet there is a need to index the Web
and many robots are built for this purpose. You already know
that famous Google bot which is indexing the Web to keep track
of urls and build a scheme out of it (link popularity
algorithm…).
There are not so many way to scan a website but some pages of a
website might not need to be crawled for any reasons such as
privacy…
A Standard for Robot Exclusion has been created and now
robots from search [...]
