Do you block spiders?

Harry P

Well-known member
Registered
Joined
Feb 3, 2015
Messages
447
Points
28
Hey guys,
I have seen any features on my web hosting control panel to block spiders from indexing my sites. Of course, I have never really used them. However, more people suggested to block any bots/spiders to increase performance and wonder if I should start blocking some search spiders? what do you think about blocking spiders? What are the advantages and disadvantages of it?
 

EpicGlobalWeb

Well-known member
Registered
Joined
Jan 24, 2016
Messages
180
Points
0
You can and should block spiders from specific sections of your site. There is no need for them to index an admin panel for example. There are some fun, hacky ways to mess with spiders like making them "eat" a cookie but the standard way, if you don't want them to index you site you make a robot.txt file and type Disallow * in it.

But I don't think that is a good marketing move. And won't necessarily stop any search engine from scraping your site either. Your site is public so they may do it anyway. This is where you can get more creative and block them by other means.
 

elcidofaguy

Well-known member
Registered
Joined
Jan 13, 2015
Messages
866
Points
0
Blocking spiders should not have any impact to performance... in addition you'll want web crawlers for search engines to visit in order to get content indexed... The only situation when you want to keep certain spiders out such as Moz, Majestic, Ahrefs etc - is when you are operating PBNs (private networks) as then you hide your backlinks from competitors and with that your ranking strategy...
 

Hawker

Well-known member
Registered
Joined
Dec 22, 2015
Messages
287
Points
0
Blocking spiders should not have any impact to performance...
Yes for small sites / new sites, this shouldn't really be something you have to think about. However, sometimes you can get a chock full of thousands of spiders every second of every day as a big site and that can really reduce and effect performance for your actual visitors. That's about the only real time you'd want to block some spiders from visiting your site is when they are hogging resources and not providing anything in return. Also, some search engine bots are malware and there are known malware bots which are blocked by most good WP plugins like Wordfence if you're a WP user. Otherwise you can just use the robots.txt, htaccess or Meta Robots tag.
in addition you'll want web crawlers for search engines to visit in order to get content indexed...
Yes true that. But as much as we may wish for all the world's search engines to take notice of our Websites - when they've actually managed to crash your system a few times you may be pardoned for having second thoughts. Equally, when they're hitting your servers with such a task load, your visitors may easily get the impression that viewing your pages is akin to plodding through treacle — impeding sales and your company's reputation, not to mention the fact that it's anything but a great user experience.

So what to do about it? Well, how about simply blocking them? After all, as always in business, it's a question of what kind of a trade-off you're actually stuck with here.

Not all spiders are created equal, and only your specific online business model should govern your decision to either bear with them accessing your pages regularly or telling them to get lost. After all, bandwidth doesn't come cheap and losing sales due to poor server performance isn't particularly funny either.

Are you targeting the Russian market at all? If not, all that traffic created by Yandex search engine crawlers is something you may very well do without.

How about China? Japan? Korea? Chinese search engines such as Baidu, SoGou and Youdao will merrily spider your sites to oblivion if you let them. In Japan it's Goo, and in South Korea it's Naver that can mutate into performance torpedoes once they've started to fancy your website.

Nor is that all, because the search engines aren't the only culprits in this field.
The only situation when you want to keep certain spiders out such as Moz, Majestic, Ahrefs etc - is when you are operating PBNs (private networks) as then you hide your backlinks from competitors and with that your ranking strategy...
Not the only situation as explained but definitely one of the reasons. That's definitely a smart move. :)

You have to ask yourself. Are you happy with your competition sussing out your entire linking strategy (both incoming and outgoing)? A number of services around will help them do exactly that. Fortunately, at least one major contender, namely Majestic-SEO is perfectly open about things and lets you block their crawlers gracefully. (No such luck with most other setups…)

The other thing to consider is, if those other search engine spiders are visiting your site, and you are getting sales/customers/clicks from that, then do you really want to block that spider from crawling/indexing your site? If it is simply just crawling your site and eating up your logs and not providing anything in return not even clicks then that is when you should probably block them. That is what you need to know. :)

Some Tips on Blocking Spiders To Think About

  • If you're considering blocking search engine spiders, make sure you're doing it for the right reasons and not just because you've heard you can.
  • Don't try to use any methods of tricking the spiders such as using agent detection and redirection. Be up front by using the robots.txt file or Meta Robots tag.
  • Don't forget that just because you're using the recommended methods to block content you're safe. Understand how blocking content will make your site appear to the bots.
Reasons to block bots:

  • Less bots on your site and more bandwidth/performance/speed for your real visitors.
  • Helps to keep you safe against malware bots which search for vulnerabilities.
  • log size
Reasons to not to block bots:

  • Search engine bots can increase your traffic by indexing your website on more search engines.
 

postcd

Member
Registered
Joined
Jul 8, 2012
Messages
32
Points
8
just monitor your access logs and see if there are any excessive visit spiders and you may then gogole it and consider blocking them.
 
Older Threads
Replies
16
Views
8,190
Replies
8
Views
5,221
Replies
7
Views
2,963
Newer Threads
Replies
6
Views
4,994
Replies
9
Views
3,770
Replies
6
Views
4,276
Replies
21
Views
7,275
Latest Threads
Replies
1
Views
26
Replies
0
Views
189
Replies
1
Views
40
Replies
2
Views
83
Recommended Threads

Latest postsNew threads

Latest Hosting OffersNew Reviews

Sponsors

Tag Cloud

You are using an out of date browser. It may not display this or other websites correctly.
You should upgrade or use an alternative browser.

Top