Does it differentiate between Robot Visitors & Humans?

General discussions about Advanced Log Analyzer
Post Reply
Guest

Does it differentiate between Robot Visitors & Humans?

Post by Guest »

Small website here, don't want disproportionately large number of robot visitors to distort log data...so does Abacre do this? Any help appreciated thanks alot.
#

Abacre
Site Admin
Posts: 1223
Joined: Mon Jan 31, 2005 5:32 pm

Post by Abacre »

Hi,

This is a good question. Here is a good tip how to solve it.

1. Note that names of robots defined in "user agent" field of log
file. So:

2. Periodically watch "Last 30 visitors paths" report or even a
better idea: see "Most common user agents" reports. Set "limit"
parameter (number of rows in report table) rather high, for ex., 250.
So you will see robots have the following names:

Code: Select all

JennyBot/0.1
Slurp/si (slurp@inktomi.com; http://www.inktomi.com/slurp.html)
ArchitextSpider


Sometimes new robots appear each day.

3. Go to tab: Site, check "Show advanced options". Then add a filter
for each robot. For example for elimination of
Slurp/si (slurp@inktomi.com; http://www.inktomi.com/slurp.html)

You may simply add the filter:

Code: Select all

parameter: agent
condition: includes
value: Slurp
action: skip_line


Note: you don't need to use whole
Slurp/si (slurp@inktomi.com; http://www.inktomi.com/slurp.html)
as a value. Just use Slurp.

So for each found robot create a filter. Most common names of robots
include: Bot, Spider, bot (filters are case sensitive).

So adding

Code: Select all

parameter: agent
condition: includes
value: Bot
action: skip_line


will eliminate not only JennyBot/0.1, but GoogleBot and others.

Second variant: just add a one filter instead of several filters:

Code: Select all

parameter: agent
condition: not_includes
value: Mozilla
action: skip_line


Normally all web browsers (opera, IE, Mozilla) have a such agent:
Mozilla/4.0 (compatible; MSIE 5.01; Windows
So you may use the filter above. But not, that it will also eliminate
some download managers. For example, GetRight download manager has the
following user agent:
GetRight/4.3

Post Reply