There are many ways to tell whether a request is sent by bot or crawler, one of them is check the user agent string.

Normal user agent only contains information about browser version and platform information. Crawler has some special characteristics.

For example , some of them will contains a domain name, we can ignore them when recording visitor information.

Here is two functions can filter out most crawler requests by looking at the user agent string.

 
function containsDomain($str) {
    $pattern = "/([\da-z\.-]+)\.([a-z\.]{2,6})/i";
    $matched = preg_match($pattern, $str, $matches);
    return $matched > 0;
}
 
function uaExcludeFilter($ua) {
    if(strstr(strtolower($ua), "crawler") !== FALSE ||
       strstr(strtolower($ua), "spider") !== FALSE ||
       strstr(strtolower($ua), "twitterbot") !== FALSE ||
       strstr(strtolower($ua), "favicon") !== FALSE ||
       strstr(strtolower($ua), "semrush") !== FALSE
    ) {
        return true;
    }
 
    return containsDomain($ua);
}