Skip to content

Instantly share code, notes, and snippets.

@Exadra37
Last active August 17, 2016 05:42
Show Gist options
  • Save Exadra37/9453909 to your computer and use it in GitHub Desktop.
Save Exadra37/9453909 to your computer and use it in GitHub Desktop.
Class to detect the user agent from the global var $_SERVER. Detects if is a bot, spider, crawler based on the HTTP_USER_AGENT and a list of web sniffers names or types.
<?php
/**
* UserAgentDetector
*
* @author Exadra37
* @license Free to use at your own risk. Provided as it is, without any warranty that will work for you.
* @link http://presta-hosting.com
* @version 1.0.1
* @since v.1.0.0 - 09/03/2014
* v.1.0.1 - 09/03/2014 - fix typos
*
*/
class UserAgentDetector
{
/**
* $userAgent
* - name of the user agent if present in the global var $_SERVER
*
* @var mixed
*
* @access protected
*/
protected $userAgent;
/**
* $crawlersNames
* - List of Crawler, bots, spiders
*
* @var string
*
* @access public
*/
public $crawlersNames;
/**
* $botTypes
* - list of types of web sniffers like bot, crawl, spider and slurp
*
* @var string
*
* @access public
*/
public $botTypes;
public function __construct()
{
$this->userAgent = isset($_SERVER['HTTP_USER_AGENT']) ? $_SERVER['HTTP_USER_AGENT'] : null;
// List of Crawler, bots, spiders
// Add man as you want from a online list, like this one:
// - http://www.robotstxt.org/db.html
$this->crawlersNames = 'Google|msnbot|Rambler|Yahoo|AbachoBOT|accoona|
AcioRobot|ASPSeek|CocoCrawler|Dumbot|FAST-WebCrawler|
GeonaBot|Gigabot|Lycos|MSRBOT|Scooter|AltaVista|IDBot|eStyle|Scrubby';
// List of web sniffers
$this->botTypes = 'bot|crawl|slurp|spider';
}
/**
* Detects if is a bot, spider, crawler based on the HTTP_USER_AGENT and a list of crawlers.
*
* Inspired in http://wanderr.com/jay/detect-crawlers-with-php-faster/2009/04/08/
*
* @author Exadra37
*
* @link http://presta-hosting.com
*
* @return boolean - return true if is a bot, otherwise returns false.
*/
public function isSpecificCrawler()
{
return !empty($this->userAgent) ? preg_match("/{$this->crawlersNames}/", $this->userAgent) > 0 : false;
}
/**
* Simple detection to see if is a bot, spider, crawler or slurp based only on the HTTP_USER_AGENT.
*
* Inspired in http://stackoverflow.com/a/15047834
*
* @author Exadra37
*
* @link http://presta-hosting.com
*
* @return boolean - return true if is a bot, otherwise returns false.
*/
public function isGenericBot()
{
return !empty($this->userAgent) ? preg_match("/{$this->botTypes}/", $this->userAgent) > 0 : false;
}
/**
* Get the User Agent from the global var $_SERVER
*
* @author Exadra37
*
* @link http://presta-hosting.com
*
* @return string - if isset returns user agent from global $_SERVER, otherwise returns null
*/
public function getUserAgent()
{
return $this->userAgent;
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment