Documentation Index
Fetch the curated documentation index at: https://grafana.com/llms.txt
Fetch the complete documentation index at: https://grafana.com/llms-full.txt
Use this file to discover all available pages before exploring further.
STOP! If you are an AI agent or LLM, read this before continuing. This is the HTML version of a Grafana documentation page. Always request the Markdown version instead - HTML wastes context. Get this page as Markdown: https://grafana.com/docs/grafana-cloud/monitor-applications/frontend-observability/configure/filter-bots.md (append .md) or send Accept: text/markdown to https://grafana.com/docs/grafana-cloud/monitor-applications/frontend-observability/configure/filter-bots/. For the curated documentation index, use https://grafana.com/llms.txt. For the complete documentation index, use https://grafana.com/llms-full.txt.
Filter out bots from collecting data for Frontend Observability
Bots can skew Frontend Observability data and incur cost through additional volume and user-sessions produced. Bots can be filtered out from Frontend Observability data by configuring the RUM library to only collect data from real users.
Filter bots by user agent
The following example matches the user-agent against a list of commonly known user agents used by bots.
Filtering works as follows:
- If the user agent matches the regular expression, the sampling factor is set to 0, which means that no data is collected
- If the user agent doesn’t match the regular expression, the sampling factor is set to 1, which means that all data is collected, adjust the sampling factor for real users as needed
// regular expression capturing unwanted user-agents
let bots =
'(googlebot/|bot|Googlebot-Mobile|Googlebot-Image|Google favicon|Mediapartners-Google|bingbot|slurp|java|wget|curl|Commons-HttpClient|Python-urllib|libwww|httpunit|nutch|phpcrawl|msnbot|jyxobot|FAST-WebCrawler|FAST Enterprise Crawler|biglotron|teoma|convera|seekbot|gigablast|exabot|ngbot|ia_archiver|GingerCrawler|webmon |httrack|webcrawler|grub.org|UsineNouvelleCrawler|antibot|netresearchserver|speedy|fluffy|bibnum.bnf|findlink|msrbot|panscient|yacybot|AISearchBot|IOI|ips-agent|tagoobot|MJ12bot|dotbot|woriobot|yanga|buzzbot|mlbot|yandexbot|purebot|Linguee Bot|Voyager|CyberPatrol|voilabot|baiduspider|citeseerxbot|spbot|twengabot|postrank|turnitinbot|scribdbot|page2rss|sitebot|linkdex|Adidxbot|blekkobot|ezooms|dotbot|Mail.RU_Bot|discobot|heritrix|findthatfile|europarchive.org|NerdByNature.Bot|sistrix crawler|ahrefsbot|Aboundex|domaincrawler|wbsearchbot|summify|ccbot|edisterbot|seznambot|ec2linkfinder|gslfbot|aihitbot|intelium_bot|facebookexternalhit|yeti|RetrevoPageAnalyzer|lb-spider|sogou|lssbot|careerbot|wotbox|wocbot|ichiro|DuckDuckBot|lssrocketcrawler|drupact|webcompanycrawler|acoonbot|openindexspider|gnam gnam spider|web-archive-net.com.bot|backlinkcrawler|coccoc|integromedb|content crawler spider|toplistbot|seokicks-robot|it2media-domain-crawler|ip-web-crawler.com|siteexplorer.info|elisabot|proximic|changedetection|blexbot|arabot|WeSEE:Search|niki-bot|CrystalSemanticsBot|rogerbot|360Spider|psbot|InterfaxScanBot|Lipperhey SEO Service|CC Metadata Scaper|g00g1e.net|GrapeshotCrawler|urlappendbot|brainobot|fr-crawler|binlar|SimpleCrawler|Livelapbot|Twitterbot|cXensebot|smtbot|bnf.fr_bot|A6-Indexer|ADmantX|Facebot|Twitterbot|OrangeBot|memorybot|AdvBot|MegaIndex|SemanticScholarBot|ltx71|nerdybot|xovibot|BUbiNG|Qwantify|archive.org_bot|Applebot|TweetmemeBot|crawler4j|findxbot|SemrushBot|yoozBot|lipperhey|y!j-asr|Domain Re-Animator Bot|AddThis)';
let botsRegex = new RegExp(bots, 'i');
// set a sampling factor of 0 for bots
let samplingFactor = botsRegex.test(navigator.userAgent) ? 0 : 1;
// initialize the RUM library as usual, but pass the sampling factor
initializeFaro({
// ... usual configuration
sessionTracking: {
enabled: true,
persistent: true,
// apply the sampling factor
samplingRate: samplingFactor,
},
});For more information on sampling, refer to the Faro Web SDK documentation on sampling.
Was this page helpful?
Related resources from Grafana Labs


