Facet Bot Blocker

Blocks bot requests that excessively probe faceted search parameters to protect site performance and SEO.

facet_bot_blocker
522 sites
20
drupal.org

Install

Drupal 11, 10 v1.0.2
composer require 'drupal/facet_bot_blocker:^1.0'

Overview

The Facet Bot Blocker module protects Drupal sites from malicious bots and crawlers that abuse faceted search functionality. When bots continuously request deeper levels of facet parameters (e.g., ?f[0]=color:red&f[1]=size:large&f[2]=brand:example), they can cause significant performance degradation and create SEO issues by generating excessive indexed URLs.

This module intercepts HTTP requests early in the Drupal bootstrap process via an event subscriber. When a request contains more facet parameters than the configured limit, the module immediately returns either a 403 Forbidden or 410 Gone response with a customizable message, preventing the request from consuming server resources.

For high-traffic sites, the module optionally integrates with Memcache or Redis to store configuration and tracking metrics in memory rather than the database, ensuring minimal performance overhead even during bot attacks.

Features

  • Blocks HTTP requests containing facet query parameters (f[]) exceeding a configurable limit
  • Choice between HTTP 410 Gone or HTTP 403 Forbidden response codes for blocked requests
  • Customizable HTML message displayed to blocked requesters
  • Dashboard with real-time metrics including blocked/allowed request counts, percentage of blocked requests, and details about the last blocked request (IP, path, user agent)
  • Bypass permission allowing authenticated users or specific roles to skip blocking
  • Automatic caching of configuration values when Memcache or Redis modules are installed for improved performance in high-traffic environments
  • Lightweight event subscriber with priority 101 for early request interception

Use Cases

Protecting search-heavy e-commerce sites

E-commerce sites with extensive product catalogs often implement faceted search allowing filtering by multiple attributes (color, size, brand, price range, etc.). Bots can exploit this by systematically combining all possible filter values, generating millions of URLs. This module prevents such abuse by limiting the number of simultaneous filter parameters a single request can contain.

Mitigating SEO pollution from bot-generated URLs

When bots create requests with many facet combinations, these URLs may get indexed by search engines, diluting your site's SEO value with low-quality or duplicate content pages. Using the 410 Gone response option signals to search engines that these URLs should be removed from their index.

Reducing server load from aggressive crawlers

Aggressive bot crawling of faceted search pages can consume significant server resources as each request may trigger database queries for filtering results. By blocking requests at the event subscriber level (before Drupal fully bootstraps), this module minimizes resource consumption from abusive traffic.

Protecting Solr/Elasticsearch backend from overload

Sites using Search API with Solr or Elasticsearch backends can experience search service overload when bots make excessive faceted queries. This module prevents those queries from ever reaching the search backend by blocking them early in the request cycle.

Tips

  • Start with a conservative facet limit (e.g., 2-3) and monitor the dashboard to see how many requests are being blocked. Adjust the limit based on your findings.
  • Use HTTP 410 Gone response code if SEO is a concern, as this helps search engines remove bot-generated URLs from their index faster than 403 Forbidden.
  • Grant the 'bypass facet bot blocker' permission to authenticated users or editor roles who may legitimately need to use deep faceted filtering.
  • Install Memcache or Redis for production sites to get full benefit of the metrics dashboard and to improve overall module performance.
  • Review the 'Last blocked User Agent' in the dashboard to identify specific bots that are most aggressively crawling your faceted search pages.
  • Consider combining this module with other bot protection measures like rate limiting or CAPTCHA for comprehensive protection.

Technical Details

Admin Pages 2
Facet Bot Blocker Settings /admin/config/system/facet-bot-blocker

Configure the module's blocking behavior including the facet parameter limit threshold, HTTP response code for blocked requests, and the custom HTML message displayed to blocked users.

Facet Bot Blocker Dashboard /admin/reports/facet-bot-blocker

View real-time statistics and metrics about blocked and allowed faceted search requests. This dashboard provides visibility into how effectively the module is protecting your site from excessive bot crawling.

Permissions 3
Administer Facet Bot Blocker settings

Allow users to modify the settings of the Facet Bot Blocker module including the facet limit, response code, and blocked message. This permission is restricted and should only be granted to trusted administrators.

Access Facet Bot Blocker dashboard

Allow users to view the dashboard reports page showing blocked and allowed request statistics.

Bypass Facet Bot Blocker blocks

Allow users to bypass the blocking mechanism entirely. Requests from users with this permission will never be blocked regardless of how many facet parameters they contain. Useful for authenticated editors or administrators who may legitimately use deep faceted filtering.

Troubleshooting 4
Dashboard shows 0 blocked/allowed requests even though the module is enabled

The metrics tracking requires either Memcache or Redis module to be installed. Without these modules, the dashboard cannot persist request counters between page loads. Install either drupal/memcache or drupal/redis module to enable metrics tracking.

Legitimate users are being blocked when using faceted search

Increase the facet parameter limit in the module settings. Analyze your site's legitimate faceted search usage to determine the maximum number of simultaneous facet filters users typically apply, then set the limit slightly above that number.

Blocked requests are still appearing in server access logs

This is expected behavior. The module blocks requests after the web server has logged the request but before Drupal fully processes it. The module prevents database queries and content rendering, not the initial HTTP connection.

Configuration changes are not taking effect immediately

When using Memcache or Redis, configuration values are cached. Clear the Drupal cache (drush cr) after making configuration changes to ensure new values are used.

Security Notes 3
  • The 'administer facet bot blocker' permission is marked as restricted and should only be granted to trusted administrators, as it controls security-relevant blocking behavior.
  • The custom blocked message field accepts HTML content - ensure only trusted administrators can modify this setting to prevent XSS vulnerabilities.
  • The module logs blocked IP addresses in the cache - be aware of data retention policies if storing this information long-term.