Wazuh ships with over 3,000 built-in rules. That sounds like plenty, until you need to detect something that isn't in the default ruleset. An obscure web shell signature in an IIS log. A WordPress scan hitting xmlrpc.php 40 times in 30 seconds from a single IP. An HAProxy log that's JSON-wrapped inside a Docker container and doesn't match any built-in decoder. When those moments arrive, you either write custom decoders and rules or you miss the alert.
This tutorial is built on Bayu Sangkaya's open-source repository, wazuh-custom-rules-and-decoders — production-grade decoders, rules, integrations, and active-response scripts from real SOC deployments. Bayu also maintains materi_wazuh, the best Wazuh training curriculum I've found in the Indonesian infosec community.
Everything in this tutorial is built on Bayu Sangkaya's work. The decoder and rule examples are lifted directly from his repository, I'm explaining them, not inventing them. If this article helps you, drop a star on his repo. Open-source security tools live and die on community support.
Why Built-In Rules Aren't Enough
Wazuh's default ruleset is broad, SSH brute force, Windows Event Log anomalies, file integrity changes, vulnerability scans. But "broad" is not the same as "complete." The defaults are designed to cover common attack patterns across generic environments. Your environment is not generic.
Here's what breaks: a web shell is dropped into /var/www/html/ on a Linux server. Wazuh's FIM rules fire, rule 554, "File added to the system." Level 5. Informational. No alert, no correlation, no mention of "web shell." A human has to see rule 554, check the filename, recognize .php in the web root, and manually escalate. At 3 AM, that human doesn't exist.
A custom rule solves this. It watches for FIM events on files ending in .php, .aspx, .jsp, scripting extensions, and fires at level 12. It watches for file content changes containing eval, passthru, or shell_exec and fires at level 15. It tags the alert with MITRE T1505.003 (Server Software Component: Web Shell). Now you don't need a human to connect the dots, the rule does it at machine speed.
Custom rules turn "something happened" into "this specific threat happened, at this severity, on this host, right now." That's the difference between a log aggregator and a SIEM.
Decoders vs. Rules: The Two-Layer Model
Before writing anything, understand the data flow inside Wazuh's analysis engine:
Raw Log → [ Pre-decoding ] → [ Decoders ] → [ Rules ] → Alert
extract timestamp, extract fields match conditions,
hostname, program from raw text assign severity,
name first generate alert
Decoders parse raw log lines into structured fields. A raw HAProxy log like 192.168.1.10:54321 [15/Jun/2026:14:22:10] frontend_http backend_servers/server01 is meaningless to a rule engine. A decoder extracts the source IP, timestamp, frontend name, and backend name into named fields, $srcip, $accept_date, $frontend_name, $backend_name. After decoding, rules can reference these fields by name instead of regex-matching the same raw string over and over.
Rules evaluate decoded fields against conditions and fire alerts. They check if a field matches a pattern, if the same source IP triggered a certain rule multiple times in a time window, or if a decoded program name equals a suspicious value. Rules also assign severity (level 0–15), map to MITRE ATT&CK, and route alerts to specific groups for reporting.
The two are tightly coupled: a rule's <decoded_as> tag tells Wazuh which decoder must match first. If your decoder doesn't fire, your rule, no matter how perfectly written, will never trigger.
Bayu Sangkaya's Repository Structure
Here's what's in the repo and where each piece goes on your Wazuh manager.
wazuh-custom-rules-and-decoders/
├── decoders/ # Custom XML decoders
│ ├── haproxy_decoder.xml # HAProxy log parsing (Docker + plain)
│ └── webshell_command_decoder.xml # Web shell network connection parser
├── rules/ # Custom XML rules
│ ├── 500500-webshell-rules.xml # Web shell detection (FIM + auditd + Sysmon)
│ ├── 841101-wpscan_rules.xml # WordPress scan detection (frequency-based)
│ ├── 500554-judol_rules.xml # Judol phishing kit detection
│ ├── 800001-haproxy_rules.xml # HAProxy attack detection (14 KB, extensive)
│ ├── 100620-misp_rules.xml # MISP threat feed integration
│ ├── 100625-opencti_rules.xml # OpenCTI threat feed integration
│ ├── Openbao-rules.xml # OpenBao secret manager monitoring
│ └── active-response.xml # Active response trigger rules
├── ESET-integration/ # ESET antivirus log forwarder
│ ├── 420010-eset_rules.xml # Rules for ESET detection events
│ ├── eset_logcollector.py # Python log collector daemon
│ └── eset_daemon.service # Systemd service unit
├── integration/ # Custom integration scripts
│ ├── custom-misp.py # MISP threat intelligence integration
│ ├── custom-thehive.py # TheHive case management integration
│ ├── custom-dfir_iris.py # DFIR-IRIS case management integration
│ └── custom-telegram.py # Telegram alert notification
├── active-response/ # Automated response scripts
│ ├── quarantine-malware.sh # Linux malware quarantine
│ ├── quarantine-webshell.sh # Linux web shell quarantine
│ ├── remove-malware.py # Cross-platform malware removal
│ └── remove-malware.exe # Windows malware removal binary
├── sysmon/ # Sysmon configuration & rules
│ ├── sysmonconfig.xml # Full Sysmon config (300 KB)
│ └── 255000-sysmon_rules.xml # Sysmon event correlation rules
├── audit/ # Auditd rules
│ └── 10-webshell.rules # Auditd syscall rules for web shells
├── browser-monitoring/ # Browser history monitoring
│ ├── browser-history-monitor.py # Python history collector
│ ├── installer-script.sh # Linux installer
│ └── windows-installer.ps1 # Windows installer
├── install-agent.sh # Interactive Linux agent installer
├── install-agent.ps1 # Interactive Windows agent installer
└── README.md
The naming convention tells you where things go: files in decoders/ land in /var/ossec/etc/decoders/ on the manager. Files in rules/ land in /var/ossec/etc/rules/. Integration scripts go to /var/ossec/integrations/. Active-response scripts go to /var/ossec/active-response/bin/ on the agent side.
The numeric prefixes on rule files (500500, 841101, 800001) are rule ID ranges. Wazuh reserves specific ranges: 1–999 for system rules, 1000–5999 for built-in rules, and 100000+ for custom rules. Bayu's numbering is deliberate, 500500 for web shells sits in a distinct namespace, 841101 for WPScan maps to his WAF rule range, and 800001 starts the HAProxy range. Don't reuse these IDs in your own custom rules unless you merge carefully.
Installation: Clone, Copy, Configure
Getting Bayu's rules and decoders onto your Wazuh manager takes three steps.
# 1. Clone the repository
cd /tmp
git clone https://github.com/bayusky/wazuh-custom-rules-and-decoders.git
cd wazuh-custom-rules-and-decoders
# 2. Copy decoders and rules to Wazuh directories
sudo cp decoders/*.xml /var/ossec/etc/decoders/
sudo cp rules/*.xml /var/ossec/etc/rules/
# 3. Copy integrations and active-response scripts
sudo cp -r integration/* /var/ossec/integrations/
sudo cp active-response/quarantine-malware.sh /var/ossec/active-response/bin/
sudo cp active-response/quarantine-webshell.sh /var/ossec/active-response/bin/
sudo cp active-response/remove-malware.py /var/ossec/active-response/bin/
# 4. Fix permissions — Wazuh runs as user 'wazuh'
sudo chown -R wazuh:wazuh /var/ossec/etc/decoders/
sudo chown -R wazuh:wazuh /var/ossec/etc/rules/
sudo chown -R wazuh:wazuh /var/ossec/integrations/
sudo chmod 750 /var/ossec/integrations/*.py
sudo chmod 750 /var/ossec/active-response/bin/*.sh
sudo chmod 750 /var/ossec/active-response/bin/*.py
# 5. Restart the Wazuh manager to load new decoders and rules
sudo systemctl restart wazuh-manager
Step 2 is critical: Wazuh's analysis engine reads every XML file in /var/ossec/etc/rules/ and /var/ossec/etc/decoders/ at startup. If any file has a syntax error, a missing closing tag, an unescaped &, a duplicate rule ID, the entire analysis engine fails to load. Your SIEM goes silent. No warning in the dashboard, no error visible in the UI. You only catch it by checking /var/ossec/logs/ossec.log.
Verify the restart succeeded:
# Check that analysisd is running
sudo /var/ossec/bin/wazuh-control status | grep analysisd
# Look for rule loading errors
sudo grep -i "error\|fail\|syntax" /var/ossec/logs/ossec.log | tail -20
# Count how many rules are loaded (should be >3000)
sudo grep -c "Rule loaded" /var/ossec/logs/ossec.log
If the rule count is lower than expected, or if you see "ERROR: Rule 'xxxxx' has duplicate id", you have an ID collision. Remove the conflicting file, fix the duplicate, and restart.
Registering Custom Rules in ossec.conf
Just copying files into the directories isn't always enough. Wazuh's main configuration at /var/ossec/etc/ossec.conf must explicitly include custom rule and decoder files:
<ossec_config>
<ruleset>
<!-- Built-in rules (already present) -->
<include>rules_config.xml</include>
<rule_dir>rules</rule_dir>
<decoder_dir>decoders</decoder_dir>
<!-- Bayu Sangkaya custom rules -->
<rule_file>500500-webshell-rules.xml</rule_file>
<rule_file>841101-wpscan_rules.xml</rule_file>
<rule_file>800001-haproxy_rules.xml</rule_file>
<rule_file>500554-judol_rules.xml</rule_file>
<!-- Only include the rules you actually need -->
<!-- Comment out unused ones to reduce processing overhead -->
<list>etc/lists/audit-keys</list>
<list>etc/lists/security-eventchannel</list>
</ruleset>
</ossec_config>
If you dropped the files into the directories but didn't add the <rule_file> references, Wazuh may ignore them depending on your version. Wazuh 4.x uses rules_config.xml as a master include list, check that file before adding standalone <rule_file> entries, or use the <rule_dir> directive which auto-loads all XML files in a directory.
Decoder Syntax: The Parsing Layer
A decoder tells Wazuh how to extract structured fields from an unstructured log line. The syntax is XML and the logic follows a parent-child hierarchy.
Core Decoder Elements
<decoder name="my_decoder">
<program_name>sshd</program_name> <!-- Match on syslog program name -->
<parent>ossec</parent> <!-- Inherit from parent decoder -->
<prematch offset="after_parent">^Failed</prematch> <!-- Coarse pattern to trigger -->
<regex offset="after_prematch" type="pcre2">(\S+) from (\S+)</regex> <!-- Field extraction -->
<order>user, srcip</order> <!-- Map regex groups to field names -->
</decoder>
name, A unique identifier. Rules reference decoders by this name via their <decoded_as> tag. If two decoders share the same name, Wazuh merges their fields, useful for multi-stage parsing of complex logs.
program_name, Matches against the syslog program field (the second word in a standard syslog line). If your log comes in as Jan 15 10:30:00 webserver haproxy[1234]: 192.168..., the program_name is haproxy. This is the fastest way to route a log to the right decoder, if it doesn't match, Wazuh skips the decoder entirely.
parent, Specifies which decoder this one extends. A child decoder only fires after its parent has matched. This is how you build multi-stage parsing chains: the parent decoder does a coarse match ("this is an HAProxy log"), and children extract specific fields in sequence. The parent value can be another decoder's name or a built-in like ossec, json, or syslog.
prematch, A coarse regex that must match for the decoder to activate. The offset attribute controls where matching starts: after_parent means "search the remainder of the log after the parent decoder consumed its portion." Use prematch as a fast gate, if the log doesn't contain a specific keyword, skip the expensive full regex.
regex, The field extraction regex. Each capture group (...) maps to a field in the <order> list. The type="pcre2" attribute enables PCRE2 syntax (lookaheads, named groups, Unicode classes). Without it, Wazuh uses OS_regex (a simpler, faster engine). For production rules, prefer PCRE2, it handles edge cases that OS_regex silently fails on.
order, A comma-separated list of field names, one per regex capture group. After matching, Wazuh stores the values as $fieldname variables accessible in rules. The order must match the capture groups exactly, if your regex has 4 groups and your order lists 3 fields, the fourth group's value is silently discarded.
type, Optional. Setting <type>web-log</type> tells Wazuh to apply web-specific post-processing (URL normalization, query string parsing). Use this for Apache, Nginx, or HAProxy access logs.
use_own_name, Rare but important. When set to true, a child decoder that shares a parent's name uses its own name as the decoded_as value instead of inheriting the parent's. Critical for Bayu's HAProxy Docker decoder, where the parent is json but rules need to reference haproxy.
Parent-Child Chain Example (HAProxy)
Bayu's HAProxy decoder demonstrates a real multi-stage chain. Here's a simplified version:
<!-- Stage 1: Identify this as an HAProxy log -->
<decoder name="haproxy">
<program_name>haproxy</program_name>
<prematch>\d+.\d+.\d+.\d+:\d+ \S+ \S+</prematch>
</decoder>
<!-- Stage 2: Extract client info -->
<decoder name="haproxy1">
<parent>haproxy</parent>
<regex type="pcre2">(\d+.\d+.\d+.\d+):(\d+) \[(\S+)\] (\S+) (\S+)/(\S+) (\S+)</regex>
<order>srcip, srcport, accept_date, frontend_name, backend_name, server_name, timer</order>
</decoder>
<!-- Stage 3: Extract termination and connection stats -->
<decoder name="haproxy1">
<parent>haproxy</parent>
<regex type="pcre2">- - (\S+) (\d+/\d+/\d+/\d+/\d+) (\d+)/(\d+)</regex>
<order>termination_state, connections, server_queue, backend_queue</order>
</decoder>
<!-- Stage 4: Extract headers and URL -->
<decoder name="haproxy1">
<parent>haproxy</parent>
<type>web-log</type>
<regex type="pcre2">\{(.*)\} "(.*)"</regex>
<order>headers, url</order>
</decoder>
Notice all three children share the name haproxy1, they all extend the same parent haproxy, and their fields accumulate. After all four stages run, a single HAProxy log has been parsed into $srcip, $srcport, $frontend_name, $backend_name, $termination_state, $url, and more, all available to rules as named fields.
HAProxy logs can be 200+ characters. A single regex matching everything would be unreadable, unmaintainable, and brittle, one format change breaks the entire parser. By breaking extraction into stages, you can add new fields without touching existing regexes. If HAProxy adds a new log format field, you add one child decoder, the rest keep working.
Rule Syntax: The Detection Layer
Rules are where detection logic lives. They evaluate decoded fields, apply thresholds, and generate alerts.
<group name="attack_category, platform,">
<rule id="100001" level="10">
<decoded_as>json</decoded_as> <!-- Which decoder must match -->
<if_sid>530</if_sid> <!-- Parent rule that must fire first -->
<field name="data.virus">yes</field> <!-- Exact field match -->
<field name="data.action" type="pcre2">alert</field> <!-- Regex field match -->
<description>Antivirus alert: $(data.virus)</description>
<mitre>
<id>T1204</id> <!-- MITRE ATT&CK technique -->
</mitre>
<group>pci_dss_11.4,nist_800_53_SI.4,</group> <!-- Compliance mapping -->
</rule>
</group>
Key Rule Elements
<group>, Wraps related rules. The name attribute is a comma-separated list of tags used for filtering in the dashboard. Multiple groups can exist in one file. The trailing comma is intentional, Wazuh's parser expects it.
id, Unique numeric identifier. This is how rules reference each other. The built-in range ends around 100000; custom rules start from 100000+. Pick a range and stay in it to avoid collisions with future Wazuh updates.
level, Severity 0–15. The most consequential decision in a rule:
Level Meaning When to Use
────── ───────────────────── ─────────────────────────────────
0–4 Ignored / Informational System events, heartbeat
5–6 Low User login success, config change
7–9 Medium Failed login, port scan, policy violation
10–11 High Malware detected, suspicious process
12–13 Critical Web shell, privilege escalation, C2 beacon
14–15 Emergency Active compromise confirmed, data exfil
<decoded_as>, The decoder name that must match before this rule evaluates. If the log wasn't decoded as json, a rule with <decoded_as>json</decoded_as> never fires. This is your primary routing mechanism.
<if_sid>, A parent rule ID that must fire first. The web shell rule 500500 uses <if_sid>554</if_sid>, rule 554 is the built-in "File added to the system" FIM rule. Rule 500500 only evaluates if 554 has already matched. This creates a dependency chain: 554 fires on any file creation, then 500500 checks if the file extension looks like a script, then 500501 checks if the content contains web shell functions.
<if_matched_sid>, Different from if_sid. This is used with frequency and timeframe for correlation rules. Example: "If rule 841101 (WPScan detected) fired 14 times from the same source IP in 30 seconds, escalate to rule 841151."
<field>, Matches a decoded field against a value or regex. The name attribute references a decoded field like srcip, url, or win.eventdata.commandLine. Without type, it does exact string matching. With type="pcre2", it does regex matching. With type="osregex", it uses OS_regex (faster but less capable).
<match>, Searches the entire original log line. Less efficient than <field> but necessary when you need to match something outside the decoded fields, or when no decoder is available.
<description>, The alert text. Use $(field_name) to inject decoded field values. Good descriptions answer "what happened, to what, and why should I care." Bad descriptions say "Rule 500500 fired."
frequency and timeframe, Correlation counters. frequency="14" timeframe="30" means "this rule fires if the parent rule matched 14 times in 30 seconds." Always paired with <if_matched_sid> and <same_source_ip /> (or <same_source_port />, <same_destination_ip />).
<mitre>, MITRE ATT&CK technique IDs. Add at least one ID per rule. When your alert fires, the Wazuh dashboard automatically maps it to the ATT&CK matrix. This is free context, use it.
Example 1: Web Shell Detection (The Full Stack)
This is Bayu's flagship detection, a multi-layered web shell detector covering file creation, file modification, file content analysis, command execution, and network connections. It spans a decoder and rules across Linux, Windows, auditd, and Sysmon.
The Decoder
<decoder name="network-traffic-child">
<parent>ossec</parent>
<prematch offset="after_parent">^output: 'webshell connections':</prematch>
<regex offset="after_prematch" type="pcre2">(\d+.\d+.\d+.\d+):(\d+)\|(\d+.\d+.\d+.\d+):(\d+)</regex>
<order>local_ip, local_port, foreign_ip, foreign_port</order>
</decoder>
Line-by-line:
- Name:
network-traffic-child, a specific child decoder for network connection logs. The parent isossec, meaning it inherits from Wazuh's built-in log parser (extracting timestamp, hostname, etc.). - Pre-match: Searches for the literal string
output: 'webshell connections':anywhere after the parent decoder's portion. The caret^ensures matching starts at the beginning of the remainder, important for precision and to avoid matching internal substrings. - Regex: Four capture groups extracting
local_ip:local_port|foreign_ip:foreign_port. The pipe|is a literal delimiter in the log format. Since the regex uses PCRE2, the dot.matches any character including dots in IP addresses, be aware of this if you need strict IP validation. - Order: Maps the four groups to named fields
$local_ip,$local_port,$foreign_ip,$foreign_port, directly usable in rules.
This decoder is designed to parse output from a custom script that runs on web servers and logs when a web process (like w3wp.exe or php-fpm) opens a network connection, a strong indicator of a web shell phoning home.
The Rules
Bayu's web shell rules are the most thorough in the repo. Here are three key ones:
<rule id="500500" level="12">
<if_sid>554</if_sid>
<field name="file" type="pcre2">(?i).php$|.phtml$|.php3$|.php4$|
.php5$|.phps$|.phar$|.asp$|.aspx$|.jsp$|.cshtml$|.vbhtml$</field>
<description>[File creation]: Possible web shell scripting file
($(file)) created</description>
<mitre>
<id>T1105</id>
<id>T1505</id>
</mitre>
</rule>
How it works: Rule 554 ("File added to the system") fires for every new file. Rule 500500 inherits from 554, then checks if the filename matches a web scripting extension. The (?i) flag makes the match case-insensitive, .PHP, .Php, and .php all match. Level 12 means this goes straight to the SOC queue.
But file creation alone isn't enough. An attacker might modify an existing legitimate PHP file to add a one-liner web shell:
<!-- Detects modification of scripting files -->
<rule id="500501" level="12">
<if_sid>550</if_sid>
<field name="file" type="pcre2">(?i).php$|.phtml$|.asp$|.aspx$|.jsp$</field>
<description>[File modification]: Possible web shell content
added in $(file)</description>
<mitre><id>T1105</id><id>T1505</id></mitre>
</rule>
<!-- Escalates if the modification contains web shell functions -->
<rule id="500502" level="15">
<if_sid>500501</if_sid>
<field name="changed_content" type="pcre2">(?i)passthru|exec|eval|
shell_exec|assert|str_rot13|system|phpinfo|base64_decode|chmod|
mkdir|fopen|fclose|readfile|show_source|proc_open|pcntl_exec|
execute|WScript.Shell|WScript.Network|FileSystemObject|Adodb.stream</field>
<description>[File Modification]: File $(file) contains a
web shell</description>
<mitre><id>T1105</id><id>T1505.003</id></mitre>
</rule>
Rule 500501 inherits from 550 ("File integrity checksum changed"). Rule 500502 inherits from 500501, a three-level chain: 550 → 500501 → 500502. If a scripting file was modified AND the changed content contains eval, passthru, base64_decode, or any of 20 other web shell indicators, it fires at level 15, maximum severity, wake-someone-up territory. The MITRE mapping narrows from general T1505 to specific T1505.003 (Web Shell).
But what about web shells that don't drop files? What if the attacker exploits a command injection in an existing application? That's where the Sysmon and auditd rules come in:
<rule id="500530" level="12">
<if_sid>61603</if_sid>
<field name="win.eventdata.parentImage" type="pcre2">(?i)w3wp\.exe</field>
<field name="win.eventdata.parentUser" type="pcre2">
(?i)IIS\sAPPPOOL\\\\DefaultAppPool</field>
<description>[Command execution ($(win.eventdata.commandLine))]:
Possible web shell attack detected</description>
<mitre><id>T1505.003</id><id>T1059.004</id></mitre>
</rule>
This fires when Sysmon event 61603 (process creation from network connection) shows a parent process of w3wp.exe (IIS worker process) running as IIS APPPOOL\DefaultAppPool, the exact conditions of a web server spawning a child process. That's not normal behavior unless you've explicitly configured IIS to execute CGI scripts. Combined with the command line in the alert description $(win.eventdata.commandLine), an analyst sees exactly what command the attacker ran.
This layered approach, FIM for file drops, content analysis for injections, Sysmon/auditd for command execution, and network connection monitoring for C2, means a web shell has to evade four independent detection layers to go unnoticed. Most don't.
Example 2: WPScan Detection with Frequency Correlation
WordPress scanning is noisy by nature, a single WPScan run generates dozens of 404s and 403s. A naive rule that fires on every wp-admin hit creates alert fatigue. Bayu's approach uses frequency-based correlation: fire a low-level informational alert on each scan signature, then escalate only when the pattern repeats from the same source IP.
<group name="wpscan,web,accesslog,">
<!-- Base rule: fires on any WP-specific URL with a 4xx response -->
<rule id="841101" level="7">
<if_sid>800001,800002,31100</if_sid>
<id>^4</id>
<url>wp-includes|wp-login|wp-admin|wp-|wordpress|xmlrpc.php</url>
<description>WP scanning detected</description>
<group>attack,pci_dss_6.5,pci_dss_11.4,</group>
</rule>
<!-- Escalation: fires when base rule hits 14x from same IP in 30s -->
<rule id="841151" level="10" frequency="14" timeframe="30">
<if_matched_sid>841101</if_matched_sid>
<same_source_ip />
<description>Multiple WP scan detected from same source ip.</description>
<mitre><id>T1595.002</id></mitre>
<group>web_scan,recon,attack,</group>
</rule>
</group>
How the base rule (841101) works:
<if_sid>800001,800002,31100</if_sid>, The rule inherits from three parent rules. Rules 800001 and 800002 are Bayu's own HAProxy access log rules (covering HTTP and HTTPS). Rule 31100 is Wazuh's built-in web access log rule. This means the rule works whether your web logs come through HAProxy or directly from Apache/Nginx, one rule, multiple log sources.<id>^4</id>, Matches HTTP 4xx response codes. WPScan probes paths that mostly return 403 Forbidden or 404 Not Found. A 200 OK onxmlrpc.phpis normal; a 404 onwp-content/plugins/revslider/isn't.<url>, Matches URLs containing WordPress-specific keywords. The pipe-separated list is implicitly an OR match.wp-catches all REST API and plugin paths.
How the escalation rule (841151) works:
frequency="14" timeframe="30", Rule 841101 must fire at least 14 times within 30 seconds.<if_matched_sid>841101</if_matched_sid>, Counts occurrences of rule 841101. Different from<if_sid>, which requires the parent to fire once.if_matched_sidwith frequency means "count how many times the parent fired."<same_source_ip />, All 14 matches must come from the same IP. Without this, 14 different IPs hitting WordPress paths (normal internet noise) would trigger a false positive.
The escalation fires at level 10, high enough to warrant investigation, not high enough to wake anyone up. It maps to MITRE T1595.002 (Active Scanning: Vulnerability Scanning), giving your SOC the exact technique to start their investigation.
14 matches in 30 seconds works for Bayu's environment. In yours, a single WPScan run might generate 30 requests in 20 seconds, or 5 requests in 60 seconds. Watch your logs during a test scan and set frequency to roughly half the scan volume, enough to catch real scans without false positives from bots casually probing xmlrpc.php.
Example 3: HAProxy Docker JSON Decoder
HAProxy inside Docker outputs JSON-formatted logs. Wazuh's built-in HAProxy decoder expects plain-text syslog format and fails silently on JSON. Bayu wrote a decoder that handles both formats:
<decoder name="haproxy-docker">
<parent>json</parent>
<use_own_name>true</use_own_name>
<prematch offset="after_parent">^log":"(\d+.\d+.\d+.\d+):(\d+)
\[(\S+)\] (\S+) (\S+)/(\S+)</prematch>
<regex offset="after_parent" type="pcre2">^log":"(\d+.\d+.\d+.\d+):
(\d+) \[(\S+)\] (\S+) (\S+)/(\S+) (\d+/\d+/\d+/\d+/\d+) (\S+) (\S+)
- - (\S+) (\d+/\d+/\d+/\d+/\d+) (\d+)/(\d+) \{(.*)\} \\"(\w+ \S+)</regex>
<order>srcip, srcport, accept_date, frontend_name, backend_name,
server_name, timer, id, response_length, termination_state,
connections, server_queue, backend_queue, headers, url</order>
</decoder>
Why this decoder exists: In Docker, HAProxy logs go to stdout and get wrapped in JSON by the logging driver. The raw log looks something like:
{"log":"192.168.1.10:54321 [15/Jun/2026:14:22:10.123] frontend_http
backend_servers/server01 0/0/1/45/46 200 1234 - - ---- 1/1/0/0/0
0/0 { } \"GET /api/health HTTP/1.1\"","stream":"stdout","time":
"2026-06-15T14:22:10Z"}
The json parent decoder extracts the top-level JSON structure. Then haproxy-docker inherits from it and searches for the log field, which contains the actual HAProxy log line. The use_own_name flag is critical, without it, rules that reference <decoded_as>haproxy-docker</decoded_as> would fail because Wazuh would use the parent's name (json) instead.
The regex extracts 15 fields in a single pass, from source IP and port through to the HTTP method and URL. This is only feasible because the parent json decoder already handled the JSON parsing, so the child only deals with the raw HAProxy portion.
If you're running any application in Docker with JSON logging, your logs arrive double-encoded. The outer layer is JSON (container runtime), and the inner layer is whatever format the application produces (syslog, CEF, custom). A common mistake is writing a decoder that tries to handle both layers in one regex. Instead, always use json as the parent decoder, then write a child that handles just the inner format. This is exactly what Bayu did.
Testing with wazuh-logtest
Never restart the Wazuh manager to test a new rule. A syntax error kills the analysis engine, and all agents go silent until you fix it. Use wazuh-logtest instead, it loads your rules and decoders in a sandboxed environment without affecting the running manager.
# Start the logtest daemon (runs in background)
sudo /var/ossec/bin/wazuh-logtest -d
# Connect to the test socket
sudo /var/ossec/bin/wazuh-logtest
Once connected, paste a raw log line and press Enter. The tool outputs the decoded fields, matched rules, and final alert, all without touching the production engine.
** Pasting a FIM event where shell.php was added to /var/www/html
** Phase 1: Completed pre-decoding
** Phase 2: Completed decoding
name: 'syscheck'
file: '/var/www/html/shell.php'
** Phase 3: Completed filtering (rules)
Rule id: '554' fired → Level 5
Rule id: '500500' fired → Level 12
Description: [File creation]: Possible web shell scripting file
(/var/www/html/shell.php) created
MITRE: T1105, T1505
** Alert to be generated
What to look for in logtest output:
- Phase 1 completed: Pre-decoding worked, timestamp, hostname, and program name were extracted.
- Phase 2 completed: Your decoder matched and extracted fields. If Phase 2 shows no decoder match, your
<program_name>or<prematch>regex is wrong. - Phase 3 completed: Your rule(s) fired. Check that the fired rule ID matches what you expected. If a parent rule fired but your child didn't, the child's
<field>or<match>condition isn't satisfied.
Testing frequency-based rules: Paste the same log line repeatedly, incrementing the count. After reaching the frequency threshold within the timeframe, the escalation rule should fire. If it doesn't, check that you're not exceeding the timeframe between pastes, logtest respects real time, not simulated time.
logtest only sees rules and decoders that are in the standard directories (/var/ossec/etc/rules/, /var/ossec/etc/decoders/). If you copied files to a test directory and ran logtest with custom paths, it won't find them. Always copy test rules to the production directories, run logtest, then remove them if they fail, or just keep them and accept that a broken rule won't load on the next restart (which is safe).
Common Mistakes and How to Fix Them
These are the issues I've hit repeatedly, along with the exact fix for each.
1. XML Syntax Errors, The Silent Killer
Symptom: Manager restarts, appears healthy, but analysisd doesn't load. No alerts appear. /var/ossec/logs/ossec.log contains "ERROR: Could not load rules."
Root cause: Unescaped XML characters. & must become &. < inside a regex must become <. A missing closing tag. A duplicate rule ID.
Fix: Validate XML before deployment:
sudo apt install libxml2-utils -y
xmllint --noout /var/ossec/etc/rules/your-rule-file.xml
If xmllint complains, fix the error before deploying. A single unescaped ampersand in a regex like &search= will break the entire ruleset.
2. Rule ID Collisions
Symptom: "ERROR: Rule '500500' has duplicate id."
Root cause: You copied Bayu's rules but also have your own rules that start at 500000.
Fix: Choose a unique ID range. Use 600000+ for your own custom rules, or rename Bayu's rules to a range that doesn't conflict. If you rename, update every <if_sid> and <if_matched_sid> that references the old IDs.
3. Decoder Never Fires
Symptom: logtest shows "Phase 2 completed" but your decoder's fields aren't extracted. Rules with <decoded_as>my_decoder</decoded_as> never fire.
Root cause: The <program_name> doesn't match, or the <prematch> regex doesn't match the log format, or the child decoder references a parent that doesn't exist.
Fix: In logtest, look at what decoder did match in Phase 2. If the log was decoded as syslog instead of haproxy, your program_name or prematch is wrong. Check the raw log carefully, is HAProxy running as a different process name? Is the log format slightly different from what the regex expects?
4. PCRE2 Regex Doesn't Match
Symptom: A <field type="pcre2"> never matches, even though the field clearly contains the expected value.
Root cause: PCRE2 is strict about certain characters. A dot . in a regex matches any character, including backslashes and special chars. If your field contains backslashes (like Windows paths), you need to double-escape: \\\\ in XML becomes \\ in the regex, which matches a single backslash.
Fix: Test your regex separately before putting it in a rule. A quick Python one-liner:
python3 -c "import re; print(bool(re.search(r'(?i)w3wp\\.exe', 'C:\\\\Windows\\\\System32\\\\inetsrv\\\\w3wp.exe')))"
If this returns False, your regex is broken. Tweak it until it returns True, then paste the corrected pattern into your rule XML.
5. Frequency Rules Never Escalate
Symptom: The base rule fires repeatedly, but the escalation rule never triggers.
Root cause: (a) The timeframe is too short for your log volume. (b) You used <if_sid> instead of <if_matched_sid>. (c) Missing <same_source_ip /> when logs come from different IPs.
Fix: (a) Increase timeframe incrementally and test. (b) For correlation rules, always use <if_matched_sid> with frequency, <if_sid> is for simple parent-child chains. (c) Add <same_source_ip /> if you're correlating by attacker IP.
Beyond Rules: The Full Repo
Bayu's repository goes deeper than just decoders and rules. Here's what else you should explore:
- ESET Integration: A Python daemon that collects ESET antivirus logs and forwards them to Wazuh, paired with 400 KB of detection rules (that's 400 KB of XML, one of the largest single rule files in any open-source Wazuh repo).
- Active Response Scripts:
quarantine-webshell.shandremove-malware.pycan automatically isolate compromised hosts when a web shell or malware rule fires. Wire these to rules 500500–500502 for automated web shell containment. - Integration Scripts: Custom connectors for MISP (threat intelligence), TheHive (case management), DFIR-IRIS (incident response), and Telegram (alerting). These replace Wazuh's built-in integrations with more flexible Python alternatives.
- Browser Monitoring: A Python script that collects browser history from Chrome, Firefox, and Edge, useful for detecting phishing victims who visited known-malicious URLs.
- Sysmon Config: A 300 KB Sysmon configuration with granular event logging. Deploy it alongside the 255000-sysmon rules for deep Windows process monitoring.
Further Learning
Bayu Sangkaya's materi_wazuh repository is a full Wazuh training curriculum covering architecture, agent deployment, rule and decoder writing, integration development, and SOAR pipeline construction. If you're building a SOC on Wazuh, whether in Indonesia or anywhere else, start there.
For deep dives into decoder and rule syntax, Wazuh's official documentation is the reference: Custom Rules and Decoders and the Ruleset XML Syntax reference. The official docs are thorough but dry, pair them with Bayu's practical examples for the fastest learning curve.
If you find Bayu's work valuable, consider supporting him on Ko-fi or Trakteer. Open-source security maintainers rarely get compensated for the detection logic that protects thousands of organizations, a coffee goes a long way.
The best way to learn custom rules is to write them. Pick a log source you currently monitor manually, a login failure pattern, a port scan, an unusual DNS query, and write a decoder for it, then a rule, then test it with logtest. Start with level 5, tune it until it catches what you want without false positives, then escalate to level 10. The gap between "I read about custom rules" and "I wrote a rule that caught a real attack" is about three hours of focused work. Close it.
References
Bayu Sangkaya, Wazuh Custom Rules and Decoders. https://github.com/bayusky/wazuh-custom-rules-and-decoders
Bayu Sangkaya, Materi Wazuh (Training). https://github.com/bayusky/materi_wazuh
Bayu Sangkaya, LinkedIn. https://www.linkedin.com/in/bayu-sangkaya/
Wazuh Documentation, Rules & Decoders. https://documentation.wazuh.com/current/user-manual/ruleset/
Wazuh, Ruleset XML Syntax. https://documentation.wazuh.com/current/user-manual/ruleset/ruleset-xml-syntax/
Wazuh, Custom Rules & Decoders. https://documentation.wazuh.com/current/user-manual/ruleset/custom.html