Splunk App for McAfee/SkyHigh Web Gateway

Need support? splunk

compek.net

Splunk App for McAfee/SkyHigh Web Gateway

Version: 5.0.10

Date: 23 Jun 2025

Splunk App for McAfee Web Gateway or SkyHigh Web Gateway allows rapid insights and operational visibility into McAfee Web Gateway (MWG) and McAfee Web Gateway Cloud Service (WGCS) deployments

About
Where to install this App
Quick Start
Get Data In
Overview of Sourcetypes and Log Formats
Configure a custom log format (mcafee:webgateway:custom) on MWG
Upgrade from 4.x to 5.x
Upgrade from 3.07
Configuration examples

Local file monitor
Local UDP/TCP input
Syslog UDP/TCP
Syslog TCP+TLS
Syslog to multiple destinations
Configure Universal Forwarder (UF) to run directly on MWG and send logs to indexer
Log pushing from MWG to a log server
Log pulling from MWG
Log pulling from SSE/WGCS
Splunk Connect for Syslog (SC4S)
Disable rsyslog/journald rate-limiting
Syslog-NG configuration
Host extraction
Segmentation
NEW: Interactive configuration builder

NEW: Security considerations
Onboarding checklist
Detailed description of the mcafee:webgateway:custom Log Format
Other logs

Audit log
/var/log/messages
/var/log/secure
/opt/mwg/log/mwg-errors

NEW: Self-Monitoring
Next Steps / Action Plan
FAQ
Dashboard Views

Summary
Summary+
Easy Search
Raw Search
Search+
URL Filter
URL Filter+
Traffic
Traffic+
Mediatypes
Malware
Protocols
Connections
Applications
User-Agents
User-Agents+
Performance
Network
Authentication
Uploads
Risk
DNS
NEW: DoH
Rules
HTTP
Headers
SSL
Security_posture
NEW: Certificates
Anomalies
Errors
NEW: MWG-Errors
NEW: Monitoring
Audit
Audit - Timeline
NEW: Unfiltered Threats
Help
Troubleshooting

NEW: Dashboard Customization
Troubleshooting
Summary of changes
Contributors, Attributions
Copyright
Disclamer
Contact, Support and Feedback
Additional information

NEW: Install Syslog-NG on MWG/SWG
mwg_xml2txt script
dump_logging_fields
NEW: List of Counters

About

This Splunk App for McAfee Web Gateway allows rapid insights and operational visibility into McAfee Web Gateway (MWG) and McAfee Web Gateway Cloud Service (WGCS) deployments. It provides field extraction and CIM field mapping using all available types of access logs (default and custom McAfee Web Gateway log, McAfee Web Gateway Cloud Service), facilitates fast incident response and troubleshooting. This app is designed for security administrators, CISOs, or security personnel dedicated to taking security seriously.

In 2022, McAfee Web Gateway (MWG) rebranded as SkyHigh Secure Web Gateway (SWG). The App and sourcetype will maintain the McAfee name for some time to preserve the old App ID.

List of abbreviations used in this document:

Abbreviation	Meaning
MWG	McAfee Web Gateway
SWG	SkyHigh/Secure Web Gateway
WGCS	McAfee/SkyHigh Web Gateway Cloud Service
UF	Splunk Universal Forwarder

Product Compatibility:

Product	Version(s)
Splunk Enterprise	6.6+, 7.x, 8.x, 9.x
Splunk Cloud	all versions, both Classic and Victoria
Splunk CIM	4.x, 5.x
MWG/SWG	7.6+, 8.x, 9.x, 10.x, 11.x, 12.x
WGCS	API version 5-12

Currently there are 85 different charts and tables grouped into 22 views

  Applications
      Applications by Hits
      Applications by Volume
      Top Blocked Applications by Hits
      Top Applications by Volume
      Top Applications by Hits
      Top Application Statistics
  Audit
      Failed Logins
      Activity by Action
      Activity by Source_Type
      Activity by User
      User Activity by Appliance
  Authentication
      Top IP by Failed Auth
      Top User-Agents by Failed Auth
      Top Destination Hosts by Failed Auth
      Top User-Agents + IPs by Failed Auth
      Top User-Agents + DestHost by Failed Auth
      Top IPs + DestHost by Failed Auth
      Top IPs + User-Agent + DestHost by Failed Auth
      Multiple Logins from diff IPs
      Multiple Usernames coming from a single IP
      Authentication Method Statistics
  Connections
      Long running transactions
  DNS
      Timechart DNS resolution time
      Timechart DNS resolution time distribution (including Cached)
      Timechart DNS resolution time distribution (excluding Cached)
      DNS distribution (1ms - 200ms)
      DNS distribution (all)
  Errors
      Error Analysis
  HTTP
      Timechart HTTP Method
      HTTP Method Statistics
      HTTP Request Headers Statistics
      HTTP Response Headers Statistics
  Easy Search
      Status Code Overview
      Web Usage by URL Category
      Web Usage by URL Category Area Graph
      Top User-Agents
      Users + IPs
      IP Addresses by Hits Graph
      Top Hosts by Hits
      Top Blocked Domains by Hits
      Top Rules by Hits
      Events
  Malware
      Malware
      Top Users by blocked Malware
  Media Types
      Media Types
      Top Media Types by Volume
      Top Media Types by Hits
      EXE Uploads/Downloads
      Macro Uploads/Downloads
      EXE and Macro Uploads/Downloads with Magic Bytes Mismatch
      Encrypted Files
  Network
      Top unreachable Servers
  Performance
      Connect to Server Latency
      Total Transaction Duration distribution
      Client-Side Latency
      DNS resolution Latency distribution
      Time in Externals Distribution
  Protocols
      Protocols by Hits
      Protocols by Hits (Percent)
      Protocols by Volume
      Protocols by Volume (Percent)
  Potential Risks
      Top SRC with high Ratio of High Risk Requests
      Unusual Ports
      Requests to IP Addresses
      CONNECT Requests to IP Addresses
      Very long URLs
      Very large request and response Headers
      Non-resolvable Domains, potential DGA (Domain Generation Algorithm)
  Rules
      Top Rules
      Block Rules Overview
      Top Block Rules
      Rule Complexity/Performance
      Slowest Rule Execution
      Time in Rule Engine Distribution
      Time in Rule Engine over Time
  Security Posture
      Content Scan is possible Ratio
  SSL
      SSL Versions by Hits (Server)
      SSL Versions by Hits (Client)
      SSL Ciphers by Hits (Server)
      SSL Ciphers by Hits (Client)
      SSL KeyExchangeBits by Hits (Server)
      SSL KeyExchangeBits by Hits (Client)
      SSL Ciphers (Server)
      SSL Versions (Server)
      Client Certificate Requested
      SSL-related blocks
      Expired Certificate
      Certificate Issuers
  Summary
      Requests / Block Ratio
      Traffic Overview
  Traffic
      Top Inbound Traffic by Source
      Top Inbound Traffic by Destination
      Top Outbound Traffic by Source
      Top Outbound Traffic by Destination
  Uploads
      Uploads
  URL Filter
      URL Categories
      Blocked by URL Filter or by Web Reputation
      Top URL Categories by Volume
      Top URL Categories by Hits
      Geolocation Stats
      High Risk Destinations
      Not categorized Domains - Chart
      Top not categorized Domains - Table
  User-Agents
      User-Agent Statistics

Where to install this App

Instance	App for McAfee Web Gateway	Add-on for McAfee Web Gateway
Standalone (all-in-one) Splunk	+	-
Splunk Cloud	+	-
On-prem Search Head	+	-
On-prem Indexer	-	+
Syslog/Log Server with Universal Forwarder	-	+
SkyHigh Logging Client	-	+

Quick Start

If you upgrade from a version 4.x then read Upgrade from 4.x to 5.x

Install Splunk directly on MWG and configure it to monitor local log folder:

Configure a custom log format (mcafee:webgateway:custom) on MWG
Install Splunk on the same MWG
Install Splunk App for McAfee Web Gateway on Splunk
CLI: Allow Splunk to read splunk.log: setfacl -m u:splunk:rx /opt/mwg/log/user-defined-logs
Configure a local file monitor

Step-by-step walkthrough: https://youtu.be/96oRco3MTu0

Configure MWG to send logs via TCP to Splunk

Configure a custom log format (mcafee:webgateway:custom) on MWG
Configure MWG to send events via UDP/TCP
Install Splunk App for McAfee Web Gateway on Splunk
Configure Splunk network input to accept logs from MWG

Step-by-step walkthrough: https://youtu.be/vYy6ddpGkNw

Get Data In

MWG can write logs to the hard disk or/and send them via Syslog. Splunk can read log files locally, get them via network input (Syslog or raw UDP/TCP steam) or get them from a UF that is installed on a log server or on MWG itself. All these methods combined produce many possible ways to get MWG logs into Splunk:

Method / Link to configuration example	Description	Real time
Local file monitor	Splunk is installed directly on MWG and monitors the log file folder	Yes, up to 30 sec delay
Local UDP/TCP input	Splunk is installed directly on the MWG and gets log files sent using Syslog	yes
Syslog UDP/TCP	MWG sends logs via UDP/TCP to syslog collector or directly to Splunk	yes
Syslog TCP+TLS	MWG sends logs via TCP, encrypted with TLS, to syslog collector or directly to Splunk	yes
UF	Install UF on MWG to monitor log file folder	yes, up to 30 sec delay
Log pushing from MWG to a log server	Use pushing (FTP/FTPS/SCP/SFTP/HTTP/HTTPS) from MWG to a log server	no
Log pulling from MWG	Pulling logs from MWG via API, scp or rsync	no
Log pulling from SSE/WGCS	Pulling logs via SSE/WGCS API	no, up to several minutes delay
Splunk Connect for Syslog (SC4S)	MWG sends events via UDP/TCP to SC4S, SC4S forward them to Splunk HEC	yes

Installing UF directly on MWG and configuring UF to forward events to Splunk indexer is a recommended and most reliable method!

Further considerations:

Local input (monitor local log files) is a simplest method used for testing or in small environments.
Syslog UDP is typically not recommended because of the potential packet loss, except when a syslog server is within the same uncongested network segment (i.e. separate management segment).
If possible, install the syslog collector/server on the same VLAN/Network as MWG. Avoid unreliable links (WiFi/WAN), firewalls (especially with DPI/IDS) between MWG and the syslog collector.
For large environments Splunk doesn't recommend sending syslog directly to Splunk indexers, and suggests using an intermediate syslog server instead.
McAfee doesn't support installing UF directly on MWG, but it's a good option in some situations.
For a large environment, use one of Splunk's validated architecture designs.
For more details on syslog collector location, UDP vs TCP etc. read https://splunk.github.io/splunk-connect-for-syslog/main/architecture/
HOWTO: Configure a McAfee Web Gateway (MWG) syslog to send TLS-secured data to Splunk https://youtu.be/-nSkYdDQA00
HOWTO: Splunk App for McAfee Web Gateway (MWG) - send logs to Splunk - step by step configuration: https://youtu.be/vYy6ddpGkNw

Overview of Sourcetypes and Log Formats

There are several possible log formats that can be used. Compare your logs with the example below to find out the current format.

On-premise Web Gateway

Log Format	Sourcetype	# of MWG fields	# of CIM fields	Average log line length (HTTPS Scanner enabled)	Comment/Example
Custom Log (recommended)	mcafee:webgateway:custom	50-100	50-100	~600-1800 Bytes	This custom modular log format allows for flexible addition or removal of logging fields as needed. It provides comprehensive Common Information Model (CIM) coverage and deep insights for analytics and rapid troubleshooting. Despite the significantly larger amount of provided information, the log size remains largely unchanged. In fact, this new format achieves up to 3 times higher information density compared to the default log format. Starting from version 5.0.0 of the app, an updated log format was introduced that provides significantly improved search (up to 30 times) and reporting (up to 100 times) performance by leveraging TERM and PREFIX directives: 2021-02-26 14:36:46 -0600 s=200 ac=allowed src=192.168.2.1 p=https m=GET d=safebrowsing.googleapis.com dp=443 bi=563 bo=4156 dur=38 rt=17 up="/v4/threatListUpdates" ua="FF86-10.0" c=it dip=142.250.185.n ckex=112 skex=112 cntx scc=1302 ssc=1302 sslcp=1.3 sslsp=1.3 sslicn="GTS CA 1O1,GlobalSign" sslcn="upload.video.google.com" crtdays=-52 mbmismatch ctmt0 rul="L" rnf=41 rne=104 srcp=62407 conrt=0 bfc=524 btc=4418 tunnel psrcip=192.168.2.1n psrcp=42550 rqv=2.0 rsv=2.0 r=0 tdns=0 tcon=0 tre=34 text=34 t=18.18.22.11.15 Old versions of the app (3.x and 4.x) provided a slightly different format, that doesn't allow TERM/PREFIX benefits: 2021-02-26 14:40:23 +0100 204 allowed 192.168.2.1 https GET example.com 443 775/58 88/1 up="/test" ua="FF86-10.0" a="Google" c="wa" dip=142.250.185.nn kex=112/112 cntx sccc=1302/1302 sslp=1.3/1.3 sslicn="GTS CA 1O1,GlobalSign" sslcn="example.com" crtdays=-66 ctmt0 rul="L" rn=13/44 srcp=63298 conrt=0 b=744/239 psrcip=192.168.2.1 psrcp=20010 piv=2.0/2.0 r=0 t=0/0/86/87/56/56/3/4/28
Minimal Log	mcafee:webgateway:minimal	6	8	~45-55 Bytes	Minimal log format, contains only 6 most important fields: status, src, dest, bytes_in, category, reputation. There is no timestamp, DATETIME_CONFIG = CURRENT is used instead. This format allows you to get the most important statistics using the shortest possible event length and is intended for use with the Splunk Free license (500 MB/day, ~10.000.000 events/day) . 302 192.168.1.10 maps.google.com 667 cm -38
Default Access Log	mcafee:webgateway:default	14	17	~700 Bytes	The default log format, which has a fixed structure, provides only a minimal subset of fields. Use it only if no MWG modification is possible. [26/Feb/2021:14:40:23 +0100] "" 192.168.2.1 200 "GET https://example.com/test&adk=1473563476 HTTP/2.0" "Web Ads" "Minimal Risk" "image/gif" 286 538 "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:86.0) Gecko/20100101 Firefox/86.0" "" "0" "Google"
Legacy log format for the Splunk App v.3.0.7	MWGaccess3	26	27	~650 Bytes	Customized log format with a fixed structure, provides more fields than the default log, including some timings and transferred bytes. Wasteful information like the User-Agent string is shortened. Consider it obsolete. [26/Feb/2021:14:40:23 +0100] status="200/0" srcip="192.168.2.1" user="" profile="-" dstip="-" dhost="example.com" urlp="443" proto="HTTPS/https" mtd="GET" urlc="Web Ads" rep="0" mt="image/gif" mlwr="-" app="Google" bytes="538/539/289/286" ua="FF86.0-10.0" lat="0/0/59/434" rule="Last Rule" url="https://example.com/test&adk=1473563476"
Modified legacy log derived from MWGaccess3	mcafee:wg:kv	26	27	650-850 Bytes	Modified MWGaccess3 log format with a fixed structure included with the Splunk Add-on for McAfee Web Gateway , provides more fields than the default log, including some timings and transferred bytes. Wasteful information like the User-Agent string is shortened. Added sha2 hash and a CN name of the SSL certificate, a Cache-Control header, file name, a reputation level. Consider it obsolete. [10/Mar/2024:15:16:52 +0100] status="200/0" srcip="192.168.2.1" dhost="web.de" destip="82.165.229.83" urlp="443" proto="HTTPS/https" mtd="GET" urlc="Portal Sites" rep="0" mt="text/html" bytes="69/189/345936/345564" ua="curl/8.4.0" lat="0/0/397/397" rule="Last Rule" url="https://web.de/" rep_level="Minimal Risk" cache_control="no-cache, no-store, must-revalidate" ssl_cert_sha2="12695b9b9d0c190b01674492fcf898f91ba85d996dbafe8651e1ac41482f5907" ssl_cert_name="*.web.de"

SSE / Web Gateway Cloud Service (WGCS)

WGCS log format provides a subset of required fields, there are several API versions:

Log Format	Sourcetype	# of MWG fields	# of CIM fields	Average log line length (HTTPS Scanner enabled)	Comment/Example
WGCS API version 5	skyhigh:webgateway:csv or mcafee:webgateway:wgcs_v5	28	28	~300-400 Bytes	"user_id","username","source_ip","http_action","server_to_client_bytes","client_to_server_bytes","requested_host","requested_path","result","virus","request_timestamp_epoch","request_timestamp","uri_scheme","category","media_type","application_type","reputation","last_rule","http_status_code","client_ip","location","block_reason","user_agent_product","user_agent_version","user_agent_comment","process_name","destination_ip","destination_port" "-1","142.250.185.nn","142.250.185.nn","GET","206","1040","example.com","/test","OBSERVED","","1626329868","2021-07-15 06:17:48","https","Business, Software/Hardware","application/x-empty","","Minimal Risk","Internal Request handled","200","8.65.16.n","","","Other","","","","78.47.250.n","443"
WGCS API version 6	skyhigh:webgateway:csv	28	28	~300-400 Bytes	No new fields are introduced. All fields from versions 1 – 5 are downloaded. Starting with API version 6, an error message is sent with the response to a download request that has timed out.
WGCS API version 7	skyhigh:webgateway:csv	34	28	~400-450 Bytes	All fields from versions 1 – 6 are downloaded, plus these fields: pop_country_code referer ssl_scanned av_scanned_up av_scanned_down rbi
WGCS API version 8	skyhigh:webgateway:csv	40	30	~400-500 Bytes	All fields from versions 1 – 7 are downloaded, plus these fields: dlp client_system_name filename pop_egress_ip pop_ingress_ip proxy_port
WGCS API version 9	skyhigh:webgateway:csv	40	30	~450-600 Bytes	With this header, no new fields are added. All fields from versions 1 – 8 are downloaded.
WGCS API version 10	skyhigh:webgateway:csv	40	30	~450-600 Bytes	With this header, all fields from versions 1 – 9 are downloaded, plus these fields: mw_probability discarded_host ssl_client_prot ssl_server_prot
WGCS API version 11	skyhigh:webgateway:csv	41	30	~450-600 Bytes	With this header, fields from versions 1 – 10 are downloaded, plus this field: domain_fronting_url
WGCS API version 12	skyhigh:webgateway:csv	41	30	~450-600 Bytes	With this header, fields from versions 1 – 11 are downloaded, plus these fields: Downloaded for firewall traffic: domain_name client_host_name host_os_name scp_policy_name process_exe_path Downloaded for Private Access traffic: virus

Configure a custom log format (mcafee:webgateway:custom) on MWG

Extract the file Splunk_Log_XXXXXX.xml (where XXXXXX is the version) from the MWG folder of the application package.
Import Splunk_Log_XXXXXX.xml file in MWG into the Default Log Handler: Policies > Rule Sets > Log Handler, right click on "Default" and select Add > Rule Set from Library
In the new window that appears, click on the "Import from file" button, then choose the xml file and click OK.

Import a new Rule Set from file into McAfee/SkyHigh Web Gateway

click "Auto-Solve Conflicts..." > select "Solve by referring to existing objects" and click OK to import the RuleSet.

Auto-Solving conflicts when importing a RuleSet in McAfee/SkyHigh Web Gateway

If MWG cannot resolve external hostnames then disable DNS RuleSet.
If MWG cannot query Online URL Database then disable URL Categorization and Geolocation Rules.
If any of the imported RuleSets/Rules are marked red - that indicates that some properties like Header.Request.GetAll (available on MWG 10.x+) are not available in the current MWG version. Just delete these rules or upgrade MWG to the latest 10.x+ version. If a TLS RuleSet is shown in red, it needs to be modified as described below in the Troubleshooting section.

The Log configuration has a modular structure, you can choose to send just a preconfigured minimal set of fields or select any subset from available fields. The log ruleset contains several parts (see numbering on the next screenshot):

Required rulesets for CIM-conforming logging.
Web Data Model ruleset where a log line from the previously prepared fields are built.
Additional rulesets where other fields are added as needed.
The DEBUG ruleset that helps to verify that the log lines built correctly.
Write Splunk.log - final log line modifications, performance monitoring of the Splunk ruleset itself and writing the Splunk log to the hard disk.
Send via Syslog.
RuleSet Library - optional templates that can be copied into appropriate Policy Rule Sets (Opener, Media Type Filter etc.) to optain information that is usually not available in the logging cycle.

Additional templates to get access to internal properties in McAfee/SkyHigh Web Gateway, that otherwise not available in the logging cycle

Here are most important modifications that you can do in additional Rulesets (block of RuleSets #3 on the previous screenshot).

Ruleset	Possible modifications
Splunk	Domains not to log - some domains can be excluded from logging completely.
Set Timestamp	choose the right timestamp. The ISO format with a time zone is selected by default. Other options are ToGMT, ISO8601, unix epoch and ToWebReporter formats. If you change the timestamp format on MWG then you have to adjust the TIME_FORMAT setting in local/props.conf on Splunk Indexer.
Client IP	Connection.IP property is used by default. Deselect it and select Client.IP if you have downstream proxies or loadbalancer between the client and MWG.
URL Categories	add internal domains to "internal Domains" list to avoid them to being shown as "uncategorized"
Headers	on MWG older than version 10.x some rules will be marked in red if they are not compatible - delete them or upgrade MWG to the newest 10.x version or later.
TLS	disable this ruleset if HTTPS Scanner is not enabled
-	To get the correct Rule statistics you must create one last ruleset with a rule named "Last Rule" which is applied to all cycles (Request, Response, Embedded).
RuleSet Library	Opener, Hashes/Body, Malware, Media Type, Uploads - to get some of the required information, additional rules need to be placed in the corresponding Policy Rule Sets. If you skip this step, some tables and graphs will be empty.

Create a "Last Rule Set" with an empty "Last Rule" as a most bottom rule in the Rule Sets Tree:
Best Practice: add a Last RuleSet and Last Rule to McAfee/SkyHigh Web Gateway to get better rule statistics

Best Practice: add a Last RuleSet and Last Rule to McAfee/SkyHigh Web Gateway to get better rule statistics

Copy Rules to Certification Verification Rule Set to be able to log information about certification parameters:

Upgrade from 4.x to 5.x

Version 5.x supports all previous log formats and introduces a new format for faster searches (speedup up to 30 times) and faster reports (speedup up to 100 times).
If you choose to keep the old log format on SWG and don't benefit from the speedup, just upgrade the app, no additional steps required.
To take advantage of the speedup introduced by the new log format, follow these steps to upgrade the log handler on SWG and the Splunk app:
- Review SPL searches in Splunk that use raw extractions and modify them for the new log format.
- If there are custom parsing and extractions on intermediate heavy forwarders or Cribl, check and modify them as necessary.
- Upgrade the app on the search head(s)
- Next, upgrade the log configuration on SWG:
  - Import the provided log format configuration (e.g., a file named "2023-10-26_13-10_Splunk_new_format.xml") from the app package in the MWG folder.
  - Optional, for highly modified rulesets: use provided scripts mwg_xml2txt script and dump_logging_fields to find differences between old and new versions of logging rulesets.
  - To view all available fields use `index_and_sourcetype`| fieldsummary | fields field count | sort field
  - Modify the new log format configuration as needed, all fields that were used before should be also enabled in the new ruleset.
  - Double check the timestamp in the old and new log configuration.
  - Disable the old log format ruleset, enable the new log ruleset, save changes.
  - Check all your searches, dashboards and reports using a time range with a new log format (e.g. "last 15 minutes") and make sure they work as expected. If not, find which fields are missing and enable them in the log configuration ruleset.
Finally, test everything in a test or staging environment. If you need support, send an email to splunkcompek.net

Upgrade from 3.07

Create a backup of MWG config, export MWG Log Rules, backup your current app.
Check if there are any custom changes in the old MWG Log Rules or in the app.
Check which sourcetype is currently used - MWGaccess3 or "default". MWGaccess3 works with new version without any changes, the "default" is named "mcafee:webgateway:default".
Upgrade an App via GUI or CLI.
Follow the installation instructions for version 4.x.x.
Modify "index_and_sourcetype" macro to include an index and a sourcetype (i.e. 'index=proxy AND sourcetype="mcafee:webgateway:custom"')
It is recommended to switch from default or MWGaccess3 to the new mcafee:webgateway:custom log format.

Configuration examples

Local file monitor

MWG UI: Configure a custom log format (mcafee:webgateway:custom)
CLI: Allow Splunk to read splunk.log: setfacl -m u:splunk:rx /opt/mwg/log/user-defined-logs (use user splunkfwd for Splunkforwarder)
Splunk UI: Settings > Data inputs > Files & directories > New local File & Directory
Browser to or type in: /opt/mwg/log/user-defined-logs/splunk.log/splunk.log
press Next
Sourcetype: Select > mcafee:webgateway:custom
App Context: McAfee Web Gateway
Host: Constant value
Index: leave on "Default" or create a new index, for example "proxy"
Press Preview and review options
Press Submit

Local UDP/TCP input

Instead of letting Splunk read local splunk.log, events can be sent to a local Splunk instance via a local network interface or even loopback interface, without writing events to the hard disk (i.e. "Write Splunk Log" Rule Set can be disabled).

MWG UI:

Configure a custom log format (mcafee:webgateway:custom) on MWG, enable "Send via Syslog" RuleSet, optionally disable "Write Splunk Log" RuleSet.
Modify rsyslog.conf: Configuration > File Editor > [Hostname] > rsyslog.conf

Find the following line *.info;mail.none;authpriv.none;cron.none /var/log/messages and modify it to *.info;daemon.!=info;mail.none;authpriv.none;cron.none -/var/log/messages.
If receiving syslog or indexer "knows" how to extract sending host field then the syslog header can be removed. Add $template msg_only,"%msg:2:$%"
Add if $programname == 'mwg' and $syslogfacility-text == 'daemon' and $syslogseverity-text == 'info' then @@hostname-or-IP-address-of-the-local-MWG:6514;msg_only. Modify the port as needed. Alternatively use if $programname == 'mwg' and $syslogfacility-text == 'daemon' and $syslogseverity-text == 'info' then @127.0.0.1:6514;msg_only
Add $MaxMessageSize 100k Important: In order for this directive to work correctly, it must be placed right at the top of rsyslog.conf (before any input is defined).
Add $SystemLogRateLimitInterval 0
Add $SystemLogRateLimitBurst 0
Add $imjournalRatelimitInterval 0
Add $imjournalRatelimitBurst 0
More information: https://kcm.trellix.com/corporate/index?page=content&id=KB77988

Verify that the syslog prefix is "mwg" unter Configuration > Appliances > Syslog > Log Prefix

$ModLoad imuxsock # provides support for local system logging (e.g. via logger command)
$ModLoad imklog # reads kernel messages (the same are read from journald)
$WorkDirectory /var/lib/rsyslog
$ActionFileDefaultTemplate RSYSLOG_TraditionalFileFormat
$IncludeConfig /etc/rsyslog.d/*.conf
$ActionName messages
*.info;daemon.!=info;mail.none;authpriv.none;cron.none -/var/log/messages
authpriv.*                                              /var/log/secure
mail.*                                                  -/var/log/maillog
cron.*                                                  /var/log/cron
*.emerg                                                 :omusrmsg:*
uucp,news.crit                                          /var/log/spooler
local7.*                                                /var/log/boot.log
$template msg_only,"%msg:2:$%"
if $programname == 'mwg' and $syslogfacility-text == 'daemon' and $syslogseverity-text == 'info' then  @127.0.0.1:6514;msg_only

Splunk UI:

Splunk UI: Settings > Data inputs > UDP > New Local UDP
Port: a port from a previous step, for example 6514
Press Next
Sourcetype: Select > mcafee:webgateway:custom
App Context: McAfee Web Gateway
Host Method: IP or DNS
Index: leave on "Default" or create a new index, for example "proxy"
Press Preview and review options
Press Submit

Syslog UDP/TCP

MWG UI:

Configure a custom log format (mcafee:webgateway:custom) on MWG, enable "Send via Syslog" RuleSet, optionally disable "Write Splunk Log" RuleSet.
Modify rsyslog.conf: Configuration > File Editor > [Hostname] > rsyslog.conf

Find the following line *.info;mail.none;authpriv.none;cron.none /var/log/messages and modify it to *.info;daemon.!=info;mail.none;authpriv.none;cron.none -/var/log/messages.
If receiving syslog or indexer "knows" how to extract sending host field then the syslog header can be removed. Add $template msg_only,"%msg:2:$%"
Add if $programname == 'mwg' and $syslogfacility-text == 'daemon' and $syslogseverity-text == 'info' then @@hostname-or-IP-address-of-the-remote-splunk:6514;msg_only. Modify the port as needed.
Add $MaxMessageSize 100k Important: In order for this directive to work correctly, it must be placed right at the top of rsyslog.conf (before any input is defined).
Add $SystemLogRateLimitInterval 0
Add $SystemLogRateLimitBurst 0
Add $imjournalRatelimitInterval 0
Add $imjournalRatelimitBurst 0
More information: https://kcm.trellix.com/corporate/index?page=content&id=KB77988
Additionally, it is recommended by rsyslog author himself to switch from slow imjournal to imuxsock/imklog as described in https://kcm.trellix.com/corporate/index?page=content&id=KB92256

Verify that the syslog prefix is "mwg" unter Configuration > Appliances > Syslog > Log Prefix

$ModLoad imuxsock # provides support for local system logging (e.g. via logger command)
$ModLoad imklog # reads kernel messages (the same are read from journald)
$WorkDirectory /var/lib/rsyslog
$ActionFileDefaultTemplate RSYSLOG_TraditionalFileFormat
$IncludeConfig /etc/rsyslog.d/*.conf
$ActionName messages
*.info;daemon.!=info;mail.none;authpriv.none;cron.none -/var/log/messages
authpriv.*                                              /var/log/secure
mail.*                                                  -/var/log/maillog
cron.*                                                  /var/log/cron
*.emerg                                                 :omusrmsg:*
uucp,news.crit                                          /var/log/spooler
local7.*                                                /var/log/boot.log
$template msg_only,"%msg:2:$%"
if $programname == 'mwg' and $syslogfacility-text == 'daemon' and $syslogseverity-text == 'info' then  @@server:6514;msg_only

Splunk UI:

Settings > Data inputs > TCP > New Local TCP
Port: a port from a previous step, for example 6514
Press Next
Sourcetype: Select > mcafee:webgateway:custom
App Context: McAfee Web Gateway
Host Method: IP or DNS (if Splunk can resolve IP address of MWG)
Index: create a new index, for example "proxy"
Press Preview and review options
Press Submit

Syslog TCP+TLS

$DefaultNetstreamDriver gtls
$DefaultNetstreamDriverCAFile /etc/rsyslog.d/certs/example.com.ca.pem
$DefaultNetstreamDriverCertFile /etc/rsyslog.d/certs/mwg.example.com.pem
$DefaultNetstreamDriverKeyFile /etc/rsyslog.d/certs/mwg.example.com.key

#$ActionSendStreamDriverAuthMode x509/name
$ActionSendStreamDriverAuthMode anon
#$ActionSendStreamDriverPermittedPeer splunk.example.com
$ActionSendStreamDriverMode 1

Configure a McAfee Web Gateway (MWG) syslog to send TLS-secured data to Splunk

Syslog to multiple destinations

Syslog (6, User-Defined.logLine)

Syslog (5, User-Defined.logLine)

# exclude both daemon.notice and daemon.info:
*.info;mail.none;daemon.!=info;daemon.!=notice;authpriv.none;cron.none -/var/log/messages

$ActionQueueFileName fwdRule1
$ActionQueueMaxDiskSpace 1g
$ActionQueueSaveOnShutdown on
$ActionQueueType LinkedList
$ActionResumeRetryCount -1
# use the new expression format instead of "traditional" severity and facility based selectors, because an expression like daemon.info match all messages of specified priority and HIGHER that can leads to duplicated events
if $programname == 'mwg' and $syslogfacility-text == 'daemon' and $syslogseverity-text == 'info' then @@syslog1

$ActionQueueFileName fwdRule2
$ActionQueueMaxDiskSpace 1g
$ActionQueueSaveOnShutdown on
$ActionQueueType LinkedList 
$ActionResumeRetryCount -1
if $programname == 'mwg' and $syslogfacility-text == 'daemon' and $syslogseverity-text == 'notice' then @@syslog2

Configure Universal Forwarder (UF) to run directly on MWG and send logs to indexer

MWG UI: Configure a custom log format (mcafee:webgateway:custom) on MWG, leave "Send via Syslog" RuleSet disabled
CLI: Allow Splunk to read splunk.log: setfacl -m u:splunkfwd:rx /opt/mwg/log/user-defined-logs (use user splunk for Splunkforwarder < 9.1.0)
Install UF on MWG
Install Add-on for McAfee Web Gateway (https://splunkbase.splunk.com/app/5452/)

Create a file $SPLUNK_HOME/etc/apps/<app>/local/inputs.conf with following content (modify as needed):

[monitor:///opt/mwg/log/user-defined-logs/splunk.log/splunk*]
sourcetype = mcafee:webgateway:custom
# index = proxy

Create outputs.conf:

[tcpout]
defaultGroup=splunk

[tcpout:splunk]
server=splunk:9997

Create limits.conf, etc. configuration as needed.

Log pushing from MWG to a log server

MWG UI: Configure a custom log format (mcafee:webgateway:custom) on MWG, leave "Send via Syslog" RuleSet disabled
MWG UI: Policy > Settings > Engines > File System Logging > Splunk Log > Settings for Rotation, Pushing and Deletion > Enable specific settings for user defined log:
- Configure Auto Rotation as needed
- Configure Auto Deletion as needed
- Configure Auto Pushing: enable auto pushing, set destination server, enable pushing log files directly after rotation
On a receiving log server: use Splunk file monitor or UF

Log pulling from MWG

MWG UI: Configure a custom log format (mcafee:webgateway:custom) on MWG, leave "Send via Syslog" RuleSet disabled
Using a script, API or other method pull logs from MWG to Splunk

Log pulling from SSE/WGCS

McAfee Web Gateway Cloud Service (WGCS) or SkyHigh SSE provides the log with a reduced set of fields, so only a subset of views will work properly.

There are several ways to pull SSE/WGCS logs:

Recommended: Use Logging Client: https://success.myshn.net/MVISION_Cloud_for_Unified_Cloud_Edge/Unified_Cloud_Edge_Logging_Client/Download_and_install_the_Logging_Client
Configure log pulling using a shell or PowerShell script https://communitym.trellix.com/t5/Web-Gateway-Cloud-Service/Sending-WGCS-logs-to-on-premise-Splunk/td-p/622784
Use SkyHigh Content Security Reporter (CSR) to download SSE/WGCS logs, configure post-processing to move processed log files to some directory and use Universal Forwarder to monitor this directory.

When reading WGCS logs, use [monitor:// and not [batch://, because batch seems to delete logs too early. Use a separate Scheduled Task (schtasks /create /tn "Delete old SSE Logs" /tr "C:\scripts\delete_old_sse_logs.bat" /sc HOURLY ) to delete old logs, for example ForFiles /p C:\SSE_Logs /d -1 /c "cmd /c del /q @file"

An example of inputs.conf:

[monitor://C:\SSE_Logs]
sourcetype = skyhigh:webgateway:csv
# index = proxy
crcSalt = <SOURCE>

HOWTO: SkyHigh Web Gateway Cloud (SSE) integration with Splunk Cloud - step by step configuration: https://www.youtube.com/watch?v=1vCbwz6uKB0

Splunk Connect for Syslog (SC4S)

Recently, Splunk has started recommending the use of SC4S as a syslog collector. SC4S is built on syslog-ng and automatically recognizes MWG/SWG events if an event contains 'mwg: ' prefix. Therefore, the custom template 'msg_only' (that strip the syslog header) cannot be used. Apply the following steps to send events via SC4S:

Ensure that the 'mwg' log prefix is configured (it is set by default) under Configuration > Syslog
By default, SC4S assigns the source 'mcafee:wg', sourcetype 'mcafee:wg:kv' and index 'netproxy'. You need to modify at least sourcetype by adding following lines to /opt/sc4s/local/context/splunk_metadata.csv:
```
# rewrite a sourcetype (mcafee:webgateway:custom, mcafee:webgateway:default, etc.)
mcafee_wg,sourcetype,mcafee:webgateway:default
# rewrite an index if required
# mcafee_wg,index,proxy
```
Events arriving from SC4S have a "mwg: " prefix (e.x. "mwg: [09/Dec/2023:..." or "mwg: 2023-11-09 ..."), which breakes parsing. To fix it either remove the 'mwg: ' prefix (using SEDCMD or TRANSFORMS) or modify TIME_PREFIX and EXTRACT (use only one of following methods):
- Using SEDCMD (recommended): add using UI or add to local/props.conf to the sourcetype configuration:
```
SEDCMD-0_delete_syslog_prefix = s/^mwg: //
```
- Alternative configuration using TRANSFORMS:
  local/props.conf:
```
TRANSFORMS-0_delete_syslog_prefix = 0_delete_syslog_prefix
```
  local/transforms.conf:
```
[0_delete_syslog_prefix]
REGEX = ^mwg: (.*)
DEST_KEY = _raw
FORMAT = $1
```
- By modifying extractions and TIME_PREFIX (not recommended): prepending ^mwg: for all extractions that starts with "^". Also TIME_PREFIX must by modified by prepending "^mwg: ".

Disable rsyslog/journald rate-limiting

Best practices: disable journald completely as described in https://kcm.trellix.com/corporate/index?page=content&id=KB92256

McAfee Web Gateway is based on RedHat/CentOS 7 and inherits some settings that rate-limit syslog. Read https://www.ibm.com/support/pages/how-disable-rsyslog-rate-limiting and https://access.redhat.com/solutions/1417483 to modify or disable rate-limiting in /etc/rsyslog.conf (using MWG UI) and /etc/systemd/journal.conf .

$SystemLogRateLimitInterval 0
$SystemLogRateLimitBurst 0
$imjournalRatelimitInterval 0
$imjournalRatelimitBurst 0

RateLimitInterval=0
RateLimitBurst=0

Syslog-NG configuration

Use following configuration for syslog-ng (on receiving side):

network
flags(no-parse)
# https://axoflow.com/syslog-over-udp-kernel-syslog-ng-tuning-avoid-losing-messages/
so-rcvbuf(32MiB)          # Sets the receive buffer size
log-iw-size(250k)         # Sets the size of the initial window for flow control
log-fetch-limit(10k)      # Sets the number of messages fetched from the source

Host extraction

Correct extraction of host field is very important. Unfortunately default methods of host extraction have some downsides:

If not explicitly set, a "$decideOnStartup" from inputs.conf or system hostname is used to set 'host' value
If not set explicitly, you can face situations when host value for the same machine will be set to IP, a short hostname or even to FQDN
The host name can appear in UPPER or low case
If Splunk or syslog server configured to get the host name of the sending host via reverse DNS resolution of the IP address and a DNS server isn't available, it will fall back to IP address
If there is a load balancer inbetween, the host field can contain wrong value
A syslog header can contain for example mwg (short name), MWG (short name upper case), IP address or even mwg.example.com (FQDN) depending on configuration
Syslog-ng documentation mentions that "it is not recommended to resolve hostnames in syslog-ng"

To summarize it all: it is better to set a host value explicit and not rely on "heuristic" that can lead to several host values for the same machine. With Syslog use either local resolution with a hosts file and host_segement/host_regex on a syslog receiver or send a host name of MWG directly with an event and extract it during the ingestion. This allows the syslog header to be disabled directly on MWG by defining the "msg_only" rsyslog template as described in the Syslog UDP/TCP section:

[mcafee:webgateway:custom]
TRANSFORMS-extract_host_from_event = extract_host_from_event

[extract_host_from_event]
REGEX = \shost=(\S+)
FORMAT = host::$1
DEST_KEY = MetaData:Host
# LOOKAHEAD = 40960

The host field must be placed in the event before long fields like url path or url query, because these potentially long fields can "push" the host field outside of the first 4096 bytes/characters limit defined by the LOOKAHEAD property defined in transforms.conf, that specifies how far Splunk looks in the event for index-time fields. To activate the host field enable the preconfigured rule "Add host field" under Policy > Log Handler > Default > Splunk > Web Data Model.

Disabling the syslog header has several benefits:

Usually a syslog header contains an incorrect timestamp that may differ from the MWG timestamp.
All information in syslog headers except "host" is usually useless.
There's no need to add a useless syslog header on MWG and then remove it again on Splunk.
A short log line without the syslog header is easier to read.
You can save between 5 and 15% of the license cost.

Segmentation

To use TERM and PREFIX directives with strings that contain double hyphens (such as punycode domains like xn--bcher-kva.de), a custom segmenters.conf file needs to be applied on indexer:

local/props.conf:

SEGMENTATION = indexing_without_double_dash

local/segmenters.conf

[indexing_without_double_dash]
[indexing_without_double_dash]
# These are default MAJOR segmenters, but without "--" to allow TERM/PREFIX work with punycode domains (xn--bcher-kva.de) and strings with more than one dash (hp--community.force.com)
# https://community.splunk.com/t5/Splunk-Search/TERM-and-PREFIX-cannot-find-string-with-two-dashes/td-p/679366
# https://ideas.splunk.com/ideas/EID-I-2226
# default:
# MAJOR = [ ] < > ( ) { } | ! ; , ' " * \n \r \s \t & ? + %21 %26 %2526 %3B %7C %20 %2B %3D -- %2520 %5D %5B %3A %0A %2C %28 %29
MAJOR = [ ] < > ( ) { } | ! ; , ' " * \n \r \s \t & ? + %21 %26 %2526 %3B %7C %20 %2B %3D %2520 %5D %5B %3A %0A %2C %28 %29

# default values need to be explicitely specified:
MINOR = / : = @ . - $ # % \\ _
INTERMEDIATE_MAJORS = false

Test:

Test condition: visit hp--community.force.com (active domain with two dashes) via the proxy to get it logged.
Test SPL: `index_and_sourcetype` TERM(d=hp--community.force.com)

Interactive Configuration Builder

An interactive configuration builder provides an easy way to prepare configuration snippets for both sending (Web Gateway) and receiving (rsyslog, syslog-ng, Splunk UF or indexer) sides.

Configuration parameter	Option(s)	Comment
Syslog destination (hostname or IP)
Syslog destination port
Syslog Transport (UDP/TCP)	OR
Strip a syslog header.	OR	Send a message only, without a syslog header (that includes a host name). The host name needs to be provided in the event itself (prefered, read Host extraction) or can be resolved on the receiver via rDNS.

# rsyslog configuration, optimized for Web Gateway, for more info see https://proxy-test.com/Splunk_App_for_SkyHigh_Secure_Web_Gateway_README.html
$MaxMessageSize 32k
$SystemLogRateLimitInterval 0
$SystemLogRateLimitBurst 0
# use imuxsock instead of imjournal (see https://kcm.trellix.com/corporate/index?page=content&id=KB92256)
$ModLoad imuxsock # provides support for local system logging (e.g. via logger command)
$ModLoad imklog # reads kernel messages (the same are read from journald)
$WorkDirectory /var/lib/rsyslog # Where to place auxiliary files
$ActionFileDefaultTemplate RSYSLOG_TraditionalFileFormat # Use default timestamp format
$template msg_only,"%msg:2:$%" # An alternative format, without a syslog header
$IncludeConfig /etc/rsyslog.d/*.conf

# Log anything (except Web Gateway access log and mail) of level info or higher.
*.info;daemon.!=info;mail.none;authpriv.none;cron.none -/var/log/messages
authpriv.*                                              /var/log/secure 
mail.*                                                  -/var/log/maillog
cron.*                                                  /var/log/cron
*.emerg                                                 :omusrmsg:*
uucp,news.crit                                          /var/log/spooler
local7.*                                                /var/log/boot.log

$ActionName messages
$ActionQueueFileName WebGateway # unique name prefix for spool files
$ActionQueueMaxDiskSpace 1g     # 1gb space limit (use as much as possible)
$ActionQueueSaveOnShutdown on   # save messages to disk on shutdown
$ActionQueueType LinkedList     # run asynchronously
$ActionResumeRetryCount -1      # infinite retries if host is down

if $programname == 'mwg' and $syslogfacility-text == 'daemon' and $syslogseverity-text == 'info' then @syslogserver:514

# rsyslog input configuration optimized for receiving of syslog messages from Web Gateway
# for more info see https://proxy-test.com/Splunk_App_for_SkyHigh_Secure_Web_Gateway_README.html
module(load="imtcp")
template (name="tpl_skyhigh_web_gateway" type="string" string="/opt/syslog/skyhigh_web_gateway/%FROMHOST%/%$year%-%$month%-%$day%.log")
template (name="tpl_FormatMsgOnly"       type="string" string="%rawmsg%\n")

ruleset(name="rule_SkyHigh_Web_Gateway"){
  action(type="omfile" dynaFile="tpl_skyhigh_web_gateway" fileGroup="splunk" dirGroup="splunk" dirCreateMode="0770" fileCreateMode="0660")
  stop
}
input(type="imtcp" port="514" ruleset="rule_SkyHigh_Web_Gateway")

# syslog-ng input configuration optimized for receiving of syslog messages from Web Gateway
# for more info see https://proxy-test.com/Splunk_App_for_SkyHigh_Secure_Web_Gateway_README.html
# https://axoflow.com/syslog-over-udp-kernel-syslog-ng-tuning-avoid-losing-messages/

source s_skyhigh_web_gateway {
  network(
    ip(0.0.0.0)
    port(514)
    transport("tcp")
    so-rcvbuf(32MiB)          # Sets the receive buffer size
    log-iw-size(250k)         # Sets the size of the initial window for flow control
    log-fetch-limit(10k)      # Sets the number of messages fetched from the source
    flags(no-parse)
  );
};
destination d_skyhigh_web_gateway {
  file("/opt/syslog/skyhigh_web_gateway/$FULLHOST/$C_YEAR-$C_MONTH-$C_DAY.log"
    create-dirs(yes)
    flush-lines(0)
    dir-perm(0755)      perm(0644)
    dir-owner("splunk") dir-group("splunk")
    owner("splunk")     group("splunk")
    
  );
};
log {
  source(s_skyhigh_web_gateway);
  destination(d_skyhigh_web_gateway);
};

# local/inputs.conf for UF/indexer with installed rsyslog/syslog-ng
[monitor:///opt/syslog/skyhigh_web_gateway/*/*.log]
sourcetype = mcafee:webgateway:custom
# index = proxy
# enable host_segment if rsyslog/syslog-ng used to extract a hostname of web gateway from a syslog header
# host_segement = 4
  
# local/props.conf for HF or indexer
[mcafee:webgateway:custom]
TRANSFORMS-extract_host_from_event = extract_host_from_event

# local/transforms.conf for HF or indexer
[extract_host_from_event]
REGEX = \shost=(\S+)
FORMAT = host::$1
DEST_KEY = MetaData:Host

Security considerations

Least privileges

Starting from UF version 9.x a splunk forwarder runs with AmbientCapabilities=CAP_DAC_READ_SEARCH that allows the service to read any file on the system (incl. /etc/shadow etc.). https://docs.splunk.com/Documentation/Forwarder/latest/Forwarder/Installleastprivileged.

Important: The Linux ACLs are still being applied. However, the user "splunkfwd" will continue to encounter a "permission denied" error when attempting to access files that it is not allowed to read. On the other hand, the Splunk forwarder service, which runs as the "splunkfwd" user but with the "CAP_DAC_READ_SEARCH" capability, can read any file.

If this capability is too permissive, you can disable it in /etc/systemd/system/SplunkForwarder.service and configure UF read-only permissions using classic linux permissions:

Command	Comment
setfacl -m u:splunkfwd:rx /opt/mwg/log/user-defined-logs	Allow the user splunkfwd to read user-defined logs
usermod -aG adm splunkfwd	Add the user splunkfwd to the adm group to allow read /var/log/messages and /var/log/secure
usermod -aG mwg splunkfwd	Add the user splunkfwd to the mwg group to allow read various proxy logs and provides more permissions than the setfacl-method.

Deployment Server

In a high-security environment, you can consider avoiding direct connections between the UF and the deployment server. Instead, opt to push all configurations using alternative methods.

Onboarding checklist

Check	Expected Result	Conditions/Causes	Comment
Timestamp and Timezone	Timestamp and timezone are correct, there are no "future" events		\| eval diff=_indextime - _time
Index	Index is correct		Use a separate index for proxy events
Sourcetype	sourcetype is correct
Host extraction	Host extraction is correct	Syslog	Don't rely on rDNS, it decreases performance and can fail. Hosts server1, SERVER1, server1.example.com, 10.20.30.nn can be the same host, but are different hosts from Splunk's point of view.
Integrity	All events reach Splunk, no events are lost	Syslog, high log rate	useACK, rsyslog: disk queue
Truncation	Long log lines aren't truncated	rsyslog: MaxMessageSize, syslog-ng: log_msg_size, syslog via UDP, Splunk: TRUNCATE	test-link
Logging delay	Low logging delay		\| eval diff=_indextime - _time
Log integrity in case of network interruption	Short network interruptions shouldn't lead to loss of events		useACK, rsyslog: disk queue
Secure transfer	Log transferred via TLS, Certificate validation, mTLS
Multiline	There are no mulitline proxy events
Duplicates	There are no duplicate events
Parsing	All events parsed correctly, action/src/dest fields are always present
Settings location	All settings are placed inside of MWG App or TA	Settings can be placed in a wrong app if GUI is used	Use btool to verify.

You can access these and additional onboarding tips and checks using the Data Onboarding Checklist Splunk app avalible on the Splunkbase.

Detailed description of the mcafee:webgateway:custom Log Format

Why the new log format? Neither the default nor the previously used MWGaccess3 log formats provide enough information for SIEM to be useful. These legacy formats provide very limited information about downloading and uploading risky files. Many SIEM correlation rules will not work properly if a transferred file is embedded as a part of a composite object (zip, iso, docx, etc.) or has different/faked media-type header or extension.

The new log format provides the following use cases among many others:

Even if a transaction was allowed, detect all potentially dangerous objects and log their true media-type, hash and size.
Even if a transaction was white-listed and not checked for the Web-Reputation and URL-Categorization - these checks are still performed in the Log Cycle after the transaction has been completed and the log event will contain them.
It performs a DNS lookup of dest_host, and if there is more than one IP, does a reverse DNS lookup of URL.Destination.IP to detect fast-flux C&C Servers.

The custom log format (mcafee:webgateway:custom) consists of several parts:

Timestamp
Fixed set of fields: status, action, client_ip, url_protocol, http_method, dest, dest_port, bytes_out/bytes_in, duration/response_time. These fields have no field prefix - Splunk extracts them based on the log structure.
Variable set of fields: they are included in the log only if they are enabled AND exist. For example, a URL path will not be included for this URL: https://www.example.com/. These fields have either a short field prefix (for example up=) or consist of a single string (i.e. "tunnel") and can exist in any part of the log line, their order is not important. Any of the variable fields can be enabled and disabled on the MWG at any time, without the need to modify anything on the Splunk side. You can enable conditional logging for these fields, for example a query string can be logged only for some subset of categories, certificate information (Issuer, Common Name, Subject Alternative Names etc.) - only for suspicious transactions etc.

2021-02-26 14:36:46 -0600 200 allowed 192.168.2.1 https GET safebrowsing.googleapis.com 443 563/4156 38/17 up="/v4/threatListUpdates" ua="FF86-10.0" c="it" dip=142.250.185.n kex=112/112 cntx sccc=1302/1302 sslp=1.3/1.3 sslicn="GTS CA 1O1,GlobalSign" sslcn="upload.video.google.com" crtdays=-52 mbmismatch ctmt0 rul="L" rn=41/104 srcp=62407 conrt=0 b=524/4418 tunnel psrcip=192.168.2.1 psrcp=42550 piv=2.0/2.0 r=0 t=0/0/34/34/18/18/22/11/11

Starting from version 5.0.0 of the app, an updated log format was introduced that provides significantly improved search (up to 30 times) and reporting (up to 100 times) performance by leveraging TERM and PREFIX directives:

Search/reporting performance using normal search:
Search/reporting performance using accelerated new log format und TERM/PREFIX:

The change between version 4 and version 5 is essentially the addition of a field prefix to every value, enabling the use of the PREFIX directive. The new version of log is about 10-15% longer - consider that for a license usage:

2021-02-26 14:36:46 -0600 s=200 ac=allowed src=192.168.2.1 p=https m=GET d=safebrowsing.googleapis.com dp=443 bi=563 bo=4156 dur=38 rt=17 up="/v4/threatListUpdates" ua="FF86-10.0" c=it dip=142.250.185.n ckex=112 skex=112 cntx scc=1302 ssc=1302 sslcp=1.3 sslsp=1.3 sslicn="GTS CA 1O1,GlobalSign" sslcn="upload.video.google.com" crtdays=-52 mbmismatch ctmt0 rul="L" rnf=41 rne=104 srcp=62407 conrt=0 bfc=524 btc=4418 tunnel psrcip=192.168.2.1 psrcp=42550 rqv=2.0 rsv=2.0 r=0 tdns=0 tcon=0 tre=34 text=34 t=4.18.18.22.11

You can download a Cheat Sheet new log format with examples of usage: https://proxy-test.com/swg_cheatsheet.pdf

The app (version 5.+) supports both old and new format. If you want to speedup search and reporting and also greatly reduce a load on the search head it is recommended to configure new log format on the SWG.

Logging of URL fields: Instead of logging a URL as-is (ex. http://www.example.com/wp/3?id=4), MWG splits the URL into usable parts (url protocol, domain, url path, url query) which will be used on Splunk's end to rebuild the url. This saves lot of processing and produces better results, in particular:

no need to parse url each time
correct domain extraction follows the Public Suffix List, ensuring the domain field is populated with a specific/public top-level domain, rather than just the last two segments of the host name (i.e. for the hostname www.bbc.co.uk, bbc.co.uk is a domain and not just co.uk).
accelerated search using TERM/PREFIX has issues/limitations when applied to unparsed URLs

Default settings exclude logging of the URL query string (portion of the url after the question mark "?"). Enable it in the Web Data Model ruleset if required. Note that logging the query string greatly increases the length of log lines, potentially bloating TSIDX and leading to compromised search performance, heightened storage needs, and increased license usage. Conversely, enabling query string logging proves beneficial in numerous scenarios. Choose to enable it for each request or selectively as needed.

An excerpt of the 100 most useful fields is provided below. MWG has about 900 properties that can be used for logging.

Description of logged fields

MWG field

CIM field

Comment

Timestamp

Property	Example	TIME_FORMAT / Comment
DateTime.ToISOString	2010-03-22 11:45:12)	%Y-%m-%d %H:%M:%S
DateTime.ToISOString with Milliseconds	2010-03-22 11:45:12.123	%Y-%m-%d %H:%M:%S.%3N
DateTime.ToISOString with Milliseconds and timezone	2010-03-22 11:45:12.123 -0600	%Y-%m-%d %H:%M:%S.%3N %z
DateTime.ToISOString and timezone	2010-03-22 11:45:12 -0600	%Y-%m-%d %H:%M:%S %z
DateTime.ToGMTString	Mon, 22 March 2010 11:45:36 GMT	%a, %d %B %Y %H:%M:%S %Z
DateTime.ToISO8601String	2016-01-26T11:45:36.695Z	this time format can produce unexpected output, don't use it
DateTime.ToNumber	Unix epoch time - 1512915182	%s
DateTime.ToWebReporterString	[29/Oct/2010:14:28:15 +0000]	\[%d/%b/%Y:%H:%M:%S %z\]

Connection.IP / Client.IP

src

Client.IP takes the value of X-Forwarded-For header

Authentication.UserName

user

Message.TemplateName, Block.ID,
Response.StatusCode, Protocol.FailureDescription,
BytesFromServer, Command.Name,
Action.Names

action

The action taken by the proxy: allowed, blocked, error or auth. Various MWG properties are used to calculate correct action field.

URL

url

Don't enable it, Splunk build URL based on uri components

URL.Categories

Other logs

Audit log

Audit logs (/opt/mwg/log/audit/audit.log) contains all changes and activity made by administator(s) using UI or REST interface. Audit events can be sent to Splunk using a UF or custom syslog configuration. Almost 70 actions are mapped to Authentication and Change CIM Data Models:

Action	action	change_type	object_category
ACTIVATE_LICENSE_FILE	modified		license
ADDED_ADMINROLE	added	AAA	role
ADDED_APPLIANCE	added		appliance
ADDED_CONTENT	added	filesystem	config
ADDED_GROUP_ROLE_MAPPING	added	AAA	role
ADDED_RULES	added		config
ADDED_SYSTEM_FILES	added	filesystem	file
ADDED_TEMPLATE_DIRECTORIES	added	filesystem	directory
AUTHENTICATE_WITH_EXTERNAL_SERVER	success
BACKUP_TRIGGERED	created		backup
CREATED_NEW_LIST	added		config
CREATED_NEW_RULE	added		config
CREATED_NEW_RULEGROUP	added		config
CREATED_NEW_SETTINGS	added		config
CREATED_NEW_USER	added	AAA	user
CREATED_NEW_USER_DEFINED_PROPERTY	added		config
DASHBOARD_DATA_RESET	deleted
DATE_CHANGED	modified		config
DELETED_ADMINROLE	deleted	AAA	role
DELETED_APPLIANCE	deleted		appliance
DELETED_CONTENT	deleted		config
DELETED_LIST	deleted		config
DELETED_LOG_HANDLER	deleted		config
DELETED_RULE	deleted		config
DELETED_RULE_GROUP	deleted		config
DELETED_RULES	deleted		config
DELETED_SETTINGS	deleted		config
DELETED_TEMPLATE_DIRECTORIES	deleted		directory
DELETED_TEMPLATE_FILES	deleted		file
DELETED_USER	deleted	AAA	user
DELETED_USER_DEFINED_PROPERTY	deleted		config
EXPORT_PRIVATE_KEY	read		config
FILE_DOWNLOAD	read		file
FILE_UPLOAD	added	filesystem	file
FILES_DELETE	deleted	filesystem	file
FORCED_USER_LOGOUT	logout
JOINED_NTLM	modified		config
LEFT_NTLM	modified		config
MODIFIED_ADMINROLE	modified		role
MODIFIED_APPLIANCE_SETTINGS	modified		config
MODIFIED_CLUSTER_CONFIGURATION	modified		config
MODIFIED_CONTENT	modified		config
MODIFIED_CATALOG	modified		config
MODIFIED_GROUP_ROLE_MAPPING	modified		role
MODIFIED_LIST	modified		config
MODIFIED_NTLM	modified		config
MODIFIED_RULE	modified		config
MODIFIED_RULE_GROUP	modified		config
MODIFIED_SETTINGS	modified		config
MODIFIED_SYSTEM_FILES	modified	filesystem	file
MODIFIED_TEMPLATE_FILES	modified	filesystem	file
MODIFIED_USER	modified	AAA	user
MODIFIED_USER_DEFINED_PROPERTY	modified		config
MOVED_RULE_GROUPS	modified		config
MOVED_RULES	modified		config
REORDERED_CONTENT	modified		config
RESTORE_FAILED	modified		config
RESTORE_STARTED	pending		config
RESTORE_SUCCEDED	modified		config
SAVING_FAILED	read		config
SYSTEM_LIST_UPDATE	modified		config
TRIGGER_ACTION	pending		config
USER_LOGIN	success
USER_LOGIN_FAILED	failure
USER_LOGOUT	logout
USER_TIMED_OUT	timeout

Audit.log can be sent to Splunk using either the UF or Syslog.

using UF

This method sends audit events "as is" - multiline, with all details.

inputs.conf:

[monitor:///opt/mwg/log/audit/audit*]
#index=proxy_audit
sourcetype=mcafee:webgateway:audit

using syslog (2 methods):

using rsyslog file monitor:

This method sends audit events "as is" - multiline, with all details.

create a new file /etc/rsyslog.d/swg_audit_log.conf with following content:

module(load="imfile")

# exclude this facility.severity in rsyslog.conf: local5.!=info
input(type="imfile"
      File="/opt/mwg/log/audit/audit.log"
      Tag="swg_audit_log"
      Facility="local5"
      Severity="info")

template(name="msg_only_udp" type="string" string="%msg%")
# template(name="msg_only_tcp" type="string" string="%msg:2:$%")

if $programname == "swg_audit_log" then {
    action(type="omfwd"
           Target="splunk.server"
           Port="10514"
           Protocol="udp"
           Template="msg_only_udp")
}

modify rsyslog.conf via UI to exclude local5.info: add local5.!=info to the line with /var/log/messages
restart rsyslog: systemctl restart rsyslog

Write Audit log to syslog:

This method produces one-line events (easier to read, but less details)
- Configuration > Log File Manager > Settings for the Audit Log: enable "Write audit log to syslog" checkbox.
- Add to rsyslog.conf using UI (modify syslog-server and port):
```
$template msg_only_udp,"%msg%" # An alternative format, without a syslog header, for UDP
if $programname == 'mwg' and $syslogfacility-text == 'auth' and $syslogseverity-text == 'info' then @splunk-server:10514;msg_only_udp
```
- modify rsyslog.conf via UI to exclude auth.info: add auth.!=info to the line with /var/log/messages
- restart rsyslog: systemctl restart rsyslog

/var/log/messages log

/var/log/messages is a system log file that records various system messages and events

To send it using UF (recommended):

add a user splunk (for Splunk) or splunkfwd (for Splunk Forwarder) to the adm group (using vigr or by modifying /etc/group): adm:x:4:mwgc,tomcat,splunk

create a monitor stanza in local/inputs.conf:

[monitor:///var/log/messages*]
# host = your_host
# index = proxy
sourcetype = linux_messages_syslog

systemctl restart SplunkForwarder

To send it using syslog:

create a new file /etc/rsyslog.d/swg_var_log_messages_log.conf with following content (take the log string from the rsyslog.conf (UI), modify splunk host and port):

$template msg_only_udp,"%msg%" # An alternative format, without a syslog header, for UDP
*.info;daemon.!=info;daemon.!=debug;daemon.!=notice;mail.none;authpriv.none;cron.none @splunk:10614;msg_only_udp

The previous setup sends a mix of various syslog severities and facilities to the receiver, making it more difficult to filter and separate the syslog stream. An alternative setup will read from /var/log/messages and send the logs using a custom severity/facility combination:

###############################################
# var_log_messages.conf – dedicated rule set  #
###############################################

module(load="imfile")                       # <- load only once globally

# very small template: send the raw line only
template(name="mwgOnlyMsg" type="string" string="%msg%")

# ---------- ruleset that talks to the log-collector -----------------
ruleset(name="var_log_messages"){
    action( type="omfwd"
            target="syslog.example.com"
            port="514"
            protocol="udp"
            template="mwgOnlyMsg"
            # Action.SendingQueue.Size="10000"
            Action.ResumeRetryCount="-1"
    )
    stop                                     # absolutely nothing else
}

# ---------- file monitor ------------------------------------------------
input(
    type="imfile"
    File="/var/log/messages" 
    Tag="var_log_messages"        # colon will be auto-appended
    Facility="local7"
    Severity="info"
    addMetadata="on"                         # if you need %$!metadata!filename%
    ruleset="var_log_messages"                  # every line -> out
)

restart rsyslog: systemctl restart rsyslog
assign it to default sourcetype linux_messages_syslog

/var/log/secure log

/var/log/secure is a system log file that contains security-related events and authentication information, for example:

pam_unix(sshd:session): session opened for user root by (uid=0)
pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=10.1.2.3  user=root

To send it using UF (recommended):

add a user splunk (for Splunk) or splunkfwd (for Splunk Forwarder) to the adm group (using vigr or by modifying /etc/group): adm:x:4:mwgc,tomcat,splunk

create a monitor stanza in local/inputs.conf:

[monitor:///var/log/secure*]
# host = your_host
# index = proxy_audit
sourcetype = linux_secure
# sourcetype = mcafee:webgateway:secure

systemctl restart SplunkForwarder

To send it using syslog:

create a new file /etc/rsyslog.d/swg_var_log_secure_log.conf with following content (modify splunk host and port):

$template msg_only_udp,"%msg%" # An alternative format, without a syslog header, for UDP
authpriv.* @splunk:10714;msg_only_udp

restart rsyslog: systemctl restart rsyslog
assign it to default sourcetype linux_secure

An example of usage:

source="/var/log/secure" host="prx*" index="proxy" sourcetype="linux_secure" authentication fail* | rex "(?\d+) more authentication failures" | eval failure_count=if(isnotnull(failure_count),failure_count,1)| stats sum(failure_count) AS count values(host) AS host by process rhost user

/opt/mwg/log/mwg-errors

The folder /opt/mwg/log/mwg-errors contains various types of logs:

Log name	Log type	Comment
mwg-core	text, single line	mwg-core logging
mwg-coordinator	text, single line	mwg-coordinator logging
mwg-ui	text, multi line	mwg-ui Tomcat logging
mwg-logmanager	text, single line	mwg-logmanager logging
mwg-uideserialization	text, single line	mwg-ui deserialization logging
mwg-sysconfd	text, single line	mwg-sysconfd logging
mwg-monitor	text, single line	mwg-monitor logging
mwg-saas-connector	text, single line	mwg-saas connector logging
*.bin	binary	cannot be parsed. Can be excluded, but the presence of such logs is also a good hint about potential issues.

To send it using UF (recommended):

add a user splunk (for Splunk) or splunkfwd (for Splunk Forwarder) to the mwg group (using vigr or by modifying /etc/group): mwg:x:199:tomcat,splunk

create a monitor stanza in local/inputs.conf:

  
[monitor:///opt/mwg/log/mwg-errors/mwg*log]
sourcetype = mcafee:webgateway:mwg-errors
# index = proxy_audit

systemctl restart SplunkForwarder

To send it using syslog:

Create a file /etc/rsyslog.d/swg_opt_mwg_log_mwg_errors.conf with following content:

#############################################
# swg_opt_mwg_log_mwg_errors.conf – dedicated rule set  #
#############################################

module(load="imfile")                       # <- load only once globally

# very small template: send the raw line only
template(name="mwgOnlyMsg" type="string" string="%msg%")

# ---------- ruleset that talks to the log-collector -----------------
ruleset(name="mwgErrors_out"){
    action( type="omfwd"
            target="syslog.example.com"
            port="514"
            protocol="udp"
            template="mwgOnlyMsg"
            Action.SendingQueue.Size="10000"
            Action.ResumeRetryCount="-1"
    )
    stop                                     # absolutely nothing else
}

# ---------- file monitor ------------------------------------------------
input(
    type="imfile"
    File="/opt/mwg/log/mwg-errors/mwg-co*.errors.log" # monitor both mwg-core and mwg-coordinator
    Tag="swg_mwg-errors_mwg-core_log"        # colon will be auto-appended
    Facility="local6"
    Severity="info"
    addMetadata="on"                         # if you need %$!metadata!filename%
    ruleset="mwgErrors_out"                  # every line -> out
)

systemctl restart rsyslog

Self-Monitoring

SSWG offers approximately 300 system properties and counters that can be collected from all appliances and analyzed. These include::

System Details: Information about CPU load, memory usage, disk usage, and more.
Performance Statistics: Metrics such as the number of requests.
Network Statistics: Data on bytes sent/received, close waits, etc.
SWG version and Antivirus modules version
Proxy Statistics: Details like the current number of connections and client count.
and many others

This statistic can be collected and sent along with an access log to Splunk. A scheduled rule engine trigger or a cron job can be used to perform a request to a non-existent domain called ‘reporting.test’. The monitoring data is sent to Splunk as part of the URL path, for example:
2024-03-09 16:21:04 +0100 s=403 ac=blocked src=255.255.255.255 p=http m=- d=reporting.test dp=80 bi=0 bo=0 dur=0 rt=0 up="/hostname_proxy24_ProxyIP_10.20.30.40_MWGVersion_12.2.5_MWGBuildNumber_47878___Lic336_CPULoad4_CPUIdle94_MemFree13645803520_MemUsed7259697152_...."
In an example above the proxy hostname is proxy24, the IP 10.20.30.40, the software version 12.2.5 and the build 47878, the remaining license 336 days and so on.

A pre-configured rule set is provided with the app package (located in the MWG folder). Consider it as a lightweight alternative for full-fledged monitoring with SNMP but without installing and configuring any software besides Splunk. This self-monitoring is especially useful for a PoC and quick troubleshooting.

Using Splunk as a lightweight alternative for full-fledged monitoring with SNMP but without installing and configuring any software besides Splunk. This self-monitoring is especially useful for a PoC and quick troubleshooting

A full list of all available counters can be viewed here: List of Counters.

The statistic counters can be sent from Web Gateway to Splunk every 60 seconds via the already configured mcafee:webgateway:custom sourcetype - no need to configure SNMP, configure firewall rules etc. The counters are sent along with other events in the same mcafee:webgateway:custom sourcetype. Following 3 steps are all that is needed to enable it:

Import a self-monitoring RuleSet from the MWG folder of the app: Policy > RuleSets (under Policy RuleSets, not in Log Handler) > Add > Top Level RuleSet > Import Rule Set from Rule Set Library > Import from file
Place the Monitoring RuleSet as a first top-level rule set
Configure a periodic rule engine trigger (Configuration > Appliances > [each appliance] > Proxies (HTTP..) > Advanced Settings > Periodic Rule Engine Trigger > "http://reporting.test", Trigger Interval: 60 seconds). Alternatively you can create a cronjob on each appliance: * * * * * curl -x proxyip:proxyport http://reporting.test/ 2>&1 >/dev/null

The ruleset can be modified to include other counters as needed.

The self-monitoring will be extended in future versions, so check for updates if you find this feature useful.

Next steps / Action Plan

You want to:	Action
complete setup	Double-check the Onboarding checklist If the timestamp format was modified, adjust validation regex in the Splunk RuleSet > DEBUG > Verify Log Structure and report if it is not correct Search for logging errors: LOGERR1 OR LOGERR2 OR LOGERR3
use non-default index	Modify "index_and_sourcetype" macro to include an index (i.e. 'index=proxy AND sourcetype="mcafee:webgateway:custom"')
improve search speed and speedup reporting	Upgrade to version 5.x and upgrate to the new log format, the app version 5.x supports all previous log formats and introduces a new format for faster searches (speedup up to 30 times) and faster reports (speedup up to 100 times).
implement Common Information Model (CIM)	Install Splunk Common Information Model (CIM) App
import new version of the Splunk Logging Ruleset but keep all modifications	Use a mwg_xml2txt and dump_logging_fields scripts to see differences between versions.
build accelerated DM	Don't put a high variable strings like uri_path, uri_query, url in accelerated DM unless you really need them
improve proxy performance, find causes of high latency	Check errors, web cache (should be disabled!), timers (esp. DNS)
configure data retention	Configure frozenTimePeriodInSecs TBD
implement some GDRP requirements	Check if personally indentifiable information (PII) should be removed, encrypted, obfuscated or masked. TBD
investigate a breach/incident	Create a copy of all relevant events (also from other sources) to avoid aging it out. TBD
implement a 4-eyes principle	It can be implemented either on the proxy side or using splunk. TBD
mask/obfuscate some fields	It can be implemented either on the proxy side or using splunk. TBD
send events to other destination besides splunk	Modify rsyslog.conf or use "Route and filter data". TBD
customize or create own views and reports	Dashboard Customization
add new fields	At first, check if a required field is already available. Send me an email, so I can include it in the log template. If the field is too specific, consider to create a new ruleset in the Splunk ruleset and put all new fields there - this step will greatly simplify an update/migration. To benefit from the PREFIX/TERM acceleration, if the value can't contain any major breakers, use key=value format.
exclude some events from search	Create a macro to exclude some sources, destinations or user-agents and add it to a query
exclude some events from logging	On MWG: Modify existing list "Domains not to log" or create own excluding rules
improve search performance	rewrite queries to use base search (be aware of base search restrictions) rewrite queries to use accelerated data model use TERM/PREFIX (implemented in the version 5.x) [1] add to index_and_sourcetype macro: DIRECTIVES(REQUIRED_EVENTTYPES(eventtypes=""),REQUIRED_TAGS(tags="")) - helps on busy SH with many installed TAs [1]
correctly log FTP/FTPoverHTTP connections	Due to the nature of FTP requests, the MWG events don't correctly reflect connection type. This requires more work, both on MWG and on Splunk side. TBD
use TERM/PREFIX for fields with major breakers like User-Agent	Build a ua field on SWG without major breakers using String.ReplaceAllMatches( field, regex([^\w\-]+),"_")
work with IPv6 addresses	TBD
you have an idea how to improve this app or need support	Write an email to splunkcompek.net

FAQ

Q: The default dashboard filter (user, src, destination and user-agent) is not sufficient, how to add other filter conditions? A: You can use any of the default input fields to add your own SPL, for example in a "User" input field enter * block_id=80 to show all matching events with block_id equals 80. Also read Dashboard Customization
Q: MWG or SWG? A: MWG underwent several acquisitions:
- initially named Webwasher (-2004)
- Cyber Guard Webwasher (2004-2006)
- Secure Computing Webwasher (2006-2008)
- McAfee Web Gateway (2008-2013)
- Intel Security Web Gateway (SWG) (2013-2017)
- then again as McAfee Web Gateway (2017-2022)
- now SkyHigh Security Web Gateway (SWG) (2022-).
Q: You want to extend/modify/improve the app/documentation or get a beta build.A: Send your requests to splunkcompek.net

Dashboard Views

Summary

Required fields: action, bytes_in, bytes_out, http_user_agent

Summary+

Starting with the version 5 of the app, new accelerated views are introduced. They are based on tstats/TERM/PREFIX and provide significant speedup. They are only works with the new mcafee:webgateway:custom log. The new accelerated views have a plus-suffix (e.g. Summary+).

Easy search

This view provides most common searches based on user, src ip, dest, action and user-agent input.

Required fields: status, category, block_id url_domain, rule, action, bytes_in, bytes_out, http_user_agent, bytes_to_client, bytes_from_client, web_reputation, block_reason

Raw_Search

This is just a raw splunk prompt with a prefilled `index_and_sourcetype` macro.

Search+

This is an accelerated (based on PREFIX) table of most important fields with a drilldown to a "normal" SPL search.

URL Filter

Required fields: action, bytes_in, bytes_out, http_user_agent, category, web_reputation, user, geolocation

URL Filter+

Accelerated version of the URL Filter view

Traffic

Required fields: action, bytes_in, bytes_out, bytes_from_client, bytes_to_client, url_domain

Traffic+

Accelerated version of the Traffic view

Media Types

There can be several media types in a web transaction. For example when a user uploads a zip file that contains pictures, audio and executables, we'll see application/zip in the Request Cycle, various image/*, audio/* and application/executable in the embedded cycle of Request Cycle and text/html in the Response cycle. Which of them should be logged?
Additionally, if the server sends a wrong Content-Type header, we want to check it with MediaType.EnsuredTypes and report Magic Bytes Mismatch. Most web transactions are downloads, so it is common practice to log the content type of the response.
The logging ruleset supplied with the app has a "MediaType Watchlist" that can be used to track some potentially dangerous media types.

Magic Bytes Mismatch: the web server present a Content-Type header that doesn't match the content of the file. Example: http://www.csm-testcenter.org/download/media_type/video.html should produce following log entry: mbmismatch ct=text/html mte=video/mpeg

Required fields: action, bytes_in, bytes_out, http_user_agent, category, web_reputation, user, content_type, http_content_type, file_name, media_types_not_ensured, media_type_ensured, magic_bytes_mismatch

Malware

Required fields: action, http_user_agent, category, web_reputation, malware, bytes_in, url, malware_file_name, malware_file_hash, block_id

Protocols

Required fields: action, bytes_in, bytes_out, http_proto_version, protocol, req_http_version, url_protocol,l resp_http_version, http_user_agent

Protocols+

Accelerated version of the Protocols view

Connections

Required fields: action, bytes_in, bytes_out, url_domain, connection_runtime, web_reputation, category, user, http_user_agent, url, dest_ip, dest_port, proxy_src_ip, proxy_src_port

Applications

Required fields: action, bytes_in, bytes_out, http_user_agent, user, app

User-agents

Required fields: action, bytes_in, bytes_out, http_user_agent, user, web_reputation, web_reputation_risk, url_protocol, url_domain, bytes_to_client, bytes_from_client, status, status_category, rule, malware, malware_probability, file_name, file_hash, contains_macro, contains_exe, category, block_reason,, block_id

User-agents+

Accelerated version

Performance

Required fields: user, http_user_agent, t_dns, duration, t_externals, http_method, client_side_latency, bytes_out, bytes_in, t_lrlsc, t_connect, latency

Network

Required fields: user, status, t_dns, status, duration, http_user_agent, action

Authentication

Required fields: user, http_user_agent, status, action, authentication_method

Uploads

Overview over Uploads. Requieres the copying of the Media Type Filter / Track Uploads ruleset from Splunk/RuleSet Library

Required fields: web_reputation, url, bytes_in, bytes_out, action, category, file_name, ensured_mediatypes, http_user_agent, content_type, action

Portential Risks

Required fields: dest, dest_port, user, http_user_agent, url, http_method, web_reputation, action, url_domain, status, resp_headers_length, url_length, dest_ip, content_disposition, category, url_protocol, req_headers_length, error_template

DNS

Required fields: t_dns, user, http_user_agent, action

DoH

Statistics on DNS-over-HTTPS (DoH). To enable copy a RuleSet "DNS-over-HTTPS" from Policy > RuleSets > Log Handler > Splunk > RuleSet Library to RuleSet after HTTPS Scanner and Media Type Filter.

Required fields: user, http_user_agent, DoH, action

Rules

Required fields: t_rule_engine, user, http_user_agent, action, url, t_externals, rules_fired, rules_evaluated, duration, bytes_in, bytes_out

HTTP

Required fields: user, http_user_agent, action, http_method, resp_headers, req_headers

Headers

The HTTP protocol can be used by malware to communicate with C&C, blending in the normal web traffic generated by benign applications like browsers. However, most enterprise security solutions don’t analyze all parts of the HTTP protocol and even if they do, only partial information can be logged: either a small subset of headers (like User-Agent, X-Forwarded-For, Referer, etc.) or header names must be configured explicitly to be logged. Neither of these methods allows logging of all or unknown headers.

Fortunately, the recent MWG/SWG versions close this security gap by allowing to log all HTTP headers. The rule based policy logic makes it possible to apply such deep logging on suspicious transactions only, significantly reducing log volume.

Possible use cases:

log transactions with very long headers to check if they are used as a covert channel
log specific headers like Upgrade, Connection etc. to monitor use of Websockets
fingerprinting suspicious transactions

Enabling the collection of header information on the SWG side: Some headers like User-Agent, Content-Type, Referer, Content-Length, Content-Disposition are already logged, therefore use the Headers ruleset to log specific or all headers.

Warning: Be aware of potential issues when headers contain special characters like equal sign, quotes, "less than" character etc. that can break parsing and even introduce security vulnerabilities. If unsure, start with sanitized version, where all characters except [a-zA-Z0-9:_\.] are replaced.

Applying conditional criteria for header collection to specific transactions only: avoid enabling header logging without a clear purpose, as it can do more harm than provide benefits. Enable header logging for suspicious transactions that require investigation.

Log example with request header information: TBD

Working with a Header View: TBD

Next step: configure the response header collection

Required fields: user, http_user_agent, action, hrq_*, host_header, len_hrq_*

SSL

Required fields: user, http_user_agent, ssl_client_cipher, ssl_client_kex_bits, ssl_client_protocol, ssl_server_cipher, ssl_server_kex_bits, ssl_server_protocol

Security posture

Required fields: url_protocol, user, tunnel_enabled, ssl_client_context_is_applied, http_user_agent, action

Certificates

Required fields: user, ssl_subject_common_name, reverse_dns, http_user_agent, web_reputation, ssl_server_cert_chain_issuer_cns, ssl_subject_common_name, ssl_cert_days_expired, block_id, category, ssl_server_handshake_cert_requested, ssl_cert_valid_days

Errors

Required fields: url_protocol, url, status, protocol_failure_description, http_method, error_id, dest_port, bytes_in, bytes_out, block_id, action, web_reputation, error_template, error_message, t_lslrs, t_dns, t_connect, src_ip, src_port, dest_port, protocol_failure_description, http_user_agent, connection_runtime

MWG-Errors

Dashboard for logs in /opt/mwg/log/mwg-errors: mwg-core.errors.log, mwg-coordinator.errors.log etc.

Monitoring

Centralized monitoring of all your Web Gateway appliances. More details: Self-Monitoring

Audit

Requires the mcafee:webgateway:audit sourcetype.

Audit - Timeline

Provides visual overview as a timeline about failed logins, changes, upload/downloads and other actions.

Requires Timeline visualization App https://classic.splunkbase.splunk.com/app/3120/

Audit Log visualisation for McAfee/SkyHigh Web Gateway: successful/failed logins, changes made, login duration, logout reason (logout or timeout)

Unfiltered Threads

Show allowed transactions to risky sites (URL Category like Malicious Sites or Reputation in Medum- or High-Risk) or download of malware. This view requires the new log format (with s=200 src= dest=) introduced in the version 5.0.0.

Help

Troubleshooting

This is a hidden view intended to help with initial setup and troubleshooting. This view contains following tables:

Sourcetype presence
Sourcetype presence + related Apps (this search uses REST and works with the enterprise license only)
Sourcetype detection
Presence of important fields

Dashboard Customization

Sometimes it is required to add own inputs elements, for example a dropdown list of indexes or group of hosts. This can be easily be done using sed. Each view (starting from the version 5.0.6) contains a placeholder line that can be used to insert your input element. In a following example we add an input element and modify search queries in SPL code to use a new token.

Prepare a text file (for example /tmp/textblock.txt) with an input element:

    <input type="dropdown" token="index_and_sourcetype_macro" searchWhenChanged="true">
      <label>Index</label>
      <choice value="`index_and_sourcetype_prod`">Prod</choice>
      <choice value="`index_and_sourcetype_nonprod`">Non-Prod</choice>
      <default>`index_and_sourcetype_prod`</default>
    </input>

Perform following commands line by line:

su - splunk # change to splunk user
mkdir -p $SPLUNK_HOME/etc/apps/McAfeeWebGateway/local/data/ui/views # create a local folder
cp $SPLUNK_HOME/etc/apps/McAfeeWebGateway/default/data/ui/views/*xml $SPLUNK_HOME/etc/apps/McAfeeWebGateway/local/data/ui/views # copy views from default to the local folder
cd $SPLUNK_HOME/etc/apps/McAfeeWebGateway/local/data/ui/views # cd to the local folder

Some views, like "audit", "audit_timeline" and "mwg_errors" (and maybe other views in the future) use different macros, you can skip them when adding a snippet:

for file in *xml; do
  if [ "$file" != "mwg_errors.xml" ] && [ "$file" != "audit.xml" ] && [ "$file" != "audit_timeline.xml" ]; then
    sed -i '/<!-- Placeholder for additional inputs -->/r /tmp/textblock.txt' "$file"
    sed -i '/`index_and_sourcetype`/$index_and_sourcetype_macro$/g' $file
  fi
done

Do debug/refresh

If you just need to add the same text block for all views use this sed command instead:

for file in *xml; do sed -i '/<!-- Placeholder for additional inputs -->/r /tmp/textblock.txt' $file; done # add a textblock to each view

Troubleshooting

Has the corresponding MWG Logging RuleSet been imported?
Are some charts and tables empty? - Check that the required fields and values are collected by the Splunk Rule Set in the Logging Cycle, activate them as needed.
Does a "Last Rule" exist on MWG?
Were the supplement rules copied in the Policy Rule Sets?
Is Splunk getting any input?
Does a search for index=* (sourcetype=mcafee:* OR sourcetype=MWGaccess3) output raw events?
Does Splunk recognize timestamps correctly?
If sent via Syslog - was the Syslog header part correctly removed?
Are there any errors in $SPLUNK_HOME/var/log/splunk/splunkd.log?
Problem: Events are not parsed correctly because an extra space character before the timestamp. Solution: modify the log template in MWG to $template msg_only,"%msg:2:$%"
Problem: Events are not parsed correctly because first character(s) of the timestamp is cut off. Solution: modify the log template in MWG to $template msg_only,"%msg:1:$%" or even $template msg_only,"%msg:$%"
Problem: Imported Splunk RuleSet has some RuleSets marked red - some properties like Header.Request.GetAll are available only on new MWG versions (10+) and rules containing such "unknown" properties will be marked red if imported on older MWG versions. Just delete these rules or upgrade the MWG to the newest 10+ version.

If a TLS Ruleset is shown red, modify it as follows (delete a second condition "SSL.Server.Certificate.SignatureMethod is not in list null" and replace it with "SSL.Server.Certificate.SignatureMethod is not in list Safe Signature Algorithms". Safe Signature Algorithms is a McAfee supplied list that should already be present in recent MWG versions:

If the list "Safe Signature Algorithms" is not present, create it as following:
Problem: Rule value is empty therefore Rule Statistics doesn't work on MWG 11.0-11.0.2. Answer: this is a bug in Map.GetStringValue function that was fixed in MWG 11.1, please update your MWG or temporarily disable Log Handler > Splunk > Rules > "Rules.CurrentRule.Name (short if exists in Rule Map)" rule.

Problem: Some fields are not extracted correctly or missing. Answer: The configuration from another app can override or suppress the intended field extraction. The btool doesn't work well in such situations, use this SPL on Search Head (replace searchtype as needed):

| rest splunk_server=local /servicesNS/-/-/configs/conf-props search="eai:acl.app=*" 
| search title="mcafee:webgateway:custom" (title!=null) (eai:acl.app!=null)
| rename eai:acl.app as app, eai:acl.perms.read as read, eai:acl.sharing as sharing, eai:acl.perms.write as write
| fields - updated published id eai* null*
| fields title author splunk_server app read write sharing **
| eval title="[".title."]"
| foreach * [eval title=if("<<FIELD>>"="author" OR "<<FIELD>>"="splunk_server" OR "<<FIELD>>"="app" 
OR "<<FIELD>>"="read" OR "<<FIELD>>"="write"  OR "<<FIELD>>"="sharing" 
OR "<<FIELD>>"="title" OR '<<FIELD>>'="",title,mvappend(title,"<<FIELD>>"." = ".'<<FIELD>>'." "))]
| fields title author splunk_server app read write sharing
| search title=**

Summary of changes

5.0.10 - improved views: mwg_errors, traffic, unfiltered_threats. New (hidden) view "troubleshooting for initial setup and debugging (contains following tables: Sourcetype presence, Sourcetype presence + related Apps, Sourcetype detection). The eventtype "Web" renamed to "mcafee_webgateway_Web". New version of Log Template - v0.14 - fixed extraction of Referer.Domain, prepared NextHopProxy.Address field, prepared Authentication.Method from RawCredentials to detect insecure Basic-over-NTLM Authentication.
5.0.9 - fixed a typo in the sourcetype name "mcafee:webgateway:mwg_errors" (mwg-errors -> mwg_errors)
5.0.8 - added searchWhenChanged=true to all inputs, visual improvements in Errors and MWG_Errors views, fixed a search error in Errors view, fixed macro "index_and_sourcetype_mwg_errors", added a sourcetype definition for "mcafee:webgateway:mwg-errors".
5.0.7 - added mwg_errors and unfiltered threats views, added new columns to Traffic and Traffic5 views, new macro "index_and_sourcetype_mwg_errors".
5.0.6 - new version of the log template: improved creation of ua2 field (for accelerated search). Added accelerated view User-Agents+. Each view has a placeholder to simplify custom modifications. Improved Errors-view. Improved Traffic and Traffic+ statistics. Minor fixes.
5.0.5 - added search macro for Audit_Timeline, clarified configuration options for the least-privileged splunkfwd user on the UF and other security options.
5.0.4 - added new views: Search+ (accelerated) Audit-Timeline and Bad_Reputation. Minor fixes in Monitoring and Authentication views. Added Sparklines to Monitoring view. Added an option to switch between Bytes/MB/GB to Overview page. Added drilldowns to URL (accelerated) view. props.conf - fixed extraction of the AuthMethod field. Added documentation about handling of punycode domains using custom segmenters.conf. Search renamed to Raw_Search to avoid overlapping with other savedsearches. transforms.conf: in the rewrite_host_from_host_field extraction - the field name called now swg and not host to avoid accidental overwriting of the host field. Improved documentation. Added a new improved version of the logging template.
5.0.3 - added parsing of SSE/WGCS logs up to API version 12.
5.0.2 - added documentation about logging of /var/log/messages and /var/log/audit, fixed missing tokens in protocols view.
5.0.1 - minor fixes.
5.0.0 - New major release, backwards compatible with old 4.x versions and old versions of log. This version provides major speedup (up to 100-fold) of reports using PREFIX (requires Splunk 8 and above). To use this new mode a slight log format modification required, read README for details.
4.0.14 - added interactive online configuration builder and new views: monitoring, DoH and certificates. Added experimental support for DoH (DNS over HTTPS) and Client-Hints. Improved documentation.
4.0.13 - added HTTP headers analysis view, new MWG Logging template, a supplemental script to compare MWG Logging templates to facilitate logging template upgrades. Improved documentation to include more best practices.
4.0.12 - an internal release
4.0.11 - added a lookup of executables that can be used for download and exfiltration (https://lolbas-project.github.io/). Fixed a TIME_PREFIX for wgcs_v5
4.0.10 - fixed extraction of authentication_method, authentication_realm, auth_failure_message and auth_failure_id fields (Thank you ML!)
4.0.9 - improved WGCS regexes, now URL, rule name and User-Agent fields that contain quote character(s) are parsed correctly. Improved a TIME_PREFIX to fix parsing errors. New CIM fields added. Added distsearch.conf to enable replication of macros.
4.0.8 - added sc_admin role to default.meta
4.0.7 - support for MWG audit log, feedback form, and a new auth method statistics view
4.0.6 - better README with more examples, global export in default.meta, MWG Log has autorotation/autodeletion enabled in case it is not enabled globally
4.0.5 - added parsing of McAfee Web Gateway Cloud Service (WGCS) Logs
4.0.4 - applied required changes to maintain compatibility with Splunk Cloud (use jQuery 3.5), improved documentation, minor fixes
4.0.3 - added Security Posture view, minor fixes
4.0.2 - improved Error Analysis view, minor fixes
4.0.1 - new major release, new log format, better documentation, new views: SSL, Errors, Uploads
3.0.7 - commited changes in props.conf and transform.conf by Myron Davis, added a contributors section in README, clarifications for the installation process in README
3.0.6 - enabled Splunk CIM (Common Information Model) version 4, by Myron Davis, compatibility with Splunk App for Enterprise Security, by Myron Davis, renamed App folder from AppForMcAfeeWebGateway to McAfeeWebGateway to match it with the app ID
3.0.5 - The App package now includes a step-by-step installation instruction with screenshots, the log structure has been reordered to avoid overwriting of parameters
3.0.4 - Introduced new short log format, many redundant fields removed, cleanup, faster search, and some panels were merged. This new major version isn't compatible with the version 2.xx

Contributors/Attributions

Thanks to Myron Davis for a lot of suggestions, enabling CIM, compatibility for Enterprise Security App
Thanks to Simon B.
Thanks to the McAfee/SkyHigh Community Forum

Copyright

This App, documentation and MWG logging ruleset are licensed under Creative Commons BY-ND 3.0

Disclamer

Test anything before using in production.
All you do with this App is on your own responsibility.

Contact, Support and Feedback

E-Mail: splunkcompek.net
Splunk Answers

Additional information

Install syslog-ng on MWG/SWG

It is possible to install syslog-ng (for example from EPEL) directly on MWG, for example for testing, but if you'll try to remove rsyslog it will break the dependencies, as mwg itself and other packages depend on rsyslog. You can follow the steps below, but be aware that it is not supported and can lead to various problems! See also https://www.syslog-ng.com/products/open-source-log-management/3rd-party-binaries.aspx and https://www.syslog-ng.com/community/b/blog/posts/installing-latest-syslog-ng-on-rhel-and-other-rpm-distributions

rpm -e --nodeps rsyslog

ps aux|grep [r]syslog

yum install https://download-ib01.fedoraproject.org/pub/epel/7/x86_64/Packages/s/syslog-ng-3.5.6-3.el7.x86_64.rpm \
            https://download-ib01.fedoraproject.org/pub/epel/7/x86_64/Packages/i/ivykis-0.36.2-2.el7.x86_64.rpm \
            https://download-ib01.fedoraproject.org/pub/epel/7/x86_64/Packages/e/eventlog-0.2.13-4.el7.x86_64.rpm

systemctl enable syslog-ng --now

systemctl status syslog-ng

mwg_xml2txt script

The MWG Splunk Logging RuleSet is quite complex. Most customers modify it to accommodate their own needs. Use this script to find all modifications when importing a new version of the RuleSet.

Usage:
Step 1: convert XML to TXT and compare them
perl mwg_xml2txt.pl old_ruleset.xml > old_ruleset.txt
perl mwg_xml2txt.pl new_ruleset.xml > new_ruleset.txt
vimdiff old_ruleset.txt new_ruleset.txt

VIMDIFF will compare TXT files and highlight differences in lists and rules using color output. It can be a simple change, like a rule being enabled/disabled, but can also be a more complex modification - in this case use a Step 2 to do a direct XML comparison.

Tip: press zR inside of vimdiff to unfold all sections.

Step 2: Identify differences and optionally extract the corresponding XML section for comparison
export a single rule from xml ruleset (replace RuleName with an actual Rule Name that you want to extract)
perl -0777 -e '$a=<>; ($rule)=$a=~m/(\QRuleName\E.*?<\/rule>)/ms; print "$rule"' ruleset_old.xml > rule_old.txt
perl -0777 -e '$a=<>; ($rule)=$a=~m/(\QRuleName\E.*?<\/rule>)/ms; print "$rule"' ruleset_new.xml > rule_new.txt
vimdiff rule_old.xml rule_new.xml

After Step 1 you'll see similar output (see below). The [true] or [false] indicates if the rule is enabled or disabled. The short 6-char string after each line are first 6 chars of the md5 for the entire rule block, so even a small modification will be highlighted.

Rules.CurrentRule.Name (short if exists in Rule Map) [true] 250e76
Rules.CurrentRule.Name (if doesnt exist in Rule Map) [true] a1b8cf
------------------------------------------------------------------
Rules.CurrentRule.Name (Last Rule) [false] 5a7005
Rules.CurrentRule.Name [false] a568cc
Number of FiredRules / EvaluatedRules (based on Last Rule presence) [false] 7b4f37
Number of FiredRules / EvaluatedRules (based on loghandler position) [true] ebfaf3

Rules.CurrentRule.Name (short if exists in Rule Map) [true] 250e76
Rules.CurrentRule.Name (short if exists in Rule Map) [false] f3a8a4
Rules.CurrentRule.Name (if doesnt exist in Rule Map) [false] 181bd9
Rules.CurrentRule.Name (Last Rule) [false] 5a7005
Rules.CurrentRule.Name [false] a568cc
Number of FiredRules / EvaluatedRules (based on Last Rule presence) [false] 7b4f37
Number of FiredRules / EvaluatedRules (based on loghandler position) [true] ebfaf3



#!/usr/bin/perl
use strict;
use warnings;
my $version = "0.3 17.Oct.2022 by PP";
use Digest::MD5 qw(md5_hex);
# <list version="1.0.3.464" mwg-version="11.2.4-42436" name="Authentication UserGroups to log" id="com.scur.type.string.483"
#    <listEntry>
#       <entry>application/vnd.ms-excel.addin.macroEnabled.12</entry>
#       <description>MS Office 2007 Excel addin (macro-enabled)</description>
#    </listEntry>
#
#<list version="1.0" mwg-version="11.1.4-40769" name="Map" id="com.scur.type.complex.maptype.321" typeId="com.scur.type.complex.maptype" classifier="Other" systemList="false" structuralList="false" defaultRights="2">
#        <description></description>
#        <content>
#          <listEntry>
#            <complexEntry defaultRights="2">
#              <configurationProperties>
#                <configurationProperty key="key" type="com.scur.type.string" encrypted="false" value="test"/>
#                <configurationProperty key="value" type="com.scur.type.string" encrypted="false" value="OK"/>
#              </configurationProperties>
#
#   <ruleGroup id="4122" defaultRights="2" name="Splunk" enabled="true" cycleRequest="true" cycleResponse="true" cycleEmbeddedObject="true" cloudSynced="false">
#      <rule id="5820" enabled="true" name="Domains not to log">
# usage: 
# Step 1: convert XML to TXT and compare them
# perl mwg_xml2txt.pl old_ruleset.xml > old_ruleset.txt
# perl mwg_xml2txt.pl new_ruleset.xml > new_ruleset.txt
# vimdiff old_ruleset.txt new_ruleset.txt
# 
# VIMDIFF will compare TXT files and highlight differences in lists and rules using color output. It can be a simple enabled vs disabled, 
# but can be also a more complex modification - in this case use a Step 2 to do a direct XML comparison.
#
# Step 2: identify differences and optionally extract corresponding XML section for comparison
# export a single rule from xml ruleset:
# perl -0777 -e '$a=<>; ($rule)=$a=~m/(\QRuleName\E.*?<\/rule>)/ms; print "$rule"' ruleset_old.xml > rule_old.txt
# perl -0777 -e '$a=<>; ($rule)=$a=~m/(\QRuleName\E.*?<\/rule>)/ms; print "$rule"' ruleset_new.xml > rule_new.txt
# vimdiff rule_old.xml rule_new.xml

my $line=1;
my $xml = undef;

open (my$fh, '<', $ARGV[0]) or die "cannot open file: $!";
{ 
  local $/=undef;
  $xml = <$fh>;
}
close $fh;

my @lists=$xml=~m/<list [^<]+ name="([^"]+)"/g;
foreach my $list_name (sort @lists){
  my $list = undef;
  if($xml =~/(<list [^\n]+ name="$list_name" [^\n]+com\.scur\.type\.complex\.maptype.+?<\/list>)/ms){ # map has other structure
    $list = $1;
    #print "$list_name\n$list\n\n"; 
    my @entries = $list =~ m/ key="key"[^\n]+value="([^\n]+\n[^\n]+value="[^"]+)"/msg;
    s/([^"]+)".*\n.*"([^"]*)/$1 - $2/msg for @entries; # remove anything except key-value
    print "$list_name\n  ".(join "\n  ",sort @entries)."\n\n";
  }elsif($xml =~/(<list [^\n]+ name="$list_name".+?<\/list>)/ms){ 
    $list = $1;
    #print "$list_name\n$list\n\n"; 
    my @entries = $list =~ m/<entry>([^<]+)<\/entry>/msg;
    print "$list_name\n  ".(join "\n  ",sort @entries)."\n\n";
  }else{ 
    die "cannot find list" 
  };
}

while(<>){
  #print "$line: $_";
  $line++;
  next if /<ruleGroups\/?>/;
  my($ruleid,$string,$offset,$name,$enabled,$rule_block)=(undef,undef,undef,undef,undef,undef);
  if(/^(\s*)<ruleGroup/){
    $offset=$1;
    if(/ name="([^"]+)"/){$name=$1};
    if(/ rule="([^"]+)"/){$ruleid=$1};
    if(/ enabled="([^"]+)"/){$enabled=$1};
    if(/^(.*)$/){$string=$1};
    #print "$offset $name [$enabled]\n"
    ($rule_block) = $xml =~ /(\Q$string\E.*?<rule )/ms;
    $rule_block =~ s/(id=")\d+"/$1XXX"/msg;
    $rule_block =~ s/(propertyId=")\d+"/$1XXX"/msg;
    $rule_block =~ s/(id="com\.scur\.type\.\w+\.)\d+"/$1XXX"/msg;
    $rule_block =~ s/(id="com\.scur\.type\.complex\.\w+\.)\d+"/$1XXX"/msg;
    $rule_block =~ s/(com\.scur\.engine\.\w+\.)\d+/$1XXX/msg;
    if(not defined $rule_block){die "Rule block not defined for $string"};
    print "$offset $name [$enabled] ".substr((md5_hex($rule_block)),0,6)."\n"

  }elsif(/^(\s*)<rule id=/){
    $offset=$1;
    if(/ name="([^"]+)"/){$name=$1};
    if(/ rule="([^"]+)"/){$ruleid=$1};
    if(/ enabled="([^"]+)"/){$enabled=$1};
    if(/^(.*)$/){$string=$1};
    ($rule_block) = $xml =~ /(\Q$string\E.*?<\/rule>)/ms;
    $rule_block =~ s/(id=")\d+"/$1XXX"/msg;
    $rule_block =~ s/(propertyId=")\d+"/$1XXX"/msg;
    $rule_block =~ s/(id="com\.scur\.type\.\w+\.)\d+"/$1XXX"/msg;
    $rule_block =~ s/(id="com\.scur\.type\.complex\.\w+\.)\d+"/$1XXX"/msg;
    $rule_block =~ s/(com\.scur\.engine\.\w+\.)\d+/$1XXX/msg;
    if(not defined $rule_block){die "Rule block not defined for $string"};
    print "$offset $name [$enabled] ".substr((md5_hex($rule_block)),0,6)."\n"
  }    
}

dump_logging_fields script

Use the following script to output configured fields in the Splunk log handler. This is useful if you migrate to a new Logging ruleset and want to compare which fields are enabled in old and new ruleset.

Example of usage:

./dump_logging.pl 2023-10-26_13-10_Splunk.xml com.scur.engine.datetimefilter.datetime.toisostring s= ac= src= p= m= d= dp= bi= bo= dur= rt= up="" ua="" a="" c= ct="" u= ud= exe= macro= mbmismatch ctmt0 ctemt mte= mtne="" cl= contentdisp='' ckex= skex= ccert cntx scc= ssc= sslcp= sslsp= tcn sslsm= DoH="" emb= embl= rqcompst rscompst encrypt= file_watchlist="" mlwrp= malware="" malware_file_name="" malware_file_hash="" stream rul="" rul="" rnf= rne= crt= bfc= btc= tunnel rqv= rsv= uploads="" pfail="" tmplt="" errid= bid= r= geo= tdns= tcon= tre= text= t=.... LOGERR1 LOGERR2 LOGERR3com.scur.engine.stringfilter.string.replaceallmatches tl=

#!/usr/bin/perl
use 5.010;
use strict;
use warnings;
my $version = "0.3 26.Oct.2023 by PP";
# This script reads a single Logging RuleSet (exported from the SWG UI) and output a line with logging fields
# USAGE: ./dump_logging_ruleset.pl LoggingRuleset.xml
use XML::LibXML;
use JSON;
use Data::Dumper;
use utf8;
#use open ":std", ":encoding(UTF-8)";
binmode(STDIN, ":utf8");
binmode(STDOUT, ":utf8");

my $array_counter = -1; # -1 because it get incremented at start of the loop
my $array_counter_list = -1;

my $filename = $ARGV[0];
my %hash = ();
my $result = undef;
my $logline_id = undef;

my $dom = XML::LibXML->load_xml(location => $filename);

# get an IP of User-Defined.logLine
foreach my $property( $dom->findnodes('/libraryContent/userDefinedPropertys/userDefinedProperty')){
 if($property->findvalue('@name') eq "User-Defined.logLine"){
   $logline_id = $property->findvalue('@id');
 }
}

$result = walk_rules("/libraryContent/ruleGroup/rules/rule", 0, "top_rules");
$result = walk_rulesets("/libraryContent/ruleGroup/ruleGroups/ruleGroup");

sub walk_rules{
  my $path = shift;
  my $ruleset_counter = shift;
  my $level = shift;
  $array_counter_list = -1;
  foreach my $rule ($dom->findnodes($path)) {
    if( $rule->findvalue('@enabled') eq "true" ){
      $array_counter_list++;
      $hash{$level}[$ruleset_counter]{rules}[$array_counter_list]{"name"}= $rule->findvalue('@name');

      # print all direct assigments of logline=xxx
      foreach my $p5 ($rule->findnodes('immediateActionContainers/setActionContainer[@propertyId='.$logline_id."]")){
        my @nodes = $p5->findnodes('expressions/setExpression');
        if( scalar @nodes eq 1 ){
          print $p5->findvalue('expressions/setExpression/parameter/value/propertyInstance/@propertyId');
        }
      }

      # print all fields (logline=logline+field=
      foreach my $p ($rule->findnodes('immediateActionContainers/setActionContainer[@propertyId='.$logline_id."]")){
        foreach my $p2 ($p->findnodes('expressions/setExpression/parameter/value/propertyInstance[@propertyId='.$logline_id."]")){
          foreach my $p3 ( $p2->findnodes('../../../../setExpression/parameter/value/stringValue')){
            print $p3->findvalue('@value');
          }
        }
      }
    }
  }
}

sub walk_rulesets{
  my $path = shift;
  $array_counter = -1;
  foreach my $ruleGroup ($dom->findnodes($path)) {
    if( $ruleGroup->findvalue('@enabled') eq "true" ){
      $array_counter++;
      my $ruleGroup_name = $ruleGroup->findvalue('@name');
      my $ruleGroup_id    = $ruleGroup->findvalue('@id');
      $hash{rulesets}[$array_counter]{name}= $ruleGroup->findvalue('@name');
      my $nested_path = $path.'[@id="'.$ruleGroup_id.'"]/rules/rule';
      walk_rules($nested_path, $array_counter, "rulesets");
    }
  }
}
#print Dumper %hash;
#use utf8;
#my $json = encode_json(\%hash);
#my $json = JSON->new->utf8->pretty->encode(\%hash);

List of SWG Counters

AMJobQueueLength
AMLoad
AMPrivateMemory
AMUsed
AMUsedPhys
ApplHighRisk
ApplicationMemoryUsage
ApplMediumRisk
ApplMinimalRisk
ApplUnverified
AuthNTLMCacheRequests
AuthUserCacheRequests
BlockedByAntiMalware
BlockedByApplControl
BlockedByDCC
BlockedByDLPMatch
BlockedByMATD
BlockedByMediaFilter
BlockedByURLFilter
Categories
CertExpired
CertNameMismatch
CertRevoked
CertSelfSigned
CertUnresolvable
CertWildCardMatch
ClientCount
CloseWaits
CloudEnc.DecryptionBytesAll
CloudEnc.DecryptionErrorsAll
CloudEnc.DecryptionHitsAll
CloudEnc.EncryptionBytesAll
CloudEnc.EncryptionErrorsAll
CloudEnc.EncryptionHitsAll
ConnectedSockets
ConnectionsBlocked
ConnectionsLegitimate
CoordLoad
CoordPrivateMemory
CoordUsed
CoordUsedPhys
CoreLoad
CorePrivateMemory
CoreThreads
CoreUsed
CoreUsedPhys
CPUIdle
CPUIOWait
CPULoad
CPULoadRaw
CPUSystem
CPUUser
DCCCalled
DCCUncategorized
DXLEventsReceived
DXLEventsSent
DXLRequestErrors
DXLRequestsSent
DXLServiceCalls
DXLTraffic
eDirectoryRequestProcTime
eDirectoryRequests
FilesystemUsage
FirstSentFirstReceivedClient
FirstSentFirstReceivedServer
FtpBytesFromServer
FtpBytesToServer
FtpRequests
FtpTraffic
GTIFileRepCloudLookupDone
GTIRequestSentToCloud
HandleConnectToServer
HarddiskUsage
Http2BytesFromClient
Http2BytesFromServer
Http2BytesToClient
Http2BytesToServer
Http2Requests
Http2Traffic
HttpBytesFromClient
HttpBytesFromServer
HttpBytesToClient
HttpBytesToServer
HttpConnectionsFromClientPerCustomer
HttpRequests
HttpsBytesFromClient
HttpsBytesFromServer
HttpsBytesToClient
HttpsBytesToServer
HttpsRequests
HttpsTraffic
HttpTraffic
ICAPClientActiveConnections
ICAPReqmodRequests
ICAPReqmodTraffic
ICAPRespmodRequests
ICAPRespmodTraffic
IfpRequests
KerberosRequests
LastSentFirstReceivedServer
LastSentLastReceivedClient
LastSentLastReceivedServer
LDAPRequestProcTime
LDAPRequests
LoadPerCPU
MalwareDetected
MATDInfected
MATDRequests
MATDScanTime
MemConsumed
MemFree
MemMallocChunks
MemMallocKBytesUsed
MemMMBlocks
MemMMBytesUsed
MemoryUsage
MemUsed
MT.Archive
MT.Audio
MT.Database
MT.Document
MT.Executable
MT.Image
MT.Text
MT.Video
NetworkBytesReceived
NetworkBytesSent
NTLMAgentRequestProcTime
NTLMAgentRequests
NTLMRequestProcTime
NTLMRequests
OTPSendProcTime
OTPSendRequests
OTPVerifyProcTime
OTPVerifyRequests
PrivDecryptOK
PrivEncryptOK
PrivKeyOpDuration
RADIUSRequestProcTime
RADIUSRequests
RawTCPTraffic
RepHighRisk
RepMediumRisk
RepMinimalRisk
RepUnverified
ReputationNeutral
ReputationSuspicious
ReputationTrusted
ReputationUnverified
ResolveHostViaDNS
SMCached
SOCKSHTTPRequests
SOCKSHTTPSRequests
SOCKSHTTPSTraffic
SOCKSHTTPTraffic
SOCKSUDPConnections
SOCKSUDPTraffic
SOCKSUnFilteredRequests
SOCKSUnFilteredTraffic
SOCKSv4Requests
SOCKSv4Traffic
SOCKSv5Requests
SOCKSv5Traffic
SSLIssuedCertificate
SSLSessionClientHit
SSLSessionClientMiss
SSLSessionServerHit
SSLSessionServerMiss
SSO.AllLogins
SSO.IncorrectTokens
StatDBSize
SwapFree
SwapUsed
TCPProxyConnections
TimeConsumedByGTIFileRepCloudLookup
TimeConsumedByGTIFileRepCloudLookup_0000_25
TimeConsumedByGTIURLCloudLookup
TimeConsumedByGTIURLCloudLookup_0000_25
TimeConsumedByGTIURLCloudLookup_0026_50
TimeConsumedByGTIURLCloudLookup_0051_75
TimeConsumedByGTIURLCloudLookup_0076_100
TimeConsumedByGTIURLCloudLookup_0101_150
TimeConsumedByGTIURLCloudLookup_0151_200
TimeConsumedByGTIURLCloudLookup_0201_250
TimeConsumedByGTIURLCloudLookup_2001_2500
TimeConsumedByGTIURLRating
TimeConsumedByGTIURLRating_0000_25
TimeConsumedByGTIURLRating_0026_50
TimeConsumedByGTIURLRating_0051_75
TimeConsumedByGTIURLRating_0076_100
TimeConsumedByGTIURLRating_0101_150
TimeConsumedByGTIURLRating_0151_200
TimeConsumedByGTIURLRating_0201_250
TimeConsumedByGTIURLRating_2001_2500
TimeConsumedByGTIURLRatingSync
TimeConsumedByGTIURLRatingSync_0000_25
TimeConsumedByRuleEngine
TimeForRegex
TimeForTransaction
UserDBRequests
WebCacheDiskUsage
WebCacheHits
WebCacheMisses
WebCacheObjectsCount
WebCacheReadNotCacheable
WorkingQueueLength
XmppClients
XmppRequests
XmppTraffic