Version: 4.0.11
Date: 20 October 2022
This Splunk App for McAfee Web Gateway allows rapid insights and operational visibility into McAfee Web Gateway (MWG) and McAfee Web Gateway Cloud Service (WGCS) deployments. It provides field extraction and CIM field mapping using all available types of access logs (default and custom McAfee Web Gateway log, McAfee Web Gateway Cloud Service), facilitates fast incident response and troubleshooting.
In 2022 McAfee Web Gateway (MWG) was renamed to SkyHigh Secure Web Gateway (SWG).
List of abbreviations used in this document:
Abbreviation | Meaning |
---|---|
MWG | McAfee Web Gateway |
WGCS | McAfee Web Gateway Cloud Service |
UF | Splunk Universal Forwarder |
Product Compatibility:
Product | Version(s) |
---|---|
Splunk | 6.6+, 7.x, 8.x, 9.x |
MWG | 7.6+, 8.x, 9.x, 10.x, 11.x |
WGCS | API v5 |
Currently there are 85 different charts and tables grouped in 22 views
Applications Applications by Hits Applications by Volume Top Blocked Applications by Hits Top Applications by Volume Top Applications by Hits Top Application Statistics Audit Failed Logins Activity by Action Activity by Source_Type Activity by User User Activity by Appliance Authentication Top IP by Failed Auth Top User-Agents by Failed Auth Top Destination Hosts by Failed Auth Top User-Agents + IPs by Failed Auth Top User-Agents + DestHost by Failed Auth Top IPs + DestHost by Failed Auth Top IPs + User-Agent + DestHost by Failed Auth Multiple Logins from diff IPs Multiple Usernames coming from a single IP Authentication Method Statistics Connections Long running transactions DNS Timechart DNS resolution time Timechart DNS resolution time distribution (including Cached) Timechart DNS resolution time distribution (excluding Cached) DNS distribution (1ms - 200ms) DNS distribution (all) Errors Error Analysis HTTP Timechart HTTP Method HTTP Method Statistics HTTP Request Headers Statistics HTTP Response Headers Statistics Easy Search Status Code Overview Web Usage by URL Category Web Usage by URL Category Area Graph Top User-Agents Users + IPs IP Addresses by Hits Graph Top Hosts by Hits Top Blocked Domains by Hits Top Rules by Hits Events Malware Malware Top Users by blocked Malware Media Types Media Types Top Media Types by Volume Top Media Types by Hits EXE Uploads/Downloads Macro Uploads/Downloads EXE and Macro Uploads/Downloads with Magic Bytes Mismatch Encrypted Files Network Top unreachable Servers Performance Connect to Server Latency Total Transaction Duration distribution Client-Side Latency DNS resolution Latency distribution Time in Externals Distribution Protocols Protocols by Hits Protocols by Hits (Percent) Protocols by Volume Protocols by Volume (Percent) Potential Risks Top SRC with high Ratio of High Risk Requests Unusual Ports Requests to IP Addresses CONNECT Requests to IP Addresses Very long URLs Very large request and response Headers Non-resolvable Domains, potential DGA (Domain Generation Algorithm) Rules Top Rules Block Rules Overview Top Block Rules Rule Complexity/Performance Slowest Rule Execution Time in Rule Engine Distribution Time in Rule Engine over Time Security Posture Content Scan is possible Ratio SSL SSL Versions by Hits (Server) SSL Versions by Hits (Client) SSL Ciphers by Hits (Server) SSL Ciphers by Hits (Client) SSL KeyExchangeBits by Hits (Server) SSL KeyExchangeBits by Hits (Client) SSL Ciphers (Server) SSL Versions (Server) Client Certificate Requested SSL-related blocks Expired Certificate Certificate Issuers Summary Requests / Block Ratio Traffic Overview Traffic Top Inbound Traffic by Source Top Inbound Traffic by Destination Top Outbound Traffic by Source Top Outbound Traffic by Destination Uploads Uploads URL Filter URL Categories Blocked by URL Filter or by Web Reputation Top URL Categories by Volume Top URL Categories by Hits Geolocation Stats High Risk Destinations Not categorized Domains - Chart Top not categorized Domains - Table User-Agents User-Agent Statistics
Instance | App for McAfee Web Gateway | Add-on for McAfee Web Gateway |
---|---|---|
Standalone (all-in-one) Splunk | + | - |
Search Head | + | - |
Indexer | - | + |
Syslog/Log Server with Universal Forwarder | - | + |
MWG can write logs to hard disk or/and send them via Syslog. Splunk can read log files locally, get them via network input (Syslog or raw UDP/TCP steam) or get them from a UF that is installed on a log server or on MWG itself. All these methods combined produce many possible ways to get MWG logs into Splunk:
Method / Link to configuration example | Description | Real time |
---|---|---|
Local file monitor | Splunk is installed directly on MWG and monitors the log file folder | Yes, up to 30 sec delay |
Local UDP/TCP input | Splunk is installed directly on the MWG and gets log files sent using Syslog | yes |
Syslog UDP/TCP | MWG sends logs via UDP/TCP to syslog collector or directly to Splunk | yes |
Syslog TCP+TLS | MWG sends logs via TCP, encrypted with TLS, to syslog collector or directly to Splunk | yes |
UF | Install UF on MWG to monitor log file folder | yes, up to 30 sec delay |
Log pushing from MWG to a log server | Use pushing (FTP/FTPS/SCP/SFTP/HTTP/HTTPS) from MWG to a log server | no |
Log pulling from MWG | Pulling logs from MWG via API, scp or rsync | no |
Log pulling from WGCS | Pulling logs via WGCS API | no |
Installing UF directly on MWG and configure UF to forward events to Splunk indexer is a recommended and most reliable method!
Further consideration:
Log Format | Sourcetype | # of MWG fields | # of CIM fields | Average log line length (HTTPS Scanner enabled) | Comment/Example |
---|---|---|---|---|---|
Default Access Log | mcafee:webgateway:default | 14 | 17 | ~700 Bytes | Default log format with a fixed structure, provides only minimal subset of fields. Use it only if no MWG modification is possible. [26/Feb/2021:14:40:23 +0100] "" 192.168.2.n 200 "GET https://example.com/test&adk=1473563476 HTTP/2.0" "Web Ads" "Minimal Risk" "image/gif" 286 538 "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:86.0) Gecko/20100101 Firefox/86.0" "" "0" "Google" |
Legacy Log for the Splunk App v.3.0.7 | MWGaccess3 | 26 | 27 | ~650 Bytes | Customized log format with a fixed structure, provides more fields than the default log, including some timings and transferred bytes. Wasteful information like User-Agent string is shortened. Consider it obsolete. [26/Feb/2021:14:40:23 +0100]status="200/0" srcip="192.168.2.n" user="" profile="-" dstip="-" dhost="example.com" urlp="443" proto="HTTPS/https" mtd="GET" urlc="Web Ads" rep="0" mt="image/gif" mlwr="-" app="Google" bytes="538/539/289/286" ua="FF86.0-10.0" lat="0/0/59/434" rule="Last Rule" url="https://example.com/test&adk=1473563476" |
Custom Log (recommended) | mcafee:webgateway:custom | 50-100 | 50-100 | ~600-1800 Bytes | New custom modular log format (described in details below), logs fields can be added/removed as needed, provides full CIM coverage and deep insights for analytics and rapid troubleshooting. Despite the significantly larger amount of provided information, the size of the log has changed insignificantly. This new format provides up to 3x higher information density than the default log format. 2021-02-26 14:40:23 +0100 204 allowed 192.168.2.n https GET example.com 443 775/58 88/1 up="/test" ua="FF86-10.0" a="Google" c="wa" dip=142.250.185.nn kex=112/112 cntx sccc=1302/1302 sslp=1.3/1.3 sslicn="GTS CA 1O1,GlobalSign" sslcn="example.com" crtdays=-66 ctmt0 rul="L" rn=13/44 srcp=63298 conrt=0 b=744/239 psrcip=192.168.2.n psrcp=20010 piv=2.0/2.0 r=0 t=0/0/86/87/56/56/3/4/28 |
Log Format | Sourcetype | # of MWG fields | # of CIM fields | Average log line length (HTTPS Scanner enabled) | Comment/Example |
---|---|---|---|---|---|
WGCS API version 5 | mcafee:webgateway:wgcs_v5 | 28 | 28 | ~300-400 Bytes |
"user_id","username","source_ip", |
WGCS API version 6 | mcafee:webgateway:wgcs_v6 (not supported yet) | 28 | 28 | ~300-400 Bytes | No new fields are introduced. All fields from versions 1 – 5 are downloaded. Starting with API version 6, an error message is sent with the response to a download request that has timed out. |
WGCS API version 7 | mcafee:webgateway:wgcs_v7 (not supported yet) | 28 | 28 | ~300-450 Bytes | All fields from versions 1 – 6 are downloaded, plus these fields: pop_country_code referer ssl_scanned av_scanned_up av_scanned_down rbi |
WGCS API version 8 | mcafee:webgateway:wgcs_v8 (not supported yet) | 28 | 28 | ~300-500 Bytes | All fields from versions 1 – 7 are downloaded, plus these fields: dlp client_system_name filename pop_egress_ip pop_ingress_ip proxy_port |
Extract the file Splunk_Log_XXXXXX.xml (where XXXXXX is the version) from the MWG folder of the application package.
Import Splunk_Log_XXXXXX.xml file in MWG into the Default Log Handler: Policies > Rule Sets > Log Handler, right click on "Default" and select Add > Rule Set from Library
In the new window that appears, click on the "Import from file" button, then choose the xml file and click OK.
click "Auto-Solve Conflicts..." > select "Solve by referring to existing objects" and click OK to import the RuleSet.
The Log configuration has a modular structure, you can choose to send just a preconfigured minimal set of fields or select any subset from available fields. The log ruleset contains several parts (see numbering on the next screenshot):
Here are most important modifications that you can do in additional Rulesets (block of RuleSets #3 on the previous screenshot).
Ruleset | Possible modifications |
---|---|
Splunk | Domains not to log - some domains can be excluded from logging completely. |
Set Timestamp | choose the right timestamp. The ISO format with a time zone is selected by default. Other options are ToGMT, ISO8601, unix epoch and ToWebReporter formats. If you change the timestamp format on MWG then you have to adjust the TIME_FORMAT setting in local/props.conf on Splunk Indexer. |
Client IP | Connection.IP property is used by default. Deselect it and select Client.IP if you have downstream proxies or loadbalancer between the client and MWG. |
URL Categories | add internal domains to "internal Domains" list to avoid them to being shown as "uncategorized" |
Headers | on MWG older than version 10.x some rules will be marked in red if they are not compatible - delete them or upgrade MWG to the newest 10.x version or later. |
TLS | disable this ruleset if HTTPS Scanner is not enabled |
- | To get the correct Rule statistics you must create one last ruleset with a rule named "Last Rule" which is applied to all cycles (Request, Response, Embedded). |
RuleSet Library | Opener, Hashes/Body, Malware, Media Type, Uploads - to get some of the required information, additional rules need to be placed in the corresponding Policy Rule Sets. If you skip this step, some tables and graphs will be empty. Watch a YouTube video on the Splunkbase for step by step instructions. |
Instead of letting Splunk read local splunk.log, events can be sent to a local Splunk instance via a local network interface or even loopback interface, without writing events to the hard disk (i.e. "Write Splunk Log" Rule Set can be disabled).
MWG UI:
$ModLoad imuxsock # provides support for local system logging (e.g. via logger command) $ModLoad imklog # reads kernel messages (the same are read from journald) $WorkDirectory /var/lib/rsyslog $ActionFileDefaultTemplate RSYSLOG_TraditionalFileFormat $IncludeConfig /etc/rsyslog.d/*.conf $ActionName messages *.info;daemon.!=info;mail.none;authpriv.none;cron.none -/var/log/messages authpriv.* /var/log/secure mail.* -/var/log/maillog cron.* /var/log/cron *.emerg :omusrmsg:* uucp,news.crit /var/log/spooler local7.* /var/log/boot.log $template msg_only,"%msg%\n" if $programname == 'mwg' and $syslogfacility-text == 'daemon' and $syslogseverity-text == 'info' then @127.0.0.1:6514;msg_only
Splunk UI:
MWG UI:
$ModLoad imuxsock # provides support for local system logging (e.g. via logger command) $ModLoad imklog # reads kernel messages (the same are read from journald) $WorkDirectory /var/lib/rsyslog $ActionFileDefaultTemplate RSYSLOG_TraditionalFileFormat $IncludeConfig /etc/rsyslog.d/*.conf $ActionName messages *.info;daemon.!=info;mail.none;authpriv.none;cron.none -/var/log/messages authpriv.* /var/log/secure mail.* -/var/log/maillog cron.* /var/log/cron *.emerg :omusrmsg:* uucp,news.crit /var/log/spooler local7.* /var/log/boot.log $template msg_only,"%msg%\n" if $programname == 'mwg' and $syslogfacility-text == 'daemon' and $syslogseverity-text == 'info' then @@server:6514;msg_only
Splunk UI:
$DefaultNetstreamDriver gtls $DefaultNetstreamDriverCAFile /etc/rsyslog.d/certs/example.com.ca.pem $DefaultNetstreamDriverCertFile /etc/rsyslog.d/certs/mwg.example.com.pem $DefaultNetstreamDriverKeyFile /etc/rsyslog.d/certs/mwg.example.com.key #$ActionSendStreamDriverAuthMode x509/name $ActionSendStreamDriverAuthMode anon #$ActionSendStreamDriverPermittedPeer splunk.example.com $ActionSendStreamDriverMode 1For Splunk configuration and more details watch Configure a McAfee Web Gateway (MWG) syslog to send TLS-secured data to Splunk
# exclude both daemon.notice and daemon.info: *.info;mail.none;daemon.!=info;daemon.!=notice;authpriv.none;cron.none -/var/log/messages $ActionQueueFileName fwdRule1 $ActionQueueMaxDiskSpace 1g $ActionQueueSaveOnShutdown on $ActionQueueType LinkedList $ActionResumeRetryCount -1 # use the new expression format instead of "traditional" severity and facility based selectors, because an expression like daemon.info match all messages of specified priority and HIGHER that can leads to duplicated events if $programname == 'mwg' and $syslogfacility-text == 'daemon' and $syslogseverity-text == 'info' then @@syslog1 $ActionQueueFileName fwdRule2 $ActionQueueMaxDiskSpace 1g $ActionQueueSaveOnShutdown on $ActionQueueType LinkedList $ActionResumeRetryCount -1 if $programname == 'mwg' and $syslogfacility-text == 'daemon' and $syslogseverity-text == 'notice' then @@syslog2
[monitor:///opt/mwg/log/user-defined-logs/splunk.log/splunk.log] sourcetype = mcafee:webgateway:custom # index = proxy
McAfee Web Gateway Cloud Service (WGCS) provides the log with a reduced set of fields, therefore only a subset of views will work properly.
There are several ways to pull WGCS logs:
When reading WGCS logs, use [monitor:// and not [batch://, because batch seems to delete logs too early. Use a separate Scheduled Task (schtasks /create /tn "Delete old WGCS" /tr "delete_old_logs.bat" /sc HOURLY ) to delete old logs, for example ForFiles /p D:\WGCS_Logs /d -1 /c "cmd /c del /q @file"
An example of inputs.conf:
[monitor://D:\WGCS_Logs] sourcetype = mcafee:webgateway:wgcs_v5 index = proxy crcSalt = <SOURCE>
McAfee Web Gateway based on RedHat/CentOS 7 and inherits some settings that rate-limit syslog. Read https://www.ibm.com/support/pages/how-disable-rsyslog-rate-limiting and https://access.redhat.com/solutions/1417483 to modify or disable rate-limiting in /etc/rsyslog.conf (using MWG UI) and /etc/systemd/journal.conf .
rsyslog.conf (after "$ModLoad imjournal" line):$SystemLogRateLimitInterval 0 $SystemLogRateLimitBurst 0 $imjournalRatelimitInterval 0 $imjournalRatelimitBurst 0journal.conf:
RateLimitInterval=0 RateLimitBurst=0Instead of disabling rate-limiting completely, it is better to set it to appropriate values for your setup.
Use following configuration for syslog-ng (on receiving side):
network flags(no-parse)
Correct extraction of host field is very important. Unfortunately default methods of host extraction have some downsides:
To summarize it all: it is better to set a host value explicit and not rely on "heuristic" that can lead to several host values for the same machine. With UF set a host in inputs.conf. With Syslog either use host_segement/host_regex on a syslog receiver or send a host name of MWG directly with an event and extract it during the ingestion. The second method allows to disable syslog header directly on MWG by defining the "msg_only" rsyslog template as described in Syslog UDP/TCP section:
props.conf:[mcafee:webgateway:custom] TRANSFORMS-extract_host_from_event = extract_host_from_eventtransforms.conf:
[extract_host_from_event] REGEX = \shost=(\S+) FORMAT = host::$1 DEST_KEY = MetaData:Host
The host field must be placed before long fields like url path or url query, because they can "push" the host field outside of the first 4096 bytes/characters limit defined by LOOKAHEAD property defined in transforms.conf, that specifies how far Splunk looks in the event for index-time fields.
Disabling syslog header has several benefits:
Check | Expected Result | Conditions/Causes | Comment |
---|---|---|---|
Timestamp and Timezone | Timestamp and timezone are correct, there are not "future" events | | eval diff=_indextime - _time | |
Index | Index is correct | Use a separate index for proxy events | |
Sourcetype | sourcetype is correct | ||
Host extraction | Host extraction is correct | Syslog | Don't rely on rDNS, it decrease performance and can fail. Hosts server1, SERVER1, server1.example.com, 10.20.30.40 can be one host, but are various hosts from Splunk point of view. |
Integrity | All events reach Splunk, no events are lost | Syslog, high log rate | useACK, rsyslog: disk queue |
Truncation | Long log lines aren't truncated | rsyslog: MaxMessageSize, syslog-ng: log_msg_size, syslog via UDP, Splunk: TRUNCATE | test-link |
Logging delay | Low logging delay | | eval diff=_indextime - _time | |
Log integrity in case of network interruption | Short network interruptions shouldn't lead to loss of events | useACK, rsyslog: disk queue | |
Secure transfer | Log transferred via TLS, Certificate validation, mTLS | ||
Multiline | There are no mulitline proxy events | ||
Duplicates | There are no duplicate events | ||
Parsing | All events parsed correctly, action/src/dest fields are always present | ||
Settings location | All settings are placed inside of MWG App or TA | Settings can be placed in a wrong app if GUI is used | Use btool to verify. |
Why a new log format? Neither the default nor the previously used MWGaccess3 log formats provide enough information for SIEM to be useful. For example these formats provide very limited information about download/upload risky files. Many SIEM correlation rules will not work properly if a transferred file was embedded as a part of a composite object (zip, iso, docx, etc.) or has different/faked media-type header or extension.
The new log format provides following use cases among many others:
The new custom log format (mcafee:webgateway:custom) consists of several parts:
2021-02-26 14:36:46.449 -0600 200 allowed 192.168.2.n https GET safebrowsing.googleapis.com 443 563/4156 38/17 up="/v4/threatListUpdates" ua="FF86-10.0" c="it" dip=142.250.185.n kex=112/112 cntx sccc=1302/1302 sslp=1.3/1.3 sslicn="GTS CA 1O1,GlobalSign" sslcn="upload.video.google.com" crtdays=-52 mbmismatch ctmt0 rul="L" rn=41/104 srcp=62407 conrt=0 b=524/4418 tunnel psrcip=192.168.2.nn psrcp=42550 piv=2.0/2.0 r=0 t=0/0/34/34/18/18/22/11/11
Instead of logging a URL as-is, MWG splits the URL into usable parts which will be put together on Splunk's end.
By default, the query string is not logged. You can enable it in the Web Data Model ruleset if needed.
An excerpt of the 100 most useful fields is provided below. MWG has about 900 properties that can be used for logging.
MWG field | CIM field | Comment | |||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Timestamp | - |
| |||||||||||||||||||||||||||
Connection.IP / Client.IP | src | Client.IP takes the value of X-Forwarded-For header | |||||||||||||||||||||||||||
Authentication.UserName | user | ||||||||||||||||||||||||||||
Message.TemplateName, Block.ID, Response.StatusCode, Protocol.FailureDescription, BytesFromServer, Command.Name, Action.Names | action | The action taken by the proxy: allowed, blocked, error or auth. Various MWG properties are used to calculate correct action field. | |||||||||||||||||||||||||||
URL | url | Don't enable it, Splunk build URL based on uri components | |||||||||||||||||||||||||||
URL.Categories | category | MWG will try to categorize URL retroactively even if URL Filter was skipped in the Policy Rule Sets. Add your internal domains to "internal Domains" list to avoid them be marked as "uncategorized" | |||||||||||||||||||||||||||
Header.Response.Get(Content-Type) MediaType.FromHeader | http_content_type | The content-type of the requested HTTP resource as reported by the web server (can be wrong, faked or missing) | |||||||||||||||||||||||||||
Header.Request.Get(User-Agent) | http_user_agent | A short string (FF68-10.0 for Firefox 68 on Windows 10) | |||||||||||||||||||||||||||
LastSentLastReceivedServer | response_time | FSFRS-LSFRS+LSLRS is used to calculate response_time that includes sending time | |||||||||||||||||||||||||||
Header.Request.Exists(Referer) | http_referrer | The HTTP referrer used in the request. The W3C specification and many implementations misspell this as http_referer. Use a FIELDALIAS to handle both key names. This field is disabled by default. | |||||||||||||||||||||||||||
URL.Domain of Header.Request.Exists(Referer) | http_referer_domain | The domain name contained within the HTTP referrer used in the request. Disabled by default. | |||||||||||||||||||||||||||
Response.StatusCode | status | The HTTP response code indicating the status of the proxy request. MWG doesn't distinguish between status sent by web server and status set by proxy, so this value can be misleading. Use action field to see what the proxy action was. | |||||||||||||||||||||||||||
URL.Protocol | - | http/https/ftp etc. Used to re-build url | |||||||||||||||||||||||||||
Command.Name | http_method | GET/POST/PUT/OPTIONS etc | |||||||||||||||||||||||||||
URL.Host | dest | The host of the requested resource | |||||||||||||||||||||||||||
URL.Port | dest_port | The port of the requested resource | |||||||||||||||||||||||||||
BytesToServer | bytes_out | The number of outbound bytes transferred | |||||||||||||||||||||||||||
BytesFromServer | bytes_in | The number of inbound bytes transferred | |||||||||||||||||||||||||||
TimeInTransaction | duration | The time taken by the proxy event, in milliseconds | |||||||||||||||||||||||||||
URL.Path | uri_path | The path of the resource served by the webserver or proxy | |||||||||||||||||||||||||||
URL.ParametersString | uri_query | Not enabled by default. You can enable it for all requests or selectively | |||||||||||||||||||||||||||
Application.Name | App | The application detected or hosted by the server/site such as WordPress, Splunk, or Facebook | |||||||||||||||||||||||||||
Cache.Status eq TCP_HIT | cached | Indicates whether the event data is cached or not. Not enabled by default. | |||||||||||||||||||||||||||
Header.Get(Cookie) | cookie | The cookie file recorded in the event. Not enabled by default. | |||||||||||||||||||||||||||
URL.Destination.IP | dest_ip | It is important to record the destination IP at the moment of the request. A hostname can be resolved to several IPs (think "moving target" CDN) so a DNS resolution a second later can lead to wrong result. Be aware that MWG can be unable to do DNS resolution by itself and it can be a different IP after all if MWG is behind upstream proxies. | |||||||||||||||||||||||||||
URL.Domain | url_domain | The domain name contained within the URL of the requested HTTP resource. It is extracted from hostname based on Public Suffix List | |||||||||||||||||||||||||||
Header.Request.GetAll | - | Returns a concatenated string of all the original request headers (separated by \r\n) as received from client. | |||||||||||||||||||||||||||
Header.Response.GetAll | - | Returns a concatenated string of all the original response headers (separated by \r\n) as received from server. | |||||||||||||||||||||||||||
Header.Request.Get(Via) | - | Via header in request | |||||||||||||||||||||||||||
Header.Response.Get(Via) | - | Via header in response | |||||||||||||||||||||||||||
Header.Response.Get(Location) | - | Location header in response | |||||||||||||||||||||||||||
Client.KeyExchangeBits | - | Normalized strength (symmetric) of the weakest link during the key exchange. Helps to detect outdated client software | |||||||||||||||||||||||||||
Server.KeyExchangeBits | - | Normalized strength (symmetric) of the weakest link during the key exchange. Helps to detect outdated servers which required special handling | |||||||||||||||||||||||||||
Server.Handshake.CertificateIsRequested | - | True, if the web server requests a client certificate (during the initial SSL handshake) [*] | |||||||||||||||||||||||||||
ClientContext.IsApplied | - | A clue if HTTPS Scanner is enabled for this request | |||||||||||||||||||||||||||
Server.Cipher | - | Description of cipher/algorithms between proxy and server (e.g. ECDHE-RSA-AES256-GCM-SHA384) | |||||||||||||||||||||||||||
Client.Cipher | - | Description of cipher/algorithms between client and proxy (e.g. ECDHE-RSA-AES256-GCM-SHA384) | |||||||||||||||||||||||||||
SSL.Server.Protocol | - | SSL/TLS protocol used between proxy and server (e.g. TLSv1.2 TLSv1.1 TLSv1.0 SSLv3.0 unknown). | |||||||||||||||||||||||||||
SSL.Client.Protocol | - | SSL/TLS protocol used between client and proxy (e.g. TLSv1.2 TLSv1.1 TLSv1.0 SSLv3.0 unknown) | |||||||||||||||||||||||||||
SSL.TransparentCNHandling | - | true for ssl connections where the CN is not known until the server handshake is done | |||||||||||||||||||||||||||
Server.CertificateChain.Issuer.CNs | ssl_issuer_common_name | The issuer common names of the certificate chain (bottom-up including the self-signed root CA, empty without certificate verification) [*] | |||||||||||||||||||||||||||
SSL.Server.Certificate.CN | ssl_subject_common_name | The common name of the server certificate [*] | |||||||||||||||||||||||||||
Server.Certificate.SHA2-256Digest | ssl_hash | The hex-encoded sha2-256 digest of the server certificate [*] | |||||||||||||||||||||||||||
Server.Certificate.AlternativeCNs | - | This list stores all alternative subject names stored in the server certificate's extensions section [*] | |||||||||||||||||||||||||||
Server.Certificate.DaysExpired | ssl_end_time | Stores how many days the server certificate is expired. Negative values mean that it is still valid [*] | |||||||||||||||||||||||||||
DNS.Lookup(URL.Host) | - | List of IP addresses of URL.Host if there are more than one. | |||||||||||||||||||||||||||
DNS.Lookup.Reverse(URL.Destination.IP) | - | List of hostnames for the destination IP. Very often it does not equal the requested hostname | |||||||||||||||||||||||||||
Body.NumberOfChildren | - | Number of embedded objects for archive or document [*] | |||||||||||||||||||||||||||
Body.NestedArchiveLevel | - | The current archive level, used to calculate the max level of the embedded object [*] | |||||||||||||||||||||||||||
IsCompositeObject | - | True, if current file is composite (archive or office document) [*] | |||||||||||||||||||||||||||
Body.IsEncryptedObject | - | True, if current object is encrypted | |||||||||||||||||||||||||||
Antimalware.Proactive.Probability | - | Malware probability value | |||||||||||||||||||||||||||
Antimalware.Infected | used for: file_name file_hash | True, if virus was found, false otherwise | |||||||||||||||||||||||||||
Antimalware.VirusNames | signature | List of names of found viruses | |||||||||||||||||||||||||||
Application.Reputation | - | reputation of the application | |||||||||||||||||||||||||||
Authentication.Method | authentication_method | authentication method (NTLM, Kerberos, etc.) | |||||||||||||||||||||||||||
Authentication.Realm | - | authentication realm (i.e. AD directory name) | |||||||||||||||||||||||||||
Authentication.UserGroups | - | User Groups, can be filtered with "Authentication UserGroups to log" list | |||||||||||||||||||||||||||
Authentication.FailureReason.Message | signature (?) | Human readable authentication failure reason description | |||||||||||||||||||||||||||
Authentication.Failed | action (in Authentication DM) | It is true if credentials were provided but the authentication has failed | |||||||||||||||||||||||||||
Cache.IsCacheable | - | True, if the response is cacheable and web cache is enabled | |||||||||||||||||||||||||||
Cache.Status | - | TCP_HIT for a web cache hit, TCP_MISS_RELOAD for a miss, TCP_MISS_VERIFY if the data in the cache was outdated, TCP_MISS_BYPASS for bypass based on I/O load | |||||||||||||||||||||||||||
Cache.IsFresh | - | True, if the response is validated or not read from web cache | |||||||||||||||||||||||||||
MagicBytesMismatch | - | True, if Mime Type from header doesn't match to detected Mime Type [*] | |||||||||||||||||||||||||||
EnsuredTypes | - | List of Mime Types detected by signatures (with high probability of detection) | |||||||||||||||||||||||||||
NotEnsuredTypes | - | List of Mime Types detected by signatures (with low probability of detection) | |||||||||||||||||||||||||||
IsMediaStream | - | Determine if current transaction is media stream | |||||||||||||||||||||||||||
StreamDetector.Probability | - | Probability value for media stream detection | |||||||||||||||||||||||||||
StreamDetector.MatchedRule | - | Returns name of matched streaming detection rule | |||||||||||||||||||||||||||
Rules.CurrentRule.Name | - | The name of the currently evaluated rule | |||||||||||||||||||||||||||
Rules.EvaluatedRules | - | List of all IDs of rules/rule sets, which have been evaluated | |||||||||||||||||||||||||||
Rules.FiredRules | - | List of all IDs of rules/rule sets, where the condition was true | |||||||||||||||||||||||||||
Proxy.IP | - | Stores the Webgateway IP | |||||||||||||||||||||||||||
Proxy.Port | - | Stores the Webgateway port | |||||||||||||||||||||||||||
Client.ProcessName | - | Stores the process name that initiated the connection, e.g. provided by MCP | |||||||||||||||||||||||||||
Client.SystemInfo | - | Client System Information (provided by MCP) | |||||||||||||||||||||||||||
DNS.Lookup.Reverse(client_ip) | src_ip | Hostname of the client | |||||||||||||||||||||||||||
Connection.Protocol | - | The protocol that the client uses to communicate with the proxy (HTTP, HTTPS, FTP, IFP, SSL, ICAP, XMPP, TCP or SOCKS) | |||||||||||||||||||||||||||
Connection.Port | src_port | Stores the port of the client | |||||||||||||||||||||||||||
Connection.RunTime | - | Connection run time (current time minus start time) in seconds | |||||||||||||||||||||||||||
BytesFromClient | - | Number of bytes received from the client for this request | |||||||||||||||||||||||||||
BytesToClient | - | Number of bytes sent to the client for this request | |||||||||||||||||||||||||||
Tunnel.Enabled | - | True, if a HTTP or HTTPS tunnel was enabled - the server response bypassed the response cycle | |||||||||||||||||||||||||||
Proxy.Outbound.IP | - | Stores the IP which is used as the Outbound Source IP by Webgateway when connecting to onward server | |||||||||||||||||||||||||||
Proxy.Outbound.Port | - | The port which is used as the source port by Webgateway when connecting to onward server | |||||||||||||||||||||||||||
ProtocolAndVersion | - | protocol and version of the request/response (HTTP/1.1, HTTP/2.0) | |||||||||||||||||||||||||||
Error.ID | - | ID of error | |||||||||||||||||||||||||||
Error.Message | - | Name of error | |||||||||||||||||||||||||||
URL.Reputation | severity (?) | Returns the web reputation value for the current URL. Range is from -127 to 127, where -127 means 'Minimal Risk' and 127 means 'High Risk'. | |||||||||||||||||||||||||||
URL.Geolocation | - | Returns the geolocation of the current URL. The geolocation is the code of the country in which the webserver is located, that hosts the requested resource. The country code is given in ISO 3166 notation. Note: The setting "Disable local GTI database" must be enabled in the URL Filter settings; otherwise this property is not filled. | |||||||||||||||||||||||||||
TimeInRuleEngine | - | Milliseconds currently spent in rule engine. If used in log handler, time consumed by the rule engine from start to the end of a transaction | |||||||||||||||||||||||||||
FirstSentFirstReceivedServer LastSentLastReceivedServer FirstReceivedFirstSentClient LastReceivedLastSentClient LastSentFirstReceivedServer | - | Time between first byte sent to server and first byte returned from server in milliseconds etc... | |||||||||||||||||||||||||||
HandleConnectToServer | - | Time to connect to a server in milliseconds | |||||||||||||||||||||||||||
ResolveHostNameViaDNS | - | Time to resolve a host name via DNS | |||||||||||||||||||||||||||
TimeInExternals | - | Milliseconds currently spent waiting for external responses, e.g. from AV scanner, domain controller for NTLM authentication or URL cloud categorization |
Audit logs (/opt/mwg/log/audit/audit.log) contains all changes and activity made by administator(s) using UI or REST interface. Audit log can be sent using an UF or custom syslog configuration. Almost 70 actions are mapped to Authentication and Change CIM Data Models:
Action | action | change_type | object_category |
---|---|---|---|
ACTIVATE_LICENSE_FILE | modified | license | |
ADDED_ADMINROLE | added | AAA | role |
ADDED_APPLIANCE | added | appliance | |
ADDED_CONTENT | added | filesystem | config |
ADDED_GROUP_ROLE_MAPPING | added | AAA | role |
ADDED_RULES | added | config | |
ADDED_SYSTEM_FILES | added | filesystem | file |
ADDED_TEMPLATE_DIRECTORIES | added | filesystem | directory |
AUTHENTICATE_WITH_EXTERNAL_SERVER | success | ||
BACKUP_TRIGGERED | created | backup | |
CREATED_NEW_LIST | added | config | |
CREATED_NEW_RULE | added | config | |
CREATED_NEW_RULEGROUP | added | config | |
CREATED_NEW_SETTINGS | added | config | |
CREATED_NEW_USER | added | AAA | user |
CREATED_NEW_USER_DEFINED_PROPERTY | added | config | |
DASHBOARD_DATA_RESET | deleted | ||
DATE_CHANGED | modified | config | |
DELETED_ADMINROLE | deleted | AAA | role |
DELETED_APPLIANCE | deleted | appliance | |
DELETED_CONTENT | deleted | config | |
DELETED_LIST | deleted | config | |
DELETED_LOG_HANDLER | deleted | config | |
DELETED_RULE | deleted | config | |
DELETED_RULE_GROUP | deleted | config | |
DELETED_RULES | deleted | config | |
DELETED_SETTINGS | deleted | config | |
DELETED_TEMPLATE_DIRECTORIES | deleted | directory | |
DELETED_TEMPLATE_FILES | deleted | file | |
DELETED_USER | deleted | AAA | user |
DELETED_USER_DEFINED_PROPERTY | deleted | config | |
EXPORT_PRIVATE_KEY | read | config | |
FILE_DOWNLOAD | read | file | |
FILE_UPLOAD | added | filesystem | file |
FILES_DELETE | deleted | filesystem | file |
FORCED_USER_LOGOUT | logout | ||
JOINED_NTLM | modified | config | |
LEFT_NTLM | modified | config | |
MODIFIED_ADMINROLE | modified | role | |
MODIFIED_APPLIANCE_SETTINGS | modified | config | |
MODIFIED_CLUSTER_CONFIGURATION | modified | config | |
MODIFIED_CONTENT | modified | config | |
MODIFIED_CATALOG | modified | config | |
MODIFIED_GROUP_ROLE_MAPPING | modified | role | |
MODIFIED_LIST | modified | config | |
MODIFIED_NTLM | modified | config | |
MODIFIED_RULE | modified | config | |
MODIFIED_RULE_GROUP | modified | config | |
MODIFIED_SETTINGS | modified | config | |
MODIFIED_SYSTEM_FILES | modified | filesystem | file |
MODIFIED_TEMPLATE_FILES | modified | filesystem | file |
MODIFIED_USER | modified | AAA | user |
MODIFIED_USER_DEFINED_PROPERTY | modified | config | |
MOVED_RULE_GROUPS | modified | config | |
MOVED_RULES | modified | config | |
REORDERED_CONTENT | modified | config | |
RESTORE_FAILED | modified | config | |
RESTORE_STARTED | pending | config | |
RESTORE_SUCCEDED | modified | config | |
SAVING_FAILED | read | config | |
SYSTEM_LIST_UPDATE | modified | config | |
TRIGGER_ACTION | pending | config | |
USER_LOGIN | success | ||
USER_LOGIN_FAILED | failure | ||
USER_LOGOUT | logout | ||
USER_TIMED_OUT | timeout |
You want to: | Action |
---|---|
complete setup |
|
use non-default index | Modify "index_and_sourcetype" macro to include an index (i.e. 'index=proxy AND sourcetype="mcafee:webgateway:custom"') |
implement Common Information Model (CIM) | Install Splunk Common Information Model (CIM) App |
import new version of the Splunk Logging Ruleset but keep all modifications | Use a mwg_xml2txt script to see differences between versions. |
build accelerated DM | Don't put a high variable strings like uri_path, uri_query, url in accelerated DM until you really need it |
improve proxy performance, find causes of high latency | Check errors, web cache (should be disabled!), timers (esp. DNS) |
configure data retention | Configure frozenTimePeriodInSecs TBD |
implement some GDRP requirements | Check if personally indentifiable information (PII) should be removed, encrypted, obfuscated or masked. TBD |
investigate a breach/incident | Create a copy of all relevant events (also from other sources) to avoid aging it out. TBD |
implement a 4-eyes principle | It can be implemented either on the proxy side or using splunk. TBD |
send events to other destination besides splunk | Modify rsyslog.conf or use "Route and filter data". TBD |
customize or create own views and reports | TBD |
add new fields | TBD |
exclude some events from search | Create a macro to exclude some sources, destinations or user-agents and add it to a query |
exclude some events from logging | On MWG: Modify existing list "Domains not to log" or create own excluding rules |
improve search performance |
|
correctly log FTP/FTPoverHTTP connections | Due to the nature of FTP requests, the MWG events don't correctly reflect connection type. This requires more work, both on MWG and on Splunk side. TBD |
work with IPv6 addresses | TBD |
HTTP protocol can be used by malware for communication with C&C, blending in a normal web traffic generated by benign applications like browsers. However, most enterprise security solutions don’t analyze all parts of the HTTP protocol and even if they do, only partial information can be logged: either a small subset of headers (like User-Agent, X-Forwarded-For, Referer, etc.) or header names must be configured explicitly. Neither of these methods allows to log all or unknown headers.
Fortunately, the recent MWG/SWG versions close this security gap by allowing to log all HTTP headers. The rule based policy logic make possible to apply such deep logging on suspicious transactions only, significantly reducing log volume.
Enabling the collection of header information on the MWG side: TBD
Using a conditional criterias to apply header collection to suspicious transactions only: TBD
Log example with request headers information: TBD
Working with Header View: TBD
Next steps: configure response headers collection
This App, documentation and MWG logging ruleset are licensed under Creative Commons BY-ND 3.0
The MWG Splunk Logging RuleSet is quite complex. Most customers modify it to accommodate to own needs. Use this script to see all modifications when importing new version of the RuleSet.
Usage:
Step 1: convert XML to TXT and compare them
perl mwg_xml2txt.pl old_ruleset.xml > old_ruleset.txt
perl mwg_xml2txt.pl new_ruleset.xml > new_ruleset.txt
vimdiff old_ruleset.txt new_ruleset.txt
VIMDIFF will compare TXT files and highlight differences in lists and rules using color output. It can be a simple enabled vs disabled, but can be also a more complex modification - in this case use a Step 2 to do a direct XML comparison.
Step 2: identify differences and optionally extract corresponding XML section for comparison
export a single rule from xml ruleset (replace RuleName with an actual Rule Name that you want to extract)
perl -0777 -e '$a=<>; ($rule)=$a=~m/(\QRuleName\E.*?<\/rule>)/ms; print "$rule"' ruleset_old.xml > rule_old.txt
perl -0777 -e '$a=<>; ($rule)=$a=~m/(\QRuleName\E.*?<\/rule>)/ms; print "$rule"' ruleset_new.xml > rule_new.txt
vimdiff rule_old.xml rule_new.xml
After the Step 1 you'll see similar output (see below). The [true] or [false] indicate if the rule is enabled or disabled. The short 6-char string after each line is first 6 chars of the md5 from the whole rule block, so even a small modification will be highlighted.
|
|
#!/usr/bin/perl use strict; use warnings; my $version = "0.3 17.Oct.2022 by PP"; use Digest::MD5 qw(md5_hex); # #
application/vnd.ms-excel.addin.macroEnabled.12 #MS Office 2007 Excel addin (macro-enabled) # # ##
# # # # # # ## # # # usage: # Step 1: convert XML to TXT and compare them # perl mwg_xml2txt.pl old_ruleset.xml > old_ruleset.txt # perl mwg_xml2txt.pl new_ruleset.xml > new_ruleset.txt # vimdiff old_ruleset.txt new_ruleset.txt # # VIMDIFF will compare TXT files and highlight differences in lists and rules using color output. It can be a simple enabled vs disabled, # but can be also a more complex modification - in this case use a Step 2 to do a direct XML comparison. # # Step 2: identify differences and optionally extract corresponding XML section for comparison # export a single rule from xml ruleset: # perl -0777 -e '$a=<>; ($rule)=$a=~m/(\QRuleName\E.*?<\/rule>)/ms; print "$rule"' ruleset_old.xml > rule_old.txt # perl -0777 -e '$a=<>; ($rule)=$a=~m/(\QRuleName\E.*?<\/rule>)/ms; print "$rule"' ruleset_new.xml > rule_new.txt # vimdiff rule_old.xml rule_new.xml my $line=1; my $xml = undef; open (my$fh, '<', $ARGV[0]) or die "cannot open file: $!"; { local $/=undef; $xml = <$fh>; } close $fh; my @lists=$xml=~m/ )/ms){ # map has other structure $list = $1; #print "$list_name\n$list\n\n"; my @entries = $list =~ m/ key="key"[^\n]+value="([^\n]+\n[^\n]+value="[^"]+)"/msg; s/([^"]+)".*\n.*"([^"]*)/$1 - $2/msg for @entries; # remove anything except key-value print "$list_name\n ".(join "\n ",sort @entries)."\n\n"; }elsif($xml =~/(
)/ms){ $list = $1; #print "$list_name\n$list\n\n"; my @entries = $list =~ m/
([^<]+)<\/entry>/msg; print "$list_name\n ".(join "\n ",sort @entries)."\n\n"; }else{ die "cannot find list" }; } while(<>){ #print "$line: $_"; $line++; next if / /; my($ruleid,$string,$offset,$name,$enabled,$rule_block)=(undef,undef,undef,undef,undef,undef); if(/^(\s*) )/ms; $rule_block =~ s/(id=")\d+"/$1XXX"/msg; $rule_block =~ s/(propertyId=")\d+"/$1XXX"/msg; $rule_block =~ s/(id="com\.scur\.type\.\w+\.)\d+"/$1XXX"/msg; $rule_block =~ s/(id="com\.scur\.type\.complex\.\w+\.)\d+"/$1XXX"/msg; $rule_block =~ s/(com\.scur\.engine\.\w+\.)\d+/$1XXX/msg; if(not defined $rule_block){die "Rule block not defined for $string"}; print "$offset $name [$enabled] ".substr((md5_hex($rule_block)),0,6)."\n" } }