MacOS Endpoint Security Framework (ESF)

WithSecure-abstract4
Reading time: 30 min

    Published

  • 10/2022
Connor Morley

Senior researcher

Introduction

In 2019, Apple announced the replacement of Kernel Extensions with System Extensions in MacOS. This article will highlight one of the key results of this change: The Endpoint Security Framework (ESF).

This article will explore how ESF works, the benefits it provides detection and response teams, and how you can set up endpoint filtering to avoid data overload – a common issue associated with ESF use. 

What is the Endpoint Security Framework (ESF)?

ESF is a MacOS component developed by Apple as a kernel-based solution to security needs and is capable of providing proactive detection and response for real-time events. It functions very similarly to Windows ETW (Event Tracing for Windows) and is a significantly more effective security framework when compared with its predecessor, OpenBSM – which was developed originally by Sun Microsystems. 

What does this look like?

Figure 1: How ESF detection works

Figure 1: How ESF detection works

The diagram shown at the top right corner of figure 1 provides an overview of how ESF works: The ESF application sits in the kernel space (green section) to collect telemetry or event types, depending on the user’s detection purposes. The messages from the kernel space will then be delivered back to the user space in the application to be run through the user’s detection stack for the detection process to begin. 

The large image on the left hand terminal window is an example of an event output. It contains various data points (detection criteria) in each of the event types, including environmental variables, user ID, parent process ID (PPID), process ID, and even CD-Hash to confirm whether the binary has been signed by a developer.

Why is the ESF important?

Restrictions and changes in MacOS

Following the update, Apple announced the deprecation of MacOS KEXT (Kernel Extensions from Third-party vendors) to replace them with System Extensions. Prior to this, security vendors would install a Kernel Extension to help them acquire the telemetry they wanted to monitor, and then output that to the detection stack for detection purposes. By introducing System Extensions as a solution to Kernel Extensions, Apple was able to provide a managed software they would continue to effectively maintain and update. 

In addressing some of the areas where OpenBSM was lacking, Apple ensured the ESF would be noticeably easier to integrate into the user’s detection stack, and effectively solve several of the issues previously associated with OpenBSM detailed in our earlier paper.

How can we use the ESF?

Old way of monitoring - OpenBSM

Figure 2: How OpenBSM monitoring worked

Figure 2: How OpenBSM monitoring worked

Third-party vendors would install a kernel extension in the kernel space, and the monitoring program would require that kernel extension to generate telemetry and run through the detection stack. OpenBSM works by hooking different log files and once the log file is populated, the data is subsequently transmitted through the hooks and back into the detection system for further auditing. 

The new way

Figure 3: New way of monitoring (ESF)

Figure 3: New way of monitoring (ESF)

In the new and improved process, the system extension works in the user space. There are three specific frameworks that were developed as a result of Apple deprecating KEXT: 

  • The network extension framework 

  • The endpoint security framework (ESF) 

  • The driver kit framework. 

The main drivers behind Apple’s decision to remove access to the kernel space were stability and security. Because third-party access to kernel space raises several security concerns, Apple’s removal of third-party access, almost entirely, mitigates that risk. It also increases the stability of the system preventing “black screens of death” (BSODs), which normally occur when third-party vendors make inaccurate memory allocations.

Note: Although kernel extensions have been deprecated, they can still be used in modern Mac systems – but the caveat is that the security profile of the system would have to be severely degraded. This can be very difficult to achieve and is usually done solely for development purposes. If vendors do not replace KEXTs with System Extensions, they could be jeopardizing the security of their customers’ systems. 

Issues with ESF use and solutions

A few issues were encountered when experimenting with the Endpoint Security Framework:

  • Bottlenecking issue in messaging queue – An inconsistency in the data that was output was evident. Further analysis showed that this was due to a bottleneck within the kernel messaging queue: The queue was being overloaded and data packets were being silently dropped without alerting the user. 

  • Solution: Two possible solutions for the bottlenecking issue are either a multi-client system or event muting. Event muting allows the user to specify which data packets to omit based on certain identifiers when they subscribe to an event type at the kernel level. However, the issue with event muting lies with its lack of flexibility. Although it is effective at reducing the data load, users cannot yet fine-tune event muting to make it a sustainable solution. 

From this perspective, the multi-client system is a far more viable solution for the bottlenecking issue as it allows for a single program to have multiple ESF clients subscribe to one or multiple event types. Consequently, each client can have their own messaging queue which is far less likely to be overloaded. 

Another solution, the value “seq_num” (sequence number) was introduced within the es_message_t by Apple between MacOS 10.15 (Catalina) and MacOS 10.15.4 when the SDK (software development kit) was silently updated. Here, the messages published from the kernel space to the user space began to have sequence numbers attached to them. This allows for an analyst to notice any gaps in the sequence numbers and subsequently determine whether data packets are being silently dropped. Although this doesn’t alert the users, it does allow them to monitor the situation and perform dynamic rebalancing accordingly to avoid data overload. 

  • SYSTEM verbosity – One of the difficulties of using ESF to handle new data sets is its tendency to ingest every activity across the system, even general maintenance operations and daemons. This causes a vast amount of data bloat and makes the sifting of usual systems from those that are crucial to the process very difficult.

  • Solution: Ensure pre- or post-acquisition filters are in place to omit various types of data that the ESF user will generate (if event muting is not effectively applied). To filter before, a user must use event muting as it is the only way to place filtering on the kernel side. However, event muting tends to work too broadly which means that a user must omit either SYSTEM level processes in general or individual process instances – which is much too specific and can lead to blocking out all SYSTEM level activity: If a SYSTEM user is compromised on an endpoint, this can lead to ignorance of that compromise and pose a high-level security risk as a result.

  • Real Parent Process ID issue relating to the PPID running processes – Because of the way Cross Process Communication (XPC) works within the MacOS environment, the Parent Process ID is normally attributed to one of three processes: “Launchd”, “runningboardd” (both are daemons), or XPC Proxy. As this was a limitation notoriously associated with MacOS, many methods were employed in attempts to resolve the “launchd” real parent ID (RPID) problem – but the introduction of “runningboard” means the issue is ongoing. Hence, when utilizing ESF data, the PPID data must be filtered through an adequate system to generate the RPID.

  • Solution: There is no sustainable solution for this as regular updates quickly render solutions ineffective. However, current notable solutions include: TrueTree by Jaron Bradley and launchdXPC by Patrick Wardle. These two technologies proved to be very effective for “launchd” and XPC Proxy but became less viable following Apple’s move to the System “runningboardd”. 

The ESFang solution introduced by Connor Morley

Code-named ESFang, this is a solution created by WithSecure’s Senior Researcher and former Threat Hunter, Connor Morley. It was designed to address several issues associated with the MacOS Endpoint Security Framework, and was adapted from an amalgamation of Patrick Wardle’s, Chris Ross’ and Omark-Ikram’s groundwork on the ESF’s initial detection capabilities and ingestion of data, in late 2019. 

ESFang was developed in early 2021 (now available for public consumption on the Countercept GitHub). It allows for dynamic digestion of 52 different event types including process events, FILE Events, FILE metadata events, inter-process connection (IPC) port allocations, etc. The main feature of ESFang is its ability to provide users with the choice to specify which events they want to ingest any time they run it, which helps to avoid the bottlenecking issue as discussed previously.

Moreover, modifying the POC to a multi-threaded system allows the user to put in the multi-client system as a separate ingestion.

ESFang also has JSON (JavaScript Object Notation) output to facilitate seamless transmission of any available databases to additional detection stacks.  

Meterpreter use case

Use case outline

A use case for ESF telemetry analysis against the Meterpreter agent was conducted on the MacOS system 11.2.2 to test several functionality aspects including event output, detection capabilities and a comparison against ESF’s predecessor and former telemetry systems: OpenBSM and KEXT. The ESFang POC was used as a main tool for telemetry acquisition and was only tested on the Meterpreter agent’s capabilities on an endpoint with a sole focus on the post-exploitation stage from a single host. 

Overall findings

Figure 4: Meterpreter telemetry generation via ESF

Figure 4: Meterpreter telemetry generation via ESF

The graph in figure 4 represents the event types that were ingested during ESF telemetry acquisition. Depending on the activity that is being conducted, users can get vast or small amounts of information, which can be very useful for detection. 

Figure 5: Key event types that were conducted

Figure 5: Key event types that were conducted

Several key event types were conducted during the installation for the Meterpreter agent and as shown in figure 5, memory protect and reader directory show the highest increase. Despite this however, they aren’t deemed as valuable data points as they lack the specificity required to identify malicious activity and help build appropriate detections.

Alternatively, data points like open and FCNTL (file control) when cross-referenced are valuable enough to help the user create a detection profile specifically for the Meterpreter agent.

Breaking this down

In total, the graph in figure 5 had 259 data points – but quantity is not an indicator of quality. Too much data can overload the system which can lead the user to become blind to other malicious activity. Nevertheless, it is useful to have enough data points in addition to high processing power as this will allow the user to perform enough cross-referencing for the creation of high-fidelity detection profiles.

More valuable event types

Figure 6: ES Event Notify Open (file operation)

Figure 6: ES Event Notify Open (file operation)

A Notify Event type is generated at any time a file is opened by a user/program, and typically specifies which process accessed which file and at what time. It is also the most important part of the file monitoring element of ESF as it provides analysts with large amounts of data for detecting malicious activity. 

Here, 30 open events were created by the Installation, whereas webcam stream (which is non-operational secure) generated a staggering 478 events, followed by process listing at 249. 

Figure 7: ES event notify write (file operation)

Figure 7: ES event notify write (file operation)

Figure 7 represents an event operation that writes to any existing file. Any time a process writes to a file, this strongly signifies the presence of anomalous behavior if it originates from an anomalous or non-editing process. 

Edit file had only one event, which is to be expected as in both cases it was writing the same amount of data to a target file. Upload had 226 events, which are all repeated write events and can imply that something is being transmitted in blocks.

Figure 8: ES event notify IOKIT open

Figure 8: ES event notify IOKIT open 

IOKIT (input-output) is performed for hardware or driver access on the Mac system. In this case, only two were reflected (screenshare and webcam stream) using Hardware and were non-operation secure. 

Valued event types for detection in ESF

The following event types found in ESF can give a very accurate detection capability for multiple attack frameworks that are currently available:

Valued event types for detection in ESF

Conclusion

The Endpoint Security Framework boasts powerful capabilities for Detection and Response purposes and is regularly being updated by Apple to proactively address the problems previously associated with its operations. Compared to prior solutions which allow users to develop their own kernel extensions using tools such as OpenBSM, ESF is relatively easy to use; its ingestion of data and capability to format this data to better suit a user’s data requirements are relatively simple. Moreover, its potential to output data using low level information allows for high-fidelity detections, further boosting the ESF’s high-detection capabilities. Nevertheless, due to some of the issues mentioned in this article, filtering per endpoint on either the kernel space or the application side itself is essential because of the amount of data a user will need to process for detection purposes. Although a solution like event-muting can make this slightly difficult to achieve due to its lack of flexibility, setting up a multi-client system is more feasible as it will help you to pick and choose which events you want to ingest whenever you run ESFang.

This article was adapted from a presentation given by Connor Morley. Watch the full talk here.