Troubleshoot VM out-of-memory errors

This page provides information on Dataproc on Compute Engine VM out-of-memory (OOM) errors, and explains steps you can take to troubleshoot and resolve OOM errors.

OOM error effects

When Dataproc on Compute Engine VMs encounter out-of-memory (OOM) errors, the effects include the following conditions:

Master and worker VMs freeze for a period of time.
Master VMs OOM errors cause jobs to fail with "task not acquired" errors.
Worker VM OOM errors cause a loss of the node on YARN HDFS, which delays Dataproc job execution.

YARN memory controls

Apache YARN provides the following types of memory controls:

Polling based (legacy)
Strict
Elastic

By default, Dataproc doesn't set yarn.nodemanager.resource.memory.enabled to enable YARN memory controls, for the following reasons:

Strict memory control can cause the termination of containers when there is sufficient memory if container sizes aren't configured correctly.
Elastic memory control requirements can adversely affect job execution.
YARN memory controls can fail to prevent OOM errors when processes aggressively consume memory.

Dataproc memory protection

When a Dataproc cluster VM is under memory pressure, Dataproc memory protection terminates processes or containers until the OOM condition is removed.

Dataproc provides memory protection for the following cluster nodes in the following Dataproc on Compute Engine image versions:

Role	1.5	2.0	2.1	2.2
Master VM	1.5.74+	2.0.48+	all	all
Worker VM	Not Available	2.0.76+	2.1.24+	all
Driver Pool VM	Not Available	2.0.76+	2.1.24+	all

Identify and confirm memory protection terminations

You can use the following information to identify and confirm job terminations due to memory pressure.

Process terminations

Processes that Dataproc memory protection terminates exit with code 137 or 143.
When Dataproc terminates a process due to memory pressure, the following actions or conditions can occur:
- Dataproc increments the dataproc.googleapis.com/node/problem_count cumulative metric, and sets the reason to ProcessKilledDueToMemoryPressure. See Dataproc resource metric collection.
- Dataproc writes a google.dataproc.oom-killer log with the message: "A process is killed due to memory pressure: process name. To view these messages, enable Logging, then use the following log filter:
```
resource.type="cloud_dataproc_cluster"
resource.labels.cluster_name="CLUSTER_NAME"
resource.labels.cluster_uuid="CLUSTER_UUID"
jsonPayload.message:"A process is killed due to memory pressure:"
```

Master node or driver node pool job terminations

When a Dataproc master node or driver node pool job terminates due to memory pressure, the job fails with error Driver received SIGTERM/SIGKILL signal and exited with INT code. To view these messages, enable Logging, then use the following log filter:
```
resource.type="cloud_dataproc_cluster"
resource.labels.cluster_name="CLUSTER_NAME"
resource.labels.cluster_uuid="CLUSTER_UUID"
jsonPayload.message:"Driver received SIGTERM/SIGKILL signal and exited with"
    
```
- Check the google.dataproc.oom-killer log or the dataproc.googleapis.com/node/problem_count to confirm that Dataproc Memory Protection terminated the job (see Process terminations).
Solutions:
- If the cluster has a driver pool, increase driver-required-memory-mb to actual job memory usage.
- If the cluster does not have a driver pool, recreate the cluster, lowering the maximum number of concurrent jobs running on the cluster.
- Use a master node machine type with increased memory.

Worker node YARN container terminations

Dataproc writes the following message in the YARN resource manager: container id exited with code EXIT_CODE. To view these messages, enable Logging, then use the following log filter:

resource.type="cloud_dataproc_cluster"
resource.labels.cluster_name="CLUSTER_NAME"
resource.labels.cluster_uuid="CLUSTER_UUID"
jsonPayload.message:"container" AND "exited with code" AND "which potentially signifies memory pressure on NODE

If a container exited with code INT, check the google.dataproc.oom-killer log or the dataproc.googleapis.com/node/problem_count to confirm that Dataproc Memory Protection terminated the job (see Process terminations).

Solutions:
- Check that container sizes are configured correctly.
- Consider lowering yarn.nodemanager.resource.memory-mb. This property controls the amount of memory used for scheduling YARN containers.
- If job containers consistently fail, check if data skew is causing increased usage of specific containers. If so, repartition the job or increase worker size to accommodate additional memory requirements.

Fine-tune Linux memory protection on the master node (advanced)

Dataproc master nodes use the earlyoom utility to prevent system hangs by freeing memory when available memory is critically low. The default configuration is suitable for many workloads. However, you might need to adjust the configuration if your master node has a large amount of memory and experiences rapid memory consumption.

In scenarios with high memory pressure, the system can enter a state of "thrashing," where it spends most of its time on memory management and becomes unresponsive. This can happen so quickly that earlyoom fails to take action based on its default settings or fails to act before the kernel OOM response is invoked.

Before you begin

This is an advanced tuning option. Before you adjust earlyoom settings, prioritize other solutions, such as using a master VM with more memory, reducing job concurrency, or optimizing job memory usage.

Customize `earlyoom` settings

The default earlyoom configuration uses a fixed amount of free memory as a trigger. On virtual machines with a large amount of RAM, for example 32GB or more, this fixed amount might represent a small fraction of the total memory. This makes the system susceptible to sudden spikes in memory usage.

To customize the earlyoom settings, connect to the master node and modify the configuration file.

Connect to your master node using SSH.
Open the configuration file for editing:
```
sudo nano /etc/default/earlyoom
```
Adjust the minimum memory threshold. Locate the EARLYOOM_ARGS line. The -M <kbytes> option sets the minimum amount of free memory in KiB that earlyoom tries to maintain. The default value is -M 65536, which is 64 MiB.

Tuning considerations: Setting the -M value too high can cause earlyoom to terminate processes more aggressively. This can impact applications even when sufficient memory might still be available for them. Monitor system behavior after making changes. Adjust the -M value based on your instance size and workload. A general guideline is to set -M to a value that represents 1-5% of total system memory.

For a master node with substantial memory, increase this value. For example, to set the threshold to 1 GiB (1048576 KiB), modify the line as follows:
```
EARLYOOM_ARGS="-r 15 -M 1048576 -s 1"
```
Notes:
- -r: Memory report interval in seconds
- -s: The swap space threshold to trigger earlyoom
Restart the earlyoom service to apply the changes:
```
sudo systemctl restart earlyoom
```