AWS EMR Log4j2 workaround for CVE-2021-44228
Published
Updated
Last updated: December 20, 2021 The following is a workaround in order to patch the CVE-2021-44228 Remote Code Execution (RCE) vulnerability inside of Amazon AWS’s EMR service using bootstrap scripts that have been provided by the AWS team. These remediation steps are also applicable to CVE-2021-45046.
A short note on CVE-2021-44228
This vulnerability affects Log4j2 versions 2.15 and lower. Many projects or their dependencies use Log4j2 for their logging or logging configuration. When attempting to resolve this vulnerability in Spark or Hadoop, it is important to understand that even though the codebase of your job may not be vulnerable (ex: it uses log4j2 2.16 or greater), the underlying cluster the job runs on (Spark / Hadoop) may still be using a vulnerable version of this library. Thus it may not be completely safe to assume that setting the property spark.driver.extraJavaOptions
will remediate this vulnerability.
Patching Log4j2 Dependencies in EMR
Amazon has released a number of patch scripts located in the bootstrap folder of the Amazon EMR s3 folder. Note, the elasticmapreduce
S3 bucket is owned and managed by AWS. I recommend you pull the scripts directly from this location.
# EMR Version 5.x
s3://elasticmapreduce/bootstrap-actions/log4j/patch-log4j-emr-5.30.2-v1.sh
s3://elasticmapreduce/bootstrap-actions/log4j/patch-log4j-emr-5.31.1-v1.sh
s3://elasticmapreduce/bootstrap-actions/log4j/patch-log4j-emr-5.32.1-v1.sh
s3://elasticmapreduce/bootstrap-actions/log4j/patch-log4j-emr-5.33.1-v1.sh
# EMR Version 6.x
s3://elasticmapreduce/bootstrap-actions/log4j/patch-log4j-emr-6.0.1-v1.sh
s3://elasticmapreduce/bootstrap-actions/log4j/patch-log4j-emr-6.1.1-v1.sh
s3://elasticmapreduce/bootstrap-actions/log4j/patch-log4j-emr-6.2.1-v1.sh
s3://elasticmapreduce/bootstrap-actions/log4j/patch-log4j-emr-6.3.1-v1.sh
s3://elasticmapreduce/bootstrap-actions/log4j/patch-log4j-emr-6.4.0-v1.sh
It is critical that you use a bootstrap patch script version which matches the version of EMR your cluster is running.
Upon calling these scripts, they will patch all of the log4j JAR files inside of your cluster by removing the JNDI class from each of the packages.
You can view or download these using the AWS CLI using the following commands. Be sure to replace the script with the version of EMR you are using:
aws s3 ls s3://elasticmapreduce/bootstrap-actions/log4j/ --no-sign-request
aws s3 cp s3://elasticmapreduce/bootstrap-actions/log4j/patch-log4j-emr-6.4.0-v1.sh - --no-sign-request
The following is a collection of the latest EMR bootstrap patch scripts currently available (current as of December 20, 2021):
2021-12-15 21:25:31 - patch-log4j-emr-5.10.1-v1.sh
2021-12-15 21:25:31 - patch-log4j-emr-5.11.4-v1.sh
2021-12-15 21:25:31 - patch-log4j-emr-5.12.3-v1.sh
2021-12-15 21:25:31 - patch-log4j-emr-5.13.1-v1.sh
2021-12-15 21:25:31 - patch-log4j-emr-5.14.2-v1.sh
2021-12-15 21:25:31 - patch-log4j-emr-5.15.1-v1.sh
2021-12-15 21:25:31 - patch-log4j-emr-5.16.1-v1.sh
2021-12-15 21:25:31 - patch-log4j-emr-5.17.2-v1.sh
2021-12-15 21:25:31 - patch-log4j-emr-5.18.1-v1.sh
2021-12-15 21:25:31 - patch-log4j-emr-5.19.1-v1.sh
2021-12-15 21:25:31 - patch-log4j-emr-5.20.1-v1.sh
2021-12-15 21:25:31 - patch-log4j-emr-5.21.2-v1.sh
2021-12-15 21:25:31 - patch-log4j-emr-5.22.0-v1.sh
2021-12-15 21:25:31 - patch-log4j-emr-5.23.1-v1.sh
2021-12-15 21:25:31 - patch-log4j-emr-5.24.1-v1.sh
2021-12-15 21:25:31 - patch-log4j-emr-5.25.0-v1.sh
2021-12-15 21:25:31 - patch-log4j-emr-5.26.0-v1.sh
2021-12-15 21:25:31 - patch-log4j-emr-5.27.1-v1.sh
2021-12-15 21:25:31 - patch-log4j-emr-5.28.1-v1.sh
2021-12-15 21:25:31 - patch-log4j-emr-5.29.0-v1.sh
2021-12-15 21:25:31 - patch-log4j-emr-5.30.2-v1.sh
2021-12-15 21:25:31 - patch-log4j-emr-5.31.1-v1.sh
2021-12-15 21:25:31 - patch-log4j-emr-5.32.1-v1.sh
2021-12-15 21:25:31 - patch-log4j-emr-5.33.1-v1.sh
2021-12-15 21:25:31 - patch-log4j-emr-5.34.0-v1.sh
2021-12-15 21:25:31 - patch-log4j-emr-5.7.1-v1.sh
2021-12-15 21:25:31 - patch-log4j-emr-5.8.3-v1.sh
2021-12-15 21:25:31 - patch-log4j-emr-5.9.1-v1.sh
2021-12-15 21:25:31 - patch-log4j-emr-6.0.1-v1.sh
2021-12-15 21:25:31 - patch-log4j-emr-6.1.1-v1.sh
2021-12-15 21:25:31 - patch-log4j-emr-6.2.1-v1.sh
2021-12-15 21:25:31 - patch-log4j-emr-6.3.1-v1.sh
2021-12-15 21:25:31 - patch-log4j-emr-6.4.0-v1.sh
2021-12-15 21:25:31 - patch-log4j-emr-6.5.0-v1.sh
Just so you have an idea of what these scripts do, here is the code for the EMR 6.4 bootstrap patch script:
#!/bin/bash
set -ex
EMR_RELEASE=emr-6.4
DELETE_JNDI_PATH=/var/aws/emr/delete_jndi.sh
MANIFEST_PATCH_PATH=/var/aws/emr/manifest_site.patch
HIVE_INIT_PATCH_PATH=/var/aws/emr/hive_log4j.patch
SITE_PP_PATH=/var/aws/emr/bigtop-deploy/puppet/manifests/site.pp
HIVE_INIT_PATH=/var/aws/emr/bigtop-deploy/puppet/modules/hadoop_hive/manifests/init.pp
function check_release_version {
CLUSTER_RELEASE=`cat /mnt/var/lib/instance-controller/extraInstanceData.json | jq -r '.releaseLabel' | cut -d "." -f 1,2`
if [[ "$EMR_RELEASE" != "$CLUSTER_RELEASE" ]]; then
echo "This script is written for $EMR_RELEASE and this cluster is $CLUSTER_RELEASE. Please use the correct bootstrap script for this release."
exit 1
else
echo "Cluster is $CLUSTER_RELEASE, matches script release $EMR_RELEASE. Proceeding with update."
fi
}
function create_delete_jndi_script {
sudo bash -c "cat > $DELETE_JNDI_PATH" <<"EOF"
#/bin/bash
set -e
jars=("/usr/lib/flink/bin/bash-java-utils.jar" "/usr/lib/flink/lib/log4j-core-2.12.1.jar" "/usr/lib/hbase-operator-tools/hbase-hbck2-1.1.0.jar" "/usr/lib/hbase-operator-tools/hbase-tools-1.1.0.jar" "/usr/lib/trino/plugin/elasticsearch/log4j-core-2.13.3.jar" "/usr/lib/hive/lib/log4j-core-2.10.0.jar" "/usr/lib/hudi/cli/lib/log4j-core-2.10.0.jar" "/usr/lib/presto/plugin/presto-druid/log4j-core-2.8.2.jar" "/usr/lib/presto/plugin/presto-elasticsearch/log4j-core-2.9.1.jar" "/usr/lib/presto/plugin/presto-druid/log4j-core-2.8.2.jar" "/usr/lib/presto/plugin/presto-elasticsearch/log4j-core-2.9.1.jar" "/usr/lib/trino/plugin/elasticsearch/log4j-core-2.13.3.jar" "/usr/share/aws/emr/emr-log-analytics-metrics/lib/log4j-core-2.13.3.jar" "/usr/share/aws/emr/emr-metrics-collector/lib/log4j-core-2.11.2.jar")
class="org/apache/logging/log4j/core/lookup/JndiLookup"
jndi="${class}.class"
for index in "${!jars[@]}"; do
jar=${jars[$index]}
if [[ -f "$jar" ]]; then
still_exists=`jar tf $jar | grep -i $class || true`
if [[ ! -z "$still_exists" ]]; then
echo "Removing JndiLookup class from $jar..."
sudo zip -q -d $jar $jndi
echo "Removed JndiLookup class from $jar."
fi
fi
done
remaining_jars=()
for index in "${!jars[@]}"; do
jar=${jars[$index]}
if [[ -f "$jar" ]] && jar tf $jar | grep -i $class ; then
remaining_jars+=$jar
fi
done
if [[ ${remaining_jars[@]} ]]; then
echo "[ERROR] JndiLookup class still exists in: "
printf "%s\n" "${remaining_jars[@]}"
exit 1
fi
exit 0
EOF
sudo chmod +x $DELETE_JNDI_PATH
}
function create_manifest_patch {
sudo bash -c "cat > $MANIFEST_PATCH_PATH" <<"EOF"
--- a/bigtop-deploy/puppet/manifests/site.pp
+++ b/bigtop-deploy/puppet/manifests/site.pp
@@ -107,6 +107,10 @@ node default {
} else {
include node_with_components
}
+
+ class { 'log4j_hotfix':
+ stage => 'pre'
+ }
}
if versioncmp($::puppetversion,'3.6.1') >= 0 {
@@ -115,3 +119,29 @@ if versioncmp($::puppetversion,'3.6.1') >= 0 {
allow_virtual => $allow_virtual_packages,
}
}
+
+class log4j_hotfix {
+ if ("hbase-client" in hiera("bigtop::roles")) {
+ include hbase_operator_tools::library
+ }
+
+ exec { 'delete jndi':
+ path => ['/bin', '/usr/bin', '/usr/sbin',],
+ command => "/bin/bash /var/aws/emr/delete_jndi.sh",
+ logoutput => true
+ }
+
+ exec { 'restart metrics-collector':
+ path => ['/bin', '/usr/bin', '/usr/sbin',],
+ command => "systemctl restart metricscollector",
+ onlyif => "systemctl is-active metricscollector",
+ require => [ Exec['delete jndi'] ]
+ }
+
+ exec { 'restart apppusher':
+ path => ['/bin', '/usr/bin', '/usr/sbin',],
+ command => "systemctl restart apppusher",
+ onlyif => "systemctl is-active apppusher",
+ require => [ Exec['delete jndi'] ]
+ }
+}
EOF
}
function create_hive_log4j_patch {
sudo bash -c "cat > $HIVE_INIT_PATCH_PATH" <<"EOF"
--- a/bigtop-deploy/puppet/modules/hadoop_hive/manifests/init.pp
+++ b/bigtop-deploy/puppet/modules/hadoop_hive/manifests/init.pp
@@ -221,6 +221,12 @@ class hadoop_hive {
require => Package['hive'],
}
+ exec { 'change log4j loglevel to error':
+ path => ['/bin', '/usr/bin', '/usr/sbin',],
+ command => "sed -i 's/^status = INFO/status = ERROR/g' /etc/hive/conf/{beeline,hive}-log4j2.properties && ln -sf /etc/hive/conf/hive-log4j2.properties /etc/hadoop/conf",
+ require => [Bigtop_file::Properties['/etc/hive/conf/hive-log4j2.properties'],Bigtop_file::Properties['/etc/hive/conf/beeline-log4j2.properties']]
+ }
+
bigtop_file::properties { '/etc/hive/conf/hive-exec-log4j2.properties':
source => '/etc/hive/conf.dist/hive-exec-log4j2.properties.default',
overrides => $hive_exec_log4j2_overrides,
}
EOF
}
check_release_version
create_delete_jndi_script
create_manifest_patch
create_hive_log4j_patch
sudo patch -p1 -b $SITE_PP_PATH < $MANIFEST_PATCH_PATH
sudo patch -p1 -b $HIVE_INIT_PATH < $HIVE_INIT_PATCH_PATH
touch /tmp/created_jndi_patch
As you can see the script locates various vulnerable Log4j JAR files within the cluster and removes the JNDI lookup classes from them, thereby rendering them safe from exploitation.
Updating your cluster’s minor version
The provided bootstrap scripts are intended to be applied to the latest minor releases available for EMR. For example, if you are on 6.3.0
you should update to 6.3.1
. You can check which version your cluster is on from the Configuration details section located within Summary tab of your cluster.
Adding the bootstrap script to your cluster
It is recommended that you copy the script to your own account’s S3 bucket. Additionally, this patch step must be the first bootstrap script that runs in the bootstrapping process of your cluster.
I recommend you follow the actions outlined by AWS to properly configure the bootstrap script in your clusters using the AWS documentation here.
Typically, you can use the following parameter to provide the bootstrap action when creating a cluster using the AWS CLI:
--bootstrap-actions Path="s3://elasticmapreduce/bootstrap-actions/log4j/patch-log4j-emr-6.4.0-v1.sh"
If you have configured the bootstrap actions correctly, you will see it listed in the bootstrap actions tab of the AWS Console:
Finally, it is important that you terminate any clusters that are running and re-launch them so that they run this script.
How to check if Log4J2 was patched correctly
You can log in to your cluster and copy down or extract any of the log4j jars.
- You want to make sure that the class called
JndiLookup
is missing. If this class is gone, your log4j JAR has been successfully patched.
You can easily do this in a single operation using the following command, be sure to change the log4j path and version as needed for your cluster:
jar tvf log4j-core-2.11.2.jar | grep JndiLookup.class
It should return no result. If you see the class returned, it means it is still present in your JAR file and your cluster is still vulnerable.
Official AWS Update
As of December 17th, AWS publicly released details regarding these bootstrap scripts and the steps developers need to take. I encourage you to review this page as well.