I'm a Developer and I Hate DLP

DLP (Data Loss Prevention) refers to strategies, tools, and processes designed to ensure that sensitive or critical information is not lost, misused, or accessed by unauthorized users. DLP is crucial for enterprises to protect their data, and I fully support its use. However, as a developer, I find DLP frustrating. Let me explain why.

How DLP works (simplified)

When I (the user) access a resource from my web client/browser (it can be any application that uses HTTP for communication), the browser connects through corporate DLP. For example, when I try to get content from the https://example.technicaldomain.xyz website, corporate DLP generates a TLS certificate for the website signed by the enterprise internal CA and acts as a proxy. The content and transferred data go through validation and might be blocked.

Yes, this is a real man-in-the-middle attack but enterprise-approved. Otherwise, DLP cannot access the HTTP content under TLS.

Some simplified visual explanations can be found below.

DLP diagram

What’s wrong with DLP?

Nothing! DLP is used to protect sensitive data, prevent insider threats, reduce risks of data breaches, give visibility into data usage, and much more. Normally, the enterprise CA is issued and automatically delivered into the system trust store by enterprise administrators. After CA installation, all applications that use the system trust store will be ready to work with DLP without any issues. However, many applications are not ready to work in DLP-enabled environments.

Just a few examples:

  • If an application uses certificate pinning, it will be broken in that environment, and you need to adjust DLP rules to allow bypass for all endpoints that utilize certificate pinning for protection.
  • All Node.js applications will be broken. npm never looks at the system trust store and has its own CA bundle, which is not editable in a straightforward way.
  • All Java-based applications also ignore the system trust store and look for CA in the Java-specific trust store.
  • Many Python applications rely on the python-certifi package for getting CA certificates.

Can we fix it? Yes, we can.

All examples below assume that you containerize your application; however, all of these approaches are fair enough for local development environments, application rollout on pure OS, etc. All you need is to adjust the example for your particular case.

Add private CA to your Docker base images

It makes sense to build a special base Docker image with your private CA (and some additional tools for the future) if you work on an application that will be containerized later.

We can create the following:

Dockerfile:

1
2
3
FROM scratch
COPY . /
USER nobody

enterprise-ca.crt certificate file which contains all necessary private CA

1
2
3
4
5
-----BEGIN CERTIFICATE-----
7xFSkyyBNKr79X9DFHOHZF34tHLJRq661V6bY7R+zdRcAitMOeGylZqPHWykYk0x
....
x0kYkyWHPqZlyGeOMtiAcRdz+R7Yb6V166qRJLHt43FZHOHFD9X97rKNByykSFx7
-----END CERTIFICATE-----

jks-add-ca.sh script which will be used to embed our private CA into the Java key store

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
temporary_directory=$( mktemp -d -t i_hate_dlp.XXXXXX )

cleanup() {
rm -rf ${temporary_directory}
}

trap cleanup EXIT

cd ${temporary_directory}

csplit -s -z $1 '/-----BEGIN CERTIFICATE-----/' '{*}'

STOREPASS="changeit"

counter=0

for certificate in xx*; do
alias_name="corporate_ca_$counter"
keytool -import -trustcacerts -cacerts \
-alias $alias_name \
-file $certificate \
-storepass $STOREPASS -noprompt
((counter++))
done

Python

A lot of Python-based applications still use the requests package to interact with HTTP resources. To add some extra CA certificates, you just need to define an environment variable with the path to the CA file, like export REQUESTS_CA_BUNDLE=/path/to/your/ca/bundle.crt.

Let’s make a Dockerfile to define it. We will use the Dockerfile with Enterprise CA built in the previous step:

1
2
3
4
5
6
7
ARG CORP_CA_VERSION=1.2.3
ARG PYTHON_VERSION=3.12-alpine
FROM oci.corp.registry/corp-certificates:${CORP_CA_VERSION} AS certificates
FROM python:${PYTHON_VERSION} AS python
COPY --from=certificates /enterprise-ca.crt /usr/local/share/ca-certificates/enterprise-ca.crt
RUN update-ca-certificates
ENV REQUESTS_CA_BUNDLE=/usr/local/share/ca-certificates/enterprise-ca.crt

But some Python packages use their own methods to interact with HTTP and don’t use the requests package. For example, fastapi and httpx.
Here we have the ultimate solution to force our Python to look at the system trust store - truststore. Simple and clever to use.

1
2
import truststore
truststore.inject_into_ssl()

Java

For Java-based applications, we will have almost the same approach. Let’s make a Dockerfile with imported certificates:

1
2
3
4
5
6
7
8
9
ARG CORP_CA_VERSION=1.2.3
ARG JAVA_VERSION=17-alpine3.20
FROM oci.corp.registry/corp-certificates:${CORP_CA_VERSION} AS certificates
FROM amazoncorretto:${JAVA_VERSION} AS java
RUN /bin/sh -c set -eux; apk add --no-cache --allow-untrusted --repository http://dl-cdn.alpinelinux.org/alpine/v3.20/main ca-certificates bash coreutils
COPY --from=certificates /enterprise-ca.crt /usr/local/share/ca-certificates/enterprise-ca.crt
RUN update-ca-certificates
COPY --from=certificates /jks-add-ca.sh /usr/bin/jks-add-ca.sh
RUN bash /usr/bin/jks-add-ca.sh /usr/local/share/ca-certificates/enterprise-ca.crt

Node.js and extra CA

For all of your Node.js applications, you can easily specify the environment variable NODE_EXTRA_CA_CERTS, like this:

1
2
3
4
5
6
7
8
ARG CORP_CA_VERSION=1.2.3
ARG NODE_VERSION=lts-alpine3.20
FROM oci.corp.registry/corp-certificates:${CORP_CA_VERSION} AS certificates
FROM node:${NODE_VERSION} as node
RUN /bin/sh -c set -eux; apk add --no-cache --allow-untrusted --repository http://dl-cdn.alpinelinux.org/alpine/v3.20/main ca-certificates bash coreutils
COPY --from=certificates /enterprise-ca.crt /usr/local/share/ca-certificates/enterprise-ca.crt
RUN update-ca-certificates
ENV NODE_EXTRA_CA_CERTS=/usr/local/share/ca-certificates/enterprise-ca.crt

Certificate Pinning

In case your application uses certificate pinning, you have no choice but to add an exclusion to bypass the route. No workarounds here.

Conclusion

Corporate DLPs and private CA is great stuff, it is used for a lot of enterprises for a good intentions, but makes System Development Life Cycle a little bit challengeable. Fortunately, we know how to deal with it now.