🔥Let’s Do DevOps: EKS K8s & Python Fuzzy Staging with AWS Secrets Manager, K8s Init disk, Secrets…
This blog series focuses on presenting complex DevOps projects as simple and approachable via plain language and lots of pictures. You can do it!
Hey all!
I’ve been deep diving into K8’s CSI drivers for AWS Secrets Manager and ESO, the External Secrets Operator, an open source project that’s similar. Both purport to allow pods to call secrets on demand from external sources on launch. And they do! But they do something that you may not want (and I know I don’t want) — they replicate your external, well protected and encrypted secrets to the k8s secrets store, where they are less protected.
That has some excellent caching benefits, but the big cost is that now your secret passwords and certificates — the keys to your infrastructure and data, now live in a second place. That now needs to be secured, and audited, and monitored with as much security as the primary location where your secrets are stored.
That’s a pretty significant security architecture expansion, and maybe not one you want or can permit. So I decided to do it my own way. I wrote a python program that does a fuzzy match (partial search matching to select multiple secrets) against AWS Secrets Manager in an init container. That init container is one I wrote and control, and which can stage secrets for an app container in a shared volume. Then those secrets can be injected as arguments for the app container that’s launching with bash cat
-ing. This avoids requiring any change in your app containers, pulls secrets directly and dynamically from an external source, and the secret values never leave your pod
.
Let’s talk about the python program I wrote first and see how it works.
Python for Secrets
First, let’s preview how we’ll call this container. Python3, then our script, then n
args. This shows just a single search string, but you could easily and intuitively send several partial search matches. And note that we’re planning to do fuzzy search matching, so a secret with name “app1-himom-asdf” would match, as well as “app1”. We want this to be expansive to future secrets automatically, to ease user input.
command: | |
- python3 | |
- ./start.py | |
args: | |
- 'arn:aws:secretsmanager:us-west-2:1234567890:secret:app1' |
Okay, let’s do python. First we declare some imports and start out main()
function. We parse all arguments, ([1:])
means to take the first argument and n
amount of arguments after it, and store those arguments as a list.
#!/usr/bin/env python3 | |
# Imports | |
import boto3 | |
from botocore.exceptions import ClientError | |
import sys # Needed to take input from user for testing | |
import os # To create directories | |
# Search AWS for secrets matching secrets lists | |
def main(): | |
# Read list of secrets search strings as input values | |
SECRETS_SEARCH_INPUT=sys.argv[1:] |
Then we start iterating over that list, and call our first custom function — parse_arn
. Let’s take a deeper live into that function first.
# Find all secret based on search string | |
for secret_search_string in SECRETS_SEARCH_INPUT: | |
print("We are searching for:", secret_search_string, "with wildcard matching") | |
# Parse ARN into individual components | |
parse_arn(secret_search_string) |
ARNs have a well known structure (that AWS mostly
follows), but it’ll work for our Secrets lookup use case. We split the ARN string on the :
character that separates our values, and assign them to a map. Then we return that map.
def parse_arn(arn): | |
# http://docs.aws.amazon.com/general/latest/gr/aws-arns-and-namespaces.html | |
elements = arn.split(':', 6) | |
result = { | |
'arn': elements[0], | |
'partition': elements[1], | |
'service': elements[2], | |
'region': elements[3], | |
'account': elements[4], | |
'resource': elements[5], | |
'resource_name': elements[6] | |
} | |
if '/' in result['resource']: | |
result['resource_type'], result['resource'] = result['resource'].split('/',1) | |
elif ':' in result['resource']: | |
result['resource_type'], result['resource'] = result['resource'].split(':',1) | |
return result |
Which makes line 2 and 6 work — we’re sending it our ARN, and grabbing the “region” attribute and “resource_name” (secret name) attribute. Then we call our search_for_secrets
function.
# Find region | |
SecretRegion=parse_arn(secret_search_string)["region"] | |
#print("Secret's region is:", SecretSearchRegion) | |
# Find secret name for searching | |
SecretSearchString=parse_arn(secret_search_string)["resource_name"] | |
#print("Secret's search string is:", SecretSearchString) | |
# Need to find all secrets that match inputs lists | |
found_secrets = search_for_secrets(SecretSearchString, SecretRegion) |
Let’s print that whole function and go over each part. We take as inputs on line 2 the search string (name of secret to fuzzy match against), and the region the secret lives in. Then we kick off a session and try
to call a list of secrets based on our search string. The try
is important if there’s no secrets. We don’t want python to bomb out on an empty result, we want it to handle that circumstance gracefully.
We search for secrets using a Filter
with key of name
and Value of our secret name. This will return multiple secrets if they exist.
On line 24, we catch the error if there are no secrets and report it back.
However, if there are secrets (line 27), we create an array for foundSecrets
and stash the secret in it. Then for each secret we found, we return
back the secrets’ names as an array. We don’t actually care about the secret’s value yet — we’re just identifying which secrets exist so we can call them in future.
# Find all secrets that match string | |
def search_for_secrets(SecretSearchString, SecretSearchRegion): | |
secret_search_string = SecretSearchString | |
secret_search_region = SecretSearchRegion | |
session = boto3.session.Session() | |
client = session.client( | |
service_name='secretsmanager', | |
region_name=secret_search_region, | |
) | |
try: | |
list_secrets_response = client.list_secrets( | |
MaxResults=100, #100 is max supported | |
Filters=[ | |
{ | |
'Key': 'name', | |
'Values': [ | |
secret_search_string, | |
] | |
}, | |
] | |
) | |
except ClientError as e: | |
print("The requested secret search string returned no results: " + secret_search_string) | |
print("The error is:", e) | |
else: | |
# Store all secrets in array: | |
foundSecrets = [] | |
for secret_name in list_secrets_response["SecretList"]: | |
#print("Found secrets:", secret_name["Name"]) | |
foundSecrets.append(secret_name["Name"]) | |
#print("The list of secrets is: ", foundSecrets) | |
return(foundSecrets) |
Back to our main()
function. We establish a directory to put our secret files with values. There is a try
to catch any errors with a read-only file-system for error surfacing. Then we loop over every secret (line 12) and try to go get the secret value (line 13). I’ll go over that next. First let’s assume that call goes through.
If the secret is found, we write the secret file with the secret name and populate it with the secret value. This works with at least string and json entries, I haven’t tested binary secrets yet.
If there’s any error, we surface it.
# Create secrets directory | |
directory="/tmp/init-secrets" | |
try: | |
print("Creating location for secrets") | |
os.mkdir(directory) | |
except OSError as error: | |
#print(error) | |
pass | |
# Find secret values | |
for secret_name in found_secrets: | |
secretValue=get_secret_value(secret_name, SecretRegion) | |
try: | |
# Create a file if not exist and write to it | |
fileName="/tmp/init-secrets/{}".format(secret_name) | |
#print("Attempting to stage secret", secret_name, "at", fileName) | |
file = open(fileName, "w") | |
file.write(secretValue) | |
print("Staged", secret_name, "at", fileName) | |
except Exception as error: | |
print(error) | |
print("Error staging secrets, please investigate") | |
print(os.listdir("/tmp/init-secrets")) |
Success! Except we haven’t gone over how we grab the secret value. This will look really familiar to you if you read the “secrets list” function above. It uses the same boto3 library, and instead fetches a secret and returns the secret value or binary (only 1 can exist at a time). If any error, we catch it and return it.
def get_secret_value(SecretName, Region): | |
secret_name = SecretName | |
region_name = Region | |
session = boto3.session.Session() | |
client = session.client( | |
service_name='secretsmanager', | |
region_name=region_name, | |
) | |
try: | |
get_secret_value_response = client.get_secret_value( | |
SecretId=secret_name | |
) | |
except ClientError as e: | |
if e.response['Error']['Code'] == 'ResourceNotFoundException': | |
print("The requested secret " + secret_name + " was not found") | |
elif e.response['Error']['Code'] == 'InvalidRequestException': | |
print("The request was invalid due to:", e) | |
elif e.response['Error']['Code'] == 'InvalidParameterException': | |
print("The request had invalid params:", e) | |
elif e.response['Error']['Code'] == 'DecryptionFailure': | |
print("The requested secret can't be decrypted using the provided KMS key:", e) | |
elif e.response['Error']['Code'] == 'InternalServiceError': | |
print("An error occurred on service side:", e) | |
else: | |
# Secrets Manager decrypts the secret value using the associated KMS CMK | |
# Depending on whether the secret was a string or binary, only one of these fields will be populated | |
if 'SecretString' in get_secret_value_response: | |
SecretValue = get_secret_value_response['SecretString'] | |
return(SecretValue) | |
else: | |
SecretValue = get_secret_value_response['SecretBinary'] | |
return(SecretValue) |
And that’s our python! Let’s box it up in a container. We put it in Ubuntu 20.04, and make sure to install python3-pip and boto3, both of which are required to run our python script. Then we copy over our python3 file (start.py) and set it as executable.
FROM --platform=linux/amd64 ubuntu:20.04 | |
# Install tooling | |
RUN apt-get update && apt-get install -y \ | |
less \ | |
(removed) | |
python3-pip \ | |
&& rm -rf /var/lib/apt/lists/* | |
# Install boto3 | |
RUN pip3 install boto3 | |
# copy over the start.py script | |
COPY start.py start.py | |
# make the script executable | |
RUN chmod +x start.py | |
# set the entrypoint to the start.py script | |
ENTRYPOINT ["python3","./start.py"] |
And that’s our init container! Let’s talk our app container next.
App Container — Unedited
This app container is intentionally very very simple. I don’t want to be modifying our app containers, and I bet you don’t either. Rather, we should have k8s do its job and inject secrets into them as they require. So let’s build an app container without much complexity.
To make sure it’s all working, I have a bash script that receives the arguments and prints them. Again, totally not required, this is more for testing and proof of concept.
#!/bin/bash | |
# Map secrets based on input | |
while getopts a:b: flag | |
do | |
case "${flag}" in | |
a) app1-super-secret-json=${OPTARG};; | |
b) app1-super-secret-string=${OPTARG};; | |
esac | |
done | |
# Print secret | |
while [ i=i ] | |
do | |
echo $app1-super-secret-json | |
echo $app1-super-secret-string | |
# Sleep a bit to avoid loop overload | |
sleep 10 | |
done | |
trap 'cleanup; exit 130' INT | |
trap 'cleanup; exit 143' TERM |
Then we box it up in a container and set it as executable. Done.
FROM --platform=linux/amd64 ubuntu:20.04 | |
# Install tooling | |
RUN apt-get update && apt-get install -y \ | |
less \ | |
(removed) | |
&& rm -rf /var/lib/apt/lists/* | |
# copy over the start.sh script | |
COPY start.sh start.sh | |
# make the script executable | |
RUN chmod +x start.sh | |
# set the entrypoint to the start.sh script | |
ENTRYPOINT ["./start.sh"] |
Note that the ENTRYPOINT
doesn’t list out the arguments we’ll need this container to receive. That’s because we’re going to over-write the entrypoint and arguments with k8s
. Which is super cool, right? Let’s talk k8s next.
K8s: Init Containers and Shared Disks
Now our containers are ready, let’s get them organized! In K8s, we have to do a few things first. First, we need a namespace to put it all in:
apiVersion: v1 | |
kind: Namespace | |
metadata: | |
name: init |
Then we need a service account. This is a K8s concept for authentication, and AWS has taken it a bit further — not the annotation at the end. This role-arn
annotation is something EKS reads and will associate your pod, using this service account, with an IAM role (with the help of an OIDC auth provider).
All configuration, including for the IAM role, and the OIDC provider, is in terraform in the repo at the end of this write-up if you want to read deeper into that. Or check out my previous write-up here (link).
apiVersion: v1 | |
kind: ServiceAccount | |
metadata: | |
name: init-secret-fetcher | |
namespace: init | |
annotations: | |
eks.amazonaws.com/role-arn: arn:aws:iam::1234567890:role/app1_eks_role |
Now let’s do the deployment, which is where the real magic happens. Here’s our standard scaffolding — nothing exciting here, just getting our pod definition going.
apiVersion: apps/v1 | |
kind: Deployment | |
metadata: | |
name: secrets-testing | |
namespace: init | |
labels: | |
app: secrets-testing | |
spec: | |
replicas: 1 | |
selector: | |
matchLabels: | |
app: secrets-testing | |
template: | |
metadata: | |
name: secrets-testing | |
labels: | |
app: secrets-testing |
Next up we start doing cool things. First, we establish an empty volume. This volume lives only within the context of your pod.
I mean, technically, it exists in etcd, but so does everything, so let’s ignore that for now.
Then we establish the pod is assigned the serviceAccount on line 6. I’d love to assign the SA to the init container specifically, but I haven’t figured out how to do that yet (if you have please let me know!).
Then on line 8 we establish the init container. Note that initContainers:
isn’t a name that I made up here, it’s a k8s concept that fully launches, runs, and destroys a container before the app container launches
. It is a great tool to delay App containers from launching until things are ready. We call our init container (with the awesome python, remember?), and tell it to go find a secret. Remember, this arg
could actually be a dozen or more secrets in multiple regions, I just have 1 secret arg here for brevity. Still, we pull multiple secrets even with this single search string.
Note also on line 11 that we’re mounting the empty dir. We’ll mount this on both init and app pod(s) so we can write (init container) and read (app container) the secrets.
spec: | |
volumes: | |
- name: init-secrets | |
emptyDir: {} | |
serviceAccountName: init-secret-fetcher | |
initContainers: | |
- name: init-secret-fetcher | |
image: kymidd/secrets-testing-fetcher01:v0.0.3 | |
volumeMounts: | |
- name: init-secrets | |
mountPath: /tmp/init-secrets | |
command: | |
- python3 | |
- ./start.py | |
args: | |
- 'arn:aws:secretsmanager:us-west-2:1234567890:secret:app1' |
Next we call our app container. Remember, our goal is to modify our app container as little as possible (and in this case, none!). We mount our secrets vol (line 5) which when the app container launches will be populated with all our secrets.
Then we over-write our entrypoint and args
of the app container. If you use this method, make sure to use the existing entrypoint and args, and rather than passing secret variables as strings
or global vars
, we will use an interpolated bash command on line 11.
The syntax here is really finicky because K8s does its own expansions using the $(stuff)
syntax, so if yours is odd, copy mine exactly and work from there. Note that we’re cat
-ing a secret file on the populated emptyDir disk. cat
means to read the contents of a file, which we then pass to our app container’s process.
containers: | |
- name: secrets-testing-app01 | |
image: kymidd/secrets-testing-app01:v0.0.3 | |
volumeMounts: | |
- name: init-secrets | |
mountPath: "/tmp/init-secrets" | |
command: ["/bin/bash", "-c"] | |
args: | |
- "./start.sh -a $(cat /tmp/init-secrets/app1-super-secret-json) -b $(cat /tmp/init-secrets/app1-super-secret-string)" |
In this way we avoid writing any secrets to K8s as “secret” objects that we be easily grabbed, like this.
✗ kubectl get secrets secret-string -o json | jq -r .data.\"string-secret\" | base64 -D | |
cindy77% | |
✗ kubectl get secrets secret-json -o json | jq -r .data.\"json-secret\" | base64 -D | jq | |
{ | |
"app_password": "cindy1", | |
"other_password": "cindy3", | |
"smtp_password": "cindy2" | |
} |
That scares me.
Summary
We did SO MUCH COOL STUFF. We created a python program that uses IRSA (pod IAM roles permissions) to search for secrets, then wrapped that logic in a container we use as an init container
in k8s to stage secrets on an empty vol. Then we mount that volume on our app container and cat any secrets we need as arguments.
Which is a really long drive to avoid k8s secrets.
A super important note. There are definitely better ways to do this. If you can modify your app containers, just use the AWS CLI to grab secrets. Or use a paid software to provide only authorized secrets at run-time.
However, this lab taught me a TON about how k8s works, how insecurely secrets are stored, and gave me an excuse to write python and logic some init containers together.
Here’s the repo where I’m storing all code. Please go run it yourself!
GitHub - KyMidd/K8s-PythonInitContainer-SecretsManager
Deployable docker, k8s, and terraform to deploy a pod to fetch secrets dynamically from AWS Secrets Manager on pod…github.com
Good luck out there!
kyler