🔥Let’s Do DevOps: Terraform Multi-Cloud Null Resource Data Store
This blog series focuses on presenting complex DevOps projects as simple and approachable via plain language and lots of pictures. You can do it!
Hey all!
Our InfoSec team provides us with a steady stream of bad-actor IP addresses and FQDNs, and we do our best to block that traffic everywhere we can.
Terraform, in theory, should make that much easier to do — we updated the rules for each resource with the new IPs, and they’re blocked, awesome.
But what if you have a dozen teams managing their own code? What if you suddenly have 10s of thousands of lines of Terraform to manage, and your developers are all managing their own lists of bad IPs?
Well, then you have drifted. Some teams type a number in wrong or miss a particular resource with the weekly or bi-weekly request from InfoSec to block an IP. And the more we drift, the less secure and reliable our environmental policies are. Not a good place to be.
However, I have a harebrained solution! I was able to misuse the locals block in terraform to store a list of bad IPs, and any team, across any cloud, could call this a module, and it’d provide back a single-point-of-truth list of IPs. Now teams don’t have to manage their own lists in many places, and the chance of drift is much less as we greatly reduce the number of places those IPs need to be updated.
Let’s talk about what the locals block is in Terraform, and how we’ve managed to misuse it to provide this data in a super-easy way for teams to access.
The Locals Block
The locals block is a storage block that permits you to store data within terraform. It can use any function, so you can happily concat, combine, sort, filter, unique, etc. It can be great in modules to standardize naming or prepare any other oft-used data.
locals { | |
# Ids for multiple sets of EC2 instances, merged together | |
name_prepend = "${var.region}${var.account_name}${var.env}" | |
} |
And then you’re able to easily use that for all resources. This can be a huge help in standardizing naming, or tags, or anything else that is potentially computed or constructed based on info sent to the module, and shared across many resources.
It can store data in maps, like the following, or in lists, too — any data structure supported in terraform is supported in locals
.
I have only ever seen locals
utilized in the context of a module, to combine data that’ll be used in the module, or otherwise “local” usages. After all, it’s in the name, right?
Global Module That Does… Nothing?
I was spending quite a bit of time writing Terraform modules that did all sorts of cool things, some of which required local
to help me standardize values, and using functions in interesting combinations. Terraform is a super cool language, I’m having a great time.
Somewhere in there, the problem arose that we needed a gold-standard list of bad-actor IPs, and I got to thinking: Can I make a module that builds no resources? So you’d call it as if it would build a resource, and instead, it returns data that can be used in your resources.
I couldn’t think of a reason this ridiculous idea wouldn't
work, so I tried it out, and bingo, I was able to create a global module that builds nothing
. However, it is shared across many terraform workspaces, and it does return several pre-formatted output lists of IPs.
The Nothing Module
First, I created a new blocked_ips.tf
file, and created a locals block, like this:
# This list is shared by many teams | |
# Update it using a PR to update block-list on all apps | |
locals { | |
blocked_ips = [ | |
"1.2.3.0/24", | |
"2.3.4.0/24", | |
"1.2.3.4/32", | |
] | |
} |
This is cool, but imagine several clouds and lots of different resources. We need to output this list formatted in many different ways. I initially created a README.md
with instructions for each team on what to paste so, they could format it how they want.
But that’s shifting a lot of work onto my app teams, where I am already asking them to use my module, and now they also need to learn TF functions for formatting? That’ll either slow or prevent adoption. This needs to be EASY.
So I created several different outputs from here, each formatting the list in a common way that teams might require. If the exact formatting they need is provided as an output, they only need to specify the output
from the module and they’re done. Way, way easier.
Here are my common outputs. First, if they just need the list, with the /32s included, they can use output ips_list
.
# Output IPs as a list | |
# ex: ["1.1.1.1/32", "2.2.2.2/32",] | |
output "ips_list" { | |
value = local.blocked_ips | |
} |
Next, if they need a list, but the resource/provider doesn’t want the /32
on each individual IP. Azure has a surprising amount of these. They can use output ips_no_slash_32_list
.
# If you need to format the IPs to remove the /32s, but keep the other slashes, like /24, as a list: | |
# ex: ["1.2.3.0/24", "2.3.4.0/24", "1.2.3.4", "5.6.7.8", ...] | |
output "ips_no_slash_32_list" { | |
value = [for ip in local.blocked_ips : replace(ip, "/32", "")] | |
} |
Next, some resources require a common-separate string of IPs, rather than a list. If teams need that, they can use ips_separated_by_comma_string
.
# If you need all the IPs in a row, separated by a comma, as a string: | |
# ex: "1.2.3.0/24,2.3.4.0/24,1.2.3.4/32,5.6.7.8/32,..." | |
output "ips_seperated_by_comma_string" { | |
value = join(",", local.blocked_ips) | |
} |
And finally, if teams require a string of IPs with /32
removed, they can use ips_no_slash_32_join_with_comma_string
.
# And if you need to remove all /32, but keep other slashes, and use list as CSVs, as a string: | |
# ex: "1.2.3.0/24,2.3.4.0/24,1.2.3.4,5.6.7.8,..." | |
output "ips_no_slash_32_join_with_comma_string" { | |
value = join(",", [for ip in local.blocked_ips : replace(ip, "/32", "")]) | |
} |
Summary
I certainly haven’t matched all the patterns my app teams will require, but the beauty of this module is I don’t need to choose. I can create n
outputs, with very little limitation or tax. The only real cost here is each stack that uses this module will compute each output each time, which adds a negligible compute tax.
However, 1–2 seconds of extra compute is worth it to entirely prevent InfoSec blocking drift, is it not?
I hope you’re also able to use this cool trick to create shared data stored that many terraform stacks are able to use. If you have any other applications of this awesome-ness, please post in comments and let me know!
Good luck out there.
kyler