Practical Steps to Cybersecurity
In the next couple of posts in our monthly blog series, A Practical Cloud Journey, we will talk about specific types of solutions, like application modernization and machine learning. But before we get there, we need to look at one last fundamental area: security.
The security of digital systems—also known as cybersecurity—is a broad and deep discipline unto itself. However, if you are on a journey to shift your work to the cloud, you can’t just say, “It’s complicated,” stick your head in the sand and hope for the best. Fortunately, there are some practical steps you can take to be in a better position to keep your systems secure.
But first, let’s understand the overall digital environment and reasons why cybersecurity is important to you and your work.
‘The Application’ as Context
Most cloud solutions clients come to Woolpert with a specific use case, workload or application in mind. It’s less common for us to deal with an entire lift-and-shift-type operation because our focus is more on helping clients design and build data pipelines, services and applications related to location and analytics.
As such, for us and our clients, “the application" means something relatively well bounded and specific. Some examples of this include hosting Esri ArcGIS Enterprise in the cloud, deploying a combination software-as-a-service (SaaS) and homegrown products across multiple clouds, or shifting from on-premises, Windows-based ETL workloads to elastic, containerized workloads.
Note: This is a different context than if you are, say, a CIO or CTO who is accountable for delivering on a cloud migration strategy for a series of generic workloads. Think of line-of-business applications like email, CRM or ERP. If you are accountable for the IT strategy for an organization of any size, you will probably have a cybersecurity analyst on staff, and potentially even a chief security officer.
What we need to ask ourselves is, what are the consequences of our focus? Well, for one, we’re not in the business of implementing holistic security strategies that define the security posture of an entire organization. If that is your business and you are consuming any type of cloud, then you should spend time reviewing and deciding which best practices from public cloud providers you can leverage.
However, we are in the position to advise clients of sensible steps toward “operational excellence,” which incorporates cybersecurity policies, practices and controls. So, let’s start with software.
One of the defining characteristics of infrastructure in public clouds and—counterintuitively, perhaps—even in private clouds is that you don’t get to control much about the hardware. Physical security is handled by your cloud vendor, for example.
At Woolpert, we tend to start by defining security controls in what we can control, which is software. What’s great in the cloud-native world is that the hardware is wrapped in a layer of software-defined controls. That means you can quickly and repeatably define and implement policies for the physical compute, actual network cables and connectors, and storage disk.
The second thing we do is look at the software definition of the application, which in almost every case means the software supply chain. That is, the code, the systems, the automated and human processes, and the deployment mechanisms by which something like an application or a data pipeline goes from source code to running in production.
Practical Software-Defined Security Controls
How do you surround your application with securely defined infrastructure and platform services? Here are some quick wins that have come out of our product and consulting work.
Outsourced Identity and Access Management (IAM): “Don’t roll your own encryption” is a saying in the world of cybersecurity because if you do it yourself, you’re going to have a bad time! Up until quite recently—let’s say the last five years—it was still quite common to see applications with their own user IAM sub-system. Think login, permissions and so on. These days, we strongly prefer using an IAM platform developed by experts. We see Azure Active Directory, Google Identity, Okta and others, which bring incredibly rich capabilities.
We also still see anti-patterns that can be avoided using sensible policies. For example, a really common issue is people using a shared consumer email account to manage access (firstname.lastname@example.org as a made-up example). Check out our article on why to avoid this. It’s a real problem for customers every day, and it is easy to fix. Instead, use roles and groups to organize what is possible, with the least privilege. Then assign people to those groups using organizational identities provided by a well-known, third-party IAM platform.
Blueprints by default, exceptions by review: One thing we realized is that new projects in Google Cloud Platform (GCP) are configured by default to work easily, rather than to be secure. For example, the firewall rules on default networks have fairly permissive ingress rules like “allow connecting to machines via SSH (Linux) or RDP (Windows).” This is not a good door to leave open.
Our lead SRE, Nate Wilhelmi, established a new policy in GCP (enforced by default) that prevented default networks from being created. Then we came up with a plan to mitigate all existing projects. Then one of our technical solutions consultants, Marc Miles, wrote up some instructions on how to create a minimally functional and secure network for a typical use case: a VM needing internet access. That’s a great project blueprint!
Exceptions to the rule become obvious in reports and warnings provided by your default cloud provider’s security tooling. You can do this yourself using some incredible infrastructure-as-code (IaC) templates. If IaC is new to you, I wrote an introductory blog post that could help.
Bring access control out of the application entirely: Authorization is commonly something that an application needs to know about and possibly configure and control. But accessing the application in the first place? That is a different story.
A pattern that we’ve had success with comes from Google’s Beyond Corp or “zero trust” model. It involves setting up a proxy around your application that has a very specific goal: ensure that each user coming from each device is allowed to access the app. This “identity aware proxy” (IAP) has saved us a ton of work building our own app-specific authorization, and it is much easier to get right.
Monitor and alert, with SLOs to reduce issues: Finally, figure out which monitoring and alerting tools you have at your disposal. For example, since most of our product and consulting work is on GCP, we use Security Command Center. This is the tool we look at to see whether there are new vulnerabilities affecting existing apps, or even new problems that we’ve created by making changes (whoops, no HTTPS by default on a new “toy" project! Better fix that).
Having service level objectives (SLOs) helps, too. For example, we have an SLO that we will have zero critical vulnerabilities, and that we have month-over-month reduction or mitigation in “high” vulnerabilities (like public IP addresses on VMs).
Practical Software Supply Chain
Back to the application itself. A “software supply chain” might be a new concept to you, but it’s a set of relatively well-understood practices at this point. Once again, it’s a broad topic, so here are the aspects of software supply chain that resonate with me, along with an example for each.
Catch issues early: Have processes in place to catch typical security gaps as early as possible. For example, it is easy to accidentally add a file with a password in it to your version control system. But with peer review and project templates for engineers to start from, it’s also easy to mitigate. That’s what we do on the Woolpert Cloud Solutions team, and why we have a policy of “two sets of eyes.” It means that one way or another, at least two people should see everything that goes into production.
Automated issue catching is even better. For example, why hope that you’re not shipping obvious security holes in your web-facing application when you can automatically test for the presence of OWASP Top Ten vulnerabilities? That is something you can easily bake into a build process.
Understand who you are trusting: Know where your dependencies come from and whether to trust them. And given the enormous reliance most modern software has on an open-source ecosystem beyond your direct control, it’s not always obvious whether your supply chain of dependencies is trustworthy. Heck, in rare cases, it is the very thing you’ve learned to implement to improve cybersecurity that causes the vulnerability! (See the Heartbleed Bug)
Luckily, there are tools out there to help. For example, the NodeJS ecosystem has a way to audit each software package, and a newer option called Open Source Insights can span platform ecosystems to provide a itself—and not just the software produced by a project.
Reproducible software … including dependencies: It also helps to have a way to reproduce/rebuild the exact software in production. For example, back in 2016 a single open-source developer “broke the Internet” when he deleted some shared code, causing untold numbers of automated software build systems to break. It’s not a panacea, but one mitigation is to “vendor” third-party code, which really just means to make a copy and keep it in your own code base.
At Woolpert, we use continuous integration to ensure that we can build our software quickly and reliably. And if we cannot do that, a system tells us within minutes that we have a problem to address, like a missing or outdated dependency.
Reproducible deployments … including the data: You should be able to deploy a different version with very little effort, even if building the ability to do that takes time at the outset. Why? If you get ransomwared, you need to know that you can re-deploy the ENTIRE application without much drama, and with a clean and safe copy of your data.
At Woolpert, we host an enterprise GIS for customers on GCP. We have runbooks to support rapidly rebuilding and configuring that entire infrastructure and application suite. But that’s really no use without also having an achievable recovery point objective (RPO) and recovery time objective (RTO). We must have that also, because a GIS without the data is not much use.
I’m not a lawyer
Cybersecurity is one of those topics that you think twice about before offering advice. It’s like those conversations that start with: “I’m not a lawyer, but …” Yet any technologist with a need to deliver viable solutions in a hybrid cloud world MUST know the basics. And at a minimum, you need to consider the broader cybersecurity context of what you’re doing—even if for you that simply means seeking advice from your organization’s cybersecurity expert.
These are pointers and ideas from our experience, not a complete picture or a cure-all. But on a cloud journey, having a security posture in mind and in practice is not just nice to have; it’s more like the sunblock and hat you wear on a hike to prevent short-term burns and long-term skin damage.
A Practical Cloud Journey 2021 Blog Series
January: A Practical Cloud Journey
February: Designing a Solution
March: Transforming the Team
April: Build or Buy
August: Practical Steps to Cybersecurity
September: Solutions: Machine Learning
October: Solutions: Supply Chain
November: App Modernization
December: Your Journey to the Cloud