Cloud Security: A Journey

Cloud Security: A Journey

Did you know that most cloud security failures result from misconfigurations? As companies expand their cloud operations, automation and governance become indispensable to ensure security and scalability.

For years, AWS has been saying: "Cloud is the new normal." This is indeed well-established, especially among companies that started their operations in the last decade.

After all, using cloud services is the best way to access new technologies while aligning costs with demand and avoiding long-term cost commitments when applications or services are still in their early or exploratory stages.

Yet, we still see many silos and a lack of visibility. How can you accelerate your Formula 1 car to maximize results (speed in delivering new services and continuous innovation) while reducing risks (driver safety, financial feasibility)?

Current Challenges

To manage and optimize, we need to identify and monitor the processes involved. As Peter Drucker said: "If you can't measure it, you can't manage it." In the world of security, the paradigm is the same: we need to identify what we need to protect.

The NIST Cybersecurity Framework v2 organizes cybersecurity into six main functions: Govern, Identify, Protect, Detect, Respond, and Recover. The 'Govern' function is essential in multi-cloud environments, ensuring the strategic alignment of security policies.

NIST CSF v2

It’s Not That Simple

Initiative Tracking: It is challenging for teams not directly involved in the build to access or even be aware of everything that has been provisioned and where it has been provisioned in the cloud.

Access: Even when aware of active development projects, fragmented responsibilities often prevent security teams from independently accessing environments or enabling the necessary security layers.

Technical Knowledge: Each provider has its own mechanisms to apply best practices. Companies may have infrastructure teams with specialists for different CSPs, but it is much harder for cybersecurity professionals to protect multiple services across multiple clouds. In many cases, they rely on multi-cloud tools that support all providers in use simultaneously.

A Complex Scenario

This becomes even more challenging as more teams and projects come into play, further compounded by a multi-cloud environment.

Problems? Yes, security and costs—but not necessarily in that order.

Security and Costs

To scale quickly and implement best practices while different teams consume cloud services, these practices need to be in place from the outset.

The default configurations of Cloud Service Providers (CSPs) are secure, but they are not always the most robust. This is because enhanced configurations could make usage impractical or lead to additional costs.

Solutions: Automation as a Foundation

Infrastructure as Code (IaC):

IaC is a practice that allows IT infrastructure to be created and managed through code, like Terraform. Imagine configuring servers, networks, and security with scripts, reducing the risk of human error.

For this reason, automation becomes a primary aspect. In the cloud world, we use technologies like IaC (e.g., Terraform, AWS CloudFormation). Infrastructure code must include essential standards, such as: Encryption, Backups, Proper network configurations, IAM settings for least privilege adherence.

Also addressing cost management:

  • Tags to ensure accurate cost allocation (FinOps).

  • Authorized services with predictable sizing and costs.

  • Observability: Metrics to optimize sizing and autoscaling rules to prevent infrastructure waste.

  • Right Sourcing: Prioritizing services that deliver real benefits in the cloud while avoiding unnecessary expenses.

This diagram illustrates how new technologies, governance standards (GRC), and frameworks like the Cloud Well Architected Framework converge to create a solid foundation for automation and governance. For example, by adopting a convergent approach like DevSecFinOps, teams integrate security and cost considerations directly into deployment processes (CI/CD):

Automation is the key

Beyond Automation

Automation alone doesn’t solve everything: it can automate solutions but also amplify problems. This is why it’s crucial to establish a process for enabling cloud services by defining how they should be used, providing use case examples, decision matrices, and minimum configurations that account for all security aspects.

This way, those provisioning infrastructure only need to consume its architecturally specified IaC, modularized for use by multiple teams.

Subsequently, deviations between what was provisioned and the code used can be detected and corrected using IaC. Platforms that detect misconfigurations add an extra layer of assurance by identifying manual configurations or new standards (published by security vendors). These can also be remediated through Cloud Security Posture Management (CSPM) - a component of the Cloud-Native Application Protection Platform (CNAPP) for leading security providers.

Scaling Cloud Services

Reference Architecture

Here is a sequence of steps I believe are essential for ensuring consistency and adherence to best practices during rapid cloud service expansion:

Define the services to be used with input from architecture, security, and engineering teams.

Establish a roadmap for services of interest to be analyzed in the future, along with prioritization criteria.

Document key decision points (e.g., use cases, infrastructure design—networking, backup, monitoring, sizing, pricing).

Ensure the security tools in use can manage these services.

Cover compliance frameworks.

Define standardized tags and naming conventions.

Map transitions (to/from other CSPs or on-premise environments).

Governance

Set up guardrails within CSPs to prevent undesirable configurations from being implemented. Evaluate the use of Cloud Detection and Response mechanisms to ensure that remediation occurs quickly—whether through automation or security tools that support such measures.

Managing the Backlog

Define a strategy for tracking deviations:

Volume-oriented: Map all deviations and then prioritize the backlog.

Framework-oriented: Prioritize policies linked to specific security frameworks (e.g., NIST CSF v2) or high-severity issues.

Prioritizing Alerts

Understanding the context of environments (development, testing, production, sandbox, etc.) and implementing a tagging strategy is crucial. This provides better insights into the nature of issues, especially when responsibilities are divided across teams.

Highlighting this is important because, in the business world, it’s clear that strategies must be communicated clearly and continuously to ensure everyone understands them. This applies to security strategies as well. For example:

Should we improve the overall security indicator?

Should there be a differentiated plan for high-impact environments?

Which environments belong to the company’s value chain?

Who hasn’t been called into a "red code" situation for a major issue that turned out to be in a deactivated or isolated environment? For this reason, it is crucial to understand the risk scoring factor of the tools and ensure that teams are working on what will generate the most positive impact.

Having a good technical understanding of the environment also aids in the suppression process. Policies for detecting misconfigurations are linear, and security tools oriented toward context generate alerts with severities much better aligned to prioritization workflows.

Avoiding Indirect Effects

Overly strict validation and risk acceptance processes can inadvertently encourage unsafe practices. For instance, excessively complex passwords may lead users to write them down in insecure locations. It’s vital to ensure everyone understands the company’s security strategy to foster better execution and adherence. After all, we don’t engage in what we don’t understand.

Enabling services in scale

In Conclusion

Cloud security is a challenging but essential journey. By integrating automation (IaC), governance, and tools like CSPM/CNAPP—especially those focused on context and correlation of security issues—companies can scale services with security, efficiency, and cost control.

Glossary

CNAPP: Cloud-Native Application Protection Platform. A comprehensive approach to securing cloud-native applications by integrating multiple tools and functionalities into a single platform for visibility, security, and governance.

CSPM: Cloud Security Posture Management. A solution that detects misconfigurations in cloud environments.

FinOps: A set of practices for optimizing cloud costs.

P. R. C.

Building software!

4mo

Very nice!! A lot of good tips and the importance of IaC in larger organizations.

Mo Riz

Developer Advocate @ ReductStore | Empowering Robotics & IIoT Teams with 10× Faster, Cost‑Efficient Data Infrastructure

4mo

Great insights, Ari! Misconfigurations are a major risk, and automation needs thoughtful implementation to be effective. At ReductStore, we tackle similar challenges in securing industrial data at scale. Thanks for sharing

Adriana Campana

CEO @ Seven Solutions Ltda | MBADigital Business

4mo

Concordo

To view or add a comment, sign in

Others also viewed

Explore topics