Cloud Custodian Policy Health Checks

An easy way to diagnose if the Cloud Custodian policies are in good health!

Now that you have hundreds of Cloud Custodian policies deployed to your AWS cloud environment. It is very important to know that those policies are working as designed and operating without giving any errors. This is similar to any internal controls that you measure for design and operating effectiveness. The YAML policy is a part of your design where you have defined the resources, filters, mode, and actions. This design makes sure that the policy meets your requirement and does what it suppose to do. The operating effectiveness is where we want to make sure that the policy continues to work without giving any errors (over time).

In this story, we will discuss an alternate way of diagnosing and alerting on policies that are giving errors. For the purpose of the story, we will assume that the Cloud Custodian logs are ingested into one of the SIEM solutions. We have a separate story about the Cloud Custodian [GZ] output and how to ingest Cloud Custodian Logs into Sumo Logic (SIEM). Out of the 3 output files that Cloud Custodian produces, the “custodian-run.log” file is very important to identify the policy that is giving errors. This file contains the DEBUG message. It includes region, custodian version, filtered items, count, and errors. See the below screenshot for an example of DEBUG message.

Let's identify all the policies throwing any kind of errors. In the below query, enter your source, collector, and _sourceName. We are parsing everything that gives errors.

If you do a quick analysis, you may find a lot of them are giving errors because of the permission issues (“Access Denied”). If we know the issue is with the s3 bucket and the reason is very obvious, we can eliminate this from the scope. In the below query, enter your source, collector, and _sourceName. We mentioned “!” (not) so the query will ignore the records containing either Access Denied (has space) or AccessDenied (one word).

The result of this query will show the policies that are erroring out due to other reasons instead of the permission issues. This will help you identify the checks on your policy health. You can also schedule a Sumo Logic search query to send you an email at regular intervals. This means you don’t have to go and run the query manually. The Sumo Logic scheduled bot will run for you as per the defined scheduled / frequency and email or slack you the results (along with the attachments).

Other Stories

Cloud Custodian [GZ] Output Files

Identify AWS Resources Exposed to the World using Cloud Custodian

A Watchman for your CLoud, that never sleeps, and it’s FREE!

Ingesting Cloud Custodian Logs into Sumo Logic



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store

Over 18 years of experience in a wide variety of technical domains within information security including information assurance, compliance, and risk management.