We have been told thousands of times not to leave our AWS keys in source code.
I have also heard that AWS takes care and searches repositiories and if they find some, they inform people to remove them.
What do you think, how many keys are still on github?
For an estimate we can run an experiment for ourselves, will take just a couple of minutes.
With a little SQL and the BigQuery examples I arrived at the following:
(Not by the most extreme stretch of the imagination I would call myself a data scientist).
It searches for public AWS keys, which have an obvious signature to begin with AKIA.
I am reasoning that where a public key can be found there may also be a secret key.
(As my Grandma said: Where there is smoke there also is fire.)
Samples can be run with the BigQuery console without any fuss within the free monthly quota.
The run finishes in seconds and delivers 35 results, most of which look like actual working keys.
Very few of them are obviously edited and thus made non-functional, which is why I dare list them here:
The query has been run on a sample data set.
Extrapolation to the full data suggests there are more than 500 unique keys on github.
- The query works only for public keys. Secret keys have no easily distinguishable signature, apart from their length and character set.
- I used the sample data set for my own curiousity. Running across the full 3 TB+ dataset would probably have exceeded my free monthly quota.
- This cannot be counted as a proven vulnerability, as no attempt was made to retrieve any secret key, or go one step further to try it, and I do not recommend this.
- I want to point out how exposed the keys actually are. Do NOT put them into code just once!
- In case it happened: What to Do If You Inadvertently Expose an AWS Access Key