Why Rotate ECR Secrets?
ECR uses IAM authentication to control access to your repositories. When you authenticate with ECR, you receive a token that's valid for 12 hours. After this period, the token expires, and your Kubernetes pods may fail to pull images from ECR. By implementing an automatic rotation mechanism, we can avoid these potential disruptions.
How to Solve It?
To address this issue, you'll set up a cronjob to update the secret in Kubernetes. The most straightforward way to do this is by using a Kubernetes CronJob. While there are many tutorials on creating this with kubectl
, I'd like to share how to apply this approach using Helm charts.
Setting Up the CronJob templates in Helm charts
Add templates/role.yaml
# For default kubectl in cluster to CRUD secrets
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: secret-reader
namespace: core
rules:
- apiGroups: [""]
resources: ["secrets"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
Add templates/rolebinding.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: read-secrets
namespace: core
subjects:
- kind: ServiceAccount
name: default
namespace: core
roleRef:
kind: Role
name: secret-reader
apiGroup: rbac.authorization.k8s.io
Add templates/ecr-cron-job.yaml
{{- if .Values.global.ecrAuth.enabled -}}
apiVersion: batch/v1
kind: CronJob
metadata:
name: core-ecr-secret-cron
spec:
schedule: "{{ .Values.global.ecrAuth.cronSchedule }}"
successfulJobsHistoryLimit: 1
suspend: false
jobTemplate:
spec:
template:
spec:
containers:
- name: core-ecr-secret-cron-container
image: alpine/k8s:1.30.4 # you can chose one to best fits your need
imagePullPolicy: IfNotPresent
env:
- name: AWS_ACCESS_KEY_ID
value: {{ .Values.global.awsAdminKey }}
- name: AWS_SECRET_ACCESS_KEY
value: {{ .Values.global.awsAdminSecret }}
- name: AWS_DEFAULT_REGION
value: {{ .Values.global.awsRegion }}
- name: ECR_REGISTRY
value: {{ .Values.global.ecrAuth.registry }}
command:
- /bin/sh
- -c
- |-
kubectl create secret docker-registry "{{ required "secretName is required" .Values.global.ecrAuth.secretName }}" \
--save-config \
--dry-run=client \
--docker-server="{{ .Values.global.ecrAuth.registry }}" \
--docker-username=AWS \
--docker-password=`aws ecr get-login-password` \
-o yaml | kubectl --namespace=core apply -f -
restartPolicy: Never
{{- end -}}
Add the related values in values.yaml
awsAdminKey: "..."
awsAdminSecret: "..."
awsRegion: "..."
ecrAuth:
enabled: true
secretName: ecr-image-pull-secret
registry: "..."
cronSchedule: "0 */10 * * *" # Token will be expired in 11 hours
Finally with helm upgrade ...
the cronjob should deploy to your Kubernates cluster.
How It Works
This CronJob runs every 10 hours (providing a 2-hour buffer before token expiration) and performs these steps:
- Uses the AWS CLI to obtain a new ECR token
- Creates or updates the secret with the fresh token
Important Considerations
- Reference the ECR secret by name in your pods (e.g.,
ecr-image-pull-secret
) instead of using imagePullSecrets in ServiceAccounts. - Ensure the AWS credentials used by the CronJob have the necessary permissions to interact with ECR.
- Regularly monitor the CronJob's performance and set up alerts for any failures to ensure smooth operation.
Conclusion
Implementing this CronJob ensures your Kubernetes cluster always has fresh credentials for pulling images from ECR. This approach prevents disruptions from expired tokens and offers a solid solution for managing ECR authentication in Kubernetes.
Remember to tailor the schedule and region to your specific needs, and always adhere to security best practices when handling AWS credentials in your cluster.