mirror of https://github.com/fluxcd/flux2.git
You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
1257 lines
53 KiB
Markdown
1257 lines
53 KiB
Markdown
# RFC-0010 Multi-Tenant Workload Identity
|
|
|
|
**Status:** implementable
|
|
|
|
<!--
|
|
Status represents the current state of the RFC.
|
|
Must be one of `provisional`, `implementable`, `implemented`, `deferred`, `rejected`, `withdrawn`, or `replaced`.
|
|
-->
|
|
|
|
**Creation date:** 2025-02-22
|
|
|
|
**Last update:** 2025-04-14
|
|
|
|
## Summary
|
|
|
|
In this RFC we aim to add support for multi-tenant workload identity in Flux,
|
|
i.e. the ability to specify at the object-level which set of cloud provider
|
|
permissions must be used for interacting with the respective cloud provider
|
|
on behalf of the reconciliation of the object. In this process, credentials
|
|
must be obtained automatically, i.e. this feature must not involve the use
|
|
of secrets. This would be useful in a number of Flux APIs that need to
|
|
interact with cloud providers, spanning all the Flux controllers except
|
|
for helm-controller.
|
|
|
|
### Multi-Tenancy Model
|
|
|
|
In the context of this RFC, multi-tenancy refers to the ability of a single
|
|
Flux instance running inside a Kubernetes cluster to manage Flux objects
|
|
belonging to all the tenants in the cluster while still ensuring that each
|
|
tenant has access only to their own resources according to the Least Privilege
|
|
Principle. In this scenario a tenant is often a team inside an organization,
|
|
so the reader can consider the
|
|
[multi-team tenancy model](https://kubernetes.io/docs/concepts/security/multi-tenancy/#multiple-teams).
|
|
Each team has their own namespaces, which are not shared with other teams.
|
|
|
|
## Motivation
|
|
|
|
Flux has strong multi-tenancy features. For example, the `Kustomization` and
|
|
`HelmRelease` APIs support the field `spec.serviceAccountName` for specifying
|
|
the Kubernetes `ServiceAccount` to impersonate when interacting with the
|
|
Kubernetes API on behalf of a tenant, e.g. when applying resources. This
|
|
allows tenants to be constrained under the Kubernetes RBAC permissions
|
|
granted to this `ServiceAccount`, and therefore have access only to the
|
|
specific subset of resources they should be allowed to use.
|
|
|
|
Besides the Kubernetes API, Flux also interacts with cloud providers, e.g.
|
|
container registries, object storage, pub/sub services, etc. In these cases,
|
|
Flux currently supports basically two modes of authentication:
|
|
|
|
- *Secret-based multi-tenant authentication*: Objects have the field
|
|
`spec.secretRef` for specifying the Kubernetes `Secret` containing the
|
|
credentials to use when interacting with the cloud provider. This is
|
|
similar to the `spec.serviceAccountName` field, but for cloud providers.
|
|
The problem with this approach is that secrets are a security risk and
|
|
operational burden, as they must be managed and rotated.
|
|
- *Workload-identity-based single-tenant authentication*: Flux offers
|
|
single-tenant workload identity support by configuring the `ServiceAccount`
|
|
of the Flux controllers to impersonate a cloud identity. This eliminates
|
|
the need for secrets, as the credentials are obtained automatically by
|
|
the cloud provider Go libraries used by the Flux controllers when they
|
|
are running inside the respective cloud environment. The problem with
|
|
this approach is that it is single-tenant, i.e. all objects are reconciled
|
|
using the same cloud identity, the one associated with the respective controller.
|
|
|
|
For delivering the high level of security and multi-tenancy support that
|
|
Flux aims for, it is necessary to extend the workload identity support to
|
|
be multi-tenant. This means that each object must be able to specify which
|
|
cloud identity must be impersonated when interacting with the cloud provider
|
|
on behalf of the reconciliation of the object. This would allow tenants to
|
|
be constrained under the cloud provider permissions granted to this identity,
|
|
and therefore have access only to the specific subset of resources they are
|
|
allowed to manage.
|
|
|
|
### Goals
|
|
|
|
Provide multi-tenant workload identity support in Flux, i.e. the ability to
|
|
specify at the object-level which cloud identity must be impersonated to
|
|
interact with the respective cloud provider on behalf of the reconciliation
|
|
of the object, without the need for secrets.
|
|
|
|
### Non-Goals
|
|
|
|
It's not a goal to provide multi-tenant workload identity *federation* support.
|
|
The (small) difference between workload identity and workload identity federation
|
|
is that the former assumes that the workloads are running inside the cloud
|
|
environment, while the latter assumes that the workloads are running outside
|
|
the cloud environment. All the major cloud providers support both, as the majority
|
|
of the underlying technology is the same, but the configuration is slightly
|
|
different. Because the differences are small we may consider workload identity
|
|
federation support in the future, but it's not a goal for this RFC.
|
|
|
|
## Proposal
|
|
|
|
For supporting multi-tenant workload identity at the object-level for the Flux APIs
|
|
we propose associating the Flux objects with Kubernetes `ServiceAccounts`. The
|
|
controller would need to create a token for the `ServiceAccount` associated with
|
|
the object in the Kubernetes API, and then exchange it for a short-lived access
|
|
token for the cloud provider. This would require the controller `ServiceAccount`
|
|
to have RBAC permission to create tokens for any `ServiceAccounts` in the cluster.
|
|
|
|
### User Stories
|
|
|
|
#### Story 1
|
|
|
|
> As a cluster administrator, I want to allow tenant A to pull OCI artifacts
|
|
> from the Amazon ECR repository belonging to tenant A, but only from this
|
|
> repository. At the same time, I want to allow tenant B to pull OCI artifacts
|
|
> from the Amazon ECR repository belonging to tenant B, but only from this
|
|
> repository.
|
|
|
|
For example, I would like to have the following configuration:
|
|
|
|
```yaml
|
|
apiVersion: source.toolkit.fluxcd.io/v1beta2
|
|
kind: OCIRepository
|
|
metadata:
|
|
name: tenant-a-repo
|
|
namespace: tenant-a
|
|
spec:
|
|
...
|
|
provider: aws
|
|
serviceAccountName: tenant-a-ecr-sa
|
|
---
|
|
apiVersion: v1
|
|
kind: ServiceAccount
|
|
metadata:
|
|
name: tenant-a-ecr-sa
|
|
namespace: tenant-a
|
|
annotations:
|
|
eks.amazonaws.com/role-arn: arn:aws:iam::123456789123:role/tenant-a-ecr
|
|
---
|
|
apiVersion: source.toolkit.fluxcd.io/v1beta2
|
|
kind: OCIRepository
|
|
metadata:
|
|
name: tenant-b-repo
|
|
namespace: tenant-b
|
|
spec:
|
|
...
|
|
provider: aws
|
|
serviceAccountName: tenant-b-ecr-sa
|
|
---
|
|
apiVersion: v1
|
|
kind: ServiceAccount
|
|
metadata:
|
|
name: tenant-b-ecr-sa
|
|
namespace: tenant-b
|
|
annotations:
|
|
eks.amazonaws.com/role-arn: arn:aws:iam::123456789123:role/tenant-b-ecr
|
|
```
|
|
|
|
#### Story 2
|
|
|
|
> As a cluster administrator, I want to allow tenant A to pull and push to the Git
|
|
> repository in Azure DevOps belonging to tenant A, but only this repository. At
|
|
> the same time, I want to allow tenant B to pull and push to the Git repository
|
|
> in Azure DevOps belonging to tenant B, but only this repository.
|
|
|
|
For example, I would like to have the following configuration:
|
|
|
|
```yaml
|
|
apiVersion: source.toolkit.fluxcd.io/v1
|
|
kind: GitRepository
|
|
metadata:
|
|
name: tenant-a-repo
|
|
namespace: tenant-a
|
|
spec:
|
|
...
|
|
provider: azure
|
|
serviceAccountName: tenant-a-azure-devops-sa
|
|
---
|
|
apiVersion: v1
|
|
kind: ServiceAccount
|
|
metadata:
|
|
name: tenant-a-azure-devops-sa
|
|
namespace: tenant-a
|
|
annotations:
|
|
azure.workload.identity/client-id: d6e4fc00-c5b2-4a72-9f84-6a92e3f06b08 # client ID for my tenant A
|
|
azure.workload.identity/tenant-id: 72f988bf-86f1-41af-91ab-2d7cd011db47 # azure tenant for the cluster (optional, defaults to the env var AZURE_TENANT_ID set in the controller)
|
|
---
|
|
apiVersion: image.toolkit.fluxcd.io/v1beta2
|
|
kind: ImageUpdateAutomation
|
|
metadata:
|
|
name: tenant-a-image-update
|
|
namespace: tenant-a
|
|
spec:
|
|
...
|
|
sourceRef:
|
|
kind: GitRepository
|
|
name: tenant-a-repo
|
|
---
|
|
apiVersion: source.toolkit.fluxcd.io/v1
|
|
kind: GitRepository
|
|
metadata:
|
|
name: tenant-b-repo
|
|
namespace: tenant-b
|
|
spec:
|
|
...
|
|
provider: azure
|
|
serviceAccountName: tenant-b-azure-devops-sa
|
|
---
|
|
apiVersion: v1
|
|
kind: ServiceAccount
|
|
metadata:
|
|
name: tenant-b-azure-devops-sa
|
|
namespace: tenant-b
|
|
annotations:
|
|
azure.workload.identity/client-id: 4a7272f9-f186-41af-9f84-6a92e32d7cd0 # client ID for my tenant B
|
|
azure.workload.identity/tenant-id: 72f988bf-86f1-41af-91ab-2d7cd011db47 # azure tenant for the cluster (optional, defaults to the env var AZURE_TENANT_ID set in the controller)
|
|
---
|
|
apiVersion: image.toolkit.fluxcd.io/v1beta2
|
|
kind: ImageUpdateAutomation
|
|
metadata:
|
|
name: tenant-b-image-update
|
|
namespace: tenant-b
|
|
spec:
|
|
...
|
|
sourceRef:
|
|
kind: GitRepository
|
|
name: tenant-b-repo
|
|
```
|
|
|
|
#### Story 3
|
|
|
|
> As a cluster administrator, I want to allow tenant A to pull manifests from
|
|
> the GCS bucket belonging to tenant A, but only from this bucket. At the same
|
|
> time, I want to allow tenant B to pull manifests from the GCS bucket
|
|
> belonging to tenant B, but only from this bucket.
|
|
|
|
For example, I would like to have the following configuration:
|
|
|
|
```yaml
|
|
apiVersion: source.toolkit.fluxcd.io/v1
|
|
kind: Bucket
|
|
metadata:
|
|
name: tenant-a-bucket
|
|
namespace: tenant-a
|
|
spec:
|
|
...
|
|
provider: gcp
|
|
serviceAccountName: tenant-a-gcs-sa
|
|
---
|
|
apiVersion: v1
|
|
kind: ServiceAccount
|
|
metadata:
|
|
name: tenant-a-gcs-sa
|
|
namespace: tenant-a
|
|
annotations:
|
|
iam.gke.io/gcp-service-account: tenant-a-bucket@my-org-project.iam.gserviceaccount.com
|
|
---
|
|
apiVersion: source.toolkit.fluxcd.io/v1
|
|
kind: Bucket
|
|
metadata:
|
|
name: tenant-b-bucket
|
|
namespace: tenant-b
|
|
spec:
|
|
...
|
|
provider: gcp
|
|
serviceAccountName: tenant-b-gcs-sa
|
|
---
|
|
apiVersion: v1
|
|
kind: ServiceAccount
|
|
metadata:
|
|
name: tenant-b-gcs-sa
|
|
namespace: tenant-b
|
|
annotations:
|
|
iam.gke.io/gcp-service-account: tenant-b-bucket@my-org-project.iam.gserviceaccount.com
|
|
```
|
|
|
|
#### Story 4
|
|
|
|
> As a cluster administrator, I want to allow tenant A to decrypt secrets using
|
|
> the AWS KMS key belonging to tenant A, but only this key. At the same time,
|
|
> I want to allow tenant B to decrypt secrets using the AWS KMS key belonging
|
|
> to tenant B, but only this key.
|
|
|
|
For example, I would like to have the following configuration:
|
|
|
|
```yaml
|
|
apiVersion: kustomize.toolkit.fluxcd.io/v1
|
|
kind: Kustomization
|
|
metadata:
|
|
name: tenant-a-aws-kms
|
|
namespace: tenant-a
|
|
spec:
|
|
...
|
|
decryption:
|
|
provider: sops
|
|
serviceAccountName: tenant-a-aws-kms-sa
|
|
---
|
|
apiVersion: v1
|
|
kind: ServiceAccount
|
|
metadata:
|
|
name: tenant-a-aws-kms-sa
|
|
namespace: tenant-a
|
|
annotations:
|
|
eks.amazonaws.com/role-arn: arn:aws:iam::123456789123:role/tenant-a-kms
|
|
---
|
|
apiVersion: kustomize.toolkit.fluxcd.io/v1
|
|
kind: Kustomization
|
|
metadata:
|
|
name: tenant-b-aws-kms
|
|
namespace: tenant-b
|
|
spec:
|
|
...
|
|
decryption:
|
|
provider: sops
|
|
serviceAccountName: tenant-b-aws-kms-sa
|
|
---
|
|
apiVersion: v1
|
|
kind: ServiceAccount
|
|
metadata:
|
|
name: tenant-b-aws-kms-sa
|
|
namespace: tenant-b
|
|
annotations:
|
|
eks.amazonaws.com/role-arn: arn:aws:iam::123456789123:role/tenant-b-kms
|
|
```
|
|
|
|
#### Story 5
|
|
|
|
> As a cluster administrator, I want to allow tenant A to publish notifications
|
|
> to the `tenant-a` topic in Google Cloud Pub/Sub, but only to this topic. At
|
|
> the same time, I want to allow tenant B to publish notifications to the
|
|
> `tenant-b` topic in Google Cloud Pub/Sub, but only to this topic. I want
|
|
> to do so without creating any GCP IAM Service Accounts.
|
|
|
|
For example, I would like to have the following configuration:
|
|
|
|
```yaml
|
|
apiVersion: notification.toolkit.fluxcd.io/v1beta3
|
|
kind: Provider
|
|
metadata:
|
|
name: tenant-a-google-pubsub
|
|
namespace: tenant-a
|
|
spec:
|
|
...
|
|
type: googlepubsub
|
|
serviceAccountName: tenant-a-google-pubsub-sa
|
|
---
|
|
apiVersion: v1
|
|
kind: ServiceAccount
|
|
metadata:
|
|
name: tenant-a-google-pubsub-sa
|
|
namespace: tenant-a
|
|
---
|
|
apiVersion: notification.toolkit.fluxcd.io/v1beta3
|
|
kind: Provider
|
|
metadata:
|
|
name: tenant-b-google-pubsub
|
|
namespace: tenant-b
|
|
spec:
|
|
...
|
|
type: googlepubsub
|
|
serviceAccountName: tenant-b-google-pubsub-sa
|
|
---
|
|
apiVersion: v1
|
|
kind: ServiceAccount
|
|
metadata:
|
|
name: tenant-b-google-pubsub-sa
|
|
namespace: tenant-b
|
|
```
|
|
|
|
### Alternatives
|
|
|
|
#### An alternative for identifying Flux resources in cloud providers
|
|
|
|
Instead of issuing `ServiceAccount` tokens in the Kubernetes API we could
|
|
come up with a username naming scheme for Flux resources and issue tokens
|
|
for these usernames instead, e.g. `flux:<resource type>:<namespace>:<name>`.
|
|
This would make each Flux object have its own identity instead of using
|
|
`ServiceAccounts` for this purpose. This choice would then prevent cases
|
|
of other Flux objects from malicious actors in the same namespace from
|
|
abusing the permissions granted to the `ServiceAccount` of the object.
|
|
This choice, however, would provide a worse user experience, as Flux and
|
|
Kubernetes users are already used to the `ServiceAccount` resource being
|
|
the identity for resources in the cluster, not only in the context of plain
|
|
RBAC but also in the context of workload identity.
|
|
This choice would also require the introduction of new APIs for configuring
|
|
the respective cloud identities in the Flux objects, when such APIs already
|
|
exist as defined by the cloud providers themselves as annotations in the
|
|
`ServiceAccount` resources. We therefore choose to stick with the well-known
|
|
pattern of using `ServiceAccounts` for configuring the identities of the
|
|
Flux resources. Furthermore, as mentioned in the
|
|
[Multi-Tenancy Model](#multi-tenancy-model) section, the tenant trust domains
|
|
are namespaces, so a tenant is expected to control and have access to all
|
|
the resources `ServiceAccounts` in their namespaces are allowed to access.
|
|
|
|
#### Alternatives for modifying controller RBAC to create `ServiceAccount` tokens
|
|
|
|
In this section we discuss alternatives for changing the RBAC of controllers for
|
|
creating `ServiceAccount` tokens cluster-wide, as it has a potential impact on
|
|
the security posture of Flux.
|
|
|
|
1. We grant RBAC permissions to the `ServiceAccounts` of the Flux controllers
|
|
(that would implement multi-tenant workload identity) for creating tokens
|
|
for any other `ServiceAccounts` in the cluster.
|
|
2. We require users to grant "self-impersonation" to the `ServiceAccounts` so they
|
|
can create tokens for themselves. The controller would then impersonate the
|
|
`ServiceAccount` when creating a token for it. This operation would then only
|
|
succeed if the `ServiceAccount` has been correctly granted permission to create
|
|
a token for itself.
|
|
|
|
In both alternatives the controller `ServiceAccount` would require some form
|
|
of cluster-wide impersonation permission. Alternative 2 requires impersonation
|
|
permission to be granted directly to the controller `ServiceAccount`, while
|
|
in alternative 1, impersonation permission would be indirectly granted by the
|
|
process of creating a token for another `ServiceAccount`. By creating a token
|
|
for another `ServiceAccount`, the controller `ServiceAccount` effectively has
|
|
the same permissions as the `ServiceAccount` it is creating the token for, as
|
|
it could simply use the token to impersonate the `ServiceAccount`. Therefore
|
|
it is reasonable to affirm that both alternatives are equivalent in terms of
|
|
security.
|
|
|
|
To break the tie between the two alternatives we introduce the fact that
|
|
alternative 1 eliminates operational burden on users. In fact, native
|
|
workload identity for pods does not require users to grant this
|
|
self-impersonation permission to the `ServiceAccounts` of the pods.
|
|
|
|
We therefore choose alternative 1.
|
|
|
|
## Design Details
|
|
|
|
For detailing the proposal we need to first introduce the technical
|
|
background on how workload identity is implemented by the managed
|
|
Kubernetes services from the cloud providers.
|
|
|
|
### Technical Background
|
|
|
|
Workload identity in Kubernetes is based on
|
|
[OpenID Connect Discovery](https://openid.net/specs/openid-connect-discovery-1_0.html)
|
|
(OIDC).
|
|
The *Kubernetes `ServiceAccount` token issuer*, included as the `iss` JWT claim in the
|
|
issued tokens, and represented by the default URL `https://kubernetes.default.svc.cluster.local`,
|
|
implements the OIDC discovery protocol. Essentially, this means that the Kubernetes API
|
|
will respond requests to the URL
|
|
`https://kubernetes.default.svc.cluster.local/.well-known/openid-configuration`
|
|
with a JSON document similar to the one below:
|
|
|
|
```json
|
|
{
|
|
"issuer": "https://kubernetes.default.svc.cluster.local",
|
|
"jwks_uri": "https://172.18.0.2:6443/openid/v1/jwks",
|
|
"response_types_supported": [
|
|
"id_token"
|
|
],
|
|
"subject_types_supported": [
|
|
"public"
|
|
],
|
|
"id_token_signing_alg_values_supported": [
|
|
"RS256"
|
|
]
|
|
}
|
|
```
|
|
|
|
And to the URL `https://172.18.0.2:6443/openid/v1/jwks`, *discovered* through the field
|
|
`.jwks_uri` in the JSON response above, the Kubernetes API will respond a JSON document
|
|
similar to the following:
|
|
|
|
```json
|
|
{
|
|
"keys": [
|
|
{
|
|
"use": "sig",
|
|
"kty": "RSA",
|
|
"kid": "NWm3YKmazJPVP7tttzkmSxUn0w8LGGp7yS2CanEF-A8",
|
|
"alg": "RS256",
|
|
"n": "lV2tbw9hnz1mseah2kMQNe5sRju4mPLlK0F7np97lLNC49G8yc5TMjyciLF3qsDNFCfWyYmsuGlcRg2BIBBX_jkpIUUjlsktdHhuqO2RnOqyRtNuljlT_b0QJgpgxCqq0DHI31EBc0JALOVd6EjjlhsVvVzZOw_b9KBXVS3D3RENuT0_FWauDq5NYbyYnjlvk-vUXCRMNDQSDNwx6X6bktwsmeDRXtM_bP3DokmnMYc4n0asTEg14L6VKky0ByF88Wi1-y0Pm0BHdobDGt1cIeUDeThk4E79JCHxkT5urAyYHcNwcfU4q-tnD6bTpNkFVsk3cqqK2nF7R_7ac5arSQ",
|
|
"e": "AQAB"
|
|
}
|
|
]
|
|
}
|
|
```
|
|
|
|
This JSON document contains the public keys for verifying the signature of the issued tokens.
|
|
|
|
By querying these two URLs in sequence, cloud providers are able to fetch the information
|
|
required for verifying and trusting the tokens issued by the Kubernetes API. Most specifically,
|
|
for trusting the `sub` JWT claim, which contains the Kubernetes `ServiceAccount` reference
|
|
(name and namespace) for which the token was issued for, i.e. the `ServiceAccount` properly
|
|
said.
|
|
|
|
By allowing permissions to be granted to `ServiceAccounts` in the cloud provider,
|
|
the cloud provider is then able to allow Kubernetes `ServiceAccounts` to access its resources.
|
|
This is usually done by a *Security Token Service* (STS) that exchanges the Kubernetes token
|
|
for a short-lived cloud provider access token, which is then used to access the cloud provider
|
|
resources.
|
|
|
|
It's important to mention that the Kubernetes `ServiceAccount` token issuer URL must be
|
|
trusted by the cloud provider, i.e. users must configure this URL as a trusted identity
|
|
provider.
|
|
|
|
This process forms the basis for workload identity in Kubernetes. As long as the issuer
|
|
URL can be reached by the cloud provider, this process can take place successfully.
|
|
|
|
The reachability of the issuer URL by the cloud provider is where the implementation
|
|
of workload identity starts to differ between cloud providers. For example, in GCP
|
|
one can configure the content of the JWKS document directly in the GCP IAM console,
|
|
which eliminates the need for network calls to the Kubernetes API. In AWS, on the
|
|
other hand, this is not possible, the process has to be followed strictly, i.e. the
|
|
issuer URL must be reachable by the AWS STS service.
|
|
|
|
Furthermore, GKE automatically
|
|
creates the necessary trust relationship between the Kubernetes issuer and the GCP
|
|
STS service (i.e. automatically injects the JWKS document of the GKE cluster in the
|
|
STS database), while in EKS this must be done manually by users (an OIDC provider
|
|
must be created for each EKS cluster).
|
|
|
|
Another difference is that the issuer URL remains the default/private one in GKE,
|
|
while in EKS it is automatically set to a public one. This is done through
|
|
the `--service-account-issuer` flag in the `kube-apiserver` command line arguments
|
|
([docs](https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/#service-account-issuer-discovery)). This is a nice feature, as it allows external
|
|
systems to federate access for workloads running in EKS clusters, e.g. EKS workloads
|
|
can have federated access to GCP resources.
|
|
|
|
Yet another difference between cloud providers that sheds light in our proposal is
|
|
how applications running inside pods from the managed Kubernetes services obtain
|
|
the short-lived cloud provider access tokens. In GCP, the GCP libraries used by
|
|
the applications attempt to retrieve tokens from the *metadata server*, which is
|
|
reachable by all pods running in GKE. This server creates a token for the
|
|
`ServiceAccount` of the calling pod in the Kubernetes API, exchanges it for a
|
|
short-lived GCP access token, and returns it to the application. In AKS, on the
|
|
other hand, pods are mutated to include a
|
|
[*token volume projection*](https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/#serviceaccount-token-volume-projection). The kubelet mounts and automatically
|
|
rotates a volume with a token file inside the pod. The Azure libraries used by
|
|
the applications then read this file periodically to perform the token exchange
|
|
with the Azure STS service.
|
|
|
|
Another aspect of workload identity that is important for this RFC is how the cloud
|
|
identities are associated with the Kubernetes `ServiceAccounts`. In most cases, an
|
|
identity from the IAM service of the cloud provider (e.g. a GCP IAM Service Account,
|
|
or an AWS IAM Role) is associated with a Kubernetes `ServiceAccount` by the process
|
|
of *impersonation*. Permission to impersonate the cloud identity is granted to the
|
|
`ServiceAccount` through a configuration that points to the fully qualified name of
|
|
the Kubernetes `ServiceAccount`, i.e. the name and namespace of the `ServiceAccount`
|
|
and which cluster it belongs to in the name/address system of the cloud provider.
|
|
|
|
Because the cloud provider needs to support this impersonation permission, some
|
|
cloud providers go further and even remove the impersonation requirement, by
|
|
allowing permissions to be granted directly to `ServiceAccounts` (if it needs to
|
|
support granting the impersonation permission, then it can probably also easily
|
|
support granting any other permissions depending on the implementation). GCP for
|
|
example has implemented this feature [recently](https://cloud.google.com/blog/products/identity-security/make-iam-for-gke-easier-to-use-with-workload-identity-federation), a GCP IAM
|
|
Service Account is no longer required for workload identity, i.e. GCP IAM
|
|
permissions can now be granted directly to Kubernetes `ServiceAccounts`. This is
|
|
a significant improvement in the user experience, as it significantly reduces
|
|
the required configuration steps. AWS implemented a similar feature called *EKS
|
|
Pod Identity*, but it still requires an IAM Role to be associated with the
|
|
`ServiceAccount`. The minor improvement from the user experience perspective is
|
|
that this association is implemented entirely in the AWS EKS/IAM APIs, no
|
|
annotations are required in the Kubernetes `ServiceAccount`. Another improvement
|
|
from this EKS feature compared to *IAM Roles for Service Accounts* is that users
|
|
no longer need to create an *OIDC Provider* for the EKS cluster in the IAM API.
|
|
|
|
In sight of the technical background presented above, our proposal becomes simpler.
|
|
The only solution to support multi-tenant workload identity at the object-level for
|
|
the Flux APIs is to associate the Flux objects with Kubernetes `ServiceAccounts`.
|
|
We propose to build the `ServiceAccount` token creation and exchange logic into
|
|
the Flux controllers through a library in the `github.com/fluxcd/pkg` repository.
|
|
|
|
### API Changes
|
|
|
|
For all the Flux APIs interacting with cloud providers (except `Kustomization`,
|
|
see the paragraph below), we propose introducing the field `spec.serviceAccountName`
|
|
(if not already present) for specifying the Kubernetes `ServiceAccount` on the same
|
|
namespace of the object that must be used for getting access to the respective cloud
|
|
resources. This field would be optional, and when not present the original behavior
|
|
would be observed, i.e. the feature only activates when the field is present and a
|
|
cloud provider among `aws`, `azure` or `gcp` is specified in the `spec.provider`
|
|
field. So if only the `spec.provider` field is present and set to a cloud provider,
|
|
then the controller would use single-tenant workload identity as it would prior to
|
|
the implementation of this RFC, i.e. it would use its own identity for the operation.
|
|
|
|
Note that this RFC does not seek to change the behavior when `spec.provider` is set
|
|
to `generic` (or left empty, when it defaults to `generic`), in which case the field
|
|
`spec.secretRef` can be used for specifying the Kubernetes `Secret` containing the
|
|
credentials (or `spec.serviceAccountName` in the case of the APIs dealing with
|
|
container registries, through the `imagePullSecrets` field of the `ServiceAccount`).
|
|
|
|
The `Kustomization` API uses Key Management Services (KMS) for decrypting
|
|
SOPS-encrypted secrets. We propose adding the dedicated optional field
|
|
`spec.decryption.serviceAccountName` for multi-tenant workload identity
|
|
when intercting with the KMS service. We choose having a dedicated field
|
|
for the `Kustomization` API because the field `spec.serviceAccountName`
|
|
already exists and is used for a major part of the functionality which
|
|
is authenticating with the Kubernetes API when applying resources. If
|
|
we used the same field for both purposes users would be forced to use
|
|
multi-tenancy for both cloud and Kubernetes API interactions. Furthermore,
|
|
the cloud provider in the `Kustomization` API is detected by the SOPS SDK
|
|
itself while decrypting the secrets, so we don't need to introduce a new
|
|
field for this purpose.
|
|
|
|
### Workload Identity Library
|
|
|
|
We propose using the Go package `github.com/fluxcd/pkg/auth`
|
|
for implementing a workload identity library that can be
|
|
used by all the Flux controllers that need to interact
|
|
with cloud providers. This library would be responsible
|
|
for creating the `ServiceAccount` tokens in the Kubernetes
|
|
API and exchanging them for short-lived access tokens
|
|
for the cloud provider. The library would also be responsible
|
|
for caching the tokens when configured by users.
|
|
|
|
The library should support both single-tenant and multi-tenant workload
|
|
identity because single-tenant implementations are already supported in
|
|
GA APIs and hence they must remain available for backwards compatibility.
|
|
Furthermore, it would be easier to support both use cases in a single
|
|
library as opposed to mingling a new library into the currently existing
|
|
ones, so this new library becomes the definitive unified solution for
|
|
workload identity in Flux.
|
|
|
|
The library should automatically detect whether the workload identity
|
|
is single-tenant or multi-tenant by checking if a `ServiceAccount` was
|
|
configured for the operation. If a `ServiceAccount` was configured, then
|
|
the operation is multi-tenant, otherwise it is single-tenant and the
|
|
granted access token must represent the identity associated with the
|
|
controller.
|
|
|
|
The directory structure would look like this:
|
|
|
|
```shell
|
|
.
|
|
└── auth
|
|
├── aws
|
|
│ └── aws.go
|
|
├── azure
|
|
│ └── azure.go
|
|
├── gcp
|
|
│ └── gcp.go
|
|
├── get_token.go
|
|
├── options.go
|
|
├── provider.go
|
|
└── token.go
|
|
```
|
|
|
|
The file `auth/get_token.go` would contain the main algorithm:
|
|
|
|
```go
|
|
package auth
|
|
|
|
// GetToken returns an access token for accessing resources in the given cloud provider.
|
|
func GetToken(ctx context.Context, provider Provider, opts ...Option) (Token, error) {
|
|
// 1. Check if a ServiceAccount is configured and return the controller access token if not (single-tenant WI).
|
|
// 2. Get the provider audience for creating the OIDC token for the ServiceAccount in the Kubernetes API.
|
|
// 3. Get the ServiceAccount using the configured controller-runtime client.
|
|
// 4. Get the provider identity from the ServiceAccount annotations and add it to the options.
|
|
// 5. Build the cache key using the configured options.
|
|
// 6. Get the token from the cache. If present, return it, otherwise continue.
|
|
// 7. Create an OIDC token for the ServiceAccount in the Kubernetes API using the provider audience.
|
|
// 8. Exchange the OIDC token for an access token through the Security Token Service of the provider.
|
|
// 9. If an image repository is configured, exchange the access token for a registry token.
|
|
// 10. Add the final token to the cache and return it.
|
|
}
|
|
```
|
|
|
|
The file `auth/token.go` would contain the token abstractions:
|
|
|
|
```go
|
|
package auth
|
|
|
|
// Token is an interface that represents an access token that can be used to
|
|
// authenticate with a cloud provider. The only common method is for getting the
|
|
// duration of the token, because different providers have different ways of
|
|
// representing the token. For example, Azure and GCP use a single string,
|
|
// while AWS uses three strings: access key ID, secret access key and token.
|
|
// Consumers of this interface should know what type to cast it to.
|
|
type Token interface {
|
|
// GetDuration returns the duration for which the token is valid relative to
|
|
// approximately time.Now(). This is used to determine when the token should
|
|
// be refreshed.
|
|
GetDuration() time.Duration
|
|
}
|
|
|
|
// RegistryCredentials is a particular type implementing the Token interface
|
|
// for credentials that can be used to authenticate with a container registry
|
|
// from a cloud provider. This type is compatible with all the cloud providers
|
|
// and should be returned when the image repository is configured in the options.
|
|
type RegistryCredentials struct {
|
|
Username string
|
|
Password string
|
|
ExpiresAt time.Time
|
|
}
|
|
|
|
func (r *RegistryCredentials) GetDuration() time.Duration {
|
|
return time.Until(r.ExpiresAt)
|
|
}
|
|
```
|
|
|
|
The file `auth/provider.go` would contain the `Provider` interface:
|
|
|
|
```go
|
|
package auth
|
|
|
|
// Provider contains the logic to retrieve an access token for a cloud
|
|
// provider from a ServiceAccount (OIDC/JWT) token.
|
|
type Provider interface {
|
|
// GetName returns the name of the provider.
|
|
GetName() string
|
|
|
|
// NewDefaultToken returns a token that can be used to authenticate with the
|
|
// cloud provider retrieved from the default source, i.e. from the pod's
|
|
// environment, e.g. files mounted in the pod, environment variables,
|
|
// local metadata services, etc. In this case the method would implicitly
|
|
// use the ServiceAccount associated with the controller pod, and not one
|
|
// specified in the options.
|
|
NewDefaultToken(ctx context.Context, opts ...Option) (Token, error)
|
|
|
|
// GetAudience returns the audience the OIDC tokens issued representing
|
|
// ServiceAccounts should have. This is usually a string that represents
|
|
// the cloud provider's STS service, or some entity in the provider for
|
|
// which the OIDC tokens are targeted to.
|
|
GetAudience(ctx context.Context, sa corev1.ServiceAccount) (string, error)
|
|
|
|
// GetIdentity takes a ServiceAccount and returns the identity which the
|
|
// ServiceAccount wants to impersonate, by looking at annotations.
|
|
GetIdentity(sa corev1.ServiceAccount) (string, error)
|
|
|
|
// NewToken takes a ServiceAccount and its OIDC token and returns a token
|
|
// that can be used to authenticate with the cloud provider. The OIDC token is
|
|
// the JWT token that was issued for the ServiceAccount by the Kubernetes API.
|
|
// The implementation should exchange this token for a cloud provider access
|
|
// token through the provider's STS service.
|
|
NewTokenForServiceAccount(ctx context.Context, oidcToken string,
|
|
sa corev1.ServiceAccount, opts ...Option) (Token, error)
|
|
|
|
// GetImageCacheKey extracts the part of the image repository that must be
|
|
// included in cache keys when caching registry credentials for the provider.
|
|
GetImageCacheKey(imageRepository string) string
|
|
|
|
// NewRegistryToken takes an image repository and a Token and returns a token
|
|
// that can be used to authenticate with the container registry of the image.
|
|
NewRegistryToken(ctx context.Context, imageRepository string,
|
|
token Token, opts ...Option) (Token, error)
|
|
}
|
|
```
|
|
|
|
The file `auth/options.go` would contain the following options:
|
|
|
|
```go
|
|
package auth
|
|
|
|
// Options contains options for configuring the behavior of the provider methods.
|
|
// Not all providers/methods support all options.
|
|
type Options struct {
|
|
ServiceAccount *client.ObjectKey
|
|
Client client.Client
|
|
Cache *cache.TokenCache
|
|
InvolvedObject *cache.InvolvedObject
|
|
Scopes []string
|
|
ImageRepository string
|
|
STSEndpoint string
|
|
ProxyURL *url.URL
|
|
}
|
|
|
|
// WithServiceAccount sets the ServiceAccount reference for the token
|
|
// and a controller-runtime client to fetch the ServiceAccount and
|
|
// create an OIDC token for it in the Kubernetes API.
|
|
func WithServiceAccount(saRef client.ObjectKey, client client.Client) Option {
|
|
// ...
|
|
}
|
|
|
|
// WithCache sets the token cache and the involved object for recording events.
|
|
func WithCache(cache cache.TokenCache, involvedObject cache.InvolvedObject) Option {
|
|
// ...
|
|
}
|
|
|
|
// WithScopes sets the scopes for the token.
|
|
func WithScopes(scopes ...string) Option {
|
|
// ...
|
|
}
|
|
|
|
// WithImageRepository sets the image repository the token will be used for.
|
|
// In most cases container registry credentials require an additional
|
|
// token exchange at the end. This option allows the library to implement
|
|
// this exchange and cache the final token.
|
|
func WithImageRepository(imageRepository string) Option {
|
|
// ...
|
|
}
|
|
|
|
// WithSTSEndpoint sets the endpoint for the STS service.
|
|
func WithSTSEndpoint(stsEndpoint string) Option {
|
|
// ...
|
|
}
|
|
|
|
// WithProxyURL sets a *url.URL for an HTTP/S proxy for acquiring the token.
|
|
func WithProxyURL(proxyURL url.URL) Option {
|
|
// ...
|
|
}
|
|
```
|
|
|
|
The `auth/aws/aws.go`, `auth/azure/azure.go` and
|
|
`auth/gcp/gcp.go` files would contain the implementations for
|
|
the respective cloud providers:
|
|
|
|
```go
|
|
package aws
|
|
|
|
import (
|
|
"github.com/aws/aws-sdk-go-v2/aws"
|
|
"github.com/aws/aws-sdk-go-v2/credentials"
|
|
"github.com/aws/aws-sdk-go-v2/service/sts/types"
|
|
)
|
|
|
|
const ProviderName = "aws"
|
|
|
|
type Provider struct{}
|
|
|
|
type Token struct{ types.Credentials }
|
|
|
|
// GetDuration implements auth.Token.
|
|
func (t *Token) GetDuration() time.Duration {
|
|
return time.Until(*t.Expiration)
|
|
}
|
|
|
|
type credentialsProvider struct {
|
|
opts []auth.Option
|
|
}
|
|
|
|
// NewCredentialsProvider creates an aws.CredentialsProvider for the aws provider.
|
|
func NewCredentialsProvider(opts ...auth.Option) aws.CredentialsProvider {
|
|
return &credentialsProvider{opts}
|
|
}
|
|
|
|
// Retrieve implements aws.CredentialsProvider.
|
|
func (c *credentialsProvider) Retrieve(ctx context.Context) (aws.Credentials, error) {
|
|
// Use auth.GetToken() to get the token.
|
|
}
|
|
```
|
|
|
|
```go
|
|
package azure
|
|
|
|
import (
|
|
"github.com/Azure/azure-sdk-for-go/sdk/azcore"
|
|
"github.com/Azure/azure-sdk-for-go/sdk/azcore/policy"
|
|
)
|
|
|
|
const ProviderName = "azure"
|
|
|
|
type Provider struct{}
|
|
|
|
type Token struct{ azcore.AccessToken }
|
|
|
|
// GetDuration implements auth.Token.
|
|
func (t *Token) GetDuration() time.Duration {
|
|
return time.Until(t.ExpiresOn)
|
|
}
|
|
|
|
type tokenCredential struct {
|
|
opts []auth.Option
|
|
}
|
|
|
|
// NewTokenCredential creates an azcore.TokenCredential for the azure provider.
|
|
func NewTokenCredential(opts ...auth.Option) azcore.TokenCredential {
|
|
return &tokenCredential{opts}
|
|
}
|
|
|
|
// GetToken implements azcore.TokenCredential.
|
|
// The options argument is ignored, any options should be
|
|
// specified in the constructor.
|
|
func (t *tokenCredential) GetToken(ctx context.Context, _ policy.TokenRequestOptions) (azcore.AccessToken, error) {
|
|
// Use auth.GetToken() to get the token.
|
|
}
|
|
```
|
|
|
|
```go
|
|
package gcp
|
|
|
|
import (
|
|
"golang.org/x/oauth2"
|
|
)
|
|
|
|
const ProviderName = "gcp"
|
|
|
|
type Provider struct {}
|
|
|
|
type Token struct{ oauth2.Token }
|
|
|
|
// GetDuration implements auth.Token.
|
|
func (t *Token) GetDuration() time.Duration {
|
|
return time.Until(t.Expiry)
|
|
}
|
|
|
|
type tokenSource struct {
|
|
ctx context.Context
|
|
opts []auth.Option
|
|
}
|
|
|
|
// NewTokenSource creates an oauth2.TokenSource for the gcp provider.
|
|
func NewTokenSource(ctx context.Context, opts ...auth.Option) oauth2.TokenSource {
|
|
return &tokenSource{ctx, opts}
|
|
}
|
|
|
|
// Token implements oauth2.TokenSource.
|
|
func (t *tokenSource) Token() (*oauth2.Token, error) {
|
|
// Use auth.GetToken() to get the token.
|
|
}
|
|
|
|
var gkeMetadata struct {
|
|
projectID string
|
|
location string
|
|
name string
|
|
mu sync.Mutex
|
|
loaded bool
|
|
}
|
|
```
|
|
|
|
As detailed above, each cloud provider implementation defines a simple wrapper
|
|
around the cloud provider access token type. This wrapper implements the
|
|
`auth.Token` interface, which is essentially the method `GetDuration()`
|
|
for the cache library to manage the token lifetime. The wrappers also contain
|
|
a helper function to create a token source for the respective cloud provider
|
|
SDKs. These methods have different names and signatures because the cloud provider
|
|
SDKs are different and have different types, but they all implement the same
|
|
concept of a token source.
|
|
|
|
The `aws` provider needs to read the environment variable `AWS_REGION` for
|
|
configuring the STS client. Even though a specific STS endpoint may be
|
|
configured, the AWS SDKs require the region to be set regardless. This
|
|
variable is usually set automatically in EKS pods, and can be manually set
|
|
by users otherwise (e.g. in Fargate pods).
|
|
|
|
An important detail to take into account in the `azure` provider implementation
|
|
is using our custom implementation of `azidentity.NewDefaultAzureCredential()`
|
|
found in kustomize-controller for SOPS decryption. This custom implementation
|
|
avoids shelling out to the Azure CLI, which is something we strive to avoid in
|
|
the Flux codebase. This is important because today we are doing this in a few
|
|
APIs but not others, so it will be a significant improvement to implement this
|
|
in a single place and use it everywhere.
|
|
|
|
The `gcp` provider needs to load the cluster metadata from the `gke-metadata-server`
|
|
in order to create tokens. This must be done lazily when the first token is
|
|
requested, and there's a very important reason for this: if this was done on
|
|
the controller startup, the controller would crash when running outside GKE and
|
|
enter `CrashLoopBackOff` because the `gke-metadata-server` would never be
|
|
available. This is a very important detail that must be taken into account when
|
|
implementing the `gcp` provider. The cluster metadata doesn't change during the
|
|
lifetime of the controller pod, so we use a `sync.Mutex` and `bool` to load it
|
|
only once into a package variable.
|
|
|
|
#### Cache Key
|
|
|
|
The cache key must include the following components:
|
|
|
|
* The cloud provider name.
|
|
* The provider audience used for issuing the Kubernetes `ServiceAccount` token.
|
|
* The optional `ServiceAccount` reference and cloud provider identity.
|
|
The identity is the string representing the identity which the `ServiceAccount`
|
|
is impersonating, e.g. for `gcp` this would be a GCP IAM Service Account email,
|
|
for `aws` this would be an AWS IAM Role ARN, etc. When there is no identity
|
|
configured for impersonation, only the `ServiceAccount` reference is included.
|
|
* The optional scopes added to the token.
|
|
* The cache key extracted from the optional image repository.
|
|
* The optional STS endpoint used for issuing the token.
|
|
* The optional proxy URL when the STS endpoint is present.
|
|
|
|
##### Justification
|
|
|
|
When single-tenant workload identity is being used, the identity associated with
|
|
the controller is the one represented by the token, so there is no identity or
|
|
`ServiceAccount` to identify in the cache key besides the implicit ones associated
|
|
with the controller. In this case, including only the cloud provider name in the
|
|
cache key is enough.
|
|
|
|
The provider audience used for issuing the `ServiceAccount` token is included
|
|
in the cache key because it may depend on the `ServiceAccount` annotations.
|
|
For example, in AWS if an IAM Role ARN is not specified we assume that users
|
|
are attempting to use EKS Pod Identity instead of IAM Roles for Service
|
|
Accounts. Each feature has its own audience string and its own way of issuing
|
|
tokens, so the audience string must be included in the cache key.
|
|
|
|
In multi-tenant workload identity, the reason for including both the `ServiceAccount`
|
|
and the identity in the cache key is to establish the fact that the `ServiceAccount`
|
|
had permission to impersonate the identity at the time when the token was issued.
|
|
This is very important. For the sake of the argument, suppose we include only the
|
|
identity. Then a malicious actor could specify any identity in their `ServiceAccount`
|
|
and get a token cached for that identity even if their `ServiceAccount` did not have
|
|
permission to impersonate that identity. We also need to include the identity in the
|
|
cache key because, otherwise, if including only the `ServiceAccount`, changes to the
|
|
`ServiceAccount` annotations to impersonate a different identity would not cause a
|
|
new token impersonating the new identity to be created since the cache key did not
|
|
change.
|
|
|
|
In most cases container registry credentials require an additional token exchange
|
|
at the end. In order to benefit from caching the final token and freeing the
|
|
library consumers from this responsibility, we allow an image repository to
|
|
be included in the options and implement the exchange. Depending on the cloud
|
|
provider, a part of the image repository string is extracted and used to issue
|
|
the token, e.g. for ECR the region is extracted and used to configure the client,
|
|
and in the case of ACR the registry host is included in the resulting token.
|
|
Those parts of the image repository must be included in the cache key. This is
|
|
accomplished by the `Provider.GetImageCacheKey()` method. In the case of GCP
|
|
container registries the image repository does not influence how the token is
|
|
issued.
|
|
|
|
The scopes are included in the cache key because they delimit the permissions that
|
|
the token has. They don't *grant* the permissions, they just set an upper bound for
|
|
the permissions that the token can have. Providers requiring scopes unfortunately
|
|
benefit less from caching, e.g. a token issued for an Azure identity can't be
|
|
seamlessly used for both Azure DevOps and the Azure Container Registry, because the
|
|
respective scopes are different, so the issued tokens are different.
|
|
|
|
The STS endpoint and proxy URL are included in the cache key because they could
|
|
influence how the token is fetched and ultimately issued. The proxy URL is included
|
|
only when the STS endpoint is present, because all the default STS endpoints are
|
|
HTTPS and belong to cloud providers, so they are all well-known, unique, and the
|
|
proxy is guaranteed not to tamper with the issuance of the token since it only
|
|
sees an opaque TLS session passing through.
|
|
|
|
##### Format
|
|
|
|
The cache key would be the SHA256 hash of the following string (breaking lines
|
|
after commas for readability):
|
|
|
|
Single-tenant/controller-level:
|
|
|
|
```
|
|
provider=<cloud-provider-name>,
|
|
scopes=<comma-separated-scopes>,
|
|
imageRepositoryKey=<'gcp'-for-gcp|registry-region-for-aws|registry-host-for-azure>,
|
|
stsEndpoint=<sts-endpoint>,
|
|
proxyURL=<proxy-url>
|
|
```
|
|
|
|
Multi-tenant/object-level:
|
|
|
|
```
|
|
provider=<cloud-provider-name>,
|
|
providerAudience=<cloud-provider-audience>,
|
|
serviceAccountName=<service-account-name>,
|
|
serviceAccountNamespace=<service-account-namespace>,
|
|
cloudProviderIdentity=<cloud-provider-identity>,
|
|
scopes=<comma-separated-scopes>,
|
|
imageRepositoryKey=<'gcp'-for-gcp|registry-region-for-aws|registry-host-for-azure>,
|
|
stsEndpoint=<sts-endpoint>,
|
|
proxyURL=<proxy-url>
|
|
```
|
|
|
|
##### Security Considerations and Controls
|
|
|
|
As mentioned previously, a `ServiceAccount` must have permission to impersonate the
|
|
identity it is configured to impersonate. Once a token for the impersonated identity
|
|
is issued, that token would be valid for a while even if immediately after issuing it
|
|
the `ServiceAccount` loses permission to impersonate that identity. In our cache key
|
|
design, the token would remain available for the `ServiceAccount` to use until it
|
|
expires. If the impersonation permission was revoked to mitigate an attack, the
|
|
attacker could still get a valid token from the cache for a while after the
|
|
revocation, and hence still exercise the permissions they had prior to the revocation.
|
|
|
|
There are a few mitigations for this scenario:
|
|
|
|
* Users that revoke impersonation permissions for a `ServiceAccount` must also
|
|
change the annotations of the `ServiceAccount` to impersonate a different identity,
|
|
or delete the `ServiceAccount` altogether, or restart the Flux controllers so the
|
|
cache is purged. Any of these actions would effectively prevent the attack, but
|
|
they represent an additional step after revoking the impersonation permission.
|
|
|
|
* In the Flux controllers users can specify the `--token-cache-max-duration` flag,
|
|
which can be used to limit the maximum duration for which a token can be cached.
|
|
By reducing the default maximum duration of one hour to a smaller value, users can
|
|
limit the time window during which a token would be available for a `ServiceAccount`
|
|
to use after losing permission to impersonate the identity.
|
|
|
|
* Disable cache entirely by setting the flag `--token-cache-max-size=0`, or removing
|
|
this flag altogether since the default is already zero i.e. no tokens are cached
|
|
in the Flux controller. This mitigation is in case your security requirements are
|
|
extreme and you want to avoid any risk of such an attack. This mitigation is the
|
|
most effective, but it comes with the cost of many API calls to issue tokens in
|
|
the cloud provider, which could result in a performance bottleneck and/or
|
|
throttling/rate-limiting, as tokens would have to be issued for every
|
|
reconciliation.
|
|
|
|
A similar situation could occur in the single-tenant scenario, when the permission
|
|
to impersonate the configured identity is revoked from the controller `ServiceAccount`.
|
|
In this case, the attacker would have access to the cloud provider resources that
|
|
the controller had access to prior to the revocation of the impersonation permission.
|
|
Most of the mitigations mentioned above apply to this scenario as well, except for
|
|
the one that involves changing the annotations of the `ServiceAccount` to impersonate
|
|
a different identity or deleting the `ServiceAccount` altogether, as the controller
|
|
`ServiceAccount` should not be deleted. The best mitigation in this case is to restart
|
|
the Flux controllers so the cache is purged.
|
|
|
|
**EKS Pod Identity**: In EKS Pod Identity the association between a `ServiceAccount`
|
|
and an IAM Role is not configured on the `ServiceAccount` annotations, nor anywhere
|
|
else inside the Kubernetes cluster. The association is established entirely through
|
|
the EKS/IAM APIs. In this case, all the mitigations mentioned above apply, except
|
|
for the one that involves changing the annotations of the `ServiceAccount`, as there
|
|
are no annotations to change.
|
|
|
|
### Library Integration
|
|
|
|
When reconciling an object, the controller must use the `auth.GetToken()`
|
|
function passing a `controller-runtime` client that has permission to create
|
|
`ServiceAccount` tokens in the Kubernetes API, the desired cloud provider by name,
|
|
and all the remaining options according to the configuration of the controller and
|
|
of the object. The provider names match the ones used for `spec.provider` in the Flux
|
|
APIs, i.e. `aws`, `azure` and `gcp`.
|
|
|
|
Because different cloud providers have different ways of representing their access
|
|
tokens (e.g. Azure and GCP tokens are a single opaque string while AWS has three
|
|
strings: access key ID, secret access key and token), consumers of the
|
|
`auth.Token` interface would need to cast it to `*<provider>.Token`.
|
|
|
|
The following subsections show details of how the integration would look like.
|
|
|
|
#### `GitRepository` and `ImageUpdateAutomation` APIs
|
|
|
|
For these APIs the only provider we have so far that supports workload identity
|
|
is `azure`. In this case we would simply replace `AzureOpts []azure.OptFunc` in
|
|
the `fluxcd/pkg/git.ProviderOptions` struct with `[]fluxcd/pkg/auth.Option`
|
|
and would modify `fluxcd/pkg/git.GetCredentials()` to use `auth.GetToken()`.
|
|
The token interface would be cast to `*azure.Token` and the token string would be
|
|
assigned to `fluxcd/pkg/git.Credentials.BearerToken`. A `GitRepository` object
|
|
configured with the `azure` provider and a `ServiceAccount` would then go through
|
|
this code path.
|
|
|
|
#### `OCIRepository`, `ImageRepository`, `HelmRepository` and `HelmChart` APIs
|
|
|
|
The `HelmRepository` API only supports a cloud provider for OCI repositories, so
|
|
for all these APIs we would only need to support OCI authentication.
|
|
|
|
All these APIs currently use `*fluxcd/pkg/oci/auth/login.Manager` to get the
|
|
container registry credentials. The new library would replace this library
|
|
entirely, as it mostly handles single-tenant workload identity. The new library
|
|
covers both single-tenant and multi-tenant workload identity, so it would be
|
|
a drop-in replacement for the `login.Manager`.
|
|
|
|
In the case of the source-controller APIs, all of them use the function `OIDCAuth()`
|
|
from the internal package `internal/oci`. We would replace the use of `login.Manager`
|
|
with `auth.GetToken()` in this function. The token interface would
|
|
be cast to `*auth.RegistryCredentials` and then fed to `authn.FromConfig()`
|
|
from the package `github.com/google/go-containerregistry/pkg/authn`.
|
|
|
|
In the case of `ImageRepository`, we would replace `login.Manager` with
|
|
`auth.GetToken()` in the `setAuthOptions()` method of the
|
|
`ImageRepositoryReconciler`, cast the token to `*auth.RegistryCredentials`
|
|
and then feed it to `authn.FromConfig()`.
|
|
|
|
The beauty of this particular integration is that here we no longer require
|
|
branching code paths for each cloud provider, we would just need to configure
|
|
the options for the `auth.GetToken()` function and the library would take
|
|
care of the rest.
|
|
|
|
#### `Bucket` API
|
|
|
|
##### Provider `aws`
|
|
|
|
A `Bucket` object configured with the `aws` provider and a `ServiceAccount` would
|
|
cause the internal `minio.MinioClient` of source-controller to be created with the
|
|
following new options:
|
|
|
|
* `minio.WithTokenClient(controller-runtime/pkg/client.Client)`
|
|
* `minio.WithTokenCache(*fluxcd/pkg/cache.TokenCache)`
|
|
|
|
The constructor would then use `auth.GetToken()` to get the
|
|
cloud provider access token. When doing so, the `minio.MinioClient` would
|
|
cast the token interface to `*aws.Token` and feed it to `credentials.NewStatic()`
|
|
from the package `github.com/minio/minio-go/v7/pkg/credentials`.
|
|
|
|
##### Provider `azure`
|
|
|
|
A `Bucket` object configured with the `azure` provider and a `ServiceAccount`
|
|
would cause the internal `azure.BlobClient` of source-controller to be created
|
|
with the following new options:
|
|
|
|
* `azure.WithTokenClient(controller-runtime/pkg/client.Client)`
|
|
* `azure.WithTokenCache(*fluxcd/pkg/cache.TokenCache)`
|
|
* `azure.WithServiceAccount(controller-runtime/pkg/client.ObjectKey)`
|
|
* `azure.WithInvolvedObject(*fluxcd/pkg/cache.InvolvedObject)`
|
|
|
|
The constructor would then use `azure.NewTokenCredential()` to feed this
|
|
token credential to `azblob.NewClient()`.
|
|
|
|
##### Provider `gcp`
|
|
|
|
A `Bucket` object configured with the `gcp` provider and a `ServiceAccount`
|
|
would cause the internal `gcp.GCSClient` of source-controller to be created
|
|
with the following new options:
|
|
|
|
* `gcp.WithTokenClient(controller-runtime/pkg/client.Client)`
|
|
* `gcp.WithTokenCache(*fluxcd/pkg/cache.TokenCache)`
|
|
* `gcp.WithServiceAccount(controller-runtime/pkg/client.ObjectKey)`
|
|
* `gcp.WithInvolvedObject(*fluxcd/pkg/cache.InvolvedObject)`
|
|
|
|
The constructor would then use `gcp.NewTokenSource()` to feed this token
|
|
source to the `option.WithTokenSource()` and pass it to
|
|
`cloud.google.com/go/storage.NewClient()`.
|
|
|
|
#### `Kustomization` API
|
|
|
|
The `Kustomization` API uses Key Management Services (KMS) for decrypting
|
|
SOPS secrets. The internal packages `internal/decryptor` and `internal/sops`
|
|
of kustomize-controller already use interfaces compatible with the new
|
|
library in the case of `aws` and `azure`, i.e. `*awskms.CredentialsProvider`
|
|
and `*azkv.TokenCredential` respectively, so we could easily use the helper
|
|
functions for creating the respective token sources to configure the KMS
|
|
credentials for SOPS. This is thanks to the respective SOPS libraries
|
|
`github.com/getsops/sops/v3/kms` and `github.com/getsops/sops/v3/azkv`.
|
|
For GCP we can introduce the equivalent interface that was recently added
|
|
in [this](https://github.com/getsops/sops/pull/1794/files) pull request.
|
|
This new interface introduced in SOPS upstream can also be used for the
|
|
current JSON credentials method that we use via
|
|
`google.CredentialsFromJSON().TokenSource`. This would allow us to use only
|
|
the respective token source interfaces for all three providers when using
|
|
either workload identity or secrets.
|
|
|
|
#### `Provider` API
|
|
|
|
The constructor of the internal `notifier.Factory` of notification-controller
|
|
would now accept the following new options:
|
|
|
|
* `notifier.WithTokenClient(controller-runtime/pkg/client.Client)`
|
|
* `notifier.WithTokenCache(*fluxcd/pkg/cache.TokenCache)`
|
|
* `notifier.WithServiceAccount(controller-runtime/pkg/client.ObjectKey)`
|
|
* `notifier.WithInvolvedObject(*fluxcd/pkg/cache.InvolvedObject)`
|
|
|
|
The cloud provider types that support workload identity would then use these
|
|
options. See the following subsections for details.
|
|
|
|
##### Type `azuredevops`
|
|
|
|
The `notifier.NewAzureDevOps()` constructor would use the existing and new
|
|
options to call `auth.GetToken()` and use it to get the cloud
|
|
provider access token. When doing so, the `notifier.AzureDevOps` would cast
|
|
the token interface to `*azure.Token` and feed the token string to
|
|
`NewPatConnection()` from the package
|
|
`github.com/microsoft/azure-devops-go-api/azuredevops/v6`.
|
|
|
|
##### Type `azureeventhub`
|
|
|
|
The `notifier.NewAzureEventHub()` constructor would use the existing and new
|
|
options to call `auth.GetToken()` and use it to get the cloud
|
|
provider access token. When doing so, the `notifier.AzureEventHub` would cast
|
|
the token interface to `*azure.Token` and feed the token string to `newJWTHub()`.
|
|
|
|
##### Type `googlepubsub`
|
|
|
|
The `notifier.NewGooglePubSub()` constructor would use the existing and new
|
|
options to call `gcp.NewTokenSource()` and feed this token source to the
|
|
`option.WithTokenSource()` and pass it to `cloud.google.com/go/pubsub.NewClient()`.
|
|
|
|
## Implementation History
|
|
|
|
A realistic estimate for implementing this proposal would be from two to
|
|
three Flux minor releases. This is so we can work on more pressing priorities
|
|
while still making progress towards this milestone. The implementation of
|
|
the core library would be done in the first release, and the integration
|
|
with the Flux APIs would be spread across all these releases. All the three
|
|
cloud providers should be implemented for each API getting this feature in
|
|
any given release. Our first priority should be `Kustomization`, as it is
|
|
where security is most important since it deals with secrets.
|
|
|
|
<!--
|
|
Major milestones in the lifecycle of the RFC such as:
|
|
- The first Flux release where an initial version of the RFC was available.
|
|
- The version of Flux where the RFC graduated to general availability.
|
|
- The version of Flux where the RFC was retired or superseded.
|
|
-->
|