Signed-off-by: Matheus Pimenta <matheuscscp@gmail.com> |
3 days ago | |
---|---|---|
.. | ||
README.md | 3 days ago |
README.md
RFC-0010 Multi-Tenant Workload Identity
Status: implementable
Creation date: 2025-02-22
Last update: 2025-04-14
Summary
In this RFC we aim to add support for multi-tenant workload identity in Flux, i.e. the ability to specify at the object-level which set of cloud provider permissions must be used for interacting with the respective cloud provider on behalf of the reconciliation of the object. In this process, credentials must be obtained automatically, i.e. this feature must not involve the use of secrets. This would be useful in a number of Flux APIs that need to interact with cloud providers, spanning all the Flux controllers except for helm-controller.
Multi-Tenancy Model
In the context of this RFC, multi-tenancy refers to the ability of a single Flux instance running inside a Kubernetes cluster to manage Flux objects belonging to all the tenants in the cluster while still ensuring that each tenant has access only to their own resources according to the Least Privilege Principle. In this scenario a tenant is often a team inside an organization, so the reader can consider the multi-team tenancy model. Each team has their own namespaces, which are not shared with other teams.
Motivation
Flux has strong multi-tenancy features. For example, the Kustomization
and
HelmRelease
APIs support the field spec.serviceAccountName
for specifying
the Kubernetes ServiceAccount
to impersonate when interacting with the
Kubernetes API on behalf of a tenant, e.g. when applying resources. This
allows tenants to be constrained under the Kubernetes RBAC permissions
granted to this ServiceAccount
, and therefore have access only to the
specific subset of resources they should be allowed to use.
Besides the Kubernetes API, Flux also interacts with cloud providers, e.g. container registries, object storage, pub/sub services, etc. In these cases, Flux currently supports basically two modes of authentication:
- Secret-based multi-tenant authentication: Objects have the field
spec.secretRef
for specifying the KubernetesSecret
containing the credentials to use when interacting with the cloud provider. This is similar to thespec.serviceAccountName
field, but for cloud providers. The problem with this approach is that secrets are a security risk and operational burden, as they must be managed and rotated. - Workload-identity-based single-tenant authentication: Flux offers
single-tenant workload identity support by configuring the
ServiceAccount
of the Flux controllers to impersonate a cloud identity. This eliminates the need for secrets, as the credentials are obtained automatically by the cloud provider Go libraries used by the Flux controllers when they are running inside the respective cloud environment. The problem with this approach is that it is single-tenant, i.e. all objects are reconciled using the same cloud identity, the one associated with the respective controller.
For delivering the high level of security and multi-tenancy support that Flux aims for, it is necessary to extend the workload identity support to be multi-tenant. This means that each object must be able to specify which cloud identity must be impersonated when interacting with the cloud provider on behalf of the reconciliation of the object. This would allow tenants to be constrained under the cloud provider permissions granted to this identity, and therefore have access only to the specific subset of resources they are allowed to manage.
Goals
Provide multi-tenant workload identity support in Flux, i.e. the ability to specify at the object-level which cloud identity must be impersonated to interact with the respective cloud provider on behalf of the reconciliation of the object, without the need for secrets.
Non-Goals
It's not a goal to provide multi-tenant workload identity federation support. The (small) difference between workload identity and workload identity federation is that the former assumes that the workloads are running inside the cloud environment, while the latter assumes that the workloads are running outside the cloud environment. All the major cloud providers support both, as the majority of the underlying technology is the same, but the configuration is slightly different. Because the differences are small we may consider workload identity federation support in the future, but it's not a goal for this RFC.
Proposal
For supporting multi-tenant workload identity at the object-level for the Flux APIs
we propose associating the Flux objects with Kubernetes ServiceAccounts
. The
controller would need to create a token for the ServiceAccount
associated with
the object in the Kubernetes API, and then exchange it for a short-lived access
token for the cloud provider. This would require the controller ServiceAccount
to have RBAC permission to create tokens for any ServiceAccounts
in the cluster.
User Stories
Story 1
As a cluster administrator, I want to allow tenant A to pull OCI artifacts from the Amazon ECR repository belonging to tenant A, but only from this repository. At the same time, I want to allow tenant B to pull OCI artifacts from the Amazon ECR repository belonging to tenant B, but only from this repository.
For example, I would like to have the following configuration:
apiVersion: source.toolkit.fluxcd.io/v1beta2
kind: OCIRepository
metadata:
name: tenant-a-repo
namespace: tenant-a
spec:
...
provider: aws
serviceAccountName: tenant-a-ecr-sa
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: tenant-a-ecr-sa
namespace: tenant-a
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::123456789123:role/tenant-a-ecr
---
apiVersion: source.toolkit.fluxcd.io/v1beta2
kind: OCIRepository
metadata:
name: tenant-b-repo
namespace: tenant-b
spec:
...
provider: aws
serviceAccountName: tenant-b-ecr-sa
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: tenant-b-ecr-sa
namespace: tenant-b
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::123456789123:role/tenant-b-ecr
Story 2
As a cluster administrator, I want to allow tenant A to pull and push to the Git repository in Azure DevOps belonging to tenant A, but only this repository. At the same time, I want to allow tenant B to pull and push to the Git repository in Azure DevOps belonging to tenant B, but only this repository.
For example, I would like to have the following configuration:
apiVersion: source.toolkit.fluxcd.io/v1
kind: GitRepository
metadata:
name: tenant-a-repo
namespace: tenant-a
spec:
...
provider: azure
serviceAccountName: tenant-a-azure-devops-sa
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: tenant-a-azure-devops-sa
namespace: tenant-a
annotations:
azure.workload.identity/client-id: d6e4fc00-c5b2-4a72-9f84-6a92e3f06b08 # client ID for my tenant A
azure.workload.identity/tenant-id: 72f988bf-86f1-41af-91ab-2d7cd011db47 # azure tenant for the cluster (optional, defaults to the env var AZURE_TENANT_ID set in the controller)
---
apiVersion: image.toolkit.fluxcd.io/v1beta2
kind: ImageUpdateAutomation
metadata:
name: tenant-a-image-update
namespace: tenant-a
spec:
...
sourceRef:
kind: GitRepository
name: tenant-a-repo
---
apiVersion: source.toolkit.fluxcd.io/v1
kind: GitRepository
metadata:
name: tenant-b-repo
namespace: tenant-b
spec:
...
provider: azure
serviceAccountName: tenant-b-azure-devops-sa
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: tenant-b-azure-devops-sa
namespace: tenant-b
annotations:
azure.workload.identity/client-id: 4a7272f9-f186-41af-9f84-6a92e32d7cd0 # client ID for my tenant B
azure.workload.identity/tenant-id: 72f988bf-86f1-41af-91ab-2d7cd011db47 # azure tenant for the cluster (optional, defaults to the env var AZURE_TENANT_ID set in the controller)
---
apiVersion: image.toolkit.fluxcd.io/v1beta2
kind: ImageUpdateAutomation
metadata:
name: tenant-b-image-update
namespace: tenant-b
spec:
...
sourceRef:
kind: GitRepository
name: tenant-b-repo
Story 3
As a cluster administrator, I want to allow tenant A to pull manifests from the GCS bucket belonging to tenant A, but only from this bucket. At the same time, I want to allow tenant B to pull manifests from the GCS bucket belonging to tenant B, but only from this bucket.
For example, I would like to have the following configuration:
apiVersion: source.toolkit.fluxcd.io/v1
kind: Bucket
metadata:
name: tenant-a-bucket
namespace: tenant-a
spec:
...
provider: gcp
serviceAccountName: tenant-a-gcs-sa
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: tenant-a-gcs-sa
namespace: tenant-a
annotations:
iam.gke.io/gcp-service-account: tenant-a-bucket@my-org-project.iam.gserviceaccount.com
---
apiVersion: source.toolkit.fluxcd.io/v1
kind: Bucket
metadata:
name: tenant-b-bucket
namespace: tenant-b
spec:
...
provider: gcp
serviceAccountName: tenant-b-gcs-sa
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: tenant-b-gcs-sa
namespace: tenant-b
annotations:
iam.gke.io/gcp-service-account: tenant-b-bucket@my-org-project.iam.gserviceaccount.com
Story 4
As a cluster administrator, I want to allow tenant A to decrypt secrets using the AWS KMS key belonging to tenant A, but only this key. At the same time, I want to allow tenant B to decrypt secrets using the AWS KMS key belonging to tenant B, but only this key.
For example, I would like to have the following configuration:
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: tenant-a-aws-kms
namespace: tenant-a
spec:
...
decryption:
provider: sops
serviceAccountName: tenant-a-aws-kms-sa
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: tenant-a-aws-kms-sa
namespace: tenant-a
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::123456789123:role/tenant-a-kms
---
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: tenant-b-aws-kms
namespace: tenant-b
spec:
...
decryption:
provider: sops
serviceAccountName: tenant-b-aws-kms-sa
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: tenant-b-aws-kms-sa
namespace: tenant-b
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::123456789123:role/tenant-b-kms
Story 5
As a cluster administrator, I want to allow tenant A to publish notifications to the
tenant-a
topic in Google Cloud Pub/Sub, but only to this topic. At the same time, I want to allow tenant B to publish notifications to thetenant-b
topic in Google Cloud Pub/Sub, but only to this topic. I want to do so without creating any GCP IAM Service Accounts.
For example, I would like to have the following configuration:
apiVersion: notification.toolkit.fluxcd.io/v1beta3
kind: Provider
metadata:
name: tenant-a-google-pubsub
namespace: tenant-a
spec:
...
type: googlepubsub
serviceAccountName: tenant-a-google-pubsub-sa
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: tenant-a-google-pubsub-sa
namespace: tenant-a
---
apiVersion: notification.toolkit.fluxcd.io/v1beta3
kind: Provider
metadata:
name: tenant-b-google-pubsub
namespace: tenant-b
spec:
...
type: googlepubsub
serviceAccountName: tenant-b-google-pubsub-sa
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: tenant-b-google-pubsub-sa
namespace: tenant-b
Alternatives
An alternative for identifying Flux resources in cloud providers
Instead of issuing ServiceAccount
tokens in the Kubernetes API we could
come up with a username naming scheme for Flux resources and issue tokens
for these usernames instead, e.g. flux:<resource type>:<namespace>:<name>
.
This would make each Flux object have its own identity instead of using
ServiceAccounts
for this purpose. This choice would then prevent cases
of other Flux objects from malicious actors in the same namespace from
abusing the permissions granted to the ServiceAccount
of the object.
This choice, however, would provide a worse user experience, as Flux and
Kubernetes users are already used to the ServiceAccount
resource being
the identity for resources in the cluster, not only in the context of plain
RBAC but also in the context of workload identity.
This choice would also require the introduction of new APIs for configuring
the respective cloud identities in the Flux objects, when such APIs already
exist as defined by the cloud providers themselves as annotations in the
ServiceAccount
resources. We therefore choose to stick with the well-known
pattern of using ServiceAccounts
for configuring the identities of the
Flux resources. Furthermore, as mentioned in the
Multi-Tenancy Model section, the tenant trust domains
are namespaces, so a tenant is expected to control and have access to all
the resources ServiceAccounts
in their namespaces are allowed to access.
Alternatives for modifying controller RBAC to create ServiceAccount
tokens
In this section we discuss alternatives for changing the RBAC of controllers for
creating ServiceAccount
tokens cluster-wide, as it has a potential impact on
the security posture of Flux.
- We grant RBAC permissions to the
ServiceAccounts
of the Flux controllers (that would implement multi-tenant workload identity) for creating tokens for any otherServiceAccounts
in the cluster. - We require users to grant "self-impersonation" to the
ServiceAccounts
so they can create tokens for themselves. The controller would then impersonate theServiceAccount
when creating a token for it. This operation would then only succeed if theServiceAccount
has been correctly granted permission to create a token for itself.
In both alternatives the controller ServiceAccount
would require some form
of cluster-wide impersonation permission. Alternative 2 requires impersonation
permission to be granted directly to the controller ServiceAccount
, while
in alternative 1, impersonation permission would be indirectly granted by the
process of creating a token for another ServiceAccount
. By creating a token
for another ServiceAccount
, the controller ServiceAccount
effectively has
the same permissions as the ServiceAccount
it is creating the token for, as
it could simply use the token to impersonate the ServiceAccount
. Therefore
it is reasonable to affirm that both alternatives are equivalent in terms of
security.
To break the tie between the two alternatives we introduce the fact that
alternative 1 eliminates operational burden on users. In fact, native
workload identity for pods does not require users to grant this
self-impersonation permission to the ServiceAccounts
of the pods.
We therefore choose alternative 1.
Design Details
For detailing the proposal we need to first introduce the technical background on how workload identity is implemented by the managed Kubernetes services from the cloud providers.
Technical Background
Workload identity in Kubernetes is based on
OpenID Connect Discovery
(OIDC).
The Kubernetes ServiceAccount
token issuer, included as the iss
JWT claim in the
issued tokens, and represented by the default URL https://kubernetes.default.svc.cluster.local
,
implements the OIDC discovery protocol. Essentially, this means that the Kubernetes API
will respond requests to the URL
https://kubernetes.default.svc.cluster.local/.well-known/openid-configuration
with a JSON document similar to the one below:
{
"issuer": "https://kubernetes.default.svc.cluster.local",
"jwks_uri": "https://172.18.0.2:6443/openid/v1/jwks",
"response_types_supported": [
"id_token"
],
"subject_types_supported": [
"public"
],
"id_token_signing_alg_values_supported": [
"RS256"
]
}
And to the URL https://172.18.0.2:6443/openid/v1/jwks
, discovered through the field
.jwks_uri
in the JSON response above, the Kubernetes API will respond a JSON document
similar to the following:
{
"keys": [
{
"use": "sig",
"kty": "RSA",
"kid": "NWm3YKmazJPVP7tttzkmSxUn0w8LGGp7yS2CanEF-A8",
"alg": "RS256",
"n": "lV2tbw9hnz1mseah2kMQNe5sRju4mPLlK0F7np97lLNC49G8yc5TMjyciLF3qsDNFCfWyYmsuGlcRg2BIBBX_jkpIUUjlsktdHhuqO2RnOqyRtNuljlT_b0QJgpgxCqq0DHI31EBc0JALOVd6EjjlhsVvVzZOw_b9KBXVS3D3RENuT0_FWauDq5NYbyYnjlvk-vUXCRMNDQSDNwx6X6bktwsmeDRXtM_bP3DokmnMYc4n0asTEg14L6VKky0ByF88Wi1-y0Pm0BHdobDGt1cIeUDeThk4E79JCHxkT5urAyYHcNwcfU4q-tnD6bTpNkFVsk3cqqK2nF7R_7ac5arSQ",
"e": "AQAB"
}
]
}
This JSON document contains the public keys for verifying the signature of the issued tokens.
By querying these two URLs in sequence, cloud providers are able to fetch the information
required for verifying and trusting the tokens issued by the Kubernetes API. Most specifically,
for trusting the sub
JWT claim, which contains the Kubernetes ServiceAccount
reference
(name and namespace) for which the token was issued for, i.e. the ServiceAccount
properly
said.
By allowing permissions to be granted to ServiceAccounts
in the cloud provider,
the cloud provider is then able to allow Kubernetes ServiceAccounts
to access its resources.
This is usually done by a Security Token Service (STS) that exchanges the Kubernetes token
for a short-lived cloud provider access token, which is then used to access the cloud provider
resources.
It's important to mention that the Kubernetes ServiceAccount
token issuer URL must be
trusted by the cloud provider, i.e. users must configure this URL as a trusted identity
provider.
This process forms the basis for workload identity in Kubernetes. As long as the issuer URL can be reached by the cloud provider, this process can take place successfully.
The reachability of the issuer URL by the cloud provider is where the implementation of workload identity starts to differ between cloud providers. For example, in GCP one can configure the content of the JWKS document directly in the GCP IAM console, which eliminates the need for network calls to the Kubernetes API. In AWS, on the other hand, this is not possible, the process has to be followed strictly, i.e. the issuer URL must be reachable by the AWS STS service.
Furthermore, GKE automatically creates the necessary trust relationship between the Kubernetes issuer and the GCP STS service (i.e. automatically injects the JWKS document of the GKE cluster in the STS database), while in EKS this must be done manually by users (an OIDC provider must be created for each EKS cluster).
Another difference is that the issuer URL remains the default/private one in GKE,
while in EKS it is automatically set to a public one. This is done through
the --service-account-issuer
flag in the kube-apiserver
command line arguments
(docs). This is a nice feature, as it allows external
systems to federate access for workloads running in EKS clusters, e.g. EKS workloads
can have federated access to GCP resources.
Yet another difference between cloud providers that sheds light in our proposal is
how applications running inside pods from the managed Kubernetes services obtain
the short-lived cloud provider access tokens. In GCP, the GCP libraries used by
the applications attempt to retrieve tokens from the metadata server, which is
reachable by all pods running in GKE. This server creates a token for the
ServiceAccount
of the calling pod in the Kubernetes API, exchanges it for a
short-lived GCP access token, and returns it to the application. In AKS, on the
other hand, pods are mutated to include a
token volume projection. The kubelet mounts and automatically
rotates a volume with a token file inside the pod. The Azure libraries used by
the applications then read this file periodically to perform the token exchange
with the Azure STS service.
Another aspect of workload identity that is important for this RFC is how the cloud
identities are associated with the Kubernetes ServiceAccounts
. In most cases, an
identity from the IAM service of the cloud provider (e.g. a GCP IAM Service Account,
or an AWS IAM Role) is associated with a Kubernetes ServiceAccount
by the process
of impersonation. Permission to impersonate the cloud identity is granted to the
ServiceAccount
through a configuration that points to the fully qualified name of
the Kubernetes ServiceAccount
, i.e. the name and namespace of the ServiceAccount
and which cluster it belongs to in the name/address system of the cloud provider.
Because the cloud provider needs to support this impersonation permission, some
cloud providers go further and even remove the impersonation requirement, by
allowing permissions to be granted directly to ServiceAccounts
(if it needs to
support granting the impersonation permission, then it can probably also easily
support granting any other permissions depending on the implementation). GCP for
example has implemented this feature recently, a GCP IAM
Service Account is no longer required for workload identity, i.e. GCP IAM
permissions can now be granted directly to Kubernetes ServiceAccounts
. This is
a significant improvement in the user experience, as it significantly reduces
the required configuration steps. AWS implemented a similar feature called EKS
Pod Identity, but it still requires an IAM Role to be associated with the
ServiceAccount
. The minor improvement from the user experience perspective is
that this association is implemented entirely in the AWS EKS/IAM APIs, no
annotations are required in the Kubernetes ServiceAccount
. Another improvement
from this EKS feature compared to IAM Roles for Service Accounts is that users
no longer need to create an OIDC Provider for the EKS cluster in the IAM API.
In sight of the technical background presented above, our proposal becomes simpler.
The only solution to support multi-tenant workload identity at the object-level for
the Flux APIs is to associate the Flux objects with Kubernetes ServiceAccounts
.
We propose to build the ServiceAccount
token creation and exchange logic into
the Flux controllers through a library in the github.com/fluxcd/pkg
repository.
API Changes
For all the Flux APIs interacting with cloud providers (except Kustomization
,
see the paragraph below), we propose introducing the field spec.serviceAccountName
(if not already present) for specifying the Kubernetes ServiceAccount
on the same
namespace of the object that must be used for getting access to the respective cloud
resources. This field would be optional, and when not present the original behavior
would be observed, i.e. the feature only activates when the field is present and a
cloud provider among aws
, azure
or gcp
is specified in the spec.provider
field. So if only the spec.provider
field is present and set to a cloud provider,
then the controller would use single-tenant workload identity as it would prior to
the implementation of this RFC, i.e. it would use its own identity for the operation.
Note that this RFC does not seek to change the behavior when spec.provider
is set
to generic
(or left empty, when it defaults to generic
), in which case the field
spec.secretRef
can be used for specifying the Kubernetes Secret
containing the
credentials (or spec.serviceAccountName
in the case of the APIs dealing with
container registries, through the imagePullSecrets
field of the ServiceAccount
).
The Kustomization
API uses Key Management Services (KMS) for decrypting
SOPS-encrypted secrets. We propose adding the dedicated optional field
spec.decryption.serviceAccountName
for multi-tenant workload identity
when intercting with the KMS service. We choose having a dedicated field
for the Kustomization
API because the field spec.serviceAccountName
already exists and is used for a major part of the functionality which
is authenticating with the Kubernetes API when applying resources. If
we used the same field for both purposes users would be forced to use
multi-tenancy for both cloud and Kubernetes API interactions. Furthermore,
the cloud provider in the Kustomization
API is detected by the SOPS SDK
itself while decrypting the secrets, so we don't need to introduce a new
field for this purpose.
Workload Identity Library
We propose using the Go package github.com/fluxcd/pkg/auth
for implementing a workload identity library that can be
used by all the Flux controllers that need to interact
with cloud providers. This library would be responsible
for creating the ServiceAccount
tokens in the Kubernetes
API and exchanging them for short-lived access tokens
for the cloud provider. The library would also be responsible
for caching the tokens when configured by users.
The library should support both single-tenant and multi-tenant workload identity because single-tenant implementations are already supported in GA APIs and hence they must remain available for backwards compatibility. Furthermore, it would be easier to support both use cases in a single library as opposed to mingling a new library into the currently existing ones, so this new library becomes the definitive unified solution for workload identity in Flux.
The library should automatically detect whether the workload identity
is single-tenant or multi-tenant by checking if a ServiceAccount
was
configured for the operation. If a ServiceAccount
was configured, then
the operation is multi-tenant, otherwise it is single-tenant and the
granted access token must represent the identity associated with the
controller.
The directory structure would look like this:
.
└── auth
├── aws
│ └── aws.go
├── azure
│ └── azure.go
├── gcp
│ └── gcp.go
├── get_token.go
├── options.go
├── provider.go
└── token.go
The file auth/get_token.go
would contain the main algorithm:
package auth
// GetToken returns an access token for accessing resources in the given cloud provider.
func GetToken(ctx context.Context, provider Provider, opts ...Option) (Token, error) {
// 1. Check if a ServiceAccount is configured and return the controller access token if not (single-tenant WI).
// 2. Get the provider audience for creating the OIDC token for the ServiceAccount in the Kubernetes API.
// 3. Get the ServiceAccount using the configured controller-runtime client.
// 4. Get the provider identity from the ServiceAccount annotations and add it to the options.
// 5. Build the cache key using the configured options.
// 6. Get the token from the cache. If present, return it, otherwise continue.
// 7. Create an OIDC token for the ServiceAccount in the Kubernetes API using the provider audience.
// 8. Exchange the OIDC token for an access token through the Security Token Service of the provider.
// 9. If an image repository is configured, exchange the access token for a registry token.
// 10. Add the final token to the cache and return it.
}
The file auth/token.go
would contain the token abstractions:
package auth
// Token is an interface that represents an access token that can be used to
// authenticate with a cloud provider. The only common method is for getting the
// duration of the token, because different providers have different ways of
// representing the token. For example, Azure and GCP use a single string,
// while AWS uses three strings: access key ID, secret access key and token.
// Consumers of this interface should know what type to cast it to.
type Token interface {
// GetDuration returns the duration for which the token is valid relative to
// approximately time.Now(). This is used to determine when the token should
// be refreshed.
GetDuration() time.Duration
}
// RegistryCredentials is a particular type implementing the Token interface
// for credentials that can be used to authenticate with a container registry
// from a cloud provider. This type is compatible with all the cloud providers
// and should be returned when the image repository is configured in the options.
type RegistryCredentials struct {
Username string
Password string
ExpiresAt time.Time
}
func (r *RegistryCredentials) GetDuration() time.Duration {
return time.Until(r.ExpiresAt)
}
The file auth/provider.go
would contain the Provider
interface:
package auth
// Provider contains the logic to retrieve an access token for a cloud
// provider from a ServiceAccount (OIDC/JWT) token.
type Provider interface {
// GetName returns the name of the provider.
GetName() string
// NewDefaultToken returns a token that can be used to authenticate with the
// cloud provider retrieved from the default source, i.e. from the pod's
// environment, e.g. files mounted in the pod, environment variables,
// local metadata services, etc. In this case the method would implicitly
// use the ServiceAccount associated with the controller pod, and not one
// specified in the options.
NewDefaultToken(ctx context.Context, opts ...Option) (Token, error)
// GetAudience returns the audience the OIDC tokens issued representing
// ServiceAccounts should have. This is usually a string that represents
// the cloud provider's STS service, or some entity in the provider for
// which the OIDC tokens are targeted to.
GetAudience(ctx context.Context, sa corev1.ServiceAccount) (string, error)
// GetIdentity takes a ServiceAccount and returns the identity which the
// ServiceAccount wants to impersonate, by looking at annotations.
GetIdentity(sa corev1.ServiceAccount) (string, error)
// NewToken takes a ServiceAccount and its OIDC token and returns a token
// that can be used to authenticate with the cloud provider. The OIDC token is
// the JWT token that was issued for the ServiceAccount by the Kubernetes API.
// The implementation should exchange this token for a cloud provider access
// token through the provider's STS service.
NewTokenForServiceAccount(ctx context.Context, oidcToken string,
sa corev1.ServiceAccount, opts ...Option) (Token, error)
// GetImageCacheKey extracts the part of the image repository that must be
// included in cache keys when caching registry credentials for the provider.
GetImageCacheKey(imageRepository string) string
// NewRegistryToken takes an image repository and a Token and returns a token
// that can be used to authenticate with the container registry of the image.
NewRegistryToken(ctx context.Context, imageRepository string,
token Token, opts ...Option) (Token, error)
}
The file auth/options.go
would contain the following options:
package auth
// Options contains options for configuring the behavior of the provider methods.
// Not all providers/methods support all options.
type Options struct {
ServiceAccount *client.ObjectKey
Client client.Client
Cache *cache.TokenCache
InvolvedObject *cache.InvolvedObject
Scopes []string
ImageRepository string
STSEndpoint string
ProxyURL *url.URL
}
// WithServiceAccount sets the ServiceAccount reference for the token
// and a controller-runtime client to fetch the ServiceAccount and
// create an OIDC token for it in the Kubernetes API.
func WithServiceAccount(saRef client.ObjectKey, client client.Client) Option {
// ...
}
// WithCache sets the token cache and the involved object for recording events.
func WithCache(cache cache.TokenCache, involvedObject cache.InvolvedObject) Option {
// ...
}
// WithScopes sets the scopes for the token.
func WithScopes(scopes ...string) Option {
// ...
}
// WithImageRepository sets the image repository the token will be used for.
// In most cases container registry credentials require an additional
// token exchange at the end. This option allows the library to implement
// this exchange and cache the final token.
func WithImageRepository(imageRepository string) Option {
// ...
}
// WithSTSEndpoint sets the endpoint for the STS service.
func WithSTSEndpoint(stsEndpoint string) Option {
// ...
}
// WithProxyURL sets a *url.URL for an HTTP/S proxy for acquiring the token.
func WithProxyURL(proxyURL url.URL) Option {
// ...
}
The auth/aws/aws.go
, auth/azure/azure.go
and
auth/gcp/gcp.go
files would contain the implementations for
the respective cloud providers:
package aws
import (
"github.com/aws/aws-sdk-go-v2/aws"
"github.com/aws/aws-sdk-go-v2/credentials"
"github.com/aws/aws-sdk-go-v2/service/sts/types"
)
const ProviderName = "aws"
type Provider struct{}
type Token struct{ types.Credentials }
// GetDuration implements auth.Token.
func (t *Token) GetDuration() time.Duration {
return time.Until(*t.Expiration)
}
type credentialsProvider struct {
opts []auth.Option
}
// NewCredentialsProvider creates an aws.CredentialsProvider for the aws provider.
func NewCredentialsProvider(opts ...auth.Option) aws.CredentialsProvider {
return &credentialsProvider{opts}
}
// Retrieve implements aws.CredentialsProvider.
func (c *credentialsProvider) Retrieve(ctx context.Context) (aws.Credentials, error) {
// Use auth.GetToken() to get the token.
}
package azure
import (
"github.com/Azure/azure-sdk-for-go/sdk/azcore"
"github.com/Azure/azure-sdk-for-go/sdk/azcore/policy"
)
const ProviderName = "azure"
type Provider struct{}
type Token struct{ azcore.AccessToken }
// GetDuration implements auth.Token.
func (t *Token) GetDuration() time.Duration {
return time.Until(t.ExpiresOn)
}
type tokenCredential struct {
opts []auth.Option
}
// NewTokenCredential creates an azcore.TokenCredential for the azure provider.
func NewTokenCredential(opts ...auth.Option) azcore.TokenCredential {
return &tokenCredential{opts}
}
// GetToken implements azcore.TokenCredential.
// The options argument is ignored, any options should be
// specified in the constructor.
func (t *tokenCredential) GetToken(ctx context.Context, _ policy.TokenRequestOptions) (azcore.AccessToken, error) {
// Use auth.GetToken() to get the token.
}
package gcp
import (
"golang.org/x/oauth2"
)
const ProviderName = "gcp"
type Provider struct {}
type Token struct{ oauth2.Token }
// GetDuration implements auth.Token.
func (t *Token) GetDuration() time.Duration {
return time.Until(t.Expiry)
}
type tokenSource struct {
ctx context.Context
opts []auth.Option
}
// NewTokenSource creates an oauth2.TokenSource for the gcp provider.
func NewTokenSource(ctx context.Context, opts ...auth.Option) oauth2.TokenSource {
return &tokenSource{ctx, opts}
}
// Token implements oauth2.TokenSource.
func (t *tokenSource) Token() (*oauth2.Token, error) {
// Use auth.GetToken() to get the token.
}
var gkeMetadata struct {
projectID string
location string
name string
mu sync.Mutex
loaded bool
}
As detailed above, each cloud provider implementation defines a simple wrapper
around the cloud provider access token type. This wrapper implements the
auth.Token
interface, which is essentially the method GetDuration()
for the cache library to manage the token lifetime. The wrappers also contain
a helper function to create a token source for the respective cloud provider
SDKs. These methods have different names and signatures because the cloud provider
SDKs are different and have different types, but they all implement the same
concept of a token source.
The aws
provider needs to read the environment variable AWS_REGION
for
configuring the STS client. Even though a specific STS endpoint may be
configured, the AWS SDKs require the region to be set regardless. This
variable is usually set automatically in EKS pods, and can be manually set
by users otherwise (e.g. in Fargate pods).
An important detail to take into account in the azure
provider implementation
is using our custom implementation of azidentity.NewDefaultAzureCredential()
found in kustomize-controller for SOPS decryption. This custom implementation
avoids shelling out to the Azure CLI, which is something we strive to avoid in
the Flux codebase. This is important because today we are doing this in a few
APIs but not others, so it will be a significant improvement to implement this
in a single place and use it everywhere.
The gcp
provider needs to load the cluster metadata from the gke-metadata-server
in order to create tokens. This must be done lazily when the first token is
requested, and there's a very important reason for this: if this was done on
the controller startup, the controller would crash when running outside GKE and
enter CrashLoopBackOff
because the gke-metadata-server
would never be
available. This is a very important detail that must be taken into account when
implementing the gcp
provider. The cluster metadata doesn't change during the
lifetime of the controller pod, so we use a sync.Mutex
and bool
to load it
only once into a package variable.
Cache Key
The cache key must include the following components:
- The cloud provider name.
- The provider audience used for issuing the Kubernetes
ServiceAccount
token. - The optional
ServiceAccount
reference and cloud provider identity. The identity is the string representing the identity which theServiceAccount
is impersonating, e.g. forgcp
this would be a GCP IAM Service Account email, foraws
this would be an AWS IAM Role ARN, etc. When there is no identity configured for impersonation, only theServiceAccount
reference is included. - The optional scopes added to the token.
- The cache key extracted from the optional image repository.
- The optional STS endpoint used for issuing the token.
- The optional proxy URL when the STS endpoint is present.
Justification
When single-tenant workload identity is being used, the identity associated with
the controller is the one represented by the token, so there is no identity or
ServiceAccount
to identify in the cache key besides the implicit ones associated
with the controller. In this case, including only the cloud provider name in the
cache key is enough.
The provider audience used for issuing the ServiceAccount
token is included
in the cache key because it may depend on the ServiceAccount
annotations.
For example, in AWS if an IAM Role ARN is not specified we assume that users
are attempting to use EKS Pod Identity instead of IAM Roles for Service
Accounts. Each feature has its own audience string and its own way of issuing
tokens, so the audience string must be included in the cache key.
In multi-tenant workload identity, the reason for including both the ServiceAccount
and the identity in the cache key is to establish the fact that the ServiceAccount
had permission to impersonate the identity at the time when the token was issued.
This is very important. For the sake of the argument, suppose we include only the
identity. Then a malicious actor could specify any identity in their ServiceAccount
and get a token cached for that identity even if their ServiceAccount
did not have
permission to impersonate that identity. We also need to include the identity in the
cache key because, otherwise, if including only the ServiceAccount
, changes to the
ServiceAccount
annotations to impersonate a different identity would not cause a
new token impersonating the new identity to be created since the cache key did not
change.
In most cases container registry credentials require an additional token exchange
at the end. In order to benefit from caching the final token and freeing the
library consumers from this responsibility, we allow an image repository to
be included in the options and implement the exchange. Depending on the cloud
provider, a part of the image repository string is extracted and used to issue
the token, e.g. for ECR the region is extracted and used to configure the client,
and in the case of ACR the registry host is included in the resulting token.
Those parts of the image repository must be included in the cache key. This is
accomplished by the Provider.GetImageCacheKey()
method. In the case of GCP
container registries the image repository does not influence how the token is
issued.
The scopes are included in the cache key because they delimit the permissions that the token has. They don't grant the permissions, they just set an upper bound for the permissions that the token can have. Providers requiring scopes unfortunately benefit less from caching, e.g. a token issued for an Azure identity can't be seamlessly used for both Azure DevOps and the Azure Container Registry, because the respective scopes are different, so the issued tokens are different.
The STS endpoint and proxy URL are included in the cache key because they could influence how the token is fetched and ultimately issued. The proxy URL is included only when the STS endpoint is present, because all the default STS endpoints are HTTPS and belong to cloud providers, so they are all well-known, unique, and the proxy is guaranteed not to tamper with the issuance of the token since it only sees an opaque TLS session passing through.
Format
The cache key would be the SHA256 hash of the following string (breaking lines after commas for readability):
Single-tenant/controller-level:
provider=<cloud-provider-name>,
scopes=<comma-separated-scopes>,
imageRepositoryKey=<'gcp'-for-gcp|registry-region-for-aws|registry-host-for-azure>,
stsEndpoint=<sts-endpoint>,
proxyURL=<proxy-url>
Multi-tenant/object-level:
provider=<cloud-provider-name>,
providerAudience=<cloud-provider-audience>,
serviceAccountName=<service-account-name>,
serviceAccountNamespace=<service-account-namespace>,
cloudProviderIdentity=<cloud-provider-identity>,
scopes=<comma-separated-scopes>,
imageRepositoryKey=<'gcp'-for-gcp|registry-region-for-aws|registry-host-for-azure>,
stsEndpoint=<sts-endpoint>,
proxyURL=<proxy-url>
Security Considerations and Controls
As mentioned previously, a ServiceAccount
must have permission to impersonate the
identity it is configured to impersonate. Once a token for the impersonated identity
is issued, that token would be valid for a while even if immediately after issuing it
the ServiceAccount
loses permission to impersonate that identity. In our cache key
design, the token would remain available for the ServiceAccount
to use until it
expires. If the impersonation permission was revoked to mitigate an attack, the
attacker could still get a valid token from the cache for a while after the
revocation, and hence still exercise the permissions they had prior to the revocation.
There are a few mitigations for this scenario:
-
Users that revoke impersonation permissions for a
ServiceAccount
must also change the annotations of theServiceAccount
to impersonate a different identity, or delete theServiceAccount
altogether, or restart the Flux controllers so the cache is purged. Any of these actions would effectively prevent the attack, but they represent an additional step after revoking the impersonation permission. -
In the Flux controllers users can specify the
--token-cache-max-duration
flag, which can be used to limit the maximum duration for which a token can be cached. By reducing the default maximum duration of one hour to a smaller value, users can limit the time window during which a token would be available for aServiceAccount
to use after losing permission to impersonate the identity. -
Disable cache entirely by setting the flag
--token-cache-max-size=0
, or removing this flag altogether since the default is already zero i.e. no tokens are cached in the Flux controller. This mitigation is in case your security requirements are extreme and you want to avoid any risk of such an attack. This mitigation is the most effective, but it comes with the cost of many API calls to issue tokens in the cloud provider, which could result in a performance bottleneck and/or throttling/rate-limiting, as tokens would have to be issued for every reconciliation.
A similar situation could occur in the single-tenant scenario, when the permission
to impersonate the configured identity is revoked from the controller ServiceAccount
.
In this case, the attacker would have access to the cloud provider resources that
the controller had access to prior to the revocation of the impersonation permission.
Most of the mitigations mentioned above apply to this scenario as well, except for
the one that involves changing the annotations of the ServiceAccount
to impersonate
a different identity or deleting the ServiceAccount
altogether, as the controller
ServiceAccount
should not be deleted. The best mitigation in this case is to restart
the Flux controllers so the cache is purged.
EKS Pod Identity: In EKS Pod Identity the association between a ServiceAccount
and an IAM Role is not configured on the ServiceAccount
annotations, nor anywhere
else inside the Kubernetes cluster. The association is established entirely through
the EKS/IAM APIs. In this case, all the mitigations mentioned above apply, except
for the one that involves changing the annotations of the ServiceAccount
, as there
are no annotations to change.
Library Integration
When reconciling an object, the controller must use the auth.GetToken()
function passing a controller-runtime
client that has permission to create
ServiceAccount
tokens in the Kubernetes API, the desired cloud provider by name,
and all the remaining options according to the configuration of the controller and
of the object. The provider names match the ones used for spec.provider
in the Flux
APIs, i.e. aws
, azure
and gcp
.
Because different cloud providers have different ways of representing their access
tokens (e.g. Azure and GCP tokens are a single opaque string while AWS has three
strings: access key ID, secret access key and token), consumers of the
auth.Token
interface would need to cast it to *<provider>.Token
.
The following subsections show details of how the integration would look like.
GitRepository
and ImageUpdateAutomation
APIs
For these APIs the only provider we have so far that supports workload identity
is azure
. In this case we would simply replace AzureOpts []azure.OptFunc
in
the fluxcd/pkg/git.ProviderOptions
struct with []fluxcd/pkg/auth.Option
and would modify fluxcd/pkg/git.GetCredentials()
to use auth.GetToken()
.
The token interface would be cast to *azure.Token
and the token string would be
assigned to fluxcd/pkg/git.Credentials.BearerToken
. A GitRepository
object
configured with the azure
provider and a ServiceAccount
would then go through
this code path.
OCIRepository
, ImageRepository
, HelmRepository
and HelmChart
APIs
The HelmRepository
API only supports a cloud provider for OCI repositories, so
for all these APIs we would only need to support OCI authentication.
All these APIs currently use *fluxcd/pkg/oci/auth/login.Manager
to get the
container registry credentials. The new library would replace this library
entirely, as it mostly handles single-tenant workload identity. The new library
covers both single-tenant and multi-tenant workload identity, so it would be
a drop-in replacement for the login.Manager
.
In the case of the source-controller APIs, all of them use the function OIDCAuth()
from the internal package internal/oci
. We would replace the use of login.Manager
with auth.GetToken()
in this function. The token interface would
be cast to *auth.RegistryCredentials
and then fed to authn.FromConfig()
from the package github.com/google/go-containerregistry/pkg/authn
.
In the case of ImageRepository
, we would replace login.Manager
with
auth.GetToken()
in the setAuthOptions()
method of the
ImageRepositoryReconciler
, cast the token to *auth.RegistryCredentials
and then feed it to authn.FromConfig()
.
The beauty of this particular integration is that here we no longer require
branching code paths for each cloud provider, we would just need to configure
the options for the auth.GetToken()
function and the library would take
care of the rest.
Bucket
API
Provider aws
A Bucket
object configured with the aws
provider and a ServiceAccount
would
cause the internal minio.MinioClient
of source-controller to be created with the
following new options:
minio.WithTokenClient(controller-runtime/pkg/client.Client)
minio.WithTokenCache(*fluxcd/pkg/cache.TokenCache)
The constructor would then use auth.GetToken()
to get the
cloud provider access token. When doing so, the minio.MinioClient
would
cast the token interface to *aws.Token
and feed it to credentials.NewStatic()
from the package github.com/minio/minio-go/v7/pkg/credentials
.
Provider azure
A Bucket
object configured with the azure
provider and a ServiceAccount
would cause the internal azure.BlobClient
of source-controller to be created
with the following new options:
azure.WithTokenClient(controller-runtime/pkg/client.Client)
azure.WithTokenCache(*fluxcd/pkg/cache.TokenCache)
azure.WithServiceAccount(controller-runtime/pkg/client.ObjectKey)
azure.WithInvolvedObject(*fluxcd/pkg/cache.InvolvedObject)
The constructor would then use azure.NewTokenCredential()
to feed this
token credential to azblob.NewClient()
.
Provider gcp
A Bucket
object configured with the gcp
provider and a ServiceAccount
would cause the internal gcp.GCSClient
of source-controller to be created
with the following new options:
gcp.WithTokenClient(controller-runtime/pkg/client.Client)
gcp.WithTokenCache(*fluxcd/pkg/cache.TokenCache)
gcp.WithServiceAccount(controller-runtime/pkg/client.ObjectKey)
gcp.WithInvolvedObject(*fluxcd/pkg/cache.InvolvedObject)
The constructor would then use gcp.NewTokenSource()
to feed this token
source to the option.WithTokenSource()
and pass it to
cloud.google.com/go/storage.NewClient()
.
Kustomization
API
The Kustomization
API uses Key Management Services (KMS) for decrypting
SOPS secrets. The internal packages internal/decryptor
and internal/sops
of kustomize-controller already use interfaces compatible with the new
library in the case of aws
and azure
, i.e. *awskms.CredentialsProvider
and *azkv.TokenCredential
respectively, so we could easily use the helper
functions for creating the respective token sources to configure the KMS
credentials for SOPS. This is thanks to the respective SOPS libraries
github.com/getsops/sops/v3/kms
and github.com/getsops/sops/v3/azkv
.
For GCP we can introduce the equivalent interface that was recently added
in this pull request.
This new interface introduced in SOPS upstream can also be used for the
current JSON credentials method that we use via
google.CredentialsFromJSON().TokenSource
. This would allow us to use only
the respective token source interfaces for all three providers when using
either workload identity or secrets.
Provider
API
The constructor of the internal notifier.Factory
of notification-controller
would now accept the following new options:
notifier.WithTokenClient(controller-runtime/pkg/client.Client)
notifier.WithTokenCache(*fluxcd/pkg/cache.TokenCache)
notifier.WithServiceAccount(controller-runtime/pkg/client.ObjectKey)
notifier.WithInvolvedObject(*fluxcd/pkg/cache.InvolvedObject)
The cloud provider types that support workload identity would then use these options. See the following subsections for details.
Type azuredevops
The notifier.NewAzureDevOps()
constructor would use the existing and new
options to call auth.GetToken()
and use it to get the cloud
provider access token. When doing so, the notifier.AzureDevOps
would cast
the token interface to *azure.Token
and feed the token string to
NewPatConnection()
from the package
github.com/microsoft/azure-devops-go-api/azuredevops/v6
.
Type azureeventhub
The notifier.NewAzureEventHub()
constructor would use the existing and new
options to call auth.GetToken()
and use it to get the cloud
provider access token. When doing so, the notifier.AzureEventHub
would cast
the token interface to *azure.Token
and feed the token string to newJWTHub()
.
Type googlepubsub
The notifier.NewGooglePubSub()
constructor would use the existing and new
options to call gcp.NewTokenSource()
and feed this token source to the
option.WithTokenSource()
and pass it to cloud.google.com/go/pubsub.NewClient()
.
Implementation History
A realistic estimate for implementing this proposal would be from two to
three Flux minor releases. This is so we can work on more pressing priorities
while still making progress towards this milestone. The implementation of
the core library would be done in the first release, and the integration
with the Flux APIs would be spread across all these releases. All the three
cloud providers should be implemented for each API getting this feature in
any given release. Our first priority should be Kustomization
, as it is
where security is most important since it deals with secrets.