2b68c8c664
Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com> |
1 month ago | |
---|---|---|
.. | ||
terraform | 1 year ago | |
.env.sample | 1 year ago | |
Makefile | 1 year ago | |
README.md | 1 year ago | |
azure_specific_test.go | 1 year ago | |
azure_test.go | 1 year ago | |
flux_test.go | 1 year ago | |
gcp_test.go | 1 year ago | |
go.mod | 1 month ago | |
go.sum | 1 month ago | |
image_repo_test.go | 8 months ago | |
notification_test.go | 1 year ago | |
oci_test.go | 8 months ago | |
sops_encryption_test.go | 1 year ago | |
suite_test.go | 8 months ago | |
util_test.go | 8 months ago |
README.md
E2E Tests
The goal is to verify that Flux integration with cloud providers are actually working now and in the future. Currently, we only have tests for Azure and GCP.
General requirements
These CLI tools need to be installed for each of the tests to run successfully.
- Docker CLI for registry login.
- SOPS CLI for encrypting files
- kubectl for applying certain install manifests.
Azure
Architecture
The azure Terraform creates the AKS cluster and related resources to run the tests. It creates:
- An Azure Container Registry
- An Azure Kubernetes Cluster
- Two Azure DevOps repositories
- Azure EventHub for sending notifications
- An Azure Key Vault
Requirements
- Azure account with an active subscription to be able to create AKS and ACR, and permission to assign roles. Role assignment is required for allowing AKS workloads to access ACR.
- Azure CLI, need to be logged in using
az login
as a User or as a Service Principal - An Azure DevOps organization, personal access token and ssh keys for accessing repositories
within the organization. The scope required for the personal access token is:
Project and Team
- read, write and manage accessCode
- Full- Please take a look at the terraform provider for more explanation.
- Azure DevOps only supports RSA keys. Please see documentation for how to set up SSH key authentication.
- When using in CI, create a test user and use the test user's PAT and SSH key
for all Azure DevOps interactions. To grant the test user access in Azure
DevOps:
- Go to
Organization Settings
on the sidebar of the organization page. - Under
General
>Users
, click onAdd User
and input the user's email, selectAccess Level
ofBasic
. - Go to
Security
>Permissions
, click on theUser
tab. - For the invited user, set the following permissions to
Allow
:General: Create new project
.
- The user will get an email invitation and would need to create a Microsoft account if they don't have one yet.
- Go to
NOTE: To use Service Principal (for example in CI environment), set the
ARM-*
variables in .env
, source it and authenticate Azure CLI with:
$ az login --service-principal -u $ARM_CLIENT_ID -p $ARM_CLIENT_SECRET --tenant $ARM_TENANT_ID
Permissions
Following permissions are needed for provisioning the infrastructure and running the tests:
Microsoft.Kubernetes/*
Microsoft.Resources/*
Microsoft.Authorization/roleAssignments/{Read,Write,Delete}
Microsoft.ContainerRegistry/*
Microsoft.ContainerService/*
Microsoft.KeyVault/*
Microsoft.EventHub/*
IAM and CI setup
To create the necessary IAM role with all the permissions, set up CI secrets and variables using azure-gh-actions use the terraform configuration below. Please make sure all the requirements of azure-gh-actions are followed before running it.
NOTE: When running the following for a repo under an organization, set the
environment variable GITHUB_ORGANIZATION
if setting the owner
in the
github
provider doesn't work.
provider "github" {
owner = "fluxcd"
}
resource "tls_private_key" "privatekey" {
algorithm = "RSA"
rsa_bits = 4096
}
module "azure_gh_actions" {
source = "git::https://github.com/fluxcd/test-infra.git//tf-modules/azure/github-actions"
azure_owners = ["owner-id-1", "owner-id-2"]
azure_app_name = "flux2-e2e"
azure_app_description = "flux2 e2e"
azure_app_secret_name = "flux2-e2e"
azure_permissions = [
"Microsoft.Kubernetes/*",
"Microsoft.Resources/*",
"Microsoft.Authorization/roleAssignments/Read",
"Microsoft.Authorization/roleAssignments/Write",
"Microsoft.Authorization/roleAssignments/Delete",
"Microsoft.ContainerRegistry/*",
"Microsoft.ContainerService/*",
"Microsoft.KeyVault/*",
"Microsoft.EventHub/*"
]
azure_location = "eastus"
github_project = "flux2"
github_secret_client_id_name = "AZ_ARM_CLIENT_ID"
github_secret_client_secret_name = "AZ_ARM_CLIENT_SECRET"
github_secret_subscription_id_name = "AZ_ARM_SUBSCRIPTION_ID"
github_secret_tenant_id_name = "AZ_ARM_TENANT_ID"
github_secret_custom = {
"TF_VAR_azuredevops_org" = "<azuredevops-org-name>",
"TF_VAR_azuredevops_pat" = "<azuredevops-pat>",
"AZURE_GITREPO_SSH_CONTENTS" = base64encode(tls_private_key.privatekey.private_key_openssh),
"AZURE_GITREPO_SSH_PUB_CONTENTS" = base64encode(tls_private_key.privatekey.public_key_openssh)
}
}
output "publickey" {
value = tls_private_key.privatekey.public_key_openssh
}
Copy the publickey
output printed after applying, or run terraform output
to
print it again, and add it in the Azure DevOps SSH public keys under the user
account that'll be used by flux in the tests.
NOTE: The environment variables used above are for the GitHub workflow that runs the tests. Change the variable names if needed accordingly.
GCP
Architecture
The gcp terraform files create the GKE cluster and related resources to run the tests. It creates:
- A Google Container Registry and Artifact Registry
- A Google Kubernetes Cluster
- Two Google Cloud Source Repositories
- A Google Pub/Sub Topic and a subscription to the service that would be used in the tests
Note: It doesn't create Google KMS keyrings and crypto keys because these cannot be destroyed. Instead, you have
to pass in the crypto key and keyring that would be used to test the sops encryption in Flux. Please see .env.sample
for the terraform variables
Requirements
-
GCP account with an active project to be able to create GKE and GCR, and permission to assign roles.
-
Existing GCP KMS keyring and crypto key.
- Create a Keyring in
global
location. - Create a Crypto Key with symmetric algorithm for encryption and decryption, and software based protection level.
- Create a Keyring in
-
gcloud CLI, need to be logged in using
gcloud auth login
as a User (not a Service Account), configure application default credentials withgcloud auth application-default login
and docker credential helper withgcloud auth configure-docker
.NOTE: To use Service Account (for example in CI environment), set
GOOGLE_APPLICATION_CREDENTIALS
variable in.env
with the path to the JSON key file, source it and authenticate gcloud CLI with:$ gcloud auth activate-service-account --key-file=$GOOGLE_APPLICATION_CREDENTIALS
Depending on the Container/Artifact Registry host used in the test, authenticate docker accordingly
$ gcloud auth print-access-token | docker login -u oauth2accesstoken --password-stdin https://us-central1-docker.pkg.dev
In this case, the GCP client in terraform uses the Service Account to authenticate and the gcloud CLI is used only to authenticate with Google Container Registry and Google Artifact Registry.
NOTE FOR CI USAGE: When saving the JSON key file as a CI secret, compress the file content with
$ cat key.json | jq -r tostring
to prevent aggressive masking in the logs. Refer aggressive replacement in logs for more details.
-
Register SSH Keys with Google Cloud
- Google Cloud supports these three SSH key types: RSA (only for keys with more than 2048 bits), ECDSA and ED25519.
- The SSH user doesn't have to be a member of the GCP project. The terraform
setup will grant the user permissions to the repository. Visit
https://source.cloud.google.com, login or create a GCP account with the SSH
user's email address and add SSH keys in the account. Set this email as the
value for the environment variable
TF_VAR_gcp_email
in.env
file to be used as a terraform variable.
Note: Google doesn't allow a SSH key to be associated with a service account email address. Therefore, there has to be an actual user that the SSH key is registered to.
Permissions
Following roles are needed for provisioning the infrastructure and running the tests:
- Compute Instance Admin (v1) -
roles/compute.instanceAdmin.v1
- Kubernetes Engine Admin -
roles/container.admin
- Service Account User -
roles/iam.serviceAccountUser
- Service Account Token Creator -
roles/iam.serviceAccountTokenCreator
- Artifact Registry Administrator -
roles/artifactregistry.admin
- Artifact Registry Repository Administrator -
roles/artifactregistry.repoAdmin
- Cloud KMS Admin -
roles/cloudkms.admin
- Cloud KMS CryptoKey Encrypter -
roles/cloudkms.cryptoKeyEncrypter
- Source Repository Administrator -
roles/source.admin
- Pub/Sub Admin -
roles/pubsub.admin
IAM and CI setup
To create the necessary IAM role with all the permissions, set up CI secrets and variables using gcp-gh-actions use the terraform configuration below. Please make sure all the requirements of gcp-gh-actions are followed before running it.
NOTE: When running the following for a repo under an organization, set the
environment variable GITHUB_ORGANIZATION
if setting the owner
in the
github
provider doesn't work.
provider "google" {}
provider "github" {
owner = "fluxcd"
}
resource "tls_private_key" "privatekey" {
algorithm = "RSA"
rsa_bits = 4096
}
module "gcp_gh_actions" {
source = "git::https://github.com/fluxcd/test-infra.git//tf-modules/gcp/github-actions"
gcp_service_account_id = "flux2-e2e-test"
gcp_service_account_name = "flux2-e2e-test"
gcp_service_account_description = "For running fluxcd/flux2 e2e tests."
gcp_roles = [
"roles/compute.instanceAdmin.v1",
"roles/container.admin",
"roles/iam.serviceAccountUser",
"roles/iam.serviceAccountTokenCreator",
"roles/artifactregistry.admin",
"roles/artifactregistry.repoAdmin",
"roles/cloudkms.admin",
"roles/cloudkms.cryptoKeyEncrypter",
"roles/source.admin",
"roles/pubsub.admin"
]
github_project = "flux2"
github_secret_credentials_name = "FLUX2_E2E_GOOGLE_CREDENTIALS"
github_secret_custom = {
"TF_VAR_gcp_keyring" = "<keyring-name>",
"TF_VAR_gcp_crypto_key" = "<key-name>",
"TF_VAR_gcp_email" = "<email>",
"GCP_GITREPO_SSH_CONTENTS" = base64encode(tls_private_key.privatekey.private_key_openssh),
"GCP_GITREPO_SSH_PUB_CONTENTS" = base64encode(tls_private_key.privatekey.public_key_openssh)
}
}
output "publickey" {
value = tls_private_key.privatekey.public_key_openssh
}
Copy the publickey
output printed after applying, or run terraform output
to
print it again, and add it in the Google Source Repository SSH public keys under
the user account with email address referred in TF_VAR_gcp_email
above.
NOTE: The environment variables used above are for the GitHub workflow that runs the tests. Change the variable names if needed accordingly.
Tests
Each test run is initiated by running terraform apply
in the provider's terraform directory e.g terraform apply,
it does this by using the tftestenv package
within the fluxcd/test-infra
repository. It then reads the output of the Terraform to get information needed
for the tests like the kubernetes client ID, the cloud repository urls, the key vault ID etc. This means that
a lot of the communication with the cloud provider API is offset to Terraform instead of requiring it to be implemented in the test.
The following tests are currently implemented:
- Flux can be successfully installed on the cluster using the Flux CLI
- source-controller can clone cloud provider repositories (Azure DevOps, Google Cloud Source Repositories) (https+ssh)
- image-reflector-controller can list tags from provider container Registry image repositories
- kustomize-controller can decrypt secrets using SOPS and provider key vault
- image-automation-controller can create branches and push to cloud repositories (https+ssh)
- source-controller can pull charts from cloud provider container registry Helm repositories
- notification-controller can forward events to cloud Events Service(EventHub for Azure and Google Pub/Sub)
The following tests are run only for Azure since it is supported in the notification-controller:
- notification-controller can send commit status to Azure DevOps
Running tests locally
- Ensure that you have the Flux CLI binary that is to be tested built and ready. You can build it by running
make build
at the root of this repository. The binary is located at./bin
directory at the root and by default this is where the Makefile copies the binary for the tests from. If you have it in a different location, you can set it with theFLUX_BINARY
variable - Copy
.env.sample
to.env
and add the values for the different variables for the provider that you are running the tests for. - Run
make test-<provider>
, setting the location of the flux binary withFLUX_BINARY
variable
$ make test-azure
make test PROVIDER_ARG="-provider azure"
# These two versions of podinfo are pushed to the cloud registry and used in tests for ImageUpdateAutomation
mkdir -p build
cp ../../bin/flux build/flux
docker pull ghcr.io/stefanprodan/podinfo:6.0.0
6.0.0: Pulling from stefanprodan/podinfo
Digest: sha256:e7eeab287181791d36c82c904206a845e30557c3a4a66a8143fa1a15655dae97
Status: Image is up to date for ghcr.io/stefanprodan/podinfo:6.0.0
ghcr.io/stefanprodan/podinfo:6.0.0
docker pull ghcr.io/stefanprodan/podinfo:6.0.1
6.0.1: Pulling from stefanprodan/podinfo
Digest: sha256:1169f220a670cf640e45e1a7ac42dc381a441e9d4b7396432cadb75beb5b5d68
Status: Image is up to date for ghcr.io/stefanprodan/podinfo:6.0.1
ghcr.io/stefanprodan/podinfo:6.0.1
go test -timeout 60m -v ./ -existing -provider azure --tags=integration
2023/03/24 02:32:25 Setting up azure e2e test infrastructure
2023/03/24 02:32:25 Terraform binary: /usr/local/bin/terraform
2023/03/24 02:32:25 Init Terraform
....[some output has been cut out]
2023/03/24 02:39:33 helm repository condition not ready
--- PASS: TestACRHelmRelease (15.31s)
=== RUN TestKeyVaultSops
--- PASS: TestKeyVaultSops (15.98s)
PASS
2023/03/24 02:40:12 Destroying environment...
ok github.com/fluxcd/flux2/tests/integration 947.341s
In the above, the test created a build directory build/ and the flux cli binary is copied build/flux. It would be used
to bootstrap Flux on the cluster. You can configure the location of the Flux CLI binary by setting the FLUX_BINARY variable.
We also pull two version of ghcr.io/stefanprodan/podinfo
image. These images are pushed to the cloud provider's
Container Registry and used to test ImageRepository
and ImageUpdateAutomation
. The terraform resources get created
and the tests are run.
If not configured explicitly to retain the infrastructure, at the end of the
test, the test infrastructure is deleted. In case of any failure due to which
the resources don't get deleted, the make destroy-*
commands can be run for
the respective provider. This will run terraform destroy in the respective
provider's terraform configuration directory. This can be used to quickly
destroy the infrastructure without going through the provision-test-destroy
steps.
Debugging the tests
For debugging environment provisioning, enable verbose output with -verbose
test flag.
make test-azure GO_TEST_ARGS="-verbose"
The test environment is destroyed at the end by default. Run the tests with -retain flag to retain the created test infrastructure.
make test-azure GO_TEST_ARGS="-retain"
The tests require the infrastructure state to be clean. For re-running the tests with a retained infrastructure, set -existing flag.
make test-azure GO_TEST_ARGS="-retain -existing"
To delete an existing infrastructure created with -retain flag:
make test-azure GO_TEST_ARGS="-existing"
To debug issues on the cluster created by the test (provided you passed in the -retain
flag):
export KUBECONFIG=./build/kubeconfig
kubectl get pods