# E2E Tests The goal is to verify that Flux integration with cloud providers are actually working now and in the future. Currently, we only have tests for Azure and GCP. ## General requirements These CLI tools need to be installed for each of the tests to run successfully. - Docker CLI for registry login. - [SOPS CLI](https://github.com/mozilla/sops) for encrypting files - kubectl for applying certain install manifests. ## Azure ### Architecture The [azure](./terraform/azure) Terraform creates the AKS cluster and related resources to run the tests. It creates: - An Azure Container Registry - An Azure Kubernetes Cluster - Two Azure DevOps repositories - Azure EventHub for sending notifications - An Azure Key Vault ### Requirements - Azure account with an active subscription to be able to create AKS and ACR, and permission to assign roles. Role assignment is required for allowing AKS workloads to access ACR. - Azure CLI, need to be logged in using `az login` as a User or as a Service Principal - An Azure DevOps organization, personal access token and ssh keys for accessing repositories within the organization. The scope required for the personal access token is: - `Project and Team` - read, write and manage access - `Code` - Full - Please take a look at the [terraform provider](https://registry.terraform.io/providers/microsoft/azuredevops/latest/docs/guides/authenticating_using_the_personal_access_token#create-a-personal-access-token) for more explanation. - Azure DevOps only supports RSA keys. Please see [documentation](https://learn.microsoft.com/en-us/azure/devops/repos/git/use-ssh-keys-to-authenticate?view=azure-devops#set-up-ssh-key-authentication) for how to set up SSH key authentication. - When using in CI, create a test user and use the test user's PAT and SSH key for all Azure DevOps interactions. To grant the test user access in Azure DevOps: - Go to `Organization Settings` on the sidebar of the organization page. - Under `General` > `Users`, click on `Add User` and input the user's email, select `Access Level` of `Basic`. - Go to `Security` > `Permissions`, click on the `User` tab. - For the invited user, set the following permissions to `Allow`: - `General: Create new project`. - The user will get an email invitation and would need to create a Microsoft account if they don't have one yet. **NOTE:** To use Service Principal (for example in CI environment), set the `ARM-*` variables in `.env`, source it and authenticate Azure CLI with: ```console $ az login --service-principal -u $ARM_CLIENT_ID -p $ARM_CLIENT_SECRET --tenant $ARM_TENANT_ID ``` ### Permissions Following permissions are needed for provisioning the infrastructure and running the tests: - `Microsoft.Kubernetes/*` - `Microsoft.Resources/*` - `Microsoft.Authorization/roleAssignments/{Read,Write,Delete}` - `Microsoft.ContainerRegistry/*` - `Microsoft.ContainerService/*` - `Microsoft.KeyVault/*` - `Microsoft.EventHub/*` ### IAM and CI setup To create the necessary IAM role with all the permissions, set up CI secrets and variables using [azure-gh-actions](https://github.com/fluxcd/test-infra/tree/main/tf-modules/azure/github-actions) use the terraform configuration below. Please make sure all the requirements of azure-gh-actions are followed before running it. ```hcl provider "github" { owner = "fluxcd" } resource "tls_private_key" "privatekey" { algorithm = "RSA" rsa_bits = 4096 } module "azure_gh_actions" { source = "git::https://github.com/fluxcd/test-infra.git//tf-modules/azure/github-actions" azure_owners = ["owner-id-1", "owner-id-2"] azure_app_name = "flux2-e2e" azure_app_description = "flux2 e2e" azure_app_secret_name = "flux2-e2e" azure_permissions = [ "Microsoft.Kubernetes/*", "Microsoft.Resources/*", "Microsoft.Authorization/roleAssignments/Read", "Microsoft.Authorization/roleAssignments/Write", "Microsoft.Authorization/roleAssignments/Delete", "Microsoft.ContainerRegistry/*", "Microsoft.ContainerService/*", "Microsoft.KeyVault/*", "Microsoft.EventHub/*" ] azure_location = "eastus" github_project = "flux2" github_secret_client_id_name = "AZ_ARM_CLIENT_ID" github_secret_client_secret_name = "AZ_ARM_CLIENT_SECRET" github_secret_subscription_id_name = "AZ_ARM_SUBSCRIPTION_ID" github_secret_tenant_id_name = "AZ_ARM_TENANT_ID" github_secret_custom = { "TF_VAR_azuredevops_org" = "", "TF_VAR_azuredevops_pat" = "", "AZURE_GITREPO_SSH_CONTENTS" = base64encode(tls_private_key.privatekey.private_key_openssh), "AZURE_GITREPO_SSH_PUB_CONTENTS" = base64encode(tls_private_key.privatekey.public_key_openssh) } } output "publickey" { value = tls_private_key.privatekey.public_key_openssh } ``` Copy the `publickey` output printed after applying, or run `terraform output` to print it again, and add it in the Azure DevOps SSH public keys under the user account that'll be used by flux in the tests. **NOTE:** The environment variables used above are for the GitHub workflow that runs the tests. Change the variable names if needed accordingly. ## GCP ### Architecture The [gcp](./terraform/gcp) terraform files create the GKE cluster and related resources to run the tests. It creates: - A Google Container Registry and Artifact Registry - A Google Kubernetes Cluster - Two Google Cloud Source Repositories - A Google Pub/Sub Topic and a subscription to the service that would be used in the tests Note: It doesn't create Google KMS keyrings and crypto keys because these cannot be destroyed. Instead, you have to pass in the crypto key and keyring that would be used to test the sops encryption in Flux. Please see `.env.sample` for the terraform variables ### Requirements - GCP account with an active project to be able to create GKE and GCR, and permission to assign roles. - Existing GCP KMS keyring and crypto key. - [Create a Keyring](https://cloud.google.com/kms/docs/create-key-ring) in `global` location. - [Create a Crypto Key](https://cloud.google.com/kms/docs/create-key) with symmetric algorithm for encryption and decryption, and software based protection level. - gcloud CLI, need to be logged in using `gcloud auth login` as a User (not a Service Account), configure application default credentials with `gcloud auth application-default login` and docker credential helper with `gcloud auth configure-docker`. **NOTE:** To use Service Account (for example in CI environment), set `GOOGLE_APPLICATION_CREDENTIALS` variable in `.env` with the path to the JSON key file, source it and authenticate gcloud CLI with: ```console $ gcloud auth activate-service-account --key-file=$GOOGLE_APPLICATION_CREDENTIALS ``` Depending on the Container/Artifact Registry host used in the test, authenticate docker accordingly ```console $ gcloud auth print-access-token | docker login -u oauth2accesstoken --password-stdin https://us-central1-docker.pkg.dev ``` In this case, the GCP client in terraform uses the Service Account to authenticate and the gcloud CLI is used only to authenticate with Google Container Registry and Google Artifact Registry. **NOTE FOR CI USAGE:** When saving the JSON key file as a CI secret, compress the file content with ```console $ cat key.json | jq -r tostring ``` to prevent aggressive masking in the logs. Refer [aggressive replacement in logs](https://github.com/google-github-actions/auth/blob/v1.1.0/docs/TROUBLESHOOTING.md#aggressive--replacement-in-logs) for more details. - Register [SSH Keys with Google Cloud](https://cloud.google.com/source-repositories/docs/authentication#ssh) - Google Cloud supports these three SSH key types: RSA (only for keys with more than 2048 bits), ECDSA and ED25519. - The SSH user doesn't have to be a member of the GCP project. The terraform setup will grant the user permissions to the repository. Visit https://source.cloud.google.com, login or create a GCP account with the SSH user's email address and add SSH keys in the account. Set this email as the value for the environment variable `TF_VAR_gcp_email` in `.env` file to be used as a terraform variable. **Note:** Google doesn't allow a SSH key to be associated with a service account email address. Therefore, there has to be an actual user that the SSH key is registered to. ### Permissions Following roles are needed for provisioning the infrastructure and running the tests: - Compute Instance Admin (v1) - `roles/compute.instanceAdmin.v1` - Kubernetes Engine Admin - `roles/container.admin` - Service Account User - `roles/iam.serviceAccountUser` - Service Account Token Creator - `roles/iam.serviceAccountTokenCreator` - Artifact Registry Administrator - `roles/artifactregistry.admin` - Artifact Registry Repository Administrator - `roles/artifactregistry.repoAdmin` - Cloud KMS Admin - `roles/cloudkms.admin` - Cloud KMS CryptoKey Encrypter - `roles/cloudkms.cryptoKeyEncrypter` - Source Repository Administrator - `roles/source.admin` - Pub/Sub Admin - `roles/pubsub.admin` ### IAM and CI setup To create the necessary IAM role with all the permissions, set up CI secrets and variables using [gcp-gh-actions](https://github.com/fluxcd/test-infra/tree/main/tf-modules/gcp/github-actions) use the terraform configuration below. Please make sure all the requirements of gcp-gh-actions are followed before running it. ```hcl provider "google" {} provider "github" { owner = "fluxcd" } resource "tls_private_key" "privatekey" { algorithm = "RSA" rsa_bits = 4096 } module "gcp_gh_actions" { source = "git::https://github.com/fluxcd/test-infra.git//tf-modules/gcp/github-actions" gcp_service_account_id = "flux2-e2e-test" gcp_service_account_name = "flux2-e2e-test" gcp_roles = [ "roles/compute.instanceAdmin.v1", "roles/container.admin", "roles/iam.serviceAccountUser", "roles/iam.serviceAccountTokenCreator", "roles/artifactregistry.admin", "roles/artifactregistry.repoAdmin", "roles/cloudkms.admin", "roles/cloudkms.cryptoKeyEncrypter", "roles/source.admin", "roles/pubsub.admin" ] github_project = "flux2" github_secret_credentials_name = "FLUX2_E2E_GOOGLE_CREDENTIALS" github_secret_custom = { "TF_VAR_gcp_keyring" = "", "TF_VAR_gcp_crypto_key" = "", "TF_VAR_gcp_email" = "", "GCP_GITREPO_SSH_CONTENTS" = base64encode(tls_private_key.privatekey.private_key_openssh), "GCP_GITREPO_SSH_PUB_CONTENTS" = base64encode(tls_private_key.privatekey.public_key_openssh) } } output "publickey" { value = tls_private_key.privatekey.public_key_openssh } ``` Copy the `publickey` output printed after applying, or run `terraform output` to print it again, and add it in the Google Source Repository SSH public keys under the user account with email address referred in `TF_VAR_gcp_email` above. **NOTE:** The environment variables used above are for the GitHub workflow that runs the tests. Change the variable names if needed accordingly. ## Tests Each test run is initiated by running `terraform apply` in the provider's terraform directory e.g terraform apply, it does this by using the [tftestenv package](https://github.com/fluxcd/test-infra/blob/main/tftestenv/testenv.go) within the `fluxcd/test-infra` repository. It then reads the output of the Terraform to get information needed for the tests like the kubernetes client ID, the cloud repository urls, the key vault ID etc. This means that a lot of the communication with the cloud provider API is offset to Terraform instead of requiring it to be implemented in the test. The following tests are currently implemented: - Flux can be successfully installed on the cluster using the Flux CLI - source-controller can clone cloud provider repositories (Azure DevOps, Google Cloud Source Repositories) (https+ssh) - image-reflector-controller can list tags from provider container Registry image repositories - kustomize-controller can decrypt secrets using SOPS and provider key vault - image-automation-controller can create branches and push to cloud repositories (https+ssh) - source-controller can pull charts from cloud provider container registry Helm repositories - notification-controller can forward events to cloud Events Service(EventHub for Azure and Google Pub/Sub) The following tests are run only for Azure since it is supported in the notification-controller: - notification-controller can send commit status to Azure DevOps ### Running tests locally 1. Ensure that you have the Flux CLI binary that is to be tested built and ready. You can build it by running `make build` at the root of this repository. The binary is located at `./bin` directory at the root and by default this is where the Makefile copies the binary for the tests from. If you have it in a different location, you can set it with the `FLUX_BINARY` variable 2. Copy `.env.sample` to `.env` and add the values for the different variables for the provider that you are running the tests for. 3. Run `make test-`, setting the location of the flux binary with `FLUX_BINARY` variable ```console $ make test-azure make test PROVIDER_ARG="-provider azure" # These two versions of podinfo are pushed to the cloud registry and used in tests for ImageUpdateAutomation mkdir -p build cp ../../bin/flux build/flux docker pull ghcr.io/stefanprodan/podinfo:6.0.0 6.0.0: Pulling from stefanprodan/podinfo Digest: sha256:e7eeab287181791d36c82c904206a845e30557c3a4a66a8143fa1a15655dae97 Status: Image is up to date for ghcr.io/stefanprodan/podinfo:6.0.0 ghcr.io/stefanprodan/podinfo:6.0.0 docker pull ghcr.io/stefanprodan/podinfo:6.0.1 6.0.1: Pulling from stefanprodan/podinfo Digest: sha256:1169f220a670cf640e45e1a7ac42dc381a441e9d4b7396432cadb75beb5b5d68 Status: Image is up to date for ghcr.io/stefanprodan/podinfo:6.0.1 ghcr.io/stefanprodan/podinfo:6.0.1 go test -timeout 60m -v ./ -existing -provider azure --tags=integration 2023/03/24 02:32:25 Setting up azure e2e test infrastructure 2023/03/24 02:32:25 Terraform binary: /usr/local/bin/terraform 2023/03/24 02:32:25 Init Terraform ....[some output has been cut out] 2023/03/24 02:39:33 helm repository condition not ready --- PASS: TestACRHelmRelease (15.31s) === RUN TestKeyVaultSops --- PASS: TestKeyVaultSops (15.98s) PASS 2023/03/24 02:40:12 Destroying environment... ok github.com/fluxcd/flux2/tests/integration 947.341s ``` In the above, the test created a build directory build/ and the flux cli binary is copied build/flux. It would be used to bootstrap Flux on the cluster. You can configure the location of the Flux CLI binary by setting the FLUX_BINARY variable. We also pull two version of `ghcr.io/stefanprodan/podinfo` image. These images are pushed to the cloud provider's Container Registry and used to test `ImageRepository` and `ImageUpdateAutomation`. The terraform resources get created and the tests are run. **IMPORTANT:** In case the terraform infrastructure results in a bad state, maybe due to a crash during the apply, the whole infrastructure can be destroyed by running terraform destroy in terraform/ directory. ### Debugging the tests For debugging environment provisioning, enable verbose output with `-verbose` test flag. ```sh make test-azure GO_TEST_ARGS="-verbose" ``` The test environment is destroyed at the end by default. Run the tests with -retain flag to retain the created test infrastructure. ```sh make test-azure GO_TEST_ARGS="-retain" ``` The tests require the infrastructure state to be clean. For re-running the tests with a retained infrastructure, set -existing flag. ```sh make test-azure GO_TEST_ARGS="-retain -existing" ``` To delete an existing infrastructure created with -retain flag: ```sh make test-azure GO_TEST_ARGS="-existing" ``` To debug issues on the cluster created by the test (provided you passed in the `-retain` flag): ```sh export KUBECONFIG=./build/kubeconfig kubectl get pods ```