1
0
mirror of synced 2026-03-01 11:16:56 +00:00

Compare commits

..

13 Commits

Author SHA1 Message Date
Paulo Gomes
35785b8a6f rfc: Add story 2 and alternatives
Signed-off-by: Paulo Gomes <paulo.gomes@weave.works>
2022-10-04 16:06:09 +01:00
Stefan Prodan
650bea497f Add proposal for adding a gating mechanism to Flux
Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>
2022-09-29 17:57:30 +03:00
Stefan Prodan
b8fd46d0df Merge pull request #3098 from Santosh1176/monitoring
[Grafana] Use `container_memory_working_set_bytes` to report memory consumption
2022-09-29 11:16:10 +03:00
Santosh Kaluskar
6a1ba3c545 monitoring: use container_memory_working_set_bytes
Signed-off-by: Santosh Kaluskar <dtshbl@gmail.com>
2022-09-29 07:49:13 +00:00
Stefan Prodan
33a874800b Merge pull request #3154 from fluxcd/rfc-0003-cosign
[RFC-0003] Add Cosign keyless specification
2022-09-29 09:42:20 +03:00
Stefan Prodan
f417352370 Add Cosign keyless specification
Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>
2022-09-29 09:20:46 +03:00
Stefan Prodan
72d90b5692 Merge pull request #3153 from fluxcd/build-go1.19
Build with Go 1.19
2022-09-29 00:21:18 +03:00
Stefan Prodan
d7dadb4425 e2e: Update bootstrap test to Kubernetes 1.25.2
Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>
2022-09-28 23:54:08 +03:00
Stefan Prodan
348408e16e Build with Go 1.19
Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>
2022-09-28 22:05:48 +03:00
Stefan Prodan
04de52044a Merge pull request #3117 from carlosonunez-vmw/main
Maintain original scheme when using --token-auth
2022-09-28 10:51:06 +03:00
Carlos Nunez
45a00a0170 Maintain original scheme when using --token-auth
If you're using an HTTP-based Git server with Flux, you need to provide `--token-auth` to avoid triggering an SSH host key check (see [here](https://github.com/fluxcd/flux2/issues/2825#issuecomment-1151355914)). Unfortunately, doing this forces the URL in the `GitRepository` resource created during bootstrapping to always use `https`. This will cause Kustomization reconcile errors for servers that do not have HTTPS enabled or do not have the appropriate certs installed or available.

This pull request fixes this by keeping the repository URL scheme intact when using `--token-auth`.

Signed-off-by: Carlos Nunez <75340335+carlosonunez-vmw@users.noreply.github.com>
2022-09-27 22:14:29 -05:00
Stefan Prodan
1ac380a7f9 Merge pull request #3145 from fluxcd/component-label
Add component label for controllers and their CRDs
2022-09-26 14:45:26 +03:00
Stefan Prodan
2971d34a13 Add component label for controllers and their CRDs
Label each controller deployment, service, service account and CRDs with `app.kubernetes.io/component: <controller-name>`.

Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>
2022-09-26 14:08:32 +03:00
27 changed files with 418 additions and 34 deletions

View File

@@ -23,12 +23,12 @@ jobs:
- name: Setup Go
uses: actions/setup-go@v3
with:
go-version: 1.18.x
go-version: 1.19.x
- name: Setup Kubernetes
uses: engineerd/setup-kind@v0.5.0
with:
version: v0.11.1
image: kindest/node:v1.21.1@sha256:69860bda5563ac81e3c0057d654b5253219618a22ec3a346306239bba8cfa1a6
version: v0.16.0
image: kindest/node:v1.25.2@sha256:9be91e9e9cdf116809841fc77ebdb8845443c4c72fe5218f3ae9eb57fdb4bace
- name: Setup Kustomize
uses: fluxcd/pkg//actions/kustomize@main
- name: Build

View File

@@ -16,7 +16,7 @@ jobs:
- name: Setup Go
uses: actions/setup-go@v3
with:
go-version: 1.18.x
go-version: 1.19.x
- name: Prepare
id: prep
run: |

View File

@@ -23,7 +23,7 @@ jobs:
- name: Setup Go
uses: actions/setup-go@v2
with:
go-version: 1.18.x
go-version: 1.19.x
- name: Install libgit2
run: |
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 648ACFD622F3D138

View File

@@ -27,7 +27,7 @@ jobs:
- name: Setup Go
uses: actions/setup-go@v3
with:
go-version: 1.18.x
go-version: 1.19.x
- name: Setup Kubernetes
uses: engineerd/setup-kind@v0.5.0
with:

View File

@@ -20,7 +20,7 @@ jobs:
- name: Setup Go
uses: actions/setup-go@v3
with:
go-version: 1.18.x
go-version: 1.19.x
- name: Setup QEMU
uses: docker/setup-qemu-action@v2
- name: Setup Docker Buildx

View File

@@ -57,7 +57,7 @@ jobs:
- name: Set up Go
uses: actions/setup-go@v2
with:
go-version: 1.18
go-version: 1.19.x
- name: Initialize CodeQL
uses: github/codeql-action/init@v2
with:

View File

@@ -16,7 +16,7 @@ jobs:
- name: Setup Go
uses: actions/setup-go@v3
with:
go-version: 1.18.x
go-version: 1.19.x
- name: Update component versions
id: update
run: |

View File

@@ -67,7 +67,7 @@ for source changes.
Prerequisites:
* go >= 1.17
* go >= 1.19
* kubectl >= 1.20
* kustomize >= 4.4
* coreutils (on Mac OS)

View File

@@ -17,8 +17,8 @@ rwildcard=$(foreach d,$(wildcard $(addsuffix *,$(1))),$(call rwildcard,$(d)/,$(2
all: test build
tidy:
go mod tidy -compat=1.18
cd tests/azure && go mod tidy -compat=1.18
go mod tidy -compat=1.19
cd tests/azure && go mod tidy -compat=1.19
fmt:
go fmt ./...

View File

@@ -192,7 +192,9 @@ func bootstrapGitCmdRun(cmd *cobra.Command, args []string) error {
// Configure repository URL to match auth config for sync.
repositoryURL.User = nil
repositoryURL.Scheme = "https"
if !gitArgs.insecureHttpAllowed {
repositoryURL.Scheme = "https"
}
} else {
secretOpts.PrivateKeyAlgorithm = sourcesecret.PrivateKeyAlgorithm(bootstrapArgs.keyAlgorithm)
secretOpts.Password = gitArgs.password

View File

@@ -79,12 +79,12 @@ type upsertable interface {
// want to update. The mutate function is nullary -- you mutate a
// value in the closure, e.g., by doing this:
//
// var existing Value
// existing.Name = name
// existing.Namespace = ns
// upsert(ctx, client, valueAdapter{&value}, func() error {
// value.Spec = onePreparedEarlier
// })
// var existing Value
// existing.Name = name
// existing.Namespace = ns
// upsert(ctx, client, valueAdapter{&value}, func() error {
// value.Spec = onePreparedEarlier
// })
func (names apiType) upsert(ctx context.Context, kubeClient client.Client, object upsertable, mutate func() error) (types.NamespacedName, error) {
nsname := types.NamespacedName{
Namespace: object.GetNamespace(),

View File

@@ -214,7 +214,6 @@ func getRowsToPrint(getAll bool, list summarisable) ([][]string, error) {
return rows, nil
}
//
// watch starts a client-side watch of one or more resources.
func (get *getCommand) watch(ctx context.Context, kubeClient client.WithWatch, cmd *cobra.Command, args []string, listOpts []client.ListOption) error {
w, err := kubeClient.Watch(ctx, get.list.asClientList(), listOpts...)

View File

@@ -4,6 +4,8 @@ resources:
- https://github.com/fluxcd/helm-controller/releases/download/v0.24.0/helm-controller.crds.yaml
- https://github.com/fluxcd/helm-controller/releases/download/v0.24.0/helm-controller.deployment.yaml
- account.yaml
transformers:
- labels.yaml
patchesJson6902:
- target:
group: apps

View File

@@ -0,0 +1,9 @@
apiVersion: builtin
kind: LabelTransformer
metadata:
name: labels
labels:
app.kubernetes.io/component: helm-controller
fieldSpecs:
- path: metadata/labels
create: true

View File

@@ -4,6 +4,8 @@ resources:
- https://github.com/fluxcd/image-automation-controller/releases/download/v0.25.0/image-automation-controller.crds.yaml
- https://github.com/fluxcd/image-automation-controller/releases/download/v0.25.0/image-automation-controller.deployment.yaml
- account.yaml
transformers:
- labels.yaml
patchesJson6902:
- target:
group: apps

View File

@@ -0,0 +1,9 @@
apiVersion: builtin
kind: LabelTransformer
metadata:
name: labels
labels:
app.kubernetes.io/component: image-automation-controller
fieldSpecs:
- path: metadata/labels
create: true

View File

@@ -4,6 +4,8 @@ resources:
- https://github.com/fluxcd/image-reflector-controller/releases/download/v0.21.0/image-reflector-controller.crds.yaml
- https://github.com/fluxcd/image-reflector-controller/releases/download/v0.21.0/image-reflector-controller.deployment.yaml
- account.yaml
transformers:
- labels.yaml
patchesJson6902:
- target:
group: apps

View File

@@ -0,0 +1,9 @@
apiVersion: builtin
kind: LabelTransformer
metadata:
name: labels
labels:
app.kubernetes.io/component: image-reflector-controller
fieldSpecs:
- path: metadata/labels
create: true

View File

@@ -4,6 +4,8 @@ resources:
- https://github.com/fluxcd/kustomize-controller/releases/download/v0.28.0/kustomize-controller.crds.yaml
- https://github.com/fluxcd/kustomize-controller/releases/download/v0.28.0/kustomize-controller.deployment.yaml
- account.yaml
transformers:
- labels.yaml
patchesJson6902:
- target:
group: apps
@@ -11,4 +13,3 @@ patchesJson6902:
kind: Deployment
name: kustomize-controller
path: patch.yaml

View File

@@ -0,0 +1,9 @@
apiVersion: builtin
kind: LabelTransformer
metadata:
name: labels
labels:
app.kubernetes.io/component: kustomize-controller
fieldSpecs:
- path: metadata/labels
create: true

View File

@@ -4,6 +4,8 @@ resources:
- https://github.com/fluxcd/notification-controller/releases/download/v0.26.0/notification-controller.crds.yaml
- https://github.com/fluxcd/notification-controller/releases/download/v0.26.0/notification-controller.deployment.yaml
- account.yaml
transformers:
- labels.yaml
patchesJson6902:
- target:
group: apps

View File

@@ -0,0 +1,9 @@
apiVersion: builtin
kind: LabelTransformer
metadata:
name: labels
labels:
app.kubernetes.io/component: notification-controller
fieldSpecs:
- path: metadata/labels
create: true

View File

@@ -4,6 +4,8 @@ resources:
- https://github.com/fluxcd/source-controller/releases/download/v0.29.0/source-controller.crds.yaml
- https://github.com/fluxcd/source-controller/releases/download/v0.29.0/source-controller.deployment.yaml
- account.yaml
transformers:
- labels.yaml
patchesJson6902:
- target:
group: apps
@@ -11,4 +13,3 @@ patchesJson6902:
kind: Deployment
name: source-controller
path: patch.yaml

View File

@@ -0,0 +1,9 @@
apiVersion: builtin
kind: LabelTransformer
metadata:
name: labels
labels:
app.kubernetes.io/component: source-controller
fieldSpecs:
- path: metadata/labels
create: true

View File

@@ -548,7 +548,7 @@
"steppedLine": false,
"targets": [
{
"expr": "rate(go_memstats_alloc_bytes_total{namespace=\"$namespace\",pod=~\".*-controller-.*\"}[1m])",
"expr": "sum(container_memory_working_set_bytes{namespace=\"$namespace\",container!=\"POD\",container!=\"\",pod=~\".*-controller-.*\"}) by (pod)",
"hide": false,
"interval": "",
"legendFormat": "{{pod}}",

View File

@@ -4,7 +4,7 @@
**Creation date:** 2022-03-31
**Last update:** 2022-08-22
**Last update:** 2022-09-28
## Summary
@@ -124,16 +124,6 @@ spec:
semver: "6.0.x"
```
To verify the authenticity of an artifact, the Sigstore cosign public key can be supplied with:
```yaml
spec:
verify:
provider: cosign
secretRef:
name: cosign-key
```
### Layer selection
By default, Flux assumes that the first layer of the OCI artifact contains the Kubernetes configuration.
@@ -224,6 +214,34 @@ controller will use a specific cloud SDK for authentication purposes. If both `s
a non-generic provider are present in the definition, the controller will use the static credentials
from the referenced secret.
### Verify artifacts
To verify the authenticity of the OCI artifacts, Flux will use the Sigstore Go SDK and implement verification
for artifacts which were either signed with keys generated by Cosign or signed using the Cosign
[keyless method](https://github.com/sigstore/cosign/blob/main/KEYLESS.md).
To enable signature verification, the Cosign public key can be supplied with:
```yaml
spec:
verify:
provider: cosign
secretRef:
name: cosign-key
```
For verifying public artifacts which are signed using the keyless method,
the `spec.verify.secretRef` field must be omitted:
```yaml
spec:
verify:
provider: cosign
```
When using the keyless method, Flux will verify the signatures in the Rekor
transparency log instance hosted at [rekor.sigstore.dev](https://rekor.sigstore.dev/).
### Reconcile artifacts
The `OCIRepository` can be used as a drop-in replacement for `GitRepository` and `Bucket` sources.

301
rfcs/XXXX-gating/README.md Normal file
View File

@@ -0,0 +1,301 @@
# RFC-XXXX Gating Flux reconciliation
**Status:** provisional
**Creation date:** 2022-09-28
**Last update:** 2022-10-04
## Summary
Flux should offer a mechanism for cluster admins and other teams involved in the release process
to manually approve the rollout of changes onto clusters. In addition, Flux should offer
a way to define maintenance time windows and other time-based gates, to allow a better control
of applications and infrastructure changes to critical system.
## Motivation
Flux watches sources (e.g. GitRepositories, OCIRepositories, HelmRepositories, S3-compatible Buckets) and
automatically reconciles the changes onto clusters as described with Flux Kustomizations and HelmReleases.
The teams involved in the delivery process (e.g. dev, qa, sre) can decide when changes are delivered
to production by reviewing and approving the proposed changes in a collaborative manner with pull request.
Once a pull request is merged onto a branch that defines the desired state of the production system,
Flux kicks off the reconciliation process.
There are situations when users want to have a gating mechanism after the desired state changes are merged in Git:
- Manual approval of container image updates (e.g. https://github.com/fluxcd/flux2/discussions/870)
- Manual approval of infrastructure upgrades (e.g. https://github.com/fluxcd/flux2/issues/959)
- Maintenance window (e.g. https://github.com/fluxcd/flux2/discussions/1004)
- Planned releases
- No Deploy Friday
### Goals
- Offer a dedicated API for defining time-based gates in a declarative manner.
- Introduce a `gating-controller` in the Flux suite that manages the `Gate` objects.
- Extend the current Flux APIs and controllers to support gating.
### Non-Goals
<!--
What is out of scope for this RFC? Listing non-goals helps to focus discussion
and make progress.
-->
## Proposal
In order to support manual gating, Flux could be extended with a dedicated API and controller
that would allow users to define `Gate` objects and perform operations like `open` and `close`.
A `Gate` object could be referenced in sources (Buckets, Git, Helm, OCI Repositories)
and syncs (Kustomizations, HelmReleases, ImageUpdateAutomation)
to block the reconciliation until the gate is opened.
A `Gate` can be opened or closed by annotating the object with a timestamp or by
calling a specific webhook receiver exposed by notification-controller.
A `Gate` can be configured to automatically close or open based on a time window defined in the `Gate` spec.
The `Gate` API would replace Flagger's current
[manual gating mechanism](https://docs.flagger.app/usage/webhooks#manual-gating).
### User Stories
#### Story 1
> As a member of the SRE team, I want to allow deployments to happen only
> in a particular time frame of my own choosing.
Define a gate that automatically closes after 1h from the time it has been opened:
```yaml
apiVersion: gating.toolkit.fluxcd.io/v1alpha1
kind: Gate
metadata:
name: sre-approval
namespace: flux-system
spec:
interval: 30s
default: closed
window: 1h
```
When the gate is created in-cluster, the `gating-controller` uses `spec.default` to set the `Opened` condition:
```yaml
apiVersion: gating.toolkit.fluxcd.io/v1alpha1
kind: Gate
metadata:
name: sre-approval
namespace: flux-system
status:
conditions:
- lastTransitionTime: "2021-03-26T10:09:26Z"
message: "Gate closed by default"
reason: ReconciliationSucceeded
status: "False"
type: Opened
```
While the gate is closed, all the objects that reference it will wait for an approval:
```yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1beta1
kind: Kustomization
metadata:
name: my-app
namespace: flux-system
spec:
gates:
- name: sre-approval
- name: qa-approval
status:
conditions:
- lastTransitionTime: "2021-03-26T10:09:26Z"
message: "Reconciliation is waiting approval, gate 'flux-system/sre-approval' is closed."
reason: GateClosed
status: "False"
type: Approved
```
The SRE team can open the gate either by annotating the gate or by calling the notification-controller webhook:
```sh
kubectl -n flux-system annotate --overwrite gate/sre-approval \
open.gate.fluxcd.io/requestedAt="$(date -u +"%Y-%m-%dT%H:%M:%SZ")"
```
The `gating-controller` extracts the ISO8601 date from the `open.gate` annotation value,
sets the `requestedAt` & `resetToDefaultAt`, and opens the gate for the specified window:
```yaml
apiVersion: gating.toolkit.fluxcd.io/v1alpha1
kind: Gate
metadata:
name: sre-approval
namespace: flux-system
status:
requestedAt: "2021-03-26T10:00:00Z"
resetToDefaultAt: "2021-03-26T11:00:00Z"
conditions:
- lastTransitionTime: "2021-03-26T10:00:00Z"
message: "Gate scheduled for closing at 2021-03-26T11:00:00Z"
reason: ReconciliationSucceeded
status: "True"
type: Opened
```
While the gate is opened, all the objects that reference it are approved to reconcile at their configured interval.
The SRE can decide to close the gate ahead of its schedule with:
```sh
kubectl -n flux-system annotate --overwrite gate/sre-approval \
close.gate.fluxcd.io/requestedAt="$(date -u +"%Y-%m-%dT%H:%M:%SZ")"
```
The `gating-controller` extracts the ISO8601 date from the `close.gate` annotation value,
compares it with the `open.gate` & `requestedAt` date and closes the gate:
```yaml
apiVersion: gating.toolkit.fluxcd.io/v1alpha1
kind: Gate
metadata:
name: sre-approval
namespace: flux-system
status:
requestedAt: "2021-03-26T10:10:00Z"
resetToDefaultAt: "2021-03-26T10:10:00Z"
conditions:
- lastTransitionTime: "2021-03-26T10:10:00Z"
message: "Gate close requested"
reason: ReconciliationSucceeded
status: "False"
type: Opened
```
The objects that are referencing this gate, will finish their ongoing reconciliation (if any) then pause.
> As a member of the SRE team, I want to block deployments in a particular time window.
To enforce a maintenance window of 24 hours, you can define a `Gate` that's opened by default:
```yaml
apiVersion: gating.toolkit.fluxcd.io/v1alpha1
kind: Gate
metadata:
name: maintenance
namespace: flux-system
spec:
interval: 30s
default: opened
window: 24h
```
To start the maintenance window you can annotate the gate with:
```sh
kubectl -n flux-system annotate --overwrite gate/maintenance \
close.gate.fluxcd.io/requestedAt="$(date -u +"%Y-%m-%dT%H:%M:%SZ")"
```
The `gating-controller` extracts the ISO8601 date from the `close.gate`
annotation value and closes the gate for the specified window:
```yaml
apiVersion: gating.toolkit.fluxcd.io/v1alpha1
kind: Gate
metadata:
name: maintenance
namespace: flux-system
status:
requestedAt: "2021-03-26T10:00:00Z"
resetToDefaultAt: "2021-03-27T10:00:00Z"
conditions:
- lastTransitionTime: "2021-03-26T10:00:00Z"
message: "Gate scheduled for opening at 2021-03-27T11:00:00Z"
reason: ReconciliationSucceeded
status: "False"
type: Opened
```
You could also schedule "No Deploy Fridays" with a CronJob that closes the `maintenance` gate at `0 0 * * FRI`.
#### Story 2
> As a member of the SRE team, I want existing deployments to still be
> reconciled during a change freeze.
Gates can be used to block Flux sources from being refreshed, resulting in Flux
to continue to reconcile existing approved desired states, whislt new changes
are held at a Flux source gate.
Example:
```yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1beta1
kind: GitRepository
metadata:
name: flux-system
namespace: flux-system
spec:
gates:
- name: change-freeze # gate that enforces a change freeze time window
status:
conditions:
- lastTransitionTime: "2022-05-26T01:12:22Z"
message: "Reconciliation is blocked as gate 'flux-system/change-freeze' is closed."
reason: GateClosed
status: "True"
type: Blocked
```
This would ensure that Gate changes would not impact the eventual consistency of
mid-flight reconciliations that were already deployed in the cluster. Flux would also
continue to re-create Flux managed objects that were manually deleted from the cluster.
### Alternatives
#### Users to implement gating outside of Flux
##### Before Flux source
Users could implement their own gating mechanisms as part of their development processes
ensuring that their custom rules are applied before the changes reach their Flux sources
(i.e. the target Git repository). For example, if deployments are not allowed on Fridays,
no PRs would be merged on those days.
The disadvantage is that some source types may not provide easy ways for users to enforce
such rules. When using different source types (e.g. Git, OCI, Helm), multiple implementations
may be required.
##### CronJobs and Flux Suspend
Users can implement a gating mechanism within Kubernetes by leveraging CronJobs and using
the built-in suspend feature in Flux that allows for a Flux object to stop being reconciled
until it is resumed. This alternative does not scale well when considering hundreds of Flux
objects.
## Design Details
<!--
This section should contain enough information that the specifics of your
change are understandable. This may include API specs and code snippets.
The design details should address at least the following questions:
- How can this feature be enabled / disabled?
- Does enabling the feature change any default behavior?
- Can the feature be disabled once it has been enabled?
- How can an operator determine if the feature is in use?
- Are there any drawbacks when enabling this feature?
-->
## Implementation History
<!--
Major milestones in the lifecycle of the RFC such as:
- The first Flux release where an initial version of the RFC was available.
- The version of Flux where the RFC graduated to general availability.
- The version of Flux where the RFC was retired or superseded.
-->