flux2/docs/proposals/go-git-providers.md

20 KiB

go-git-providers

Abstract

This proposal aims to create a library with the import path github.com/fluxcd/go-git-providers' (import name: gitprovider), which provides an abstraction layer for talking to Git providers like GitHub, GitLab and Bitbucket.

This would become a new repository, specifically targeted at being a general-purpose Git provider client for multiple providers and domains.

Goals

  • Support multiple Git provider backends (e.g. GitHub, GitLab, Bitbucket, etc.) using the same interface
  • Support talking to multiple domains at once, including custom domains (e.g. talking to "gitlab.com" and "version.aalto.fi" from the same client)
  • Support both no authentication (for public repos), basic auth, and OAuth2 for authentication
  • Manipulating the following resources:
    • Organizations: GET, LIST (both all accessible top-level orgs and sub-orgs)
    • For a given Organization:
      • Teams: GET and LIST
    • Repositories: GET, LIST and POST
      • Team Access: LIST, POST and DELETE
      • Credentials: LIST, POST and DELETE
  • Support sub-organizations (or "sub-groups" in GitLab) if possible
  • Support reconciling an object for idempotent operations
  • Pagination is automatically handled for LIST requests
  • Transparently can manage teams (collections of users, sub-groups in Gitlab) with varying access to repos
  • Follow library best practices in order to be easy to vendor (e.g. use major vX versioning & go.mod)

Non-goals

  • Support for features not mentioned above

Design decisions

  • A context.Context should be passed to every request as the first argument
  • There should be two interfaces per resource, if applicable:
    • one collection-specific interface, with a plural name (e.g. OrganizationsClient), that has methods like Get() and List()
    • one instance-specific interface, with a singular name (e.g. OrganizationClient), that operates on that instance, e.g. allowing access to child resources, e.g. Teams()
  • Every Create() signature shall have a {Resource}CreateOptions struct as the last argument.
    • Delete() and similar methods may use the same pattern if needed
  • All *Options structs shall be passed by value (i.e. non-nillable) and contain only nillable, optional fields
  • All optional fields in the type structs shall be nillable
  • It should be possible to create a fake API client for testing, implementing the same interfaces
  • All type structs shall have a Validate() method, and optionally a Default() one
  • All type structs shall expose their internal representation (from the underlying library) through the InternalGetter interface with a method GetInternal() interface{}
  • Typed errors shall be returned, wrapped using Go 1.14's new features
  • Go-style enums are used when there are only a few supported values for a field
  • Every field is documented using Godoc comment, including +required or +optional to clearly signify its importance
  • Support serializing the types to JSON (if needed for e.g. debugging) by adding tags

Implementation

Provider package

The provider package, e.g. at github.com/fluxcd/go-git-providers/github, will have constructor methods so a client can be created, e.g. as follows:

// Create a client for github.com without any authentication
c := github.NewClient()

// Create a client for an enterprise GitHub account, without any authentication
c = github.NewClient(github.WithBaseURL("enterprise.github.com"))

// Create a client for github.com using a personal oauth2 token
c = github.NewClient(github.WithOAuth2("<token-here>"))

Client

The definition of a Client is as follows:

// Client is an interface that allows talking to a Git provider
type Client interface {
	// The Client allows accessing all known resources
	ResourceClient

	// SupportedDomain returns the supported domain
	// This field is set at client creation time, and can't be changed
	SupportedDomain() string

	// ProviderID returns the provider ID (e.g. "github", "gitlab") for this client
	// This field is set at client creation time, and can't be changed
	ProviderID() ProviderID

	// Raw returns the Go client used under the hood for accessing the Git provider
	Raw() interface{}
}

As one can see, the Client is scoped for a single backing domain. ProviderID is a typed string, and every implementation package defines their own constant, e.g. const ProviderName = gitprovider.ProviderID("github").

The ResourceClient actually allows talking to resources of the API, both for single objects, and collections:

// ResourceClient allows access to resource-specific clients
type ResourceClient interface {
	// Organization gets the OrganizationClient for the specific top-level organization
	// ErrNotTopLevelOrganization will be returned if the organization is not top-level when using
	Organization(o OrganizationRef) OrganizationClient

	// Organizations returns the OrganizationsClient handling sets of organizations
	Organizations() OrganizationsClient

	// Repository gets the RepositoryClient for the specified RepositoryRef
	Repository(r RepositoryRef) RepositoryClient

	// Repositories returns the RepositoriesClient handling sets of organizations
	Repositories() RepositoriesClient
}

In order to reference organizations and repositories, there are the OrganizationRef and RepositoryRef interfaces:

// OrganizationRef references an organization in a Git provider
type OrganizationRef interface {
	// String returns the HTTPS URL
	fmt.Stringer

	// GetDomain returns the URL-domain for the Git provider backend, e.g. gitlab.com or version.aalto.fi
	GetDomain() string
	// GetOrganization returns the top-level organization, i.e. "weaveworks" or "kubernetes-sigs"
	GetOrganization() string
	// GetSubOrganizations returns the names of sub-organizations (or sub-groups),
	// e.g. ["engineering", "frontend"] would be returned for gitlab.com/weaveworks/engineering/frontend
	GetSubOrganizations() []string
}

// RepositoryRef references a repository hosted by a Git provider
type RepositoryRef interface {
	// RepositoryRef requires an OrganizationRef to fully-qualify a repo reference
	OrganizationRef

	// GetRepository returns the name of the repository
	GetRepository() string
}

Along with these, there is OrganizationInfo and RepositoryInfo which implement the above mentioned interfaces in a straightforward way.

If you want to create an OrganizationRef or RepositoryRef, you can either use NewOrganizationInfo() or NewRepositoryInfo(), filling in all parts of the reference, or use the ParseRepositoryURL(r string) (RepositoryRef, error) or ParseOrganizationURL(o string) (OrganizationRef, error) methods.

As mentioned above, only one target domain is supported by the Client. This means e.g. that if the Client is configured for GitHub, and you feed it a GitLab URL to parse, ErrDomainUnsupported will be returned.

This brings us to a higher-level client abstraction, MultiClient.

MultiClient

In order to automatically support multiple domains and providers using the same interface, MultiClient is introduced.

The user would use the MultiClient as follows:

// Create a client to github.com without authentication
gh := github.NewClient()

// Create a client to gitlab.com, authenticating with basic auth
gl := gitlab.NewClient(gitlab.WithBasicAuth("<username>", "<password"))

// Create a client to the GitLab instance at version.aalto.fi, with a given OAuth2 token
aalto := gitlab.NewClient(gitlab.WithBaseURL("version.aalto.fi"), gitlab.WithOAuth2Token("<your-token>"))

// Create a MultiClient which supports talking to any of these backends
client := gitprovider.NewMultiClient(gh, gl, aalto)

The interface definition of MultiClient is similar to that one of Client, both embedding ResourceClient, but it also allows access to domain-specific underlying Client's:

// MultiClient allows talking to multiple Git providers at once
type MultiClient interface {
	// The MultiClient allows accessing all known resources, automatically choosing the right underlying
	// Client based on the resource's domain
	ResourceClient

	// SupportedDomains returns a list of known domains
	SupportedDomains() []string

	// ClientForDomain returns the Client used for a specific domain
	ClientForDomain(domain string) (Client, bool)
}

OrganizationsClient

The OrganizationsClient provides access to a set of organizations, as follows:

// OrganizationsClient operates on organizations the user has access to
type OrganizationsClient interface {
	// Get a specific organization the user has access to
	// This might also refer to a sub-organization
	// ErrNotFound is returned if the resource does not exist
	Get(ctx context.Context, o OrganizationRef) (*Organization, error)

	// List all top-level organizations the specific user has access to
	// List should return all available organizations, using multiple paginated requests if needed
	List(ctx context.Context) ([]Organization, error)

	// Children returns the immediate child-organizations for the specific OrganizationRef o.
	// The OrganizationRef may point to any sub-organization that exists
	// This is not supported in GitHub
	// Children should return all available organizations, using multiple paginated requests if needed
	Children(ctx context.Context, o OrganizationRef) ([]Organization, error)

	// Possibly add Create/Update/Delete methods later
}

The Organization struct is fairly straightforward for now:

// Organization represents an (top-level- or sub-) organization
type Organization struct {
	// OrganizationInfo provides the required fields
	// (Domain, Organization and SubOrganizations) required for being an OrganizationRef
	OrganizationInfo `json:",inline"`
	// InternalHolder implements the InternalGetter interface
	// +optional
	InternalHolder `json:",inline"`

	// Name is the human-friendly name of this organization, e.g. "Weaveworks" or "Kubernetes SIGs"
	// +required
	Name string `json:"name"`

	// Description returns a description for the organization
	// No default value at POST-time
	// +optional
	Description *string `json:"description"`
}

The OrganizationInfo struct is a straightforward struct just implementing the OrganizationRef interface with basic fields & getters. InternalHolder is implementing the InternalGetter interface as follows, and is embedded into all main structs:

// InternalGetter allows access to the underlying object
type InternalGetter interface {
	// GetInternal returns the underlying struct that's used
	GetInternal() interface{}
}

// InternalHolder can be embedded into other structs to implement the InternalGetter interface
type InternalHolder struct {
	// Internal contains the underlying object.
	// +optional
	Internal interface{} `json:"-"`
}

OrganizationClient

OrganizationClient allows access to a specific organization's underlying resources as follows:

// OrganizationClient operates on a given/specific organization
type OrganizationClient interface {
	// Teams gives access to the TeamsClient for this specific organization
	Teams() OrganizationTeamsClient
}

Organization Teams

Teams belonging to a certain organization can at this moment be fetched on an individual basis, or listed.

// OrganizationTeamsClient handles teams organization-wide
type OrganizationTeamsClient interface {
	// Get a team within the specific organization
	// teamName may include slashes, to point to e.g. "sub-teams" i.e. subgroups in Gitlab
	// teamName must not be an empty string
	// ErrNotFound is returned if the resource does not exist
	Get(ctx context.Context, teamName string) (*Team, error)

	// List all teams (recursively, in terms of subgroups) within the specific organization
	// List should return all available organizations, using multiple paginated requests if needed
	List(ctx context.Context) ([]Team, error)

	// Possibly add Create/Update/Delete methods later
}

The Team struct is defined as follows:

// Team is a representation for a team of users inside of an organization
type Team struct {
	// Team embeds OrganizationInfo which makes it automatically comply with OrganizationRef
	OrganizationInfo `json:",inline"`
	// Team embeds InternalHolder for accessing the underlying object
	// +optional
	InternalHolder `json:",inline"`

	// Name describes the name of the team. The team name may contain slashes
	// +required
	Name string `json:"name"`

	// Members points to a set of user names (logins) of the members of this team
	// +required
	Members []string `json:"members"`
}

In GitLab, teams could be modelled as users in a sub-group. Those users can later be added as a single unit to access a given repository.

RepositoriesClient

RepositoriesClient provides access to a set of repositories for the user.

// RepositoriesClient operates on repositories the user has access to
type RepositoriesClient interface {
	// Get returns the repository at the given path
	// ErrNotFound is returned if the resource does not exist
	Get(ctx context.Context, r RepositoryRef) (*Repository, error)

	// List all repositories in the given organization
	// List should return all available organizations, using multiple paginated requests if needed
	List(ctx context.Context, o OrganizationRef) ([]Repository, error)

	// Create creates a repository at the given organization path, with the given URL-encoded name and options
	// ErrAlreadyExists will be returned if the resource already exists
	Create(ctx context.Context, r *Repository, opts RepositoryCreateOptions) (*Repository, error)

	// Reconcile makes sure r is the actual state in the backing Git provider. If r doesn't exist
	// under the hood, it is created. If r is already the actual state, this is a no-op. If r isn't
	// the actual state, the resource will either be updated or deleted/recreated.
	Reconcile(ctx context.Context, r *Repository) error
}

RepositoryCreateOptions has options like AutoInit *bool, LicenseTemplate *string and so forth to allow an one-time initialization step.

The Repository struct is defined as follows:

// Repository represents a Git repository provided by a Git provider
type Repository struct {
	// RepositoryInfo provides the required fields
	// (Domain, Organization, SubOrganizations and RepositoryName)
	// required for being an RepositoryRef
	RepositoryInfo `json:",inline"`
	// InternalHolder implements the InternalGetter interface
	// +optional
	InternalHolder `json:",inline"`

	// Description returns a description for the repository
	// No default value at POST-time
	// +optional
	Description *string `json:"description"`

	// Visibility returns the desired visibility for the repository
	// Default value at POST-time: RepoVisibilityPrivate
	// +optional
	Visibility *RepoVisibility
}

// GetCloneURL gets the clone URL for the specified transport type
func (r *Repository) GetCloneURL(transport TransportType) string {
	return GetCloneURL(r, transport)
}

As can be seen, there is also a GetCloneURL function for the repository which allows resolving the URL from which to clone the repo, for a given transport method (ssh and https are supported TransportTypes)

RepositoryClient

RepositoryClient allows access to a given repository's underlying resources, like follows:

// RepositoryClient operates on a given/specific repository
type RepositoryClient interface {
	// TeamAccess gives access to what teams have access to this specific repository
	TeamAccess() RepositoryTeamAccessClient

	// Credentials gives access to manipulating credentials for accessing this specific repository
	Credentials() RepositoryCredentialsClient
}

Repository Teams

RepositoryTeamAccessClient allows adding & removing teams from the list of authorized persons to access a repository.

// RepositoryTeamAccessClient operates on the teams list for a specific repository
type RepositoryTeamAccessClient interface {
	// Create adds a given team to the repo's team access control list
	// ErrAlreadyExists will be returned if the resource already exists
	// The embedded RepositoryInfo of ta does not need to be populated, but if it is,
	// it must equal to the RepositoryRef given to the RepositoryClient.
	Create(ctx context.Context, ta *TeamAccess, opts RepositoryAddTeamOptions) error

	// Lists the team access control list for this repo
	List(ctx context.Context) ([]TeamAccess, error)

	// Reconcile makes sure ta is the actual state in the backing Git provider. If ta doesn't exist
	// under the hood, it is created. If ta is already the actual state, this is a no-op. If ta isn't
	// the actual state, the resource will either be updated or deleted/recreated.
	// The embedded RepositoryInfo of ta does not need to be populated, but if it is,
	// it must equal to the RepositoryRef given to the RepositoryClient.
	Reconcile(ctx context.Context, ta *TeamAccess) error

	// Delete removes the given team from the repo's team access control list
	// ErrNotFound is returned if the resource does not exist
	Delete(ctx context.Context, teamName string) error
}

The TeamAccess struct looks as follows:

// TeamAccess describes a binding between a repository and a team
type TeamAccess struct {
	// TeamAccess embeds RepositoryInfo which makes it automatically comply with RepositoryRef
	// +optional
	RepositoryInfo `json:",inline"`
	// TeamAccess embeds InternalHolder for accessing the underlying object
	// +optional
	InternalHolder `json:",inline"`

	// Name describes the name of the team. The team name may contain slashes
	// +required
	Name string `json:"name"`

	// Permission describes the permission level for which the team is allowed to operate
	// Default: read
	// Available options: See the TeamRepositoryPermission enum
	// +optional
	Permission *TeamRepositoryPermission
}

Repository Credentials

RepositoryCredentialsClient allows adding & removing credentials (e.g. deploy keys) from accessing a specific repository.

// RepositoryCredentialsClient operates on the access credential list for a specific repository
type RepositoryCredentialsClient interface {
	// Create a credential with the given human-readable name, the given bytes and optional options
	// ErrAlreadyExists will be returned if the resource already exists
	Create(ctx context.Context, c RepositoryCredential, opts CredentialCreateOptions) error

	// Lists all credentials for the given credential type
	List(ctx context.Context, t RepositoryCredentialType) ([]RepositoryCredential, error)

	// Reconcile makes sure c is the actual state in the backing Git provider. If c doesn't exist
	// under the hood, it is created. If c is already the actual state, this is a no-op. If c isn't
	// the actual state, the resource will either be updated or deleted/recreated.
	Reconcile(ctx context.Context, c RepositoryCredential) error

	// Deletes a credential from the repo. name corresponds to GetName() of the credential
	// ErrNotFound is returned if the resource does not exist
	Delete(ctx context.Context, t RepositoryCredentialType, name string) error
}

In order to support multiple different types of credentials, RepositoryCredential is an interface:

// RepositoryCredential is a credential that allows access (either read-only or read-write) to the repo
type RepositoryCredential interface {
	// GetType returns the type of the credential
	GetType() RepositoryCredentialType

	// GetName returns a name (or title/description) of the credential
	GetName() string

	// GetData returns the key that will be authorized to access the repo, this can e.g. be a SSH public key
	GetData() []byte

	// IsReadOnly returns whether this credential is authorized to write to the repository or not
	IsReadOnly() bool
}

The default implementation of RepositoryCredential is DeployKey:

// DeployKey represents a short-lived credential (e.g. an SSH public key) used for accessing a repository
type DeployKey struct {
	// DeployKey embeds InternalHolder for accessing the underlying object
	// +optional
	InternalHolder `json:",inline"`

	// Title is the human-friendly interpretation of what the key is for (and does)
	// +required
	Title string `json:"title"`

	// Key specifies the public part of the deploy (e.g. SSH) key
	// +required
	Key []byte `json:"key"`

	// ReadOnly specifies whether this DeployKey can write to the repository or not
	// Default value at POST-time: true
	// +optional
	ReadOnly *bool `json:"readOnly"`
}