This self-assessment sheet is a tool for teams implementing continuous integration and delivery (CID), to assist them in evaluating their practices, preparing a CID feature roadmap and prioritise the tasks on the roadmap.

Metrics

Do we track the 4 key metrics?

From https://www.thoughtworks.com/radar/techniques/four-key-metrics

The thorough State of DevOps reports have focused on data-driven and statistical analysis of high-performing organizations. The result of this multiyear research, published in Accelerate, demonstrates a direct link between organizational performance and software delivery performance. The researchers have determined that only four key metrics differentiate between low, medium and high performers: lead time, deployment frequency, mean time to restore (MTTR) and change fail percentage. Indeed, we’ve found that these four key metrics are a simple and yet powerful tool to help leaders and teams focus on measuring and improving what matters. A good place to start is to instrument the build pipelines so you can capture the four key metrics and make the software delivery value stream visible. GoCD pipelines, for example, provide the ability to measure these four key metrics as a first-class citizen of the GoCD analytics.

Links:

  • https://devops-research.com/research.html
  • https://itrevolution.com/book/accelerate/
  • https://www.gocd.org/

Note that adopting a reasonably short sprint term and implementing cotinuous delivery and infrastructure as code automatically optimizes these key metrics. It then removes any string incentive to actually monitor them.

Questioning Quality

Can we add new repositories to the CID pipeline?

Desired State: It is easy and reliable to add new repositories to the CID pipeline.

Rationale: For teams using a lot of repositories, which is typical for microservices architectures, it is a common operation to create a new repository.

Can we upgrade the CID software?

Desired state: The CID software is running free of bugs and security breaches.

Rationale: Bugs in CID software are hindering teams progress.

Does the CID system avoid to access STAGING and LIVE secrets?

Desired state: The CID system avoid to access STAGING and LIVE secrets.

Rationale: When the CID system accesses STAGING and LIVE secrets, it opens the door to privilege escalation from attackers gaining an illegetimate access on the CID system. When the CID system is not publicly available and a solid access policy is implemented, the privilege escalataion risk is mitigated.

Does the CID system collect garbage?

Desired state: The CID system can sustain long term operation without exhausting storage.

Rationale: When the CID exhausts storage it is not available for normal operation anymore and hinders team’s progress. When theThe CID pipeline produce software artefacts or analysis artefacts on each run. Older artefacts need to be garbage collected to avoid exhausting storage resources on the CID system.

Do we scan consumed software artefacts for security issues in the security pipeline?

Desired state: The software we deploy on live system has the latest security patches available and has no trojan horse.

Rationale: Deploying vulnerable software increases the surface available to attackers.

Do we perform automated code analysis in the security pipeline?

Desired state: We perform automated code analysis in the security pipeline according to the team goals.

Rationale: Style checking, linting, code coverage analysis and other code metrics or reports belong here. Depending on the team goals, automated code analysis results could block the deployment or not.

Do CID pipeline resources match their payload?

Desired state: CID pipeline resources can process CID jobs in a timely manner

Rationale: If pending jobs are accumulating in the CID pipeline backlog, the length of the development life-cycle and the lead-time increase accordingly.

Can we gain shell access to the CID agents on an ad-hoc basis?

Desired state: Developers can gain a shell access to the CID agents running tests.

Rationale: In the rare occasions where tests are reproducibly succesful on a developer workstation and reproducibly failing on the test pipeline, there is little left to do as to investigate the run on the CID agents themselves.

Can we automatically deploy the CID pipeline?

Desired state: Developers dispose of an automated system that can pop up a new CID pipeline on “new” computing resources.

Rationale: In the hypothesis of a total disaster (destruction, compromission) a new CID pipeline needs to be setup as quickly as possible because the CID pipeline is the responsible for deploying software in production.

Do we have stable tests?

Desired state: Test status is stable, i.e. running a test multiple times will always yield the same result.

Rationale: If tests are unstable or flacky, they are not a source of truth and are less useful to developers, if of any use at all. In some cases, it easier to allow some specific category of tests to be flacky (e.g. journey tests) instead of solving a hard problem to ensure stability (e.g. synchronisation of processes).

Do we implement continuous deployment?

Desired state: We implement continuous deployment and developers work is continuously tested and published in a sandbox environment and in production.

Rationale: Continuous deployment stimulates a good discipline of succesful deployments and reduce software inventory. It is usually the desired deployment disicipline for SaaS organisations but those operating in a different segment have different needs. For instance, team developing products requiring specific hardware or products requiring manual intervention to carry on a complete test batch need to adjust the usual continuous deployment discipline so that they still can enjoy most of its benefits in their particular use-case.

Do we implement fuzzy tests?

Desired state: We implement fuzzy tests.

Rationale: Using a fuzzer in tests improves the test coverage. Some care must be taken to ensure that tests are reproducible.

Do we implement mutation tests?

Desired state: We implement mutation tests.

Rationale: Using a code mutation tool in tests improves the test coverage.