What is Continuous Integration?
A development practice where developers integrate code into a shared repository frequently. It can range from a couple of changes every day or a week to a couple of changes in one hour in larger scales.
Each piece of code (change/patch) is verified, to make the change is safe to merge. Today, it’s a common practice to test the change using an automated build that makes sure the code can integrated. It can be one build which runs several tests in different levels (unit, functional, etc.) or several separate builds that all or some has to pass in order for the change to be merged into the repository.
What is Continuous Deployment?
A development strategy used by developers to release software automatically into production where any code commit must pass through an automated testing phase. Only when this is successful is the release considered production worthy. This eliminates any human interaction and should be implemented only after production-ready pipelines have been set with real-time monitoring and reporting of deployed assets. If any issues are detected in production it should be easy to rollback to previous working state.
Can you describe an example of a CI (and/or CD) process starting the moment a developer submitted a change/PR to a repository?
There are many answers for such a question, as CI processes vary, depending on the technologies used and the type of the project to where the change was submitted. Such processes can include one or more of the following stages:
An example of one possible answer:
A developer submitted a pull request to a project. The PR (pull request) triggered two jobs (or one combined job). One job for running lint test on the change and the second job for building a package which includes the submitted change, and running multiple api/scenario tests using that package. Once all tests passed and the change was approved by a maintainer/core, it’s merged/pushed to the repository. If some of the tests failed, the change will not be allowed to merged/pushed to the repository.
A complete different answer or CI process, can describe how a developer pushes code to a repository, a workflow then triggered to build a container image and push it to the registry. Once in the registry, the k8s cluster is applied with the new changes.
What is Continuous Delivery?
A development strategy used to frequently deliver code to QA and Ops for testing. This entails having a staging area that has production like features where changes can only be accepted for production after a manual review. Because of this human entanglement there is usually a time lag between release and review making it slower and error prone as compared to continous deployment.
What CI/CD best practices are you familiar with? Or what do you consider as CI/CD best practice?
- Automated process of building, testing and deploying software
- Commit and test often
- Testing/Staging environment should be a clone of production environment
You are given a pipeline and a pool with 3 workers: virtual machine, baremetal and a container. How will you decide on which one of them to run the pipeline?
Where do you store CI/CD pipelines? Why?
There are multiple approaches as to where to store the CI/CD pipeline definitions:
- App Repository – store them in the same repository of the application they are building or testing (perhaps the most popular one)
- Central Repository – store all organization’s/project’s CI/CD pipelines in one separate repository (perhaps the best approach when multiple teams test the same set of projects and they end up having many pipelines)
- CI repo for every app repo – you separate CI related code from app code but you don’t put everything in one place (perhaps the worst option due to the maintenance)
Would you prefer a “configuration->deployment” model or “deployment->configuration”? Why?
Both have advantages and disadvantages. With “configuration->deployment” model for example, where you build one image to be used by multiple deployments, there is less chance of deployments being different from one another, so it has a clear advantage of a consistent environment.
Explain mutable vs. immutable infrastructure
In mutable infrastructure paradigm, changes are applied on top of the existing infrastructure and over time the infrastructure builds up a history of changes. Ansible, Puppet and Chef are examples of tools which follow mutable infrastructure paradigm.
In immutable infrastructure paradigm, every change is actually a new infrastructure. So a change to a server will result in a new server instead of updating it. Terraform is an example of technology which follows the immutable infrastructure paradigm.
What ways are there to distribute software? What are the advantages and disadvantages of each method?
- Source – Maintain build script within version control system so that user can build your app after cloning repository. Advantage: User can quickly checkout different versions of application. Disadvantage: requires build tools installed on users machine.
- Archive – collect all your app files into one archive (e.g. tar) and deliver it to the user. Advantage: User gets everything he needs in one file. Disadvantage: Requires repeating the same procedure when updating, not good if there are a lot of dependencies.
- Package – depends on the OS, you can use your OS package format (e.g. in RHEL/Fefodra it’s RPM) to deliver your software with a way to install, uninstall and update it using the standard packager commands. Advantages: Package manager takes care of support for installation, uninstallation, updating and dependency management.
Disadvantage: Requires managing package repository.
Images – Either VM or container images where your package is included with everything it needs in order to run successfully. Advantage: everything is preinstalled, it has high degree of environment isolation. Disadvantage: Requires knowledge of building and optimizing images.
Are you familiar with “The Cathedral and the Bazaar models”? Explain each of the models
Cathedral – source code released when software is released
Bazaar – source code is always available publicly (e.g. Linux Kernel)
What is caching? How does it works? Why is it important?
Caching is fast access to frequently used resources which are computationally expensive or IO intensive and do not change often. There can be several layers of cache that can start from CPU caches to distributed cache systems. Common ones are in memory caching and distributed caching.
Caches are typically data structures that contains some data, such as a hashtable or dictionary. However, any data structure can provide caching capabilities, like set, sorted set, sorted dictionary etc. While, caching is used in many applications, they can create subtle bugs if not implemented correctly or used correctly. For example,cache invalidation, expiration or updating is usually quite challenging and hard.
Explain stateless vs. stateful
Stateless applications don’t store any data in the host which makes it ideal for horizontal scaling and microservices. Stateful applications depend on the storage to save state and data, typically databases are stateful applications.
IAC (infrastructure as code) is a declerative approach of defining infrastructure or architecture of a system. Some implementations are ARM templates for Azure and Terraform that can work across multiple cloud providers.
How do you manage build artifacts?
What deployment strategies are you familiar with or have used?
There are several deployment strategies:
- Blue green deployment
- Canary releases
- Recreate strategy
You joined a team where everyone developing one project and the practice is to run tests locally on their workstation and push it to the repository if the tests passed. What is the problem with the process as it is now and how to improve it?
What is a configuration drift? What problems is it causing?
Configuration drift happens when in an environment of servers with the exact same configuration and software, a certain server or servers are being applied with updates or configuration which other servers don’t get and over time these servers become slightly different than all others.
How to deal with a configuration drift?
Configuration drift can be avoided with desired state configuration (DSC) implementation. Desired state configuration can be a declarative file that defined how a system should be. There are tools to enforce desired state such a terraform or azure dsc. There are incramental or complete strategies.
Explain Declarative and Procedural styles. The technologies you are familiar with (or using) are using procedural or declarative style?
Declarative – You write code that specifies the desired end state Procedural – You describe the steps to get to the desired end state
Declarative Tools – Terraform, Puppet, CloudFormation Procedural Tools – Ansible, Chef
To better emphasize the difference, consider creating two virtual instances/servers. In declarative style, you would specify two servers and the tool will figure out how to reach that state. In procedural style, you need to specify the steps to reach the end state of two instances/servers – for example, create a loop and in each iteration of the loop create one instance (running the loop twice of course).
What is GitOps?
GitLab: “GitOps is an operational framework that takes DevOps best practices used for application development such as version control, collaboration, compliance, and CI/CD tooling, and applies them to infrastructure automation”.
Explain test-driven development (TDD)
Explain agile software development
What do you think about the following sentence?: “implementing or practicing DevOps leads to more secure software”
Do you know what is a “post-mortem meeting”? What is your opinion on that?
How do you perform plan capacity for your CI/CD resources? (e.g. servers, storage, etc.)
How would you structure/implement CD for an application which depends on several other applications?
How do you measure your CI/CD quality? Are there any metrics or KPIs you are using for measuring the quality?
Do you have experience with testing cross-projects changes? (aka cross-dependency)
Have you contributed to an open source project? Tell me about this experience
What is Distributed Tracing?
What is Reliability? How does it fit DevOps?
Reliability, when used in DevOps context, is the ability of a system to recover from infrastructure failure or disruption. Part of it is also being able to scale based on your organization or team demands.
What “Availability” means? What means are there to track Availability of a service?
Describe the workflow of setting up some type of web server (Apache, IIS, Tomcat, …)
How a web server works?
Explain “Open Source”
Describe me the architecture of service/app/project/… you designed and/or implemented
What types of tests are you familiar with?
Styling, unit, functional, API, integration, smoke, scenario, …
You should be able to explain those that you mention.
You need to install periodically a package (unless it’s already exists) on different operating systems (Ubuntu, RHEL, …). How would you do it?
What Continuous Integration solution are you using/prefer and why?
What is “infrastructure as code”? What implementation of IAC are you familiar with?