As a Developer, it's easy to get caught up in the every day life of development without taking a step back and looking at the big picture of why/how the business works. In reality though, every bit I do as a developer: the code quality, the time I spend, and the visual work I can show, all impact business for our company. Continuous Integration (CI) helps me as a developer help business by providing tools that fulfill those necessities. In order to understand how CI provides quaility code, effeciency, and shows prowess as a development company, you must understand what CI is exactly, and get a brief overview of how developers use these tools to make the day to day business run smoothly.
There are several common misconceptions about what Continuous Integration actually is. When you hear about it in your favorite blog or in passing, you hear about Automated Testing, Test Driven Development or Automation. Truly, these all can be parts of your CI process, and possibly can benefit your business, but aren't necessarily what you are looking for.
Continuous Integration is the process of literally Continuously Integrating the code that your developers create using a set of best practices guided by a list of 10 principles. These principles are:
- Revision Control
- Build Automation
- Automated Deployment
- Self-testing Builds
- Testing in a Clone of Production
- Frequent Commits
- Code Consolidation
- Fast Builds
- Build Availability
- Test Result Availability
Each of these complement the other, and there are some that I personally would consider more important than others when you are looking for ways to get your money's worth. Another factor in which principles you should use for a particular project should lie in how big your project is. Taking these into mind, I've tried to order them most to least beneficial in the grand scheme of things for convenience. Notice, however, that this is not the end all ruling. You very well may find yourself benefiting more from something at the bottom of the list on one project and not as much on the next.
Revision Control has been the very basic standard for development firms for the last couple of decades. This is almost as close to a gimme as you could get. The purpose of revision control (also known as version control and source control) is mainly used to manage changes made to the files that make up the code for your website. With version control, it is easy to change back and forth from the changes I made in December 19, 2012 at 2:00 pm with a single command, provided I commited a change at that time and then switch back. The normal process here without revision control would be countless folders with various versions of code from different dates. One of the biggest losses you have when you employ such a strategy as that is the ability to have multiple people work on an item and then later merge that item together.
In short, Revision Control not only makes your code base manageable when dealing with making/moving between changes, it also makes it easy for your developers to intergrate their code with one another. This is the root of Continuous Integration, thus I believe it to be the must important and best investment (though most version control systems are free) that you could make on behalf of your developers for your business.
Revision Control allows you to do things such as Build Automation easily as well.
Build Automation refers to the ability to either automatically trigger and/or build a clean version of your product via a single command or action from raw components. Some firms take this to the extreme and allow their systems to built solely from the code that is obtained from their version control. Having Build Automation can take a 30 minute task and turn it into several seconds. By establishing the patterns that your developers perform to get a clean or new system up and going (whether this be a production, staging, or development build) will save you lots of time later on down the road.
Now this isn't to say that there isn't risk. There are times when the process for building must change and the way your product is built will need to change with it. Also taking the time to identify your product's process for building may take some time, which will front load the costs that you are usually expecting to be spread out. But in the end, the efficiency and ability to get new employees up and going quickly with an environment is well worth it, particularly in bigger shops.
Build Automation allows you to automate the deployment of your product as well.
Automated Deployment is the process of pushing a product to various environments on a trigger and in a consistent manner. In addition to the speed you gained from your build automation you developed, you know exactly what to expect every time you deploy an environment with much faster results.
What this means for you is, your client is offline less time, you are mucking around with their system less, and are touching less of their system. With an automated system, it is less likely that human nature will be the cause of your blunder. If it was, you most likely were able to catch it before you got to the deployment of the system to a live customer, saving you time, and the big headache that your client would have been sure to give you later.
You can even automate the contingency plan, making the site rollback to a working or previous state making it seem nothing at all has ever happened. This is super valuable. Additionally, it can be version controlled, and improved, making your developers able to learn from other's mistakes. It now can be self-tested!
Testing can happen in many different ways using methods you have probably heard thrown around as buzz words. Some of these involve automated testing, including methods like unit testing and interface testing. Self-testing Builds are the next step. Once you have tests that can be automated, making them happen when a build happens is usually not too much more work. What this does is gives your team more awareness on what is going on with your product. Obviously, the more you test (and the more good tests you have) the greater visibility you have into the state of your product before release. This could possibly be the difference between being an reactive and proactive company. Testing is only as good as the environment you test it in.
Making it a point to minimize the variables in which you are testing in is just good planning. Every developer has that story of the time they were up till ungodly hours of the night because their perfect build for some reason didn't work on the production environment like it did on the staging or development environments. Making sure your testing environment is as close to production as possible allows you to minimize this scenario, saving tons of time trying to figure out that the version of some program happens to be a version older or some other scenario that is related to this. There are several options for configuration management when it comes to building server machines that allow you to manipulate the configuration of several machines with a single versioned file. This allows someone to see exactly what changes were made, when, and by whom increasing accountability and efficiency. Testing frequently with a minimum set of variances will only give you frequent results only as long as your developers frequently commit their changes.
Committing is the act of creating a new version in a version control system. This marks a point in the history of the code base where one is able to switch over to that point and use the changes made. Frequently, this isn't just when a new release of the product is made. In fact, each release of your product may be made of thousands of commits; thousands of points in time where you could go back to the state of the code base as it was in that moment. This is a good thing. The more modular the commits, the easier it is for developers to merge, move, add, and remove changes with other developer's work. It also gives them a point of comfort, because if something horrible goes wrong, they can start back where they were the last time they committed work. Making this a frequent occurrence ensures that at the end of the day, everyone's code can be updated with the latest changes making code consolidation easy.
It is often best if at the end of each day, all development features are all thrown back at each other and merged. Not everyone follows this methodology and prefer to merge only full complete features, however, Code Consolidation does a couple things for you.
First, Developers spend less time trying to merge their changes with one another. Although this process is usually automatic, there is a case in which one developer's changes may conflict with another developer's changes. This happens when both developers end up changing the exact same line of code. Catching these early while it's still fresh on their minds will increase efficiency.
Second, If you have Automated Building of your product on a test server and those builds are Self-testing Builds, you then have a view of how well your product is doing at the end of the day with all the latest changes. The important part to this of course that as code changes, tests need to change, and this will identify that problem as well, generally.
Lastly, it gives your other developers visibility on the code that another developer is writing. This provides a check and balance routine, particularly if you have set up a code review process for every merge that is made into the main development branch. This will possibly catch problems earlier and most likely improve the quality of your code, ultimately leading to a better product.
As with everything, time is absolutely money. You don't want to pay your developers for sitting around watching Youtube every time they have to build a new environment (which happens often some days). Additionally, depending on the trigger that is set up for an automated build made for testing, a build can take so long that the trigger fires more than once before an actual environment is completely built. This is bad if it happens on a daily basis. Their are obvious times where this will happen anyway if your trigger is something like "on every merge to the development branch", and it's close to a major deadline. Not only do developers get on their feet faster, but you get your test results faster, and people can see your product faster.
It is important to have a current development and/or staging build available at all times. Regardless of the automated testing you do, there are some things a computer wasn't taught to see or it makes very little sense to try and make it do so. In these cases manual testing is required.
Additionally, if you provide a public (pseudo public even) version of your build to your clients, they may be more likely to buy in to what you are building. Everyone loves a sneak peak into the new great features that are coming along. Availability is key.
So who knows about what tests passed and what tests failed? If it's just the developers, you may want to reconsider. Have visibility on the results are a great way to have a checks and balance system that keeps your developers honest. There are many products out there that show the test results in a nice graph of passes and fails. They usually let you drill down into the results so you can get as detailed as you want. A reconciliation of each test error each day can be preformed so that everyone knows exactly why it failed and when it will be fixed.
Like with Build Availability, allowing your clients to see test results can be assuring. It is important to note that you should most likely only show them results on the builds that your show them. For instance, if you have a staging environment that is only built at every stable development cycle, then show them tests related to that build. This gives your clients confidence in your ability and your devotion to quality, unless of course, you let them see red test results all day long without a proper reconciliation.
Below are a few suggestions on the tools that developers can use to help you accomplish the principles above. It's very possible that you are already using some of each of these or various programs that are similar to them. I will try to list a couple of alternative programs with each to give you an idea of what is out there. These, however, are what my team uses to accomplish the task.
Git is probably the most popular version control system among Linux based developers today. It has several features that gives it a leg up on some of it's competitors. It is also known as a distributed version control system. What this means is that even when you aren't connected to the network, you can still make commits as you need. Pushing and pulling changes from other git repositories creates an intricate web of repositories, but once the slightly higher learning curve on the workflow is achieved, it makes for a very power and efficient version control system.
The workflow used with git is also known as git-flow. This is the process of moving changes between several branches of development from one version of a product to another. It is best described with a picture:
Courtesy of Joefleming.net
In this case, Master is your pristine releasable copy of your product. Using this flow along with a common code review greatly increases your code's quality.
Other version control systems are:
Vagrant is a way for developers to take an environment and distribute it among other developers. Yes, it allows an entire virtual machine to be distributed. This includes the setup, from software, to resources (memory, hardrive space, video memory, etc). This comes at the cost of having your developers set up a machine that is buildable, but after that cost is sunk, the hours that developers spend setting up local environments (that they can utterly destroy and then rebuild) is reduced greatly. Building an entire machine after the initial configuration stage (which is version controllable) is literally one command. Suitable for sites that will be shared among several team members.
Chef actually works with Vagrant to configure a machine. It's a series of scripts that handles building the software and configuration layer of building/maintaining a server. For instance, if you need version 5.4 of PHP, you can write this in a single line in a configuration file and all the servers that are set up to receive that instruction then will go download and install PHP 5.4 if they are able to. Having this version controlled, distributed, system will increase accountability, efficiency, as well as add a layer of automation that will free up your system administrator to go figure out why the mail server isn't working again.
An alternative to this is Puppet.
http://phpunit.de/manual/current/en/index.html
PHPUnit is a unit testing framework. It is technically used for more than just unit testing however. It is also used to run automated testing for interface testing such as Selenium, PhantomJS, and Behavioral Driven Development platforms such as Behat. Note there are none PHP specific answers to the following technologies including PHPUnit. However, preforming continuous integration in a PHP/Drupal shop almost requires these.
Other testing frameworks include:
- SimpleTest
- JUnit
- Ruby's Test::Unit library
The meat and potatoes that make Continuous Integration possible would be Jenkins. Jenkins is a server software that provides users the ability to preform tasks on hooks and on a job basis. What this means is that at the click of a button, on code commit, or at a specific time, you can have an entire machine set up with the right settings, your product's code pushed to that machine, tested, and the results sitting available for you. This product is rightfully labeled "An extendable open source continuous integration server." You could accomplish all of this with a variety of scripts, however, using a market standard software like Jenkins makes life easier when things goes wrong. The variety of plugins alone should be enough to sway even the newest of users.
Other alternatives are:
When you are getting your business started on Continuous Integration, remember you are doing 3 things. You are:
- Improving the quality of you code, thusly most likely improving the quality of your product.
- Effeciency of your developers. You can now get more out of your money and narrow the cost of those large projects.
- Showing your prowess in your market. Let everyone know that you have the practices that makes quality work!
Remember it is ok to not implement the whole process, as needs of the business or project may not call for some of it or resources may be low. Doing so though will make the process lose some of it's meaning and that will reflect on the way you do business.