Leveraging static code analysis in a Ruby CI pipeline
Setting up a GitHub workflow CI pipeline powered by Rubocop.
- By Dhruv
- ·
- Engineering
- Continuous Quality
Continuous integration, or CI, refers to the culture and the technologies that enable continuously merging features and bug fixes into the main branch of the codebase. Code changes are incorporated immediately after testing, rather than being bunched with other updates in a waterfall release process.
Similarly, continuous delivery, or CD, refers to automatically deploying the changed code to the target environment, such as pre-production branches to staging and master to production. CD picks up where CI left off, and as such, they often go hand in hand.
Static code analysis typically falls under the CI aspect of a CI/CD pipeline. Taking the example of a small Ruby project, we'll be setting up a CI workflow to analyze code quality using static analysis in the following areas:
- Consistency, with the widely adopted Ruby style guide.
- Layout, such as unjust spacing or misaligned indentation.
- Linting, such as inadequate permissions or redundant operations.
- Security analysis, such as the use of unsafe methods.
Prerequisites
Creating a sandbox
Let's make a fresh new directory for our adventure today. Initialize a Git repository in the folder and check it into GitHub as we'll be using GitHub Workflows as our CI tool (more on this later).
$ mkdir proj
$ cd proj/
$ touch .gitignore
Setting up Ruby and Bundler
You probably already have Ruby installed on your computer. But I find it best not to use pre-installed Ruby for a couple of reasons. Reason the first, it's generally much older than the latest stable version, and you don't want to miss out on Ruby's newest features, do you? Reason the second, it's relatively easy to break your system by installing, removing, or updating a critical package.
Don't fret; there's a solution — RVM. I won't get into details of RVM here, but using it, you can install and manage several versions of Ruby on a system, keeping the system Ruby pristine.
$ rvm install 3.0.0
$ rvm use 3.0.0
Next, we set up Bundler, a fantastic package manager for Ruby. We need it to keep track of our projects dependencies. It's extremely straightforward to install.
$ gem install bundler
$ bundle init
To ensure that our project gems remain localized to our project, we can set up Bundler to install Gems at a given path. Create a directory .bundle/
and a config
file within the directory with the following content.
---
BUNDLE_PATH: '.gems'
With this configuration, Bundler will install all gems inside a .gems/
folder inside the current project folder proj/
. Add both directories .bundle/
and .gems/
to your .gitignore
file so that they are not checked into VCS.
Getting familiar with Rubocop
For analyzing the code quality in all the areas we mentioned above, we will be using Rubocop, one of the finest linters available for Ruby. Rubocop comes with an extensive collection of rules, called 'cops', organized in groups, called 'departments', based on their functionality.
To install, add the line to your Gemfile and run bundle install
.
# frozen_string_literal: true
source 'https://rubygems.org'
git_source(:github) { |repo_name| "https://github.com/#{repo_name}" }
+ gem 'rubocop', '~> 1.9', require: false
To list all the offences in any given file or directory, just pass the names as arguments to rubocop
.
$ rubocop <file/dir_name>
Rubocop is also capable of autocorrecting most of the errors it reports, which is incredibly helpful. To enable autocorrection, pass the -a
flag. Passing -A
uses a more aggressive auto-correct mode, which is not advisable unless you are sure of what you are doing.
$ rubocop -a <file/dir_name> # safe autocorrect, recommended
$ rubocop -A <file/dir_name> # unsafe autocorrect, not recommended
You will need to prepend bundle exec
to these commands if Rubocop is not globally installed.
The code
Now we get to the fun part, scripting in Ruby. Take this script, for example. It takes a file name as an argument and prints said file's content to STDOUT, very similar to the cat
command (hence the name).
# cat.rb
filename = ARGV[0]
file = open(filename)
list = file.read
file.close
list.each_line.with_index{|line|
puts line
}
It is formatted poorly, violates many rules from the style guide, and even has a couple of gaping security flaws. We'll fix those pretty soon but first, let's do a preliminary scan with Rubocop and observe the output.
$ bundle exec rubocop cat.rb
Inspecting 1 file
W
Offenses:
cat.rb:1:1: C: [Correctable] Style/FrozenStringLiteralComment: Missing frozen string literal comment.
# cat.rb
^
cat.rb:4:8: C: Security/Open: The use of Kernel#open is a serious security risk.
file = open(filename)
^^^^
cat.rb:8:16: W: [Correctable] Lint/RedundantWithIndex: Remove redundant with_index.
list.each_line.with_index{|line|
^^^^^^^^^^
cat.rb:8:26: C: [Correctable] Layout/SpaceBeforeBlockBraces: Space missing to the left of {.
list.each_line.with_index{|line|
^
cat.rb:8:26: C: [Correctable] Layout/SpaceInsideBlockBraces: Space between { and | missing.
list.each_line.with_index{|line|
^^
cat.rb:8:26: C: [Correctable] Style/BlockDelimiters: Avoid using {...} for multi-line blocks.
list.each_line.with_index{|line|
^
cat.rb:10:2: C: [Correctable] Layout/TrailingEmptyLines: Final newline missing.
}
1 file inspected, 7 offenses detected, 6 offenses auto-correctable
Rubocop found 7 offenses, of which 6 it can correct automatically, labeled as [Correctable]
in the output above.
Pipeline
Picking the infrastructure
Choices abound when it comes to picking a CI/CD infrastructure provider. From Travis CI, a darling of open-source developers, to Jenkins, the tool of choice for enterprise teams who'd rather self-host their customized solution, dev-ops engineers are spoilt for choice.
But the simplest of these in my experience has been GitHub workflows, a GitHub-native solution allowing you to set up entire chains of jobs, described as YAML files, that can be initiated based on specific triggers. We can use them throughout the CI/CD pipeline, from running checks on PRs before merge to deploying the code after. There are hundreds of pre-built actions (many officially maintained) that take the effort out of setting up end-to-end pipelines.
Naturally, we'll be using GitHub workflows as our CI pipeline infrastructure. The end goal is to have linting as a check on our PRs and commits. Only PRs that pass the checks would be mergeable.
Adding lint workflow
Let's see what the workflow file would look like in our case. Create a new directory .github/
, create another directory within this one named workflows/
, and in this directory, create a file named lint.yml
.
# .github/workflows/lint.yml
name: Lint
on:
push:
branches:
- master
jobs:
lint:
name: Lint
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: ruby/setup-ruby@v1
with:
ruby-version: 3.0
bundler-cache: true # runs `bundle install`
- run: bundle exec rubocop .
The lint workflow consists of a single job. The job is fired on every push event to the master
branch and performs three steps:
actions/checkout
: checks out the code repositoryruby/setup-ruby
:- sets up Ruby version 3.0 and the latest compatible version of Bundler
- uses Bundler to install all packages in the
Gemfile
run
: runs Rubocop on the entire current working directory
Once the job completes, we get our outcome. In our case, it's a big red cross. The workflow execution fails because Rubocop found issues in the code. The logs reveal the same message we had seen earlier, and lists offences identified by Rubocop.
😢 We'll get there, eventually.
Fixing problems
We want our check to pass. Nobody likes failing checks. Coming back to our local setup, let's use Rubocop's autofix feature to quickly resolve these issues. First, we should let Rubocop take care of the automatic stuff using the -a
flag.
$ bundle exec rubocop -a cat.rb
Inspecting 1 file
W
Offenses:
cat.rb:1:1: C: [Correctable] Style/FrozenStringLiteralComment: Missing frozen string literal comment.
# cat.rb
^
cat.rb:4:8: C: Security/Open: The use of Kernel#open is a serious security risk.
file = open(filename)
^^^^
cat.rb:8:15: C: [Corrected] Layout/ExtraSpacing: Unnecessary spacing detected.
list.each_line do |line|
^
cat.rb:8:16: W: [Corrected] Lint/RedundantWithIndex: Remove redundant with_index.
list.each_line.with_index{|line|
^^^^^^^^^^
cat.rb:8:26: C: [Corrected] Layout/SpaceBeforeBlockBraces: Space missing to the left of {.
list.each_line.with_index{|line|
^
cat.rb:8:26: C: [Corrected] Layout/SpaceInsideBlockBraces: Space between { and | missing.
list.each_line.with_index{|line|
^^
cat.rb:8:26: C: [Corrected] Style/BlockDelimiters: Avoid using {...} for multi-line blocks.
list.each_line.with_index{|line|
^
cat.rb:10:2: C: [Corrected] Layout/TrailingEmptyLines: Final newline missing.
}
1 file inspected, 8 offenses detected, 6 offenses corrected, 1 more offense can be corrected with `rubocop -A`
Rubocop solved most of the issues reported, including one issue introduced during the autofix process itself! With 6 of the 7 problems are already fixed, we've managed to shave off ~70% of our work with zero effort input.
What's left is one unsafe autofix and one security vulnerability that Rubocop cannot fix automatically. We can take care of those:
- The frozen string literal comment is missing. That's a reasonable thing to add to the file, so we'll let Rubocop add it using the stronger autofix flag
-A
.
$ bundle exec rubocop -A cat.rb
Inspecting 1 file
C
Offenses:
cat.rb:1:1: C: [Corrected] Style/FrozenStringLiteralComment: Missing frozen string literal comment.
# cat.rb
^
cat.rb:2:1: C: [Corrected] Layout/EmptyLineAfterMagicComment: Add an empty line after magic comments.
# cat.rb
^
cat.rb:6:8: C: Security/Open: The use of Kernel#open is a serious security risk.
file = open(filename)
^^^^
1 file inspected, 3 offenses detected, 2 offenses corrected
Kernel::open
is a prominent security risk, especially so when passing tainted input to the function. We've talked about this (and other security pitfalls before). Replacing that withFile.open
should do the trick.
- file = open(filename)
+ file = File.open(filename)
Commit and push. We're green now! At this point, you should pat yourself on the back for a job well done.
🥳 Yay!
Staying clean
Now that you're at peak code quality, we need to ensure it stays that way. This means that we need to ensure that no PR negatively affects our codebase quality. To run the check on every incoming PR, add the pull_request
event to our lint workflow.
on:
on:
push:
branches:
- master
+ pull_request:
+ branches:
+ - master
Now, to test that our check is working as expected, we need to make a PR with some code that Rubocop would flag. Let's refactor the lines in our script, that are concerned with reading the file, to use a block.
# cat.rb
# frozen_string_literal: true
filename = ARGV[0]
- file = File.open(filename)
- list = file.read
- file.close
+ File.open(filename) do |file|
+ list = file.read
+ end
list.each_line do |line|
puts line
end
Check out a new branch from master
. Commit and push to this branch and open a PR. You'll see that the checks fail, and thus the PR cannot be merged unless overridden by an administrator.
🚧 Our check is working just fine!
🧠 Brain-teaser: Can you identify why the updated code, using a block, is being flagged by Rubocop?
🤷♂️ Hint: If you want a hint, here's the Rubocop output for the PR:
$ bundle exec rubocop cat.rb
Inspecting 1 file
W
Offenses:
cat.rb:7:3: W: Lint/UselessAssignment: Useless assignment to variable - list.
list = file.read
^^^^
1 file inspected, 1 offense detected
🧑💻 Answer: Defining list
inside the block means that it is not accessible outside the block. This makes the assignment useless and will lead to a bug in the subsequent use of the variable.
💡 Lesson: Though indirectly, code analysis can sometimes also help identify potential bugs!
DeepSource
While we invested considerable time and effort, we now have checks and actions set up to monitor code quality in our repo. But what if we didn't have time to spare or didn't want to spend the effort? We're busy developers, after all!
Consider using DeepSource. It continuously scans the code on every commit, and on every pull request, through various static code analyzers (including linters and security analyzers), and can automatically fix some of them. DeepSource also has its custom-built analyzers for most languages that are constantly improved and kept up-to-date.
It’s incredibly easy to set up! You need only add a .deepsource.toml
file in your repository root, and DeepSource will pick it up. It takes much less effort and the end result is way more polished that setting up several workflows in GitHub.
version = 1
[[analyzers]]
name = "ruby"
enabled = true
[[transformers]]
name = "rubocop"
enabled = true
Automate the tedium away
CI/CD pipelines are quintessential to the agile development workflow. The ability to add features, squash bugs, and get the changes in production instantly can make a very significant difference. Startups live or die based on how often they iterate.
Integrating static analysis into the CI pipeline ensures that only the cleanest and most compliant code makes its way into production. For something that takes very little time to set up, consumes a minuscule amount of resources, and does not significantly affect test/build timings, static analysis can add a lot of confidence to your build process.
Confident iterations await. Till next time!