Percent notation in Ruby

How Ruby's percent notation help you write more succinct code.

  • By Faraaz
  • ·
  • Ruby
  • Engineering
Last updated on Dec 3, 2022

Ruby is a language of careful balance. Its creator, Yukihiro “Matz” Matsumoto, blended parts of his favorite languages (Perl, Smalltalk, Eiffel, Ada, and Lisp) to form a new Object Oriented language and balanced it with functional and imperative programming. One of the main goals of Ruby is developer happiness, which is fulfilled by having a syntax that is easy to understand, an ecosystem that makes it easy to manage dependencies, and freedom to do the same thing in different ways.

Percent Literals are one such way of dealing with strings in Ruby. Sure you can directly create strings using quotes and interpolation, but its a lot more convenient to just use percent literals.

Why percent literals?

This is better for your health - Yukihiro "Matz" Matsumoto

What are they?

Enough hyperbole; what are percent literals? It's a notation in Ruby that's inspired by Perl. It provides a simple syntax to generate interpolated strings, string arrays, symbol arrays, shell commands, and regular expressions.

Percent literals start with a % and are followed by delimiting characters that contain a string. While you can use alphanumeric characters as delimiting characters, any bracket is a good choice, as you can quickly see each bracket's opening and closing nature. Here's an example:

%[ship good code]
%(ship good code)
%?ship good code?

They will all evaluate to "ship good code."

It might be convenient, but why is it helpful? The most obvious example is when escaping characters within a string literal. For example, the following would fail with a syntax error since Ruby considers the string to be terminated in the middle.

puts "Its "LeviOsa", not "LeviosAR"!"

To get this working as intended, you'll need to escape the quotes like so:

puts "Its \"LeviOsa\", not \"LeviosAR\"!"

But escaping every quote is tedious, so you can use percent literals instead:

puts %{Its "LeviOsa", not "LeviosAR"!}

and it will produce the same result!

Decorators

In the spirit of giving developers the freedom to do things their way, Ruby provides a way to decorate your percent literals with mnemonics. These mnemonics are certain alphabetical characters, and when these alphabets are capitalized, they allow for interpolation. Ruby provides a healthy number of decorators to use with your percent literals; let's explore them one by one:

%w

Creates an array of literal strings — Creating an array of string literals by explicitly typing out each string's quotes is tedious and is more easily grepped visually. It is generally agreed upon by the community style guide to use %w over the literal array syntax when you need to create an array of strings (non-empty strings without spaces and special characters in them). Apply this rule only to arrays with two or more elements.

# bad
STATES = ['draft', 'open', 'closed']

# good
STATES = %w[draft open closed]

%i

It is similar to %w but produces an array of symbols instead. Prefer %i to the literal array syntax when you need an array of symbols (and you don't need to maintain Ruby 1.9 compatibility). Apply this rule only to arrays with two or more elements.

# bad
STATES = [:draft, :open, :closed]

# good
STATES = %i[draft open closed]

%q

Creates a single string that includes both single and double quotes. This differs from %w and %i because it produces a single string instead of an array. Why do you need this? Why can't you just create a string literal normally and escape the string in it manually? Well, you can, but it's more convenient this way. Which of the following feels more convenient to write?

"<p class='quote'>\"What did you say?\"</p>"

or

%q(<p class='quote'>"What did you say?"</p>)

Avoid %() unless you have a string with both ' and " in it. Regular string literals are more readable and should be preferred unless many characters would have to be escaped in them.

# bad
name = %q(Bruce Wayne)
time = %q(8 o'clock)
question = %q("What did you say?")

# good
name = 'Bruce Wayne'
time = "8 o'clock"
question = '"What did you say?"'
quote = %q(<p class='quote'>"What did you say?"</p>)

%q shorthand

Use %() (a shorthand for %Q) for single-line strings that require both interpolation and embedded double-quotes. For multi-line strings, prefer heredocs.

# bad (no interpolation needed)
%(<div class="text">Some text</div>)
# should be '<div class="text">Some text</div>'

# bad (no double-quotes)
%(This is #{quality} style)
# should be "This is #{quality} style"

# bad (multiple lines)
%(<div>\n<span class="big">#{exclamation}</span>\n</div>)
# should be a heredoc.

# good (requires interpolation, has quotes, single line)
%(<tr><td class="name">#{name}</td>)

%r

It is a shorthand to generate regular expressions but it differs from the \..\ syntax and Regexp#new such that it allows you to have a / within your regular expression without having to escape it. For example:

%r(/usr/local/bin)
# => /\/usr\/local\/bin

As recommended by the Style Guide, use %r only for regular expressions matching atleast one / character:

# bad
%r{\s+}

# good
%r{^/(.*)$}
%r{^/blog/2011/(.*)$}

%x

This is another way to execute a shell command from within Ruby. Avoid using %x unless you run a command with backquotes (which is rather unlikely).

# bad
date = %x(date)

# good
date = `date`
echo = %x(echo `date`)

%s

Avoid the use of %s. It seems that the community has decided :"some string" is the preferred way to create a symbol with spaces.

Percent Literal Braces

You might have noticed that we're using different types of brackets for different use cases. Use the braces that are most appropriate for their respective intended use:

  • () for string literals (%q, %Q).
  • [] for array literals (%w, %i, %W, %I) is aligned with the standard array literals.
  • {} for regexp literals (%r) since parentheses often appear inside regular expressions. That’s why a less common character with { is usually the best delimiter for %r literals.
  • () for all other literals (e.g. %s, %x)
# bad
%q{"Test's king!", John said.}

# good
%q("Test's king!", John said.)

# bad
%w(one two three)
%i(one two three)

# good
%w[one two three]
%i[one two three]

# bad
%r((\w+)-(\d+))
%r{\w{1,2}\d{2,5}}

# good
%r{(\w+)-(\d+)}
%r|\w{1,2}\d{2,5}|

Caveats

Since Ruby provides many ways to do the same thing, the community had to develop a style generally agreed upon by a majority. However, you may decide with your team that a certain style just doesn't fit with what you want to do in your codebase — which is okay; that's how the community moves forward — in such cases, it's okay to use the style you prefer as long as it's not an anti-pattern or not agreed upon by your team.

Delimiter conflict

You can technically use any alphanumeric character as a delimiter for percent literals. It can come to bite you when you need to use a character in the string, but it is already being used as the delimiter. That will cause your code to break. For example:

x = %Q| do this | or this | # this will break
x = %Q| do this \| or this | # Escape the character instead

Space is also a delimiter

Consider the following Ruby code:

[10, 7, 4].map { |i| % i }
# => ["i", "i", "i"]

Notice how it produces an array of ["i", "i", "i"] which might be different from the intended output. It's possible the person who wrote this code might have expected to get the result of the modulo operator on the array's elements like so:

10 % 7 % 4
=> 3 % 4
=> 3

which can be done instead with Enumerable#inject on the array:

[10, 7, 4].inject(&:%)
# => 3

This is why it's important to remember these issues when using Ruby's percent literals.

Conclusion

Ruby focuses a lot on programmer happiness and makes the syntax feel natural since it provides a myriad of features for your convenience while writing code. These features are well thought out and make writing and understanding code easier. However, that freedom can bite you if you're not careful using these features.

Ship clean and secure code.