Ruby & (Ampersand) Parameter Demystified

Idea

Recently I was asked a question about ‘& parameters’ when you define and/or call methods which take a block e.g.:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
def blah(&block)
  yadda(block)
end

def yadda(block)
  foo(&block)
end

def foo(&block)
  block.call
end

blah do
  puts "hello"
end

As you pass this parameter around, sometimes the ampersand appears in front of it, but other times it doesn’t, seemingly with no rhyme of reason. As we dig into crazy metaprogramming or write various libraries, it is often hard to remember how confusing Ruby can be when you’re starting out. So, let’s dig into this a little more deeply and shed some light on what’s going on.

The Implicit Block

Methods in Ruby can take arguments in all sorts of interesting ways. One case that’s especially interesting is when a Ruby method takes a block.

In fact, all Ruby methods can implicitly take a block, without needing to specify this in the parameter list or having to use the block within the method body e.g.:

1
2
3
4
5
6
def hello
end

hello do
  puts "hello"
end

This will execute without any trouble but nothing will be printed out as we’re not executing the block that we’re passing in. We can – of course – easily execute the block by yielding to it:

1
2
3
4
5
6
7
def hello
  yield if block_given?
end

hello do
  puts "hello"
end

This time we get some output:

1
hello

We yielded to the block inside the method, but the fact that the method takes a block is still implicit.

It gets even more interesting since Ruby allows to pass any object to a method and have the method attempt to use this object as its block. If we put an ampersand in front of the last parameter to a method, Ruby will try to treat this parameter as the method’s block. If the parameter is already a Proc object, Ruby will simply associate it with the method as its block.

1
2
3
4
5
6
7
def hello
  yield if block_given?
end

blah = -> {puts "lambda"}

hello(&blah)
1
lambda

If the parameter is not a Proc, Ruby will try to convert it into one (by calling to_proc on it) before associating it with the method as its block.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
def hello
  yield if block_given?
end

class FooBar
  def to_proc
    -> {puts 'converted lambda'}
  end
end

hello(&FooBar.new)
1
converted lambda

All of this seems pretty clear, but what if I want to take a block that was associated with a method and pass it to another method? We need a way to refer to our block.

The Explicit Block

When we write our method definition, we can explicitly state that we expect this method to possibly take a block. Confusingly, Ruby uses the ampersand for this as well:

1
2
3
4
5
6
7
def hello(&block)
  yield if block_given?
end

hello do
  puts "hello"
end

Defining our method this way, gives us a name by which we can refer to our block within the method body. And since our block is a Proc object, instead of yielding to it, we can call it:

1
2
3
4
5
6
7
def hello(&block)
  block.call if block_given?
end

hello do
  puts "hello"
end

I prefer block.call instead of yield, it makes things clearer. Of course, when we define our method we don’t have to use the name ‘block’, we can do:

1
2
3
4
5
6
7
def hello(&foo)
  foo.call if block_given?
end

hello do
  puts "hello"
end

Having said that; ‘block’ is a good convention.

So, in the context of methods and blocks, there are two ways we use the ampersand:

  • in the context of a method definition, putting an ampersand in front of the last parameter indicates that a method may take a block and gives us a name to refer to this block within the method body
  • in the context of a method call, putting an ampersand in front of the last argument tells Ruby to convert this argument to a Proc if necessary and then use the object as the method’s block

Passing Two Blocks To A Method

It is instructive to see what happens when you try to pass a both a regular block and a block argument to a method:

1
2
3
4
5
6
7
8
9
def hello(&block)
  block.call if block_given?
end

blah = -> {puts "lambda"}

hello(&blah) do
  puts "hello"
end

You get the following error message:

1
code.rb:56: both block arg and actual block given

It is not even an exception – it’s a syntax error!

Using Another Method As A Block

It’s also interesting to note that since you can easily get a reference to a method in ruby and the Method object implements to_proc, you can easily give one method as a block to another e.g.:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
def hello(&block)
  block.call if block_given?
end

def world
  puts "world"
end

method_reference = method(:world)

hello(&method_reference)
1
world

Passing The Block Around

We now know enough to easily understand our first example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
def blah(&block)
  yadda(block)
end

def yadda(block)
  foo(&block)
end

def foo(&block)
  block.call
end

blah do
  puts "hello"
end
  • we define blah to expect a block, inside blah we can refer to this block as block
  • we define yadda to expect one parameter, this parameter would be referred to as block inside yadda, but it is not a block in that we could not yield to it inside yadda
  • foo also expects a block and we can refer to this block as block inside foo
  • when we call yadda from within blah we pass it our block variable without the ampersand, since yadda does not a expect a block parameter, but simply a regular method argument, in our case this regular method argument will just happen to be a Proc object
  • when we call foo from inside yadda we pass it our block variable, but this time with an ampersand since foo actually expects a block and we want to give it a block rather than just a regular variable

It should now be much more obvious why the ampersand is used in some cases, but not in others.

The Symbol To Proc Trick

We should now also have a lot less trouble understanding the ‘symbol to proc’ trick. You’ve no doubt seen code like this:

1
p ["a", "b"].map(&:upcase)

We know that this is equivalent to:

1
p ["a", "b"].map{|string| string.upcase}

But now we also make an educated guess as to why they are equivalent. We have a Symbol object (’:upcase’), we put an ampersand in front of it and pass it to the map method. The map method takes a block, and by using the ampersand we’ve told Ruby that we want to convert our Symbol object to a Proc and associate it with the map method as its block. It turns out that Symbol implements to_proc in a special way, so that the resulting block becomes functionally equivalent to our second example above. Of course these days Ruby implements Symbol#to_proc using C, so it’s not quite as nice looking as the examples you’ll find around the web, but you get general idea.

Anyway, hopefully this makes blocks and ampersands a bit more friendly. It’s definitely worth it to occasionally revisit the basics, I’ll try to do it more often in the future.

Image by Martin Whitmore

Ruby – Why U No Have Nested Exceptions?

Why U No

One of the things we almost always do these days when we write our libraries and apps, is use other libraries. Inevitably something will go wrong with those libraries and exceptions will be produced. Sometimes these are expected (e.g. an HTTP client that produces an exception when you encounter a 500 response or a connection timeout), sometimes they are unexpected. Either way you don’t want to allow the exceptions from these external libraries to bubble up through your code and potentially crash your application or cause other weirdness. Especially considering that many of these exceptions will be custom types from the libraries you’re using. No-one wants strange exceptions percolating through their code.

What you want to do, is ensure that all interactions with these external libraries are wrapped in a begin..rescue..end. You catch all external errors and can now decide how to handle them. You can throw your hands up in the air and just re-raise the same error:

1
2
3
4
5
begin
  SomeExternalLibrary.do_stuff
rescue => e
  raise
end

This doesn’t really win us anything. Better yet you would raise one of your own custom error types.

1
2
3
4
5
begin
  SomeExternalLibrary.do_stuff
rescue => e
  raise MyNamespace::MyError.new
end

This way you know that once you’re past your interfaces with the external libraries you can only encounter exception types that you know about.

The Need For Nested Exceptions

The problem is that by raising a custom error, we lose all the information that was contained in the original error that we rescued. This information would have potentially been of great value to help us diagnose/debug the problem (that caused the error in the first place), but it is lost with no way to get it back. In this regard it would have been better to re-raise the original error. What we want is to have the best of both worlds, raise a custom exception type, but retain the information from the original exception.

When writing escort one of the things I wanted was informative errors and stack traces. I wanted to raise errors and add information (by rescuing and re-raising) as they percolated through the code, to be handled in one place. What I needed was the ability to nest exceptions within other exceptions.

Ruby doesn’t allow us to nest exceptions. However, I remembered Avdi Grimm mentioning the nestegg gem in his excellent Exceptional Ruby book, so I decided to give it a try.

The Problems With Nestegg

Egg

Unfortunately nestegg is a bit old and a little buggy:

  • It would sometimes lose the error messages
  • Nesting more than one level deep would cause repetition in the stacktrace

I also didn’t like how it made the stack trace look non-standard when including the information from the nested errors. If we take some code similar to the following:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
require 'nestegg'

class MyError < StandardError
  include Nestegg::NestingException
end

begin
  1/0
rescue => e
  begin
    raise MyError.new("Number errors will be caught", e)
  rescue => e
    begin
      raise MyError.new("Don't need to let MyError bubble up")
    rescue => e
      raise MyError.new("Last one for sure!")
    end
  end
end

It would produce a stack trace like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
examples/test1.rb:26:in `rescue in rescue in rescue in <main>': MyError (MyError)
	from examples/test1.rb:23:in `rescue in rescue in <main>'
	from examples/test1.rb:20:in `rescue in <main>'
	from examples/test1.rb:17:in `<main>'
	from cause: MyError: MyError
	from examples/test1.rb:24:in `rescue in rescue in <main>'
	from examples/test1.rb:20:in `rescue in <main>'
	from examples/test1.rb:17:in `<main>'
	from cause: MyError: MyError
	from examples/test1.rb:21:in `rescue in <main>'
	from examples/test1.rb:17:in `<main>'
	from cause: ZeroDivisionError: divided by 0
	from examples/test1.rb:18:in `/'
	from examples/test1.rb:18:in `<main>'

After looking around I found loganb-nestegg. This fixed some of the bugs, but still had the non-standard stack trace and the repetition issue.

When you’re forced to look for the 3rd library to solve a problem, it’s time to write your own.

This is exactly what I did for escort. This functionality eventually got extracted into a gem which is how we got nesty. Its stack traces look a lot like regular ones, it doesn’t lose messages and you can nest exceptions as deep as you like without ugly repetition in the stack trace. If we take the same code as above, but redefine the error to use nesty:

1
2
3
class MyError < StandardError
  include Nesty::NestedError
end

Our stack trace will now be:

1
2
3
4
5
6
7
8
examples/complex.rb:20:in `rescue in rescue in rescue in <main>': Last one for sure! (MyError)
	from examples/complex.rb:17:in `rescue in rescue in <main>'
	from examples/complex.rb:18:in `rescue in rescue in <main>': Don't need to let MyError bubble up
	from examples/complex.rb:14:in `rescue in <main>'
	from examples/complex.rb:15:in `rescue in <main>': Number errors will be caught
	from examples/complex.rb:11:in `<main>'
	from examples/complex.rb:12:in `/': divided by 0
	from examples/complex.rb:12:in `<main>'

Definitely nicer. We simply add the messages for every nested error to the stack trace in the appropriate place (rather than giving them their own line).

How Nested Exceptions Work

The code for nesty is tiny, but there are a couple of interesting bits in it worth looking at.

One of the special variables in Ruby is $! which always contains the last exception that was raised. This way when we raise a nesty error type, we don’t have to supply the nested error as a parameter, it will just be looked up in $!.

Ruby always allows you to set a custom backtrace on any error. So, if you rescue an error you can always replace its stack trace with whatever you want e.g.:

1
2
3
4
5
6
7
begin
  1/0
rescue => e
  e.message = "foobar"
  e.set_backtrace(['hello', 'world'])
  raise e
end

This produces:

1
2
hello: divided by 0 (ZeroDivisionError)
	from world

We take advantage of this and override the set_backtrace method to take into account the stack trace of the nested error.

1
2
3
4
5
6
7
8
9
def set_backtrace(backtrace)
  @raw_backtrace = backtrace
  if nested
    backtrace = backtrace - nested_raw_backtrace
    backtrace += ["#{nested.backtrace.first}: #{nested.message}"]
    backtrace += nested.backtrace[1..-1] || []
  end
  super(backtrace)
end

To produce the augmented stack trace we note that the stack trace of the nested error should always be mostly a subset of the enclosing error. So, we whittle down the enclosing stack trace by taking the difference between it and the nested stack trace (I think set operations are really undervalued in Ruby, maybe a good subject for a future post). We then augment the nested stack trace with the error message and concatenate it with what was left over from the enclosing stack trace.

Anyway, if you don’t want exceptions from other libraries invading your app, but still want the ability to diagnose the cause of the exceptions easily – nested exceptions might be the way to go. And if you do decide that nested exceptions are a good fit, nesty is there for you.

Image by Samuel M. Livingston

The Best Way To Pretty Print JSON On The Command-Line

Print

Developers tend to work with APIs a lot and these days most of these APIs are JSON. These JSON strings aren’t exactly easy to read unless they are formatted well. There are many services online that can pretty print JSON for you, but that’s annoying. I love the command-line and whenever I am playing with new APIs or writing my own I mostly use CURL which means I need a good way to pretty print my JSON on the command-line. It should be simple, quick, easy to remember, easy to install – we’re not trying to solve complex algorithms, just format some JSON.

The ‘Good’ Old Faithful

One way that is always available is the Python JSON tool. So you can always do this:

1
echo '{"b":2, "a":1}' | python -mjson.tool

Which will give you:

1
2
3
4
{
    "a": 1,
    "b": 2
}

This is alright and, as I said, it is always available. However note that it has sorted our keys which is a major disadvantage. It is also a bit of a handful to write when you just want to pretty print some JSON. I only ever use this when I am on an unfamiliar computer and there is nothing better.

YAJL Tools

If you’re not using YAJL you should be. It is a small JSON library written in C. The parser is event driven and super fast. In the Ruby world we have a nice set of bindings and there are bindings for other languages as well. It is my go-to JSON library.

YAJL also cames with a couple of tools, json_reformat and json_verify. These are pretty self-explanatory and you can get your hands on them like this:

1
brew install yajl

or

1
sudo apt-get install yajl-tools

After that all you have to do is:

1
echo '{"b":2, "a":1}' | json_reformat

Which will give you:

1
2
3
4
{
    "b": 2,
    "a": 1
}

This is pretty nice. My one tiny niggle is that json_reformat is still a bit of a mouthful, but if you just want basic JSON formatting, it’s a good solution.

Ppjson

Of course being developers, we don’t have to put up with even minor niggles since we can build ourselves exactly the tools we need (and a little bit extra to boot). I was writing a command-line framework and I just happened to need a command-line tool, which is how I ended up with ppjson. I use Ruby quite a bit, so it is a Ruby tool and even if I do say so myself, it’s pretty nice.

You just need to:

1
gem install ppjson

This will let you do:

1
echo '{"b":2, "a":1}' | ppjson

Which will give you:

1
2
3
4
{
  "b": 2,
  "a": 1
}

It uses YAJL through multi_json under the hood, so you still get the speed of YAJL, but it can also do a few extra things for you.

You can pass or pipe it some JSON and it will pretty print it to standard output for you. But you can also pass it a file name with some JSON in it and it will pretty print the contents of the file for you:

1
ppjson -f abc123.json

You can also store the pretty printed contents back into the file:

1
ppjson -fi abc123.json

Sometimes you have saved some pretty printed JSON in a file, but now you want to use it as a body of a CURL POST request, for example. Well ppjson can uglify your JSON for you as well:

1
ppjson -fu abc123.json

This will output a minified JSON string to standard output. And of course you can also update the original file with the uglified JSON as well:

1
ppjson -fui abc123.json

It will do you basic JSON pretty printing with an easy to remember executable name, but it also has a few nice convenience features to make your life that little bit easier.

The best part is that using Escort](https://github.com/skorks/escort), it was a breeze to write. I’ll [talk about some of the other interesting projects that ‘fell out’ of Escort some other time.

Anyway, now you no longer need to remember that IDE or editor shortcut to format your JSON or look for an online tool to do the same, the command-line has you more than covered.

Image by NS Newsflash