Using Ruby Blocks And Rolling Your Own Iterators

I recently covered Ruby block basics in my post, More Advanced Ruby Method Arguments – Hashes And Block Basics. I mentioned that blocks are not really method arguments and also covered the two different types of block syntax. Towards the end of that post I promised to examine Ruby blocks more deeply and I am going to try and do that here.

In my opinion there are several interesting things about blocks:

is there any other difference between the two different types of block syntax besides the fact that one is predominantly used for single line blocks and the other for multi-line?
in what context are blocks normally used and how does it look from the perspective of the methods that take blocks as ‘parameters’?
is passing parameters to blocks similar to passing parameters to a method (i.e. is the syntax just as rich, can you have default values etc.) and what scope do those parameters have?
can a block be a first class parameter and can you pass it around easily as opposed to just using it as a quasi-parameter like we normally do?

Examining each one of those in turn can give one a reasonably solid understanding of blocks, so that’s what we’re gonna do.

The Two Different Block Notations

As we know there are two different types of block syntax, the curly brace syntax e.g.:

ruby [1,2,3,4,5].each {|i| print "#{i} "}

and the do..end syntax e.g.:

ruby [1,2,3,4,5].each do |i| print "#{i} " end

Aside from looking different and being used for single or multi-line blocks, there is one major difference between the two types of syntax – precedence. As we know all expressions in Ruby return a value, this is no different when we execute an iterator. So if we want to use a value that is returned by an iterator that takes a block (e.g. print it out) we can normally do so. When we use curly brace notation this is not a problem e.g.:

ruby print [1,2,3,4,5].each {|i| i}

this will print out:

as we would expect. But if we try to do the same with the do..end syntax e.g.:

ruby print [1,2,3,4,5].each do |i| i end

we get the following error:

`each': no block given (LocalJumpError)

this is due to the fact that the method call to print has higher precedence over the do..end block syntax so that what we actually have happening is the following:

ruby print([1,2,3,4,5].each) do |i| i end

to get this to work correctly we need to wrap the whole thing in braces e.g.:

ruby print([1,2,3,4,5].each do |i| i end)

It is not a major difference but something to be aware of.

Learn About Blocks By Implementing Our Own Iterator

The best way I have found to learn how blocks and iterators work together is to implement some of your own starting with a really simple case. Lets implement an infinite loop iterator of our own:

ruby def infinite_loop while true yield end end

We can call it in the following way, notice that we pass a simple block:

ruby infinite_loop {puts 'Looping Infinitely'}

this will print out ‘Looping Infinitely’ forever since the loop inside out iterator method has no exit condition and just keeps yielding to the block. So far so easy, so lets move on to something more complex. Let’s say we need to add another iterator to the array class that can work with the array elements in reverse (I am aware that Array has a method to reverse itself). We could do this by opening up the Array class and putting our new iterator in:

ruby class Array def reverse_iterate current_index = self.size-1 while current_index >= 0 yield self[current_index] current_index -= 1 end end end

We now have an exit condition for the loop in our iterator which means we won’t yield to the block more times than there are values in the array. We can then call our new iterator method in the following fashion:

ruby [2,4,6,8].reverse_iterate { |i| print "#{i} "}

This is all pretty simple, a couple of things to note, the call to yield inside out new method, this is what causes the method to accept a block in the first place. The second thing is the parameter we pass to the yield method. By passing one parameter to yield we cause the block that the iterator takes to also take one parameter. If we passed two values to yield then our block would need to take two parameters as well. So, the above prints out the array because we yield each of the array values to the block (in reverse order) and just output it:

8 6 4 2

But what if we want our new method to not only take a block but also have some sort of default behavior when no block is provided? This is fairly simple to do, we just need to use the Kernel.block_given? method:

ruby class Array def reverse_iterate if block_given? current_index = self.size-1 while current_index >= 0 yield self[current_index] current_index -= 1 end else print self.reverse end end end

we can now call our iterator with no block at all:

ruby [2,4,6,8].reverse_iterate

Our default behavior just prints out all the array values in reverse, concatenated together:

The last thing to note is the fact that yield can actually return a value from the block back to the iterator. The value yield returns is the result of the last expression executed in the block. We can take advantage of this to, for example, collect the values that the block returns on every iteration and return a new array with all the new values as the result of the iterator method call, e.g.:

ruby class Array def reverse_iterate if block_given? new_array=[] current_index = self.size-1 while current_index >= 0 new_array << yield(self[current_index]) current_index -= 1 end else print self.reverse end new_array end end

If we then call out iterator method with a block that squares all the values it receives:

ruby puts [2,4,6,8].reverse_iterate { |i| i*i}

the result returned from the method call will be an array with the values of our original array but reversed and squared. We can print this out:

This can be pretty handy and is incidentally similar to how the more complex Ruby iterator methods (such as map) work.

Block Parameters

Using block parameters is very similar to using method parameters, the rules are pretty much identical. If you’re using Ruby 1.9 you’ve got all the features like default arguments and optional arguments which you can mix and match pretty much any way you like. If you’re on Ruby 1.8, you can still have optional and default arguments but you’re a little more limited in how you can mix and match them. So to give an example, you could create an iterator that yields 4 parameters to the block in the following way:

ruby class Array def reverse_iterate current_index = self.size-1 while current_index >= 0 yield self[current_index], 'Value', current_index, 'Index' current_index -= 1 end end end

You can then call this iterator with a block but rather than using 4 parameters, you can use two with the second being a catch-all (remember the * notation):

ruby [2,4,6,8].reverse_iterate do |value, *others| puts "#{others[0]} = #{value}, #{others[2]} = #{others[1]}" end

You use the values from the catch-all parameter just like you would an array, so our iterator call would output:

Value = 8, Index = 3
Value = 6, Index = 2
Value = 4, Index = 1
Value = 2, Index = 0

Fairly straight forward. What is more interesting is the scope that variables have once we are inside the block. In the simplest case we may call an iterator with a block from another method, which may already have some variables defined:

ruby def some_crazy_method random_variable=5 [1,2,3].each do |i| puts "Array value=#{i}, Random variable=#{random_variable}" end end

It is curious that in this case we actually have access to these variables from within the block. Calling the above method prints out the following:

Array value=1, Random variable=5
Array value=2, Random variable=5
Array value=3, Random variable=5

As you can see we are able to print out _randomvariable from within the block and it is set to the same value as it was outside the block. But, what happens when we have a variable outside the block and use a variable with the same name as a block parameter e.g.:

```ruby def some_crazy_method i=5 puts “Before block i=#{i}”

[1,2,3].each do |i| puts “In block i=#{i}” end

puts “After block i=#{i}” end```

The behavior here depends on whether you’re using Ruby 1.8 or 1.9. With Ruby 1.8 it is still the same variable as the one defined outside the scope of the block, so if we assign a new value to that variable within the block this value will be available outside of the block. So calling the above method in Ruby 1.8 would produce:

Before block i=5
In block i=1
In block i=2
In block i=3
After block i=3

Notice that i now retains the last value it had within the block. In Ruby 1.9 however, the variable within the block is not the same as the one outside of the block so the output in 1.9 would be:

Before block i=5
In block i=1
In block i=2
In block i=3
After block i=5

As you can see the variable retains it’s old value after the block completes. Ruby 1.9 also provides another feature to do with variable scope. If you want to pass a variable to the block that is not automatically assigned to, but you don’t want to accidentally pass in a variable which has already been initialized (prior to the block and is therefore in scope within the block) you can do the following:

```ruby def some_crazy_method some_variable=5

puts “Before block some_variable=#{some_variable}”

[1,2,3].each do |i;some_variable| puts “In block i=#{i}” some_variable = i end

puts “After block some_variable=#{some_variable}” end```

By passing a variable to the block after the semicolon, we essentially say that we want to have a variable with that name local to the block and unrelated to a variable with the same name outside the block e.g.:

Before block some_variable=5
In block i=1
In block i=2
In block i=3
After block some_variable=5

This seems to be of limited utility and is not available in Ruby 1.8 (it will produce a compilation error).

Block As Closures

That’s right, you can use a block as a closure and pass it around and call it whenever you like. It is a reasonably advanced feature, but essentially you can do the following:

```ruby def method_with_block_as_closure(&block) another_method block end

def another_method(variable) x=25 variable.call x end

method_with_block_as_closure {|i| print “I am happy block #{i}”}```

By prepending the method parameter with the & symbol you tell the method to treat the block it gets as a closure and assign it to that parameter. We can then pass this parameter around to other methods just like we would any other. When we actually want to call the block we simply need to use the call method on the variable that currently contains the block, passing in any parameters that the block expects. In our case the result of the above code would be to execute the print method that was called inside the block e.g.:

I am happy block 25

This is a very basic, surface view of how to use blocks as first class parameters (closures). At this stage, this is really all we need to know, but I do plan to dig more deeply into how and why this works in a later post (i.e. Procs etc.). If you’re interested in that, then make sure you grab my feed so you don’t miss it. That’s all I wanted to cover regarding blocks, hope you found it interesting.

Image by Holger Zscheyge

Table of Contents

The Two Different Block Notations

Learn About Blocks By Implementing Our Own Iterator

Block Parameters

Block As Closures