RubyWhenever I learn a new language I always want to know how the language does its work at more than just a superficial level. I admit that if you want to just be a casual user of a language (or library for that matter), this is not really necessary, but if you want to take a step towards mastering a language this is required knowledge. For my latest dive into Ruby, I decided to look at the the ruby installation and how we load the various libraries that we use.

The Ruby Installation

If you want to know where things live in your Ruby installation, there is an easy way to find out. Ruby stores a lot of information about itself in a big hash (just like environment variables):

Update: As per the comments below, Config is an object provided by the RubyGems library, so you will likely need to require ‘rubygems’ before you will be able to use it.

Config::CONFIG

If you get a list of keys from this hash, you can find out about all sorts of interesting info that Ruby knows about itself, but the most interesting bits are probably the following:

Config::CONFIG['bindir']
Config::CONFIG['rubylibdir']
Config::CONFIG['sitedir']/Config::CONFIG['vendordir']
Config::CONFIG['archdir']

Feel free to explore the others though there are interesting things to find there.

  • bindir – this is where the ruby installation will have it’s executables, such as the interpreter – ruby, irb, ri, rdoc etc. (in my case C:/ruby1.8/bin)
  • rubylibdir – this is where most of the Ruby standard libraries live (in my case C:/ruby1.8/lib/ruby/1.8). I say most because most Ruby libraries are written in Ruby as you would expect, however some are written in C and are therefore specific to the installation of Ruby (i.e. what platform you’re on), these don’t live here but instead reside in ‘_archdir_’.
  • archdir – this is the brother of ‘rubylibdir’ for all native libraries that are written in C (in my case C:/ruby1.8/lib/ruby/1.8/i386-mswin32). You’ll mostly find a bunch of C header files (.h) here as well as .so (in my case) or potentially some others (.dll maybe?).
  • sitedir/vendordir – these are actually two separate directories, but ‘vendordir’ is only applicable to Ruby 1.9. Essentially any extensions you develop yourself (or find lying around I guess :)) that are not gems will find a home here. These will be similar in structure to ‘rubylibdir’ Ruby native stuff will go directly here but there will also be an ‘archdir’ equivalent for your architecture specific stuff. My ‘sitedir’ was C:/ruby1.8/lib/ruby/site_ruby.

But what about gems? Well, that brings us to how we use the various libraries in our ruby programs.

Require vs Load

Normally when we want to use a feature in our Ruby program (be it gem or standard library), we would include it in our file using the require keyword, (we are all aware that load exists, but require is what all the big boys use). So what is the difference?

Load

To load a file we use the load method:

load ‘blah.rb’

Note that we must supply the extension when we use it. When Ruby encounters a load, it will read in the contents of the file you’re trying to load. It will do this every time. No matter how many times you load the same feature, Ruby will read the file in every time it encounters a load. You’re not limited to just supplying a name and extension though, you can navigate directories e.g.:

load ‘../../blah.rb’

or even give an absolute path:

load ‘/a/b/c/blah.rb’

But how will Ruby find these files?

How Ruby Finds Stuff

By default Ruby has a list of directories that it searches to find features you want to load – the load path. This is stored in a special variable:

$:

By default the load path will include the ‘rubylibdir’, ‘archdir’, ‘sitedir’, ‘vendordir’ etc. The last thing on the load path will always be the current working directory (i.e. . – ‘dot’), which is the directory from which you launched your application (unless you jump directories as your program is executing). The load path will not by default include anything to do with ruby gems. So how does Ruby find anything to do with gems? Lets talk about require before dealing with gems.

Require

No matter how many times you require the same feature in your program, only the first time is significant. Ruby will not re-read a file a second time, this is the first fundamental difference from how load works. The other obvious difference is the fact that you don’t need to supply an extension when you require a feature:

require ‘blah’

This allows us to treat features written in Ruby in exactly the same way as extensions written in C, Ruby is smart enough to work it out. These are basically the two major differences between require and load. Obviously as I mentioned, require is what most people will use day-to-day, but as long as we are aware of the differences, load can be useful too. And by the way, just like load, you can supply a relative or absolute path to require (as long as you leave the extension off).


How Ruby Finds Gem Stuff

As I mentioned the load path does not contain anything to do with gems by default. This is due to the fact that you can have multiple versions of the same gem installed on your system (and can actually use any of the versions you have installed), and so the load path is appended to (actually it is prepended to rather than appended to) dynamically as the program executes when you want to include a feature provided by a gem.

By default when you install gems they will go into the same place where ruby is installed, one level back from where ‘rubylibdir ’ is, in my case (C:\ruby1.8\lib\ruby\gems). Ruby will know about this when it needs to dynamically modify the load path, this is where it will look for gems. Ruby will not by default know that it needs to look for gems, so you will need to tell it to do so. There are several ways of doing this:

  • use require ‘rubygems’ in your code, before you include any features from gems. This is fine for your own code, but programs that you download that did not do this, will not work.
  • use –rubygems command like option when you launch your Ruby program.
  • use the RUBYOPT environment variable (e.g. RUBYOPT=rubygems). This is probably the easiest and most painless way.

If you have this set up then when you require something from a gem, Ruby will know to search the place where your gems are and will modify the load path dynamically to include the location of the gem you want (if there are multiple versions of the same gem, by default Ruby will use the latest version). For example:

>> puts $:
C:/ruby1.8/lib/ruby/site_ruby/1.8
C:/ruby1.8/lib/ruby/site_ruby/1.8/i386-msvcrt
C:/ruby1.8/lib/ruby/site_ruby
C:/ruby1.8/lib/ruby/1.8
C:/ruby1.8/lib/ruby/1.8/i386-mswin32
.

>> require 'hpricot'

>> puts $:
C:/ruby1.8/lib/ruby/gems/1.8/gems/hpricot-0.8.1-x86-mswin32/bin
C:/ruby1.8/lib/ruby/gems/1.8/gems/hpricot-0.8.1-x86-mswin32/lib
C:/ruby1.8/lib/ruby/site_ruby/1.8
C:/ruby1.8/lib/ruby/site_ruby/1.8/i386-msvcrt
C:/ruby1.8/lib/ruby/site_ruby
C:/ruby1.8/lib/ruby/1.8
C:/ruby1.8/lib/ruby/1.8/i386-mswin32
.

One final caveat to be aware of is this. If you can’t install gems into the default location, you will need to tell Ruby where the gems will go (and where to find them subsequently). You do this using the GEM_HOME environment variable. As long as you have it set, everything will still function as expected.

Image by jaja_1985