coderrr

March 31, 2008

Solving the method collision problem of monkey-patching

Filed under: c, ruby — Tags: , — coderrr @ 2:00 pm

Update: I’ve put the code and specs up on github and named it monkey_shield.

Everyone and their cat is blogging about how monkeypatching is the devil these days. One of the big problems is considered to be when two different libraries (re)define the same method on the same class. I have no interest in getting into any debates about this, but I did want to see if there was a way to solve it other than just “never monkeypatch!”.

Here’s what I’ve come up with:

Protect.wrap_with_context :lib1 do
  class Object
    def to_xml
      "<lib1/>"
    end
  end

  class Lib1
    def self.xml_for(o)
      o.to_xml
    end
  end
end

Protect.wrap_with_context :lib2 do
  class Object
    def to_xml
      "<lib2/>"
    end
  end

  class Lib2
    def self.xml_for(o)
      o.to_xml
    end
  end
end

# or 

Protect.wrap_with_context(:lib1) { require 'lib1' }
Protect.wrap_with_context(:lib2) { require 'lib2' }

# now you can...

Protect.context_switch_for Object, :to_xml

o = Object.new
Lib1.xml_for o  # => "<lib1/>"
Lib2.xml_for o  # => "<lib2/>"

o.to_xml # => raises Protect::NoContextError

Protect.in_context(:lib2) { o.to_xml } # => "<lib2/>

Protect.set_default_context_for Object, :to_xml, :lib1

o.to_xml => "<lib1/>"

It allows you to wrap any code in a context (including a require statement). What this does is hook method_added (and a few others) temporarily then alias any newly added methods to be called by a proxy method which pushes the context you’ve specified onto the “context stack”, calls the original method, then pops the context. This way, we always know what context we are in and can context switch to get around method collisions.

There are a few gotchas
Methods that call super which are defined in modules cannot be aliased because of this bug in ruby 1.8. They must be added to an exception list like so:

Protect.wrap_with_context :active_support, ['ActiveSupport::CoreExtensions::LoadErrorExtensions::LoadErrorClassMethods#new'] do
  require 'activesupport'
end

There are a few ways around this, but I haven’t decided which is best or if they are even worth it. A single method missing context usually won’t break any context switching. Also this bug should supposedly be fixed in the next version of 1.8.

The context stack is stored in a thread local variable meaning if you spawn a new thread it will lose context. I could solve this by having Thread.new copy the last context from the thread which created it onto its own context stack. I’ll probably do this in the next version.

Method names with irregular characters for example “filter[abc]” (Hpricot, I’m looking at you) cannot be defined with “def method_name”, only with define_method or alias_method. So for now you have to add these methods to the exceptions list. The exception list will take a regular expression which matches against the method name only. I’ll probably have this fixed in the next version:

Protect.wrap_with_context(:hpricot, [/^filter\[/]) { require 'hpricot' }

Lastly, I think the library should detect which methods it needs to context switch and do this step itself versus having the user do it explicitly. I’m still not sure on this yet though.

All comments, suggestions, and criticism welcome!

Will this solve all of the problems caused by monkeypatching? No. Will this solve some of them? Maybe. Does this make performance suck? Most likely. Would you use this in production? Hell no. Are there things this will break? Very likely. Is there a better way to do this? Probably. Does this work in 1.9? Haven’t tested it but my intuition says no. Should this be built into the language? Don’t ask me. Can I sue you if this breaks stuff? No.

Here’s the implementation:

require 'inline'

class Module
  # the singleton method which module_function creates should point to the original 
  # method, not the context wrapped one
  def __PROTECT__module_function__with_args(*methods)
    methods.each do |method|
      if unique_method_name = Protect::UNIQUE_METHOD_NAMES["#{self.name}##{method}"]
        __module_function__(unique_method_name)
        (class << self; self; end).class_eval { alias_method method, unique_method_name }
      else
        __module_function__(method)
      end
    end
  end  

  alias_method :__module_function__, :module_function  # store original module_function
  # this has to be a C function so that it can modify the module's scope
  inline { |builder| builder.c_raw %q{
    static VALUE __PROTECT__module_function(int argc, VALUE *argv, VALUE self) {
      if (argc == 0)
        return rb_funcall(self, rb_intern("__module_function__"), 0);
      else
        return rb_funcall2(self, rb_intern("__PROTECT__module_function__with_args"), argc, argv);
    }
  } }
end

class Protect
  UNIQUE_METHOD_NAMES = {}

  class NoContextError < StandardError; end
  class MethodDefinedInModuleCallsSuper < StandardError; end

  class << self
    def wrap_with_context(context, exceptions = [], &blk) 
      Module.class_eval do
        define_method :__PROTECT__method_added do |klass, method_name|
          return  unless Protect.hook_method_added?
          return  if exceptions.include? method_name or exceptions.include? "#{klass.name}##{method_name}" or
                     exceptions.select {|ex| ex.is_a? Regexp }.any? {|re| re =~ method_name.to_s }

          Protect.ignore_method_added { Protect.wrap_method_with_context(klass, method_name, context) }
        end
      end

      Protect.alias_method_added_hooks

      Protect.hook_module_function do
        Protect.hook_method_added do
          yield
        end
      end
    end

    def wrap_method_with_context(klass, method_name, context)
      klass.class_eval do
        return  if ! method_defined? method_name and ! private_method_defined? method_name # something else removed it already, wth!

        method_name_with_context = Protect.prefix_with_context(method_name, context)
        unique_method_name = Protect.unique_method_name(method_name)

        alias_method method_name_with_context, method_name
        alias_method unique_method_name, method_name
        private method_name_with_context, unique_method_name

        UNIQUE_METHOD_NAMES["#{self.name}##{method_name}"] = unique_method_name

        class_eval <<-EOF
          def #{method_name} *args, &blk
            Protect.in_context #{context.inspect} do
              __send__(#{unique_method_name.inspect}, *args, &blk)
            end
          rescue NoMethodError
            if $!.message =~ /super: no superclass method `(.+?)'/
              raise Protect::MethodDefinedInModuleCallsSuper, "Please add #{self.name}##{method_name} to the exceptions list!"
            end

            raise
          end
        EOF
      end
    rescue
      puts "failed to wrap #{klass.name}##{method_name}: #{$!}"
    end

    def context_switch_for klass, method_name
      klass.class_eval <<-EOF
        def #{method_name} *args, &blk
          raise NoContextError  if ! Protect.current_context
          __send__(Protect.prefix_with_context(#{method_name.inspect}, Protect.current_context), *args, &blk)
        end
      EOF
    end

    def set_default_context_for klass, method_name, context
      context_switched_name = "__context_switch__#{Protect.unique}__#{method_name}"
      klass.class_eval do
        alias_method context_switched_name, method_name
        class_eval <<-EOF
          def #{method_name} *args, &blk
            if ! Protect.current_context
              need_pop = true
              Protect.push_context #{context.inspect}
            end

            __send__(#{context_switched_name.inspect}, *args, &blk)
          ensure
            Protect.pop_context  if need_pop
          end
        EOF
      end
    end

    def alias_method_added_hooks
      return  if @method_added_hooks_aliased

      Module.class_eval do
        if method_defined? :method_added
          old_method_added = Protect.unique_method_name(:method_added)
          alias_method old_method_added, :method_added
        end

        if method_defined? :singleton_method_added
          old_singleton_method_added = Protect.unique_method_name(:singleton_method_added)
          alias_method old_singleton_method_added, :singleton_method_added
        end

        class_eval <<-EOF
          def __PROTECT__method_added__proxy method_name
            __PROTECT__method_added(self, method_name)
            #{old_method_added && "#{old_method_added} method_name"}
          end

          def __PROTECT__singleton_method_added__proxy method_name
            __PROTECT__method_added((class<<self;self;end), method_name)
            #{old_singleton_method_added && "#{old_singleton_method_added} method_name"}
          end
        EOF

        alias_method :method_added, :__PROTECT__method_added__proxy
        alias_method :singleton_method_added, :__PROTECT__singleton_method_added__proxy
      end

      @method_added_hooks_aliased = true
    end

    def hook_module_function(&blk)
      old_module_function = Protect.unique_method_name(:module_function)
      Module.class_eval do
        alias_method old_module_function, :module_function
        alias_method :module_function, :__PROTECT__module_function
      end

      yield
    ensure
      Module.class_eval { alias_method :module_function, old_module_function }
    end

    def in_context(context, &blk)
      push_context context
      yield
    ensure
      pop_context
    end

    def context_stack
      (Thread.current[:method_context] ||= []).dup
    end

    def current_context
      (Thread.current[:method_context] ||= []).last
    end

    def push_context(context)
      (Thread.current[:method_context] ||= []).push context
    end

    def pop_context
      Thread.current[:method_context].pop
    end

    def unique
      $unique_counter ||= 0
      $unique_counter += 1
    end

    def prefix_with_context(method_name, context)
      "__PROTECT__context__#{context}__#{clean_method_name(method_name)}" 
    end

    def unique_method_name(method_name)
      "__PROTECT__unique_method__#{unique}__#{clean_method_name(method_name)}"
    end

    def hook_method_added(hook = true, &blk)
      orig, @hook_method_added = @hook_method_added, hook
      yield
    ensure
      @hook_method_added = orig
    end

    def ignore_method_added(&blk)
      hook_method_added false, &blk
    end

    def hook_method_added?
      @hook_method_added
    end
    
    private

    def clean_method_name(method_name)
      method_name.to_s.gsub(/\[\]/,'__BRACKETS__').to_sym
    end
  end
end

I’ll turn this into a gem and include the tests once I come up with a better name for it.

9 Comments »

  1. Nice :-)

    Comment by Avdi — March 31, 2008 @ 2:48 pm

  2. Could you define “monkey patching”, please?

    Comment by Tom — March 31, 2008 @ 4:25 pm

  3. http://www.google.com/search?q=ruby+monkey+patching

    Comment by coderrr — March 31, 2008 @ 4:35 pm

  4. [...] Solving the method collision problem of monkey-patching [...]

    Pingback by This Week in Ruby (April 7, 2008) | Zen and the Art of Programming — April 7, 2008 @ 10:07 am

  5. Nice code, but if all you want to do is to make sure that the #to_xml method in context :lib1 doesn’t conflict with the #to_xml method in context :lib2, then why don’t you just name them #lib1_to_xml and #lib2_to_xml or so? You don’t have to maintain complex programming tricks and get the full speed.

    Comment by Pit — April 7, 2008 @ 1:28 pm

  6. Steve, one more remark: I don’t know why you used a C method. With a pure Ruby implementation I get the desired results.

    Comment by Pit — April 7, 2008 @ 2:09 pm

  7. Hey Pit,

    Thanks for the comments. Let address them.

    The issue is not with code you’ve written yourself, but when using libraries other people have written. Sometimes two of these libraries (by different authors) will define the same method on the same class. They might then use this method all over the place in their library, expecting that the method acts as they’ve defined it. This is why we can’t just change the name to lib1_to_xml. Of course if you really wanted to, you could go through and modify all their code to do this. But then you’d have to do it again on the next release.

    Here’s the reason for the C method. When module_function is called with no args, it changes the behavior of subsequently defined methods in the scope from which it was called (just as the keywords private, public, or protected do). Usually this scope will be some class, eg:

    class X
      module_function
      def some_method
        # this method will be a module function
      end
    end

    The problem is when you define a new method, which calls the original module_function, the scope is
    now the new method, and no longer the class from which you called the new method, eg:


    class Module
      alias_method :orig_module_function, :module_function
      def module_function
        orig_module_function
        # the scope inside this method is changed, and not the scope of the class
      end
    end
     
    class X
      module_function
      def some_method
        # this method will not be a module function as you would expect
      end
    end

    Writing the method in C gets around this and was necessary to make my code work with libraries which call the module_function method with no arguments.

    Comment by coderrr — April 7, 2008 @ 7:41 pm

  8. Steve, thanks for the replies to my questions. Now both things make sense to me.

    Comment by Pit — April 8, 2008 @ 7:17 pm

  9. [...] well what do you know. This seems to do what I described, pretty much. But it looks a bit [...]

    Pingback by Scoped mixins | self.collect(&:code) — January 22, 2009 @ 12:41 am


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Customized Silver is the New Black Theme Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 28 other followers

%d bloggers like this: