Best thing since sliced bread
Jamis Buck has shed a little light on figuring out WTF that Ruby process eating all your processor is actually doing.
Alright, maybe not quite the same as sliced bread, but very nice none-the-less.
I can’t tell you how many times I could have used this, now I just need to wait for the need to pop up again.
[UPDATE] Apparently it get’s better than this, much better
Using Hpricot to Scrub HTML
[UPDATE 2007-01-10]
I’ve updated the scrubber, see Hpricot Scrub for more.
[/UPDATE]
I went looking for a Ruby replacement for Html::Scrubber in perl for a gig and came up blank. Can
it really be possible the nobody is doing anything more than blindly stripping tags?
I had seen Hpricot and thought I needed to find a reason to use it, well here it is. I monkey patched a couple methods into Hrpicot and off I went.
Here’s the Hpricot bits.
module Hpricot
class Elements
def strip
each { |x| x.strip }
end
def strip_attributes(safe=[], patterns={})
each { |x| x.strip_attributes(safe, patterns) }
end
end
class Elem
def strip
parent.replace_child self, Hpricot.make(inner_html) unless
parent.nil?
end
def strip_attributes(safe=[], patterns={})
attributes.each { |atr|
pat = patterns[atr[0].to_sym] || ''
remove_attribute(atr[0]) unless safe.include?(atr[0]) &&
atr[1].match(pat)
} unless attributes.nil?
end
end
end
Just that bit get’s me to the point where I can do things like this
doc = Hpricot(open('http://slashdot.org/').read)
# remove all anchors leaving behind the text inside.
(doc/:a).strip
# strip all attributes except for src from all images
(doc/:img).strip_attributes(['src'])
Then I made scrubber that passes in the array and hash to those methods to handle the dirty work. It looks like this, though I’m also using Tidy so mine is alittle different.
class HtmlScrubber
@@config = YAML.load_file(
"#{RAILS_ROOT}/config/html_scrubber.yml") unless
defined?(@@config)
def self.scrub(markup)
doc = Hpricot(markup || '', :xhtml_strict => true)
raise 'No markup specified' if doc.nil?
@@config[:nuke_tags].each { |tag| (doc/tag).remove }
@@config[:allow_tags].each { |tag|
(doc/tag).strip_attributes(@@config[:allow_attributes],
@@config[:attribute_patterns]) }
doc.traverse_all_element {|e|
e.strip unless @@config[:allow_tags].include?(e.name)
}
doc.inner_html
end
end
Here is a zip of the code and a sample config: html_scrubber.zip
Profiling Rails end-to-end
I wanted to do some profiling of a Rails app, so I did a little digging and found ruby-prof with new and improved call graphs. Plus it’s very fast. The install couldn’t be easier
sudo gem install ruby-prof
Then I wanted to see if I could get this to run in before and after filters, I haven’t had any luck, though I haven’t tried all that hard. Since I wanted to be able to do this relatively easily I threw together a mini module to handle the report generation piece for me. So now I can profile a controller action by adding this to my application controller
require 'ruby_profiler'
class ApplicationController < ActionController::Base
include RubyProfiler
end
Then in the controller I just need to
def some_action
result = RubyProf.profile {
...
}
write_profile(result, 5, RubyProfiler::GRAPH_HTML)
end
source: ruby_profiler.rb