March 15, 2007 14:30

So I stumbled onto CruiseControl.rb a couple days ago, yeah I’ve been under a rock, very nice. I’ve already switched from continuous_builder plugin. Being able to browse the output of past builds is nice. I’ve also set it up to drop the rcov reports into the artifacts so they are easily viewable as well.

I also wrote a plugin to monitor your rcov coverage and if t falls outside of a specified range you get email. Yay, good ole public shaming.

More info on RcovNotifier

March 04, 2007 15:58

Most folks wont much care about these, but I’ve taken some code I found myself using in every project I worked on and made them plugins to make life a little easier.

The current plugins are GemTools (previously mentioned on the blog), Core Extensions and Config Reader. More info here.

March 04, 2007 15:58

I turned HpricotScrub into a Gem so that it’s easier to use across projects.

You can find the RubyForge project here and the Trac is here

sudo gem install hpricot_scrub
February 10, 2007 05:57

So I stumbled onto Devalot, and I have to say I like what I see. This thing has the potential to kick some butt. I really like the blog aggregation to the front page. If they go the right direction with the source integration this thing will replace Trac and Wordpress on my stuff no problem.

I just wish I had more “free-time” to help out with it, new gigs tend to eat a lot of time.

January 21, 2007 03:23

[UPDATE 2007-02-07] Changed scrub to return self [/UPDATE]

Using Hpricot to Scrub HTML – The remix

So I wanted to bring the HTML Scrubber into my Hpricot tweaks to tidy it up a bit and this is what I ended up with.

Now you can use the following to remove all tags from an HTML snippet

doc = Hpricot(open('http://slashdot.org/').read)
doc.scrub

Strip all hrefs, leaving the text inside in tact
(doc/:a).strip
Scrub the snippet based on a config hash
doc.scrub(hash)

hpricot_scrub.rb
require 'hpricot'

module Hpricot
class Elements
def strip
each { |x| x.strip }
end

def strip_attributes(safe=[]) each { |x| x.strip_attributes(safe) } end end class Elem def remove parent.children.delete(self) end def strip children.each { |x| x.strip unless x.class == Hpricot::Text } if strip_removes? remove else parent.replace_child self, Hpricot.make(inner_html) unless parent.nil? end end def strip_attributes(safe=[]) attributes.each {|atr| remove_attribute(atr0) unless safe.include?(atr0) } unless attributes.nil? end def strip_removes?
  1. I’m sure there are others that shuould be ripped instead of stripped
    attributes && attributes[‘type’] =~ /script|css/
    end
    end
class Doc def scrub(config={}) config = { :nuke_tags => [], :allow_tags => [], :allow_attributes => [] }.merge(config) config[:nuke_tags].each { |tag| (self/tag).remove } config[:allow_tags].each { |tag| (self/tag).strip_attributes(config[:allow_attributes]) } children.reverse.each {|e| e.strip unless e.class == Hpricot::Text || config[:allow_tags].include?(e.name) } self end end

end


Sample config in YAML
---
  :allow_tags: # let these tags stay, but will strip attributes
    - 'b'
    - 'blockquote'
    - 'br'
    - 'div'
    - 'h1'
    - 'h2'
    - 'h3'
    - 'h4'
    - 'h5'
    - 'h6'
    - 'hr'
    - 'i'
    - 'em'
    - 'img'
    - 'li'
    - 'ol'
    - 'p'
    - 'pre'
    - 'small'
    - 'span'
    - 'span'
    - 'strike'
    - 'strong'
    - 'sub'
    - 'sup'
    - 'table'
    - 'tbody'
    - 'td'
    - 'tfoot'
    - 'thead'
    - 'tr'
    - 'u'
    - 'ul'

:nuke_tags: # completely removes everything between open and close tag - ‘form’ - ‘script’ :allow_attributes: # let these attributes stay, strip all others - ‘src’ - ‘font’ - ‘alt’ - ‘style’ - ‘align’


The source with sample data/test, run the test with
ruby test

January 16, 2007 20:55

[UPDATE 2007-02-07] I realized I left some extra junk in the version of Util in the zip, it’s been updated [/UPDATE]

I have a rake task and a Util class that I use to make setting up required gems painless and to be sure that I’m always running the versions I think I am.

Install or update required gems

rake gems:install

Make sure they are loaded with the right versions during startup, by adding the following to environment.rb

Util.load_gems

This uses a config file that looks like


:source: http://local_mirror.example.com # this is optional
:gems:
  - :name: mongrel
    :version: "1.0"
    # this gem has a specfic source URL
    :source: 'http://mongrel.rubyforge.org/releases'

  - :name: hpricot
    :version: '0.4'
    # this tells us to load not just install
    :load: true 

  - :name: postgres
    :version: '0.7.1'
    :load: true
    # any extra config that needs to be passed to gem install
    :config: '--with-pgsql-include-dir=/usr/local/pgsql/include
              --with-pgsql-lib-dir=/usr/local/pgsql/lib' 

Here’s the Util class


require 'yaml'

class Util
  def self.load_gems
    config = YAML.load_file(
      File.join(RAILS_ROOT, 'config', 'gems.yml'))
    gems = config[:gems].reject {|gem| ! gem[:load] }
    gems.each do |gem|
      require_gem gem[:name], gem[:version]
      require gem[:name]
    end
  end
end

Here’s the rake task


require 'yaml'

namespace :gems do
  require 'rubygems'

  task :install do
    # defaults to --no-rdoc, set DOCS=(anything) to build docs
    docs = (ENV['DOCS'].nil? ? '--no-rdoc' : '')
    #grab the list of gems/version to check
    config = YAML.load_file(File.join('config', 'gems.yml'))
    gems = config[:gems]

    gems.each do |gem|
      # load the gem spec
      gem_spec = YAML.load(`gem spec #{gem[:name]} 2> /dev/null`)
      gem_loaded = false
      begin
        gem_loaded = require_gem gem[:name], gem[:version]
      rescue Exception
      end

      # if forced
      # or there is no gem_spec
      # or the spec version doesn't match the required version
      # or require_gem returns false
      # (return false also happens if the gem has already been loaded)
      if ! ENV['FORCE'].nil? ||
         ! gem_spec ||
         (gem_spec.version.version != gem[:version] && ! gem_loaded)
        gem_config = gem[:config] ? " -- #{gem[:config]}" : ''
        source = gem[:source] || config[:source] || nil
        source = "--source #{source}" if source
        ret = system "gem install #{gem[:name]} 
            -v #{gem[:version]} -y #{source} #{docs} #{gem_config}"
        # something bad happened, pass on the message
        p $? unless ret
      else
        puts "#{gem[:name]} #{gem[:version]} already installed"
      end
    end
  end
end

zipped source

January 10, 2007 04:43

Just a quick announcement, FCKeditor on Rails will run in Rails 1.2 as a plugin (with a little help), more info on the blog or in trac.

September 22, 2006 23:34

Jamis Buck has shed a little light on figuring out WTF that Ruby process eating all your processor is actually doing.

Alright, maybe not quite the same as sliced bread, but very nice none-the-less.

I can’t tell you how many times I could have used this, now I just need to wait for the need to pop up again.

[UPDATE] Apparently it get’s better than this, much better

September 10, 2006 02:49

[UPDATE 2007-01-10]
I’ve updated the scrubber, see Hpricot Scrub for more.
[/UPDATE]

I went looking for a Ruby replacement for Html::Scrubber in perl for a gig and came up blank. Can
it really be possible the nobody is doing anything more than blindly stripping tags?

I had seen Hpricot and thought I needed to find a reason to use it, well here it is. I monkey patched a couple methods into Hrpicot and off I went.

Here’s the Hpricot bits.


module Hpricot
  class Elements
    def strip
      each { |x| x.strip }
    end
    
    def strip_attributes(safe=[], patterns={})
      each { |x| x.strip_attributes(safe, patterns) }
    end
  end

  class Elem
    def strip
      parent.replace_child self, Hpricot.make(inner_html) unless 
        parent.nil?
    end

    def strip_attributes(safe=[], patterns={})
      attributes.each { |atr|
          pat = patterns[atr[0].to_sym] || ''
          remove_attribute(atr[0]) unless safe.include?(atr[0]) &&
            atr[1].match(pat)
      } unless attributes.nil?
    end
  end
end

Just that bit get’s me to the point where I can do things like this


doc = Hpricot(open('http://slashdot.org/').read)

# remove all anchors leaving behind the text inside.
(doc/:a).strip 

# strip all attributes except for src from all images
(doc/:img).strip_attributes(['src']) 

Then I made scrubber that passes in the array and hash to those methods to handle the dirty work. It looks like this, though I’m also using Tidy so mine is alittle different.


class HtmlScrubber
  @@config = YAML.load_file(
    "#{RAILS_ROOT}/config/html_scrubber.yml") unless 
      defined?(@@config)

  def self.scrub(markup)
    doc = Hpricot(markup || '', :xhtml_strict => true)
    raise 'No markup specified' if doc.nil?
    @@config[:nuke_tags].each { |tag| (doc/tag).remove }
    @@config[:allow_tags].each { |tag|
      (doc/tag).strip_attributes(@@config[:allow_attributes], 
        @@config[:attribute_patterns]) }
    doc.traverse_all_element {|e|
      e.strip unless @@config[:allow_tags].include?(e.name)
    }
    doc.inner_html
  end
end

Here is a zip of the code and a sample config: html_scrubber.zip

September 09, 2006 14:18

I wanted to do some profiling of a Rails app, so I did a little digging and found ruby-prof with new and improved call graphs. Plus it’s very fast. The install couldn’t be easier

sudo gem install ruby-prof

Then I wanted to see if I could get this to run in before and after filters, I haven’t had any luck, though I haven’t tried all that hard. Since I wanted to be able to do this relatively easily I threw together a mini module to handle the report generation piece for me. So now I can profile a controller action by adding this to my application controller

require 'ruby_profiler'

class ApplicationController < ActionController::Base
include RubyProfiler
end

Then in the controller I just need to


def some_action
  result = RubyProf.profile {
    ...
  }
  write_profile(result, 5, RubyProfiler::GRAPH_HTML)
end

source: ruby_profiler.rb