Kotoba_app’s First Words

Some of you may have noticed that kotoba_app is now speaking! While the first fits and starts included my heavy-handed hand in the background, we have since gotten ourselves to where it does it all by itself. Papa is very proud.

What is all this about? How to get periodic, background process running on a web-site. This is absolutely nothing new. Every site requires maintenance scripts to handle various things. Sometimes certain processes are just too expensive to run all the time, otherwise the processes are too computationally expensive to do any other way than asynchronously (i.e. not synchronized with the user’s request). Other processes just lend them to be doing in a periodic manner (e.g. updating daily statistics).

My goal was to select a word randomly, generate a message, and then tweet it (or so say Tweeters in the know, ya know) to kotoba_app on a periodic basis.

My first approach was to use BackgrounDRb, a full-blown solution that allows workers to be called either periodically with a scheduler, or added ad hoc to a scheduler for asynchronous processing. In a word: butter. However, in what boils down to hindsight being 20/20, I put this solution together onto a shared hosting service. In short order I was asked to take it down as it was against terms of use; sadly but understandably so.

While upgrading my hosting to a dedicated server is very tempting, it is premature at this point. So what are my my other options? Cron! No, not Tron. But maybe better; hard as that is to believe. Given the ubiquity of cron I do not want to duplicate extant sources; but if you want to get the quick on this approach then skip over to Kip at Ameravant who has a very succinct post on how to go about getting cron to work for you.

While I did not get to use BackgrounDRb in production (yet), one unexpected consequence was an opportunity to refactor the extant code for twittering. In the end, the bits and bobs that call Twitter, to generating tweets, to selecting a random word have all been sufficiently separated; as they should be.

And even better? Writing word of the moment, including UI design and complete implementation, took less than 60 minutes. Then again, we are using RoR, the ultimate in web rapid development frameworks. But still, not shabby.

A Moment’s Word

In order to help introduce ourselves to new words sometimes it is best to find them serendipitously. While serendipity by not definition is not something we can will to happen, there are some good substitutes. Kotoba now offers a moment’s word. Just refresh and continue your journey, or come back whenever you want to learn, or rediscover, a word.

Additionally, you can follow kotoba_app on Twitter; our friendly bot will bring you new words on a daily basis.

Managing Externals with Piston

One thing that I like to do is ensure I am using the best practices I can even (especially) in my personal projects. Kotoba is no different.

One thing that has bothered me with Rails is the myriad of ways to get external components into a project. Gems are, of course, a great way to get system-wide control of gems, but then you need to manage them a bit more explicitly to ensure that your system configuration does not change from under your feet when a newer version gets installed afterwards. You can just import it directly from a source repository. Quick. Dirty. Bad. Bad. Bad. Did I mention it is bad? Yes. Bad.

While some of the projects I am using are located on Git repositories (more on this later), there are a few that are using subversion much as I am. While svn has the concept of externals, this can be a whole lot of messy when you have multiple externals, each of which gets interrogated every time you do an update. Ultimately, svn-externals require more work than they are often worth. Why do something when you can be lazy and let a tool do it for you?

Solution? Piston. [1, 2, 3]. Both Robby and Chris make good points about how difficult svn-externals can become. Piston just makes it simple.

To import a project from svn:

piston install svn-url vendor/plugins/project_name

To lock to the specific version of a project:

piston lock vendor/plugins/project_name

To unlock and update then:

piston unlock vendor/plugins/project_name
piston update vendor/plugins/project_name

To see that status of your Piston managed projects:

piston status

And what about Git and Piston? 1.9.1. Alternatively, here. Unfortunately, as of tonight I have not been able to get this to work. But I have not given up.

SVNと文字化け

I recently ran into an issue with installing subversion 1.5.6 via MacPort. Namely, I ended up with incorrect encoding of characters (文字化け as it were) as svn tried to display Japanese text where terminal’s LANG was configured incorrectly.

Macintosh:~ wwv$ svn --version
svn, ?\227?\131?\144?\227?\131?\188?\227?\130?\184?\227?\131?\167?\227?\131?\179 1.5.6 (r36142)
   ?\227?\130?\179?\227?\131?\179?\227?\131?\145?\227?\130?\164?\227?\131?\171?\230?\151?\165?\230?\153?\130: Mar  8 2009, 20:28:48

Copyright (C) 2000-2008 CollabNet.

then the below rectify this situation of 文字化け.

First and foremost, remember to update your shell’s PATH to point to wherever port installs the binaries; this is typically /opt/local/bin. This will ensure you are not using the default svn that comes installed with your Mac OS X.

If upon checking the version of subversion you will see a snippet like the below:

If you using Apple Terminal’s default shell, then edit ~/.bashrc.

export PATH=/opt/local/bin:/${PATH}

Additionally, in the same file also export LANG to ja_JP.UTF-8, or:

export LANG=ja_JP.UTF-8

Finally, if you are using Terminal on Mac OS X 10.5, then you will also want to disable automatically setting LANG on startup from Preferences > Details.

Turn Off LANG on Startup

When you create a new terminal session, if you again check the version of subversion you should see:

Macintosh:~ wwv$ svn --version
svn, バージョン 1.5.6 (r36142)
   コンパイル日時: Mar  8 2009, 20:28:48

Copyright (C) 2000-2008 CollabNet.
Subversion はオープンソースソフトウェアです。
http://subversion.tigris.org/ を参照してください。
この製品には、CollabNet (http://www.Collab.Net/) によって開発されたソフトウェア
が含まれています。

以下のリポジトリアクセス (RA) モジュールが利用できます:

* ra_neon : Neon を利用して WebDAV (DeltaV) プロトコルでリポジトリにアクセスするモジュール。
  - 'http' スキームを操作します
  - 'https' スキームを操作します
* ra_svn : svn ネットワークプロトコルを使ってリポジトリにアクセスするモジュール。
  - Cyrus SASL 認証を併用
  - 'svn' スキームを操作します
* ra_local : ローカルディスク上のリポジトリにアクセスするモジュール。
  - 'file' スキームを操作します
* ra_serf : serf を利用して WebDAV (DeltaV) プロトコルでリポジトリにアクセスするモジュール。
  - 'http' スキームを操作します
  - 'https' スキームを操作します

Caching RSS Feeds

We recently incorporated RSS feeds into Kotoba. I noted while implementing my solution that there are some tutorials [1, 2] out there that provide quick-and-dirty solutions to getting RSS feeds into a Rails application. While the tutorials are a good start, they do neglect the fact these implementations will drive a lot of (unnecessary) traffic between your site and the site with the RSS feed.

What can you do? What else! The time honored solution for all things that ail software: cache it.

First, let us create our RSS controller, or:

require 'kotoba_rss/feed'

class RssController < ApplicationController

  def parse_feed
    feed_url                = params['rss_url']                      || 'http://word.wardosworld.com/?feed=rss2&cat=6'
    maximum_number_of_items = params['maximum_number_of_items'].to_i || 5
    title                   = params['title']                        || t('rss.rss_feed')
    details                 = params['details']                      || t('rss.read_more')
    details_url             = params['details_url']                  || 'http://word.wardosworld.com/?cat=6'
    feed(feed_url, maximum_number_of_items, title, details, details_url)
  end

  def feed(feed_url, maximum_number_of_items, title, details, details_url)
    kotoba_rss_feed = Kotoba_Rss::Feed.new
    result = kotoba_rss_feed.get(feed_url)
    return render( 
      :partial => '/rss/feed', 
      :layout => 'popup',
      :locals => { 
        :result => result, 
        :maximum_number_of_items => maximum_number_of_items, 
        :title => title,
        :details => details,
        :details_url => details_url
      }
    )
  end
end

The next step is to create a class that will fetch RSS feeds for the rest of our application caching as appropriate, lib/kotoba_rss/feed.rb:

require "rss"
require 'rss/2.0'
require 'open-uri'

class Kotoba_Rss::Feed
  
  def initialize
    @cache = Kotoba_Rss::Cache.instance
  end
  
  def get(url)
    result = nil
    if @cache.has_expired?(url)
      open(url) do |http|
        response = http.read
        result   = RSS::Parser.parse(response, false)
      end
      @cache.add(url,result)
    else
      result = @cache.get(url)
    end
    return result
  end
  
end

Finally, we want to be able to cache our RSS feed results between calls, lib/kotoba_rss/cache.rb:

# Keep a cache of fetched RSS feeds.  This is to ensure
# we do not hit an RSS feed too often; an issue when we
# have many page fetches.  
#
# This is also a very useful thing when we are fetching
# from sites that might be throttling requests from specific
# users and, or domains (e.g. twitter.com).
class Kotoba_Rss::Cache
  include Singleton
  
  ENTRY_EXPIRES_IN_SECONDS = 300 # 5 minutes
  
  def initialize
    @cache = Hash.new
  end
  
  def add(url, result)
    entity = Kotoba_Rss::Cache::Entry.new(Time.now,result)
    @cache[url] = entity
  end
  
  def get(url)
    @cache[url].feed_result
  end
    
  def has_expired?(url)
    if @cache.has_key?(url)
      entry        = @cache[url]
      current_time = Time.now
      time_expired = current_time - entry.feed_time
      if time_expired > ENTRY_EXPIRES_IN_SECONDS
        return true
      else
        return false
      end
    else
      # if we do not have the URL yet then 
      # consider it expired
      return true
    end
  end

end

In the same file, lib/kotoba_rss/cache.rb, I also include:


class Kotoba_Rss::Cache::Entry
  attr_reader :feed_time, :feed_result
  
  def initialize(time, result)
    @feed_time   = time
    @feed_result = result
  end
  
end

I consider Kotoba_Rss::Cache::Entry an inner class that is only really useful for Kotaba_Rss::Cache, thus a great candidate for being included in its parent’s file; however, if you prefer you can certainly put it in its own file. In truth, Kotaba_Rss::Cache can also be included in lib/kotoba_rss/feed.rb which makes an even better design (note to self to refactor afterwards).

To use we have in our ApplicationHelper

  def link_to_blog_popup
    link_to(
         image_gif('blog',t('meta.blog')), 
         { 
           :action => 'parse_feed', 
           :controller => 'rss', 
           :rss_url => 'http://word.wardosworld.com/?feed=rss2&cat=6', 
           :maximum_number_of_items => 10, 
           :title => t('meta.blog'),
           :details => t('meta.blog_read_more'),
           :details_url => 'http://word.wardosworld.com/?cat=6'
         } , 
         :class => 'popup',
         :onclick => "return hs.htmlExpand(this, { objectType: 'ajax', contentId: 'popup', wrapperClassName: 'highslide-no-border', dimmingOpacity: 0.75, align: 'center'} )"
     ) 
  end

Note that we are using I18n t() method for localization.

In summary, we have a good separation of responsibility between all our moving parts. We are using a controller the way it is intended, with our presentation properly off-loaded to views. In our library we have a means of making a service that invisibly caches its results.

In my experience, caches should almost always be hidden from the caller; otherwise, you begin to degrade cohesion and venture into the land of tight coupling. Additionally, it makes the cache design much harder with multiple point of entry.

That is it. Whenever we have held a result in our cache longer than ENTRY_EXPIRES_IN_SECONDS our RSS feed will get new result that it stores in its cache. This will ensure that as our site grows (and it will!) we will not be hitting external services with every page fetch; something that would yield little value to our users (slower fetches) and make us bad clients of other sites’ feeds.