We recently incorporated RSS feeds into Kotoba. I noted while implementing my solution that there are some tutorials [1, 2] out there that provide quick-and-dirty solutions to getting RSS feeds into a Rails application. While the tutorials are a good start, they do neglect the fact these implementations will drive a lot of (unnecessary) traffic between your site and the site with the RSS feed.
What can you do? What else! The time honored solution for all things that ail software: cache it.
First, let us create our RSS controller, or:
require 'kotoba_rss/feed'
class RssController < ApplicationController
def parse_feed
feed_url = params['rss_url'] || 'http://word.wardosworld.com/?feed=rss2&cat=6'
maximum_number_of_items = params['maximum_number_of_items'].to_i || 5
title = params['title'] || t('rss.rss_feed')
details = params['details'] || t('rss.read_more')
details_url = params['details_url'] || 'http://word.wardosworld.com/?cat=6'
feed(feed_url, maximum_number_of_items, title, details, details_url)
end
def feed(feed_url, maximum_number_of_items, title, details, details_url)
kotoba_rss_feed = Kotoba_Rss::Feed.new
result = kotoba_rss_feed.get(feed_url)
return render(
:partial => '/rss/feed',
:layout => 'popup',
:locals => {
:result => result,
:maximum_number_of_items => maximum_number_of_items,
:title => title,
:details => details,
:details_url => details_url
}
)
end
end
The next step is to create a class that will fetch RSS feeds for the rest of our application caching as appropriate, lib/kotoba_rss/feed.rb
:
require "rss"
require 'rss/2.0'
require 'open-uri'
class Kotoba_Rss::Feed
def initialize
@cache = Kotoba_Rss::Cache.instance
end
def get(url)
result = nil
if @cache.has_expired?(url)
open(url) do |http|
response = http.read
result = RSS::Parser.parse(response, false)
end
@cache.add(url,result)
else
result = @cache.get(url)
end
return result
end
end
Finally, we want to be able to cache our RSS feed results between calls, lib/kotoba_rss/cache.rb
:
# Keep a cache of fetched RSS feeds. This is to ensure
# we do not hit an RSS feed too often; an issue when we
# have many page fetches.
#
# This is also a very useful thing when we are fetching
# from sites that might be throttling requests from specific
# users and, or domains (e.g. twitter.com).
class Kotoba_Rss::Cache
include Singleton
ENTRY_EXPIRES_IN_SECONDS = 300 # 5 minutes
def initialize
@cache = Hash.new
end
def add(url, result)
entity = Kotoba_Rss::Cache::Entry.new(Time.now,result)
@cache[url] = entity
end
def get(url)
@cache[url].feed_result
end
def has_expired?(url)
if @cache.has_key?(url)
entry = @cache[url]
current_time = Time.now
time_expired = current_time - entry.feed_time
if time_expired > ENTRY_EXPIRES_IN_SECONDS
return true
else
return false
end
else
# if we do not have the URL yet then
# consider it expired
return true
end
end
end
In the same file, lib/kotoba_rss/cache.rb
, I also include:
class Kotoba_Rss::Cache::Entry
attr_reader :feed_time, :feed_result
def initialize(time, result)
@feed_time = time
@feed_result = result
end
end
I consider Kotoba_Rss::Cache::Entry
an inner class that is only really useful for Kotaba_Rss::Cache
, thus a great candidate for being included in its parent’s file; however, if you prefer you can certainly put it in its own file. In truth, Kotaba_Rss::Cache
can also be included in lib/kotoba_rss/feed.rb
which makes an even better design (note to self to refactor afterwards).
To use we have in our ApplicationHelper
def link_to_blog_popup
link_to(
image_gif('blog',t('meta.blog')),
{
:action => 'parse_feed',
:controller => 'rss',
:rss_url => 'http://word.wardosworld.com/?feed=rss2&cat=6',
:maximum_number_of_items => 10,
:title => t('meta.blog'),
:details => t('meta.blog_read_more'),
:details_url => 'http://word.wardosworld.com/?cat=6'
} ,
:class => 'popup',
:onclick => "return hs.htmlExpand(this, { objectType: 'ajax', contentId: 'popup', wrapperClassName: 'highslide-no-border', dimmingOpacity: 0.75, align: 'center'} )"
)
end
Note that we are using I18n t()
method for localization.
In summary, we have a good separation of responsibility between all our moving parts. We are using a controller the way it is intended, with our presentation properly off-loaded to views. In our library we have a means of making a service that invisibly caches its results.
In my experience, caches should almost always be hidden from the caller; otherwise, you begin to degrade cohesion and venture into the land of tight coupling. Additionally, it makes the cache design much harder with multiple point of entry.
That is it. Whenever we have held a result in our cache longer than ENTRY_EXPIRES_IN_SECONDS
our RSS feed will get new result that it stores in its cache. This will ensure that as our site grows (and it will!) we will not be hitting external services with every page fetch; something that would yield little value to our users (slower fetches) and make us bad clients of other sites’ feeds.