July 05, 2012

Posted by John

Older: Misleading Title About Queueing

Newer: Booleans are Baaaaaaaaaad

Four Guidelines That I Feel Have Improved My Code

I have been thinking a lot about isolation, dependencies and clean code of late. I know there is a lot of disagreement with people vehemently standing in both camps.

I certainly will not say either side is right or wrong, but what follows is what I feel has improved my code. I post it here to formalize some recent thoughts and, if I am lucky, get some good feedback.

Before I rush into the gory details, I feel I should mention that I went down this path, not as an architecture astronout, but out of genuine pain in what I was working on.

My models were growing large. My tests were getting slow. Things did not feel “right”.

I started watching Gary Bernhardt’s Destroy All Software screencasts. He is a big proponent of testing in isolation. Definitely go get a subscription and take a day to get caught up.

On top of DAS, I started reading everything I could on the subject of growing software, clean code and refactoring. When I say reading, I really should say devouring.

I was literally prowling about like a lion, looking for the next book I could devour. Several times my wife asked me to get off my hands and knees and to kindly stop roaring about SRP.

Over the past few months as I have tried to write better code, I have definitely learned a lot. Learning without reflection and writing is not true learning for me.

Reflecting on why something feels better and then writing about it formalizes it in my head and has the added benefit of being available for anyone else who is struggling with the same.

Here are a few guidelines that have jumped out at me over the past few days as I reflected on what I have been practicing the past few months.

Guideline #1. One responsibility to rule them all

Single responsibility principle (SRP) is really hard. I think a lot of us are frustrated and feeling the pain of our chubby <insert your favorite ORM> classes. Something does not feel right. Working on them is hard.

The problem is context. You have to load a lot of context in your brain when you crack open that INFAMOUS user model. That context takes up the space where we would normally create and come up with new solutions.

Create More Classes

So what are we to do? Create more classes. Your models do not need to inherit from ActiveRecord::Base, or include MongoMapper::Document, or whatever.

A model is something that has business logic. Start breaking up your huge models that have persistence bolted on into plain old Ruby classes.

I am not going to lie to you. If you have not been doing this, it will not be easy. Everything will seem like it should just be tucked as another method in a model that also happens to persist data in a store.

Naming is Hard

Another pain point will be naming. Naming is fracking hard. You are welcome for the BSG reference there. I would like to take that statement a step further though.

Naming is hard because our classes and methods are doing too much. The fewer responsibilities your class has, the easier it will be to name, especially after a few months of practice.

An Example

Enough talk, lets see some code. In our track processors, which pop tracks off a queue and store reports in a database, we query for the gauge being tracked before storing reports. The purpose of this query is to ensure that the gauge is in good standing and that we should, in fact, store reports in the database for it.

A lot of people throw the tracking code on their site and never remove it or sign up for a paying account. We do this find to make sure those people noop, instead of creating tons of data that no one is paying for.

This query happens for each track and it is pulling information that rarely if ever changes. It seemed like a prime spot for a wee bit of caching.

First, I created a tiny service around the memcached client I decided to use. This only took an hour and it means that my application now has an interface for caching (get, set, delete, and fetch). I’ll talk more about this in guideline #3.

Once I had defined the interface Gauges would use for caching, I began to integrate it. After much battling and rewriting of the caching code, each piece felt like it was doing too much and things were getting messy.

I stepped back and thought through my plans. I wanted to cache only the attributes, so I threw everything away and started with that. First, I wanted to be able to read attributes from the data store.

class GaugeAttributeService
  def get(id)
    criteria = {:_id => Plucky.to_object_id(id)}
    if (attrs = gauge_collection.find_one(criteria))
      attrs.delete('_id')
      attrs
    end
  end
end

Given an id, this class returns a hash of attributes. That is pretty much one responsibility. Sweet action. Let’s move on.

Second, I knew that I wanted to add read-through caching for this. Typically read-through caching uses some sort of fetch pattern. Fetch is basically a shortcut for look first in the cache and if it is not there, compute the block, store the computed result in the cache and return the computed result.

If I would have added caching in the GaugeAttributeService class, I would have violated SRP. Describing the class would have been "checks the cache and if not there it fetches from database". Note the use of "and".

As Growing Object Oriented Software states:

Our heuristic is that we should be able to describe what an object does without using any conjunctions (“and,” “or”).

Instead, I created a new class to wrap (or decorate) my original service.

class GaugeAttributeServiceWithCaching
  def initialize(attribute_service = GaugeAttributeService.new)
    @attribute_service = attribute_service
  end

  def get(id)
    cache_service.fetch(cache_key(id)) {
      @attribute_service.get(id)
    }
  end
end

I left a few bits out of this class so we can focus on the important part, which is that all we do with this class is wrap the original one with a cache fetch.

As you can see, naming is pretty easy for this class. It is a gauge attribute service with caching and is named as such. It initializes with an object that must respond to get. Note also that it defaults to an instance of GaugeAttributeService.

Unit testing this class is easy as well. We can isolate the dependencies (attribute_service and cache_service) in the unit test and make sure that they do what we expect (fetch and get).

Note: There definitely could a point made that "with" is the same as "and" and therefore means that we are breaking SRP. Naming is hard, really hard. Rather than get mired forever in naming, I rolled with this convention and, at this point, it does not bother me. I am definitely open to suggestions. Another name I played with was CachedGaugeAttributeService.

Below is an example setup with new dependencies inject in the test that help us verify this classes behavior in isolation.

attributes = {'title' => 'GitHub'}

attribute_service = Class.new do
  def get(id)
    attributes
  end
end.new

cache_service = Class.new do
  def fetch(key)
    get(key) || yield
  end

  def get(key)
  end
end.new

service = GaugeAttributeServiceWithCaching.new(attribute_service)
service.cache_service = cache_service

Above I used dynamic classes. Instead of dynamic classes, one could use stubbing or whatever. I’ll talk more about cache_service= later.

Decorating in this manner means we can easily find without caching by using GaugeAttributeService or with caching by using GaugeAttributeServiceWithCaching.

The important thing to note is that we added new functionality to our application by extending existing parts instead of changing them. I read recently, but cannot find the quote, that if you can add a new feature purely by extending existing classes and creating new classes, you are winning.

Guideline #2. Use accessors for collaborators

In the example above, you probably noticed that when testing GaugeAttributeServiceWithCaching, I changed the cache service used by assigning a new one. What I often see is others using some top level config, or even worse they actually use a $ global.

# bad
Gauges.cache = Memcached.new
class GaugeAttributeServiceWithCaching
  def get(id)
    Gauges.cache.fetch(cache_key(id)) { … }
  end
end

# worse
$cache = Memcached.new
class GaugeAttributeServiceWithCaching
  def get(id)
    $cache.fetch(cache_key(id)) { … }
  end
end

What sucks about this is you are coupling this class to a global and coupling leads to pain. Instead, what I have started doing is using accessors to setup collaborators. Here is the example from above, but now with the cache service accessors included.


class GaugeAttributeServiceWithCaching
  attr_writer :cache_service

  def cache_service
    @cache_service ||= CacheService.new
  end
end

By doing this, we get a sane, memoized default for our cache service (CacheService.new) and the ability to change that default (cache_service=), either in our application or when unit testing.

Finding ourselves doing this quite often, we created a library, aptly named Morphine. Right now it does little more than what I just showed (memoized default and writer method to change).

As I have started to use this gem, I am getting more ideas for things that would be helpful. Here is the same code as above, but using Morphine. What I like about it, over a memoized method and an attr_writer is that it feels a little more declarative and creates a standard way of declaring collaborators for classes.


class GaugeAttributeServiceWithCaching
  include Morphine

  register :cache_service do
    CacheService.new
  end
end

Note also that I am not passing these dependencies in through initialize. At first I started with that and it looked something like this:

class GaugeAttributeServiceWithCaching
  def initialize(attribute_service = GaugeAttributeService.new, 
                 cache_service = CacheService.new)
    @attribute_service = attribute_service
    @cache_service = cache_service
  end
end

Personally, over time I found this method tedious. My general guideline is pass a dependency through initialize when you are going to decorate it, otherwise use accessors. Let’s look at the attribute service with caching again.

class GaugeAttributeServiceWithCaching
  include Morphine

  register :cache_service do
    CacheService.new
  end

  def initialize(attribute_service = GaugeAttributeService.new)
    @attribute_service = attribute_service
  end
end

Since this class is decorating an attribute service with caching, I pass in the service we want to decorate through initialize. I do not, however, pass in the cache service through initialize. Instead, the cache service uses Morphine (or accessors).

First, I think this makes the intent more obvious. The intent of this class is to wrap another object, so that object should be provided to initialize. Defaulting the service to wrap is merely a convenience.

Second, the cache service is a dependency, but not one that is being wrapped. It purely needs a sane default and a way to be replaced, therefore it uses Morphine (or accessors).

I cannot say this is a hard and fast rule that everyone should follow and that you are wrong if you do not. I can say that through trial and error, following this guideline has led to the least amount of friction while maintaining flexibility and isolation.

Guideline #3. Create real interfaces

As I mentioned above, the first thing I started with when working on the caching code was an interface for caching for the application, rather than just using a client directly. Occasionally what I see people do is create an interface, but wholesale pass arguments through to a client like so:

# bad idea
class CacheService
  def initialize(driver)
    @driver = driver
  end

  def get(*args)
    @driver.get(*args)
  end

  def set(*args)
    @driver.set(*args)
  end

  def delete(*args)
    @driver.delete(*args)
  end
end

In my opinion, this is abstracting at the wrong level. All you are doing is adding a layer of indirection on top of a driver. It makes it harder to follow and any exceptions that the driver raises will be raised in your application. Also, any parameters that the driver works with, your interface will work with. There is no point in doing this.

Instead, create a real interface. Define the methods and parameters you want your application to be able to use and make that work with whatever driver you end up choosing or changing to down the road.

Handling Exceptions

First, I created the exceptions that would be raised if anything goes wrong.

class CacheService

  class Error < StandardError
    attr_reader :original

    def initialize(original = $!)
      if original.nil?
        super
      else
        super(original.message)
      end
      @original = original
    end
  end

  class NotFound < Error; end
  class NotStored < Error; end

end

CacheService::Error is the base that all other errors inherit from. It wraps whatever the original error was, instead of discarding it, and defaults to the last exception that was raised $!. I will show how these are used in a bit.

Portability and serialization

I knew that I wanted the cache to be portable, so instead of just defaulting to Marshal’ing, I used only raw operations and ensured that I wrapped all raw operations with serialize and deserialize, where appropriate.

In order to allow this cache service class to work with multiple serialization methods, I registered a serializer dependency, instead of just using MultiJson’s dump and load directly. I then wrapped convenience methods (serialize and deserialize) that handle a few oddities induced by the driver I am wrapping.

class CacheService
  include Morphine

  register :serializer do
    Serializers::Json.new
  end

  private

  def serialize(value)
    serializer.serialize(value)
  end

  def deserialize(value)
    if value.is_a?(Hash) # get with multiple keys
      value.each { |k, v| value[k] = deserialize(v) }
      value
    else
      serializer.deserialize(value)
    end
  end
end

Handling exceptions (continued)

I then created a few private methods that hit the driver and wrap exceptions. These private methods are what the public methods use to ensure that exceptions are properly handled and such.

class CacheService
  private

  def driver_read(keys)
    deserialize(@driver.get(keys, false))
  rescue Memcached::NotFound
    raise NotFound
  rescue Memcached::Error
    raise Error
  end

  def driver_write(method, key, value)
    @driver.send method, key, serialize(value), DefaultTTL.call, false
  rescue Memcached::NotStored
    raise NotStored
  rescue Memcached::Error
    raise Error
  end

  def driver_delete(key)
    @driver.delete(key)
  rescue Memcached::NotFound
    raise NotFound
  end
end

At this point, no driver specific exceptions should ever bubble outside of the cache service. When using the cache service in the application, I need only worry about handling the cache service exceptions and not the specific driver exceptions.

If I change to a different driver, only this class changes. The rest of my application stays the same. Big win. How many times have you upgraded a gem and then had to update pieces all over your application because they willy-nilly changed their interface.

The public interface

All that is left is to define the public methods and parameters that can be used in the application.

class CacheService
  def get(keys)
    driver_read(keys)
  rescue NotFound
    nil
  end

  def set(key, value)
    driver_write :set, key, value
  end

  def delete(key)
    driver_delete key
  rescue NotFound
    nil
  end
end

At this point, the application has a defined interface that it can work with for caching and for the most part does not need to worry about exceptions as they are wrapped and, in some cases, even handled (ie: nil for NotFound).

Creating real interfaces ensures that expectations are set and upgrades are easy. Defined interfaces give other developers on the project confidence that if they follow the rules, things will work as expected.

Guideline #4. Test the whole way through

Whatever you want to call them, you need tests that prove all your components are wired together and working as expected, in the same manor as they will be used in production.

The reason a lot of developers have felt pain with pure unit testing and isolation is because they forget to add that secondary layer of tests on top that ensure that the way things are wired together works too.

Unit tests are there to drive our design. Acceptance tests are there to make sure that things are actually working the whole way through. Each of these are essential and not to be skipped over.

If you are having problems testing, it may be your design. If you are getting burned by isolation, you are probably missing higher level tests. You should be able to kill your unit tests and still have reasonable confidence that your system is working.

Nowadays, I often start with a high level test and then work my way in unit testing the pieces as I make them. I’ve found this keeps me focused on the value I am adding and ensures that my coverage is good.

Conclusion

While it has definitely taken a lot of trial and error, I am starting to find the right balance between flexibility, isolation and overkill.

  1. Stick to single responsibilities.
  2. Inject decorated dependencies through initialization and use accessors for other dependencies.
  3. Create real interfaces.
  4. Test in isolation and the whole way through.

Follow these guidelines and I believe you will start to feel better about the code you are writing, as I have over the past few months.

I would love to hear what others of you are doing and see examples. Comment below with gists, github urls, and other thoughts. Thanks!

16 Comments

  1. FWIW, Ruby already has a “convention” for serialize/deserialize and it’s…wait for it…`load`/`dump`. Marshal, YAML, and JSON already implement it.

    The interface you build up here is generally useful. I’ve worked on at least two apps that have or needed this general idea wrapped around all key/value operations. Thanks for writing up how to build a good cache interface up in layers!

  2. if you can add a new feature purely by extending existing classes and creating new classes, you are winning

    This is the Open/closed principle – you maybe read about it from Bob Martin’s ASDPPP book?

  3. @Adam: Good point.

    @Jamie: Yep, thanks for linking. Didn’t read it in that book, but probably was linked to from another book.

  4. I like these thoughts in general. One thing on the real interface that you build though: it still relies on the driver’s return values for set and delete. It would probably behoove us to provide our own consistent return values, lest the driver change them and wreak havoc throughout our application.

  5. Do you tink it would help a little using namespaces? With the amount of new classes that come with this type of development, I think some kind of organization might be helpful.

    How would you organize it then? GaugeAttributeService::Base and GaugeAttributeService::Cache perhaps?

    I’m working on a fairly big project nowadays and I’m facing these kind of decisions.

  6. So, I’ve been doing this stuff for a while, and have a few thoughts/comments to add here:

    Naming stuff:

    GaugeAttributeService bothers me a bit, because of the Service suffix. Goos highly recommends against naming stuff with patterns, and I tend to agree. Service tells me nothing about why I’d use this thing, so I’d be tempted to go with something like FindGaugeAttributes, and then call the method first_with_id. This might seem super nitpicky, but it means the rest of your code isn’t thinking about what a “service” object is or does, only what it needs to do.

    GaugeAttributeServiceWithCaching could actually be extremely generic (if you always constructor inject all dependencies, more on that in a moment), so you could have something like this:

    
    class GaugeAttributeCache
      def initialize(finder, cacher)
        @finder = finder
        @cacher = cacher
      end
    
      def get(id)
        @cacher.fetch(cache_key(id)) {
          @finder.get(id)
        }
      end
    end
    

    On the idea of using accessors for collaborators, I’m not a huge fan of this. I prefer my objects (that do computation) to be mostly immutable, unless some requirement means I might need to say, swap out which caching service I’m using in code, at runtime. Instead I pass everything through the constructor of every object, until you get to a `main` method (often a rails controller, or a queue object of some kind)

    This leads to a thing some people find weird, but is also wonderful, which is that you will have a couple of top-level methods that just call new on a bunch of objects, passing them into each other, and then tells one of them to start processing. These main methods can lead to huge flexibility in how the system works, by wiring objects together differently, you can change behaviour without writing much code at all.

    Whilst this seems tedious to start off with, defaulting constructors has burnt me a bunch in the pass, so I’m mostly sworn off it now. If I have objects that are too hard to create, that’s a good design smell about that object doing too much. Ocasionally I break up the toplevel methods into a “dsl” layer that literally just constructs object graphs in particular configurations (note I don’t like the phrase dsl here, it’s just some ruby methods with readable names).

    I love the wrapping of third party code in the driver_read etc methods, though I’d be tempted to move them into another object.

  7. Your “class extension” recall maybe have been from reading anything that referenced the SOLID principles espoused by “Uncle” Bob Martin.

  8. Regarding ‘Use accessors for collaborators’ I would ask if you can write code like this:

    service = GaugeAttributeServiceWithCaching.new

    and the service instance is “ready to go,” then you are good with the API design.

    But if the code had to look like this, with a “setup” stage:

    # bad idea
    service = GaugeAttributeServiceWithCaching
    service.register_cache(Cache.new)

    I would say that is a case where you would want to demand a class be created in a useful state (via an initializer). Otherwise, it makes it very challenging to use “out of the box” without reading the API docs :-)

    Of course, there are always exceptions…

  9. (Odd formatting snafu above)

    Regarding naming… absolutely agree. It is worth arguing about. The more critical the class is to the system design, the more I might go to the mat and wrestle a good name to the ground. The decision on a name can be a fleeting event, but it will have everlasting impact. Think: Write Once, Read Many.

    For a C++ app for manufacturing (’95-98), I employed a very layered architecture. Portable business objects were sent to the thin client (no UI talking to DB crap allowed!). The paradigm of pick lists was commonplace… show me a list of parts. The list UI component merely needed a set of IDs and Names. So each domain class could basically implement the interface and they too could be tossed into a drop-down list. My clever name for this little device never grew past “IdString” — we kind of joked about it, because a better name never surfaced.

    When I see something like “GaugeAttributeServiceWithCaching” in isolation, it is not as easy to unilaterally discuss a better name.

    However, were this inside a basic system called “Gauge” and I saw a bunch of other classes prefixed with “Gauge” — I would throw a red flag.

    BTW: My rules are strict for domain-y things, and less strict for utility classes, and lesser things.

    I mostly dislike prefixes, postfixes, redundant things of any sort in a name — class or attribute. If I have to mentally strip off a prefix to get at the gist of what I am reading, then it should not be there in the first place.

    For example (non-Ruby community, mostly):

    • column names prefixed by the table name (Table User; user_first, user_last)
    • public class attributes prefixed by an underscore (Clas User; string _first, string _last)

    I also raise an eyebrow and look more closely at any (domain) class ending in “er.” Yup. Try it. Look for some. They are usually “doing” things.

    You can go too far in making “God” classes that have no properties of their own, are stuffed full of collaborators via the initializer, and wouldn’t know how to delegate if it hit them over the head. It’s the kind of “Manager” class (there’s that ‘er’) that — instead of asking (delegating) for it’s gauges to compute some stat, it gets all the data from the gauges and does the work for the gauges. Don’t be that guy!

    Conversely, look for the boring data class. No methods, just accessor stuff. While it might be just fine, and there truly is no business logic for this class, I would look around just to be sure there are no over achiever “er” classes lurking in the dark alleys — ready to pimp the business logic for the data class.

    Thanks for sharing, John!

  10. @Tom Crayford: I left out some functionality for clarity. The gauge attribute service handles CRUD for gauges. It is the persistence layer for the gauge class. Not a fan of FindGaugeAttributes as it feels verby.

  11. I’m with you on most of the important points here, but you hurt my feelings in two spots:

    
    if (attrs = gauge_collection.find_one(criteria))
    

    Assignment in that spot is, to me, unacceptable. It would cause me to do a double-take every time I read the code. I’d prefer:

    
    class GaugeAttributeService
      def get(id)
        gauge_collection.find_one(criteria_for(id)).tap do |a|
          a.delete('_id') if a
        end
      end
    
    private
      def criteria_for(id)
        {:_id =&gt; Plucky.to_object_id(id)}
      end
    end
    

    The second was this:

    
    def initialize(original = $!)
    

    I would never use a $ variable in Ruby code, but that’s my latent Perl-racism.

  12. Oren Dobzinski Oren Dobzinski

    Jul 12, 2012

    The quote you didn’t find was:

    “When you can extend a system solely by adding new objects without
    modifying any existing objects, then you have a system that is
    flexible and cheap to maintain”

    - Kent Beck in Smalltalk Best Practices Patterns.

    Was quoted in Avdi Grimm’s “Objects on Rails” recently.

  13. I read “clean code” and “the clean coder” books, both are really good. I wrote a blog post about it. Some of his advice is hard to follow but is a good guideline.

  14. Joshua Scholar Joshua Scholar

    Sep 09, 2012

    I made a slightly enhanced version of your 2009 ClassLevelInheritableAttributes module, and I’m wondering if it’s ok with you if I publish it as a gem.

    All I did was:
    1) make the names a bit shorter
    2) make the attr_… into a Kernal method that starts with
    include so that you don’t have to include it yourself (note, Ruby checks for multiple includes so that’s no problem)
    3) had it generate object level access functions so that, like C++ you can access class variables directly from an instance without prefacing it with .class
    4) made an analogous attr_ for regular class variables that just generates that access functions
    5) made an attr_read option for read only class variables.

    I like simple, easy to document libraries like this. I’ve seen other libraries that have similar abilities, but since they have more features and nothing is documented, I can’t bring myself to use them.

  15. Joshua Scholar Joshua Scholar

    Sep 09, 2012

    Actually after playing some more this has gotten more interesting.

    It turns out that “include” does run more than once at least in some cases.

    With some work I’ve come up with the right solution for hooking inherited, which includes making an around alias and chaining calls to it in the case where more than one extension wants to use the hook.

    Frankly that should be built into the language if we’re gonna have hooks at all. As it is you can’t put aliases into modules so you have inject code with class_eval. And it’s an error to do an around alias if you’re the first one at the hook so you have to test whether you’re first before you inject code. And it will include more than once so you have to test for that too.

    What a mess. But it’s ruby so even a bad solution is short and works.

  16. Does your blog have a contact page? I’m having trouble locating it but, I’d like
    to shoot you an e-mail. I’ve got some creative ideas for your blog you might be interested in hearing. Either way, great website and I look forward to seeing it improve over time.

Sorry, comments are closed for this article to ease the burden of pruning spam.

About

Authored by John Nunemaker (Noo-neh-maker), a web developer and programmer who has fallen deeply in love with Ruby. More about John.

Syndication

Feed IconRailsTips Articles - An assortment of howto's and thoughts on Ruby and Rails.

Feed IconRails Quick Tips - Ruby and Rails related links.