August 2006

Hibernate Second Level Caching

Hibernate does not insert into second level cache on database inserts.

General
Java
Software Development

Comments (1)

Permalink

PST’s

Tonight I finished The Idea Book, by the guys at interesting.org. The book is not readily available in the US, but the great folks over at inBubbleWrap.com and 800cr kindly put together a fundable project to get an order of the book shipped to the US from Sweden, and I got in on the deal. By the way, if you are interested in the book, there is a sample of the book (around 35 pages if I remember correctly available here).

150 pages about ideas.
150 pages for your own ideas.
An idea book.

The first thing you notice when opening the book is that the content is interleaved with a number of plain white pages. These are for your ideas. So that you can jot as you read. I was tempted, but instead chose to have a sheet of paper handy for the note taking, in case people want to borrow it.

There is so much great stuff in the book that I could go on and on. Instead I will leave you with one tidbit of advice: Don’t be a PST (Person who has Stopped Thinking). These are the people who continue to do things the same way, all the time,because that is the way it always has been.

Always be willing to question the norm, and look for another way of doing things. I know that I at times (more often then I would like to admit) have been a PST. It is an insideous trap, because while you are stuck in PST mode, chances are good that you do not even know you are stuck! Be willing to accept new ideas. Question everything. Think Different. (your pithy two word statement goes here)

Books
General
Self
Software Development
Thought

Comments (0)

Permalink

A Ruby DSL for splitting files

Tonight Blaine sent me his version of my Simple Ruby Log splitter, that took it a giant step further and made a reuseable library to split files. He used an object oriented approach, as well as provided extensibility by allowing the caller to pass in a block that determined when to split the file. Hopefully he will post the code either on my site or his. ;) Good stuff.

Anyway, he brought that problem back into my head, and I immediately thought about making a DSL for splitting files. The code I came up with is below.
Note: I am hoping to share ideas here. If anyone has a better way to do this please please please either post your thoughts/code in the comments or blog about it and send me the link. I am still very much learning Ruby, and would love to see what other people are doing.

  1. #
  2. # read from $stdin, splitting it into multiple files, using a DSL of sorts
  3. #
  4. def nextFile(filename, count)
  5.         File.open(“#{count}_#{filename}”, “w”)
  6. end
  7.  
  8. def split(instream, filename, args)
  9.         currentFile = nil
  10.         lines = count = 0
  11.         lineBreak = args[:lines].to_i
  12.         instream.each do |line|
  13.                 currentFile = nextFile(filename, count+=1) if (lines%lineBreak) == 0
  14.                 lines += 1
  15.                 currentFile.write line
  16.         end     
  17. end
  18.  
  19. def parse(command)
  20.         command.gsub(/into|when/, “,”).gsub(/=/,“=>”)  # overly simplistic, no error handling, etc
  21. end
  22.  
  23. def run(command)
  24.         eval parse(command)
  25. end
  26.  
  27. run “split $stdin into ’split.log’ when :lines = 1000″

Of course, one quick look at this will show that this example is simplistic, and that there is much work that could be done to make it more robust and/or full featured. That is not the point. The point is line 27, where your business-tier developers only need to type in near-english syntax to get their jobs done. The rest of it is not too hard in this case… though it can get much much harder in other cases. However, that can be handled by a relatively few developers, and greatly simplifies the work for the rest.

Anyone care to take this example further?

p.s.> I know there is some work being done on DSL books for Ruby. I cannot wait until I get my hands on one. If anyone out there needs a reviewer, drop me a line. ;)

General
Ruby
Thought

Comments (2)

Permalink

Simple Ruby log splitter

The other day I needed to split up a large log file into smaller, more manageable bits. Long ago I wrote a Java based file splitter, and long before that I wrote one in C++ (woohoo iostreams!). But unfortunately, I did not have either of these programs with me.

So, I wrote it in Ruby:

  1. #
  2. # read from $stdin, splitting it into multiple files, based off of MAX_LINES
  3. #
  4.  
  5. FILE_SUFFIX = ARGV[0] ? ARGV[0] : “split.log”
  6. MAX_LINES = ARGV[1] ? ARGV[1].to_i : 500000
  7. lines = 0
  8. fileCount = 0
  9. currentFile = nil
  10.  
  11. $stdin.each do |line|
  12.         if (currentFile == nil)
  13.           fileName = “#{fileCount}_#{FILE_SUFFIX}”
  14.           puts “Writing file “ + fileName
  15.           currentFile = File.open(fileName, “w”)
  16.           fileCount+=1
  17.           lines = 0
  18.         end
  19.  
  20.         currentFile.write line
  21.         lines+=1
  22.  
  23.         if (lines > MAX_LINES)
  24.           currentFile.close
  25.           currentFile = nil
  26.         end
  27. end

Usages:

ruby split.rb < file

ruby split.rb mylogfilename < file

ruby split.rb mylogfilename maxlines < file

This is by far the smallest (and simplest) I have ever gotten the file splitter code, without using extra jars in Java. There are a number of reasons for this, not least of which is the elegant use of closures in Ruby to pass each line, instead of having to do all of the reading directly in my code.

But really, I’m putting this out here to see what ideas or approaches other people might have on how I could have done this better/more efficient, etc. So, please post your thoughts and comments. I would love to see a more elegant approach.

General
Ruby
Software Development

Comments (5)

Permalink

Perspectives on Performance

This evening, my wife asked me to take care of a load of laundry. Specifically, she asked me to take out all of the non-wrinkled clothes and fold them, while the rest of the load de-wrinkled in our wonderfully advanced, de-wrinkling dryer. I looked at her like she was crazy, and told her what I would do instead was just de-wrinkle all of it and then fold the load later.

My thinking was obvious: delay the unpleasentness of handling the laundry until hopefully I would not have to do it, or at least until after dinner.

Her thinking was more practical: do what can be done now, while the rest is being dewrinkled, and then I would have less to worry about when she leaves me with the kids after dinner.

In the end, I chose the only sane option of the two: I did it her way. The reason (besides the obvious: I was not prepared to die over laundry), is that she also imparted a technique that made sense: take all of the laundry out of the dryer, spread it out, and then put all the things that need to go back in the dryer back in the dryer. Simple. My initial thought when I looked at her like she was crazy was that she wanted me to dig through the clothes while still in the dryer and get out all the things that were not wrinkly. Obviously a more painful approach and thus why I optimized it out of the loop entirely.

Pondering the results

While doing as I was told, I pondered wether my initial approach (just letting it all de-wrinkle) was the more performant approach, as now I was moving the laundry from the dryer to the sorting area and partially back to the dryer, instead of pressing two buttons and then getting back to playing with my kids for a while and then dealing with it all at once.

Well, as usual with performance, it depends. See, if I did it her way I would spend a little more time on the laundry (processing time) but I would be done sooner (latency). My solution only saved time in the processing sense: I would not move the laundry around as much, and was ultimately the simplist approach. Which is the right approach (ignoring for the moment the whole death thing)?

I experienced much the same question at work today. Do we worry about performance first or do we strive for a simpler processing model and then worry about performance? If you are like most techies I know, you say performance. Why? Because it is the harder problem, at least on the surface. Simplicity is subjective, and performance is not, when applied in a scientific manner. In practice performance is usually just as subjective. I hear people say all the time that to write fast Java code you need to use StringBuffer. And thats it. To me, simplcity should always be the primary goal against the given set of requirements. If performance is a requirement, then it gets added to the equation. If it isn’t then we only worry about it when it becomes apparent that it is a problem.

performance = f(simplicity, nature of the processing)

I believe strongly that performance is a function of the simplicity of the system, and the nature of the processing. The simpler the system, the more performant it usually is. I say usually because in this case I am talking about the Einstienian view of simplicity: “Make it as simple as possible, but no simpler”. One can overly simplify the system and cause it to do too much of everything as a result. It is a fine line to walk.

Its not personal, it’s just business

Really, when it comes down to it, if the business requires a certain level of perfornance, than that is what matters. In my case tonight my wife had certain expectations (business reasons) that needed to be met if I was to be saved. At work today it was not quite as clear cut. Yes performance matters, but more than a simple, easily maintainable design? That is not an easy choice to make. It is even more difficult when you add multiple people, each with their own perspective(s) to the mix. My wife and I had different perspectives, which gave us different ways to solve the problem. She had an approach that was simpler than my painful approach, and while slightly more costly than my initial approach in processing time, was effective and got her what she wanted. Sometimes even when the simplest is not “too simple”, there is a compromise that satisfies the perspectives better.

General
Performance
Self

Comments (1)

Permalink