Tonight Blaine sent me his version of my Simple Ruby Log splitter, that took it a giant step further and made a reuseable library to split files. He used an object oriented approach, as well as provided extensibility by allowing the caller to pass in a block that determined when to split the file. Hopefully he will post the code either on my site or his.
Good stuff.
Anyway, he brought that problem back into my head, and I immediately thought about making a DSL for splitting files. The code I came up with is below.
Note: I am hoping to share ideas here. If anyone has a better way to do this please please please either post your thoughts/code in the comments or blog about it and send me the link. I am still very much learning Ruby, and would love to see what other people are doing.
-
#
-
# read from $stdin, splitting it into multiple files, using a DSL of sorts
-
#
-
def nextFile(filename, count)
-
File.open(“#{count}_#{filename}”, “w”)
-
end
-
-
def split(instream, filename, args)
-
currentFile = nil
-
lines = count = 0
-
lineBreak = args[:lines].to_i
-
instream.each do |line|
-
currentFile = nextFile(filename, count+=1) if (lines%lineBreak) == 0
-
lines += 1
-
currentFile.write line
-
end
-
end
-
-
def parse(command)
-
command.gsub(/into|when/, “,”).gsub(/=/,“=>”) # overly simplistic, no error handling, etc
-
end
-
-
def run(command)
-
eval parse(command)
-
end
-
-
run “split $stdin into ’split.log’ when :lines = 1000″
Of course, one quick look at this will show that this example is simplistic, and that there is much work that could be done to make it more robust and/or full featured. That is not the point. The point is line 27, where your business-tier developers only need to type in near-english syntax to get their jobs done. The rest of it is not too hard in this case… though it can get much much harder in other cases. However, that can be handled by a relatively few developers, and greatly simplifies the work for the rest.
Anyone care to take this example further?
p.s.> I know there is some work being done on DSL books for Ruby. I cannot wait until I get my hands on one. If anyone out there needs a reviewer, drop me a line. ![]()
Roger Henson | 05-Sep-06 at 2:37 pm | Permalink
Matt,
I decided to take your challenge after looking at yours and Blaine’s code. Here is my solutions for the file splitter problem:
# splits.rb — splits one input file into multiple output files
#
# Usage: ruby splits.rb infile count
#
# where infile is the file you want to split and
# count is the maximum number of lines you want in each file
#
# Author: Roger Henson 09/05/2006
#
if ARGV.size < 2 # if user did not supply
filename and count
File.open($0.to_s, “r”) do |sourcefile| # open this source code
while sourceline = sourcefile.gets # read source and
puts sourceline if sourcefile.lineno < 8 # write usage comments
end
end
exit # exit program
end
# initialize local variables
count = ARGV[0].to_i
infile = ARGV[1]
num = 0
# open and process the user’s input file using |file| block
File.open(infile, “r”) do |file|
while inline = file.gets # get next while
there are input lines
if ((file.lineno % count) == 1)||(file.lineno == 1) # make a new
file?
num += 1 # yes, add 1 to
filenumber count
outfile = infile + num.to_s # append filenumber
to filename
outline = File.open(outfile, “w+”) # open a new output
file
end
outline.puts(inline) # put one input line
to output file
end
end
—————————————————————————————-
To test it, I created an Excel macro to generate a csv file containing
the numbers 1 through 1100 and their corresponding value in words. For
example:
1,One
2,Two
3,Three
4,Four
5,Five
.. etc
This made it easy to see where the file breaks were during testing. The
program starts with a piece of code I converted to Ruby from one of my
REXX programs. It displays the first seven lines (comments) of the
source file when the user does not supply all arguments. By putting
program usage in comments at the top, I can provide a simple help
message and document the code at the same time.
Brooks | 07-May-08 at 11:10 pm | Permalink
The initialize local variables section in Roger’s code appears to access the ARGVs in the wrong order. To get the code to work, I had to switch things so infile = ARGV[0] and count = ARGV[1].to_i