- Functional programmers say no to destructive updates and side effects because it detracts from the mathematical transparency of code.
- Open source proponents say no to commercial software, patents and anything that infringes on the right of the individual to own and change the logic of the applications he/she uses.
- Accessible web aficionados say no to Flash, Java, Javascript and anything that pollutes the document accessibility the web was designed for.
Tuesday, March 3, 2009
Social contracts, commitments and the power of No
Thursday, February 26, 2009
Literate programming in Clojure
I was fascinated in reading about the concept of 'Literate programming', or the concept of writing code so very well commented that it stands for it's own manual. Since I've been doing a lot of blogging about bits and pieces of Clojure code these last few days, I decided to write a utility that can convert a well-documented Clojure file into it's own blog entry.
This is the result of running that utility on itself. Call it self-blogging code.
Since I'm blogging in HTML, the code will receive an input clojure file and transform it into an output html file with syntax-coloring of the code bits. I can then use the output file to retrieve the html code for entering into my blog.
Step 1. I want to be able to use this utility from the command-line (btw, check out JLine for adding bash-like command-line utility on unixy systems). So in true command-line fashion, I define a method for printing out usage about the utility:
(defn usage[]
(println "usage: html-transform <input clojure file> <output-html-file>"))
Step 2. I check at runtime that I get both input parameters (input Clojure file and ouput HTML file, otherwise print out usage and quit the utility:
(if (< (count *command-line-args*) 1)
(do
(usage)
(System/exit -1)))
Step 3. I define input and output file according to the command-line arguments:
(def input-file (nth *command-line-args* 0))
(def output-file (nth *command-line-args* 1))
(def input (java.io.InputStreamReader. (java.io.FileInputStream. input-file)))
Step 4. Now, I'm going to define certain 'special' words, that will be rendered differently than other text. 'Reserved' words are Clojure forms that are common enough as to warrant their own special color. 'Definition' words are the words following which are user-defined values, functions and macro names, all of which will be rendered in bold letters to highlight them. Finally, 'punctuation' words are opening and closing parentheses, brackets, and curly-braces.
(def reserved #{"def" "defn" "defmacro" "let" "letrec" "if" "cond" "when" "do" "recur" "loop"})
(def definitions #{"def" "defn" "defmacro"})
(def punc #{"(" ")" "[" "]" "`" "'" "&" "{" "}"})
Incidentally, a string-transformation function to transform special html characters found in the code so as to render them html-ready:
(defn htmlize[st]
"replace certain characters in a string with their html equivalents to render it html-ready"
(.replaceAll (.replaceAll (.replaceAll st "&" "&") ">" ">") "<" "<"))
Another string-transformation function to extract text from user comments and render them as html paragraphs:
(defn text[st]
"convert a string into another string representing a list of html paragraphs"
(let [paraphed
(reverse
(loop [lines (.split st "n") buffer "" result nil]
(let [line (first lines)]
(if line
(let [line (.trim line)]
(if (= line "")
(if (= buffer "")
(recur (rest lines) "" result)
(recur (rest lines) "" (cons buffer result)))
(recur (rest lines) (.trim (str buffer " " line)) result)))
(if (= buffer "")
result
(cons buffer result))))))]
(reduce (fn[a b](str a b))
(map (fn[a](str "<p>" a "</p>")) paraphed))))
Step 5. Define a few nifty-looking colors for rendering output Clojure code:
(def string-color "#889")
(def keyword-color "#458")
(def splice-color "#485")
(def reserved-word-color "#c50")
(def punctuation-color "#666")
(def definition-color "#d80")
(def symbol-color "#059")
Step 6. A function to color strings, keywords, symbols, punctuation, splices, definitions. This function also takes the previous colored word so as to render definition names in bold (need to detect that the previously colored word was 'def', 'defn' or 'defmacro'):
(defn color[st previous]
(cond
(= "" (.trim st))
""
(.startsWith st """)
(str "<font style="color: " string-color "">" (htmlize st) "</font>")
(.startsWith st ":")
(str "<font style="color: " keyword-color "">" (htmlize st) "</font>")
(.startsWith st "~")
(str "<font style="color: " splice-color "">" (htmlize st) "</font>")
(contains? reserved (. st trim))
(str "<font style="color: " reserved-word-color "">" (htmlize st) "</font>")
(contains? punc (. st trim))
(str "<font style="color: " punctuation-color "">" (htmlize st) "</font>")
:default
(if
(contains? definitions previous)
(str "<font style="font-weight: bold;color: " definition-color "">" (htmlize st) "</font>")
(str "<font style="color: " symbol-color "">" (htmlize st) "</font>"))))
Finally, I define input and output streams and parse the input, transform to HTML and send it to the output.This piece of code is a state-aware recursive loop into the Clojure structure. I define three states: symbol, string and escape. While I'm in :symbol mode, I use the (,),[,],{ and } as word separators. As soon as a double-quote " is encountered, I switch to string mode and collect all characters into a string. If while in string mode I encounter a backslash \, I switch to escape mode for a single following character that will be added to the string wether it's a double quote or not. From escape mode I can only go to string mode, which encountering another double-quote " will switch back down to symbol mode and send the string to the rendering engine.
The loop keeps an account of opening and closing punctuation so as to maintain information of the nesting level of any word: this is to process zero-level strings differently: instead of rendering them surrounded by double-quotes I will render them as HTML paragraphs. This is the literate programming that I was mentionning earlier: the code comments will become the text of the blog entry.
(let [out (java.io.PrintStream. (java.io.FileOutputStream. output-file))]
(. out println (str
"<html> <head> <style> body { font-family: sans-serif; background-color: #fff; font-size: 10pt; text-align: center; } #main { width: 400pt; margin-left: auto; margin-right: auto; text-align: left; padding: 10pt; } </style> </head> <body> <div id="main">"
(loop [buffer "" state :symbol word "" previous-word "" level 0]
(let [next-char-read (char (.read input)) next-char (str "" next-char-read)]
(if (= next-char-read (char -1))
buffer
(cond (= :escaping state)
(recur buffer :string (str word next-char) "" level)
(= :string state)
(cond
(= next-char "\")
(recur buffer :escaping word "" level)
(= next-char """)
(if (= 0 level)
(recur (text (str buffer word)) :symbol "" "" level)
(recur (str buffer (color (str """ word """) "")) :symbol "" "" level))
:default
(recur buffer state (str word next-char) "" level))
(= :symbol state)
(cond
(= next-char """)
(recur (str buffer (color word previous-word)) :string "" word level)
(= next-char "n")
(recur (str buffer (color word previous-word) "<br/>") state "" word level)
(= next-char " ")
(recur (str buffer (color word previous-word) " ") state "" word level)
(or (= next-char "[") (= next-char "{") (= next-char "("))
(recur (str buffer (color word previous-word)(color next-char "")) state "" word (+ level 1))
(or (= next-char "]") (= next-char ")") (= next-char "}") )
(recur (str buffer (color word previous-word)(color next-char "")) state "" word (- level 1))
(= next-char "t")
(recur (str buffer " " (color word previous-word)) state "" word level)
:default
(recur buffer state (str word next-char) previous-word level))))))
" </div> </body> </html>")))
And that's it. using this utility can accelerate blogging with code, and will enable automatic syntax-coloring of the code. The utility is hosted Here. Feel free to use it or abuse it as fits your own blogging or no blogging, needs. Cheerio!
Saturday, February 21, 2009
Xuglu: a contatenative language processor implemented in just a few lines of Clojure
1 2 addwhich translates to "add the number 1 to the stack, then the number 2, then pop the last two numbers off the stack, add them and push the result back onto the stack". Concatenative languages are fun to use because they make for extremely concise code and give the programmer the sensation of working close to the metal. Better-known languages of this sort include postscript, Forth, Joy and more recently Factor. My goal here is to use the power of Clojure to write such a processor for such a language in just a few lines of code. What I want is to be able to write:
..and yield the result of the stack evaluation as a stack itself. So what I need is a list processor that takes in elements and pushes them onto the stack unless they're a function, in which case I let them act upon the stack. Oh, and since we want to be able to push functions onto the stack to do functional evaluations such as maps, folds, filters and sorts, I keep a provision for pushing quoted functions onto the stack. Finally, I want to be able to group my instruction lists as new words that can be easily spliced into a running list. Without further ado, here's my language processor:(exec 1 2 add)
(defn execute[stack]
"function for executing a xuglu stack"
(reduce
(fn[current-stack operator]
(if (instance? clojure.lang.IFn operator)
(if (symbol? operator)
(cons (resolve operator) current-stack)
(operator current-stack))
(cons operator current-stack))) nil stack))
Simple, isn't it? Here's a function for calling the processor with a list of arguments:
I should mention that by now, I allready have working code. In fact, if I evaluate:(defn exec[& args] "function for constructing and executing a xuglu stack" (execute args))
...I get the expected:(exec 1 2 3 4 5 list reverse)
Now I add the code that will enable me to evaluate a sublist of instructions and splice the result back into the current stack:((5 4 3 2 1))
Now, I can do sub-evaluations:(defn push-back[sub-stack] "push a result stack onto a super-stack, one element at a time" (fn[stack] (concat stack sub-stack))) (defn splice[stack] "function for splicing a word into the xuglu stack" (push-back (execute stack))) (defn sub-exec[& args] "execute a sub-stack" (splice args))
...which gives:(exec 1 2 3 (sub-exec 4 5 6) list reverse)
or splices:((3 2 1 6 5 4))
...Which yields:(def a (list 1 2 3)) (exec 1 2 3 (splice a))
The definition of a as a list in this example is actually the definition of a word, that will be replaced at runtime by it's expansion. Why not facilitate this word creation by a simplified syntax? Here's a macro that does just that (this will be the only macro, promised!):(3 2 1 3 2 1)
...so now I can write:(defmacro word [name & args] "macro for defining xuglu words" `(def ~name (list ~@args)))
..that will define word a as a list of instructions pushing the numbers 1 to 5 onto the stack. My language runtime is done now. I need operators that will manipulate the stack. Unfortunately, all such operators will have to be functions that take the stack as argument. This is where it gets a little complicated, because if I have a function f(x,y) that works on 2 arguments, I have to write a new function that will take the stack, extract 2 arguments, do the evaluation and push the result back onto the stack. Here's a function that will do just that, through a recursive call to partial evaluations as many times as need to pull arguments off the stack:(word a 1 2 3 4 5)
(defn channel-args[fu numargs]
"function that transforms a function of numargs arguments into a function that
gets numargs arguments off an input stack"
(fn[stack]
((loop
[computed fu
counter numargs
mstack stack]
(if (> counter 0)
(recur
(partial computed (first mstack))
(- counter 1)
(rest mstack))
computed)))))
Given this function, I can redefine a whole slew of mathematical and functional operators to use the stack instead of a set number of arguments:
Now, I can do basic math, or actually do a functional reduction on the stack:(def x+ (channel-args + 2)) (def x- (channel-args - 2)) (def x* (channel-args * 2)) (def xdiv (channel-args / 2)) (def xreduce (channel-args reduce 2)) (def xfilter (channel-args filter 2))
All done in a single hack-session. A tribute to how much one can accomplish with Clojure.(exec 2 1 x+) (exec 5 6 x*) (exec '(1 2 3 4 5 6 7) (quote +) xreduce)
Clojure explorations: The "analyser" macro? "Cumulator" macro?
In my continuing exploration of the awesome power of macros, I've come upon a pattern that I see as recurrent in my programming. I tend to write a lot of code to parse my specialized formats that store my website information, and the pattern that I see in many of these parser functions is one that iterates over a list of strings and updates various variables depending on the value of each object.
A simple example would be the following: iterate over a list of strings and count characters, words and lines.Counting characters in a string is trivial in Clojure/Java:
(. line length)Counting lines is a simple question of incrementing a counter for each line in the file containing the strings that must be weighed.
Counting words needs it's own function, which I give here without explanation (it's useful for what comes after):
(defn count-words[line]
(if (> (. line length) 0)
(reduce (fn[num achar]
(if (= achar \ )
(+ num 1)
num)) 1 line) 0))
Now, if I want to count characters, words and lines from a file test.txt, then I might write the following loop:
(defn lines-of-file[file-name]
(line-seq
(java.io.BufferedReader.
(java.io.InputStreamReader.
(java.io.FileInputStream. file-name)))))
(loop [cc 0 wc 0 lc 0 lines (lines-of-file "test.txt")]
(let [line (first lines)]
(if line
(recur
(+ cc (. line length) 1)
(+ wc (count-words (. line trim)))
(+ lc 1)
(rest lines))
[cc wc lc])))
Now to look for a pattern in the loop form. Given a list, you could insert any list in lieu of the (lines-of-file) initialisation. The lines variable can be anything, and the 'line' variable that points to the head of the remainder of the lines to be weighed need only be a declared variable. I can then rewrite the loop as a macro with a variable parameter list and a body section that updates those parameters for each loop iteration. Here's what it looks like:
(defmacro iter-cumulate[[holder inits element a-list] & body]
`(loop [~holder ~inits a-list# ~a-list]
(let [~element (first a-list#)]
(if ~element
(let [res# (do ~@body)]
(recur res# (rest a-list#)))
~holder))))
Notice how I declare it so as to capture the parameters into a single variable: I'm going to use a vector of variables to turn that one variable into a generalized holder. Here's how I use my macro:
(iter-cumulate [[cc wc lc] [0 0 0] line (lines-of-file "test.txt")]
[ (+ cc (. line length) 1)
(+ wc (count-words (. line trim)))
(+ lc 1) ])
The result of this call gives a vector of numbers. If I run it on a file containing the following text:
two lines of text.
...I get the expected result:
[20 4 3]
But is my macro reusable? Here's one way of reusing it: let's say I want to parse a text file and gather text into paragraphs. I would do that by defining a list of paragraphs and a running text buffer that receives each line and updates the paragraph list only if it receives an empty line. Here's how I would do it:
(iter-cumulate [[paragraphs buffer] [nil ""] line
(concat (lines-of-file "test.txt") '(""))]
(if (= "" (. line trim))
(if (> (. buffer length) 0)
[(concat paragraphs (list buffer)) ""]
[paragraphs ""])
[paragraphs (. (str buffer " " line) trim)]))
Notice how I surreptitiously added a blank line to the end of the list of lines from the file. That's to simulate a blank line at the end of the file to trigger the adding of what's left in the buffer to the list of paragraphs.
Running the loop on a test.txt containing the following text:
This is my first paragraph. This is my second paragraph.
...yields the expected result:
[("This is my first paragraph." "This is my second paragraph.") ""] And that's it. I expect this macro has been created a million times over the last 50 years, but since I don't know a name for it, I can't look it up. Sapir-Whorf syndrome.*sigh*
Thursday, February 19, 2009
My first Clojure macro : inspired by Ruby's "each"
I've allways been impressed with the stunning alacrity in which you can scan over all lines of a file in Ruby and do something with each line. If you want to output each line to the console in Ruby, for instance, you could write:
IO.readlines("test.txt").each do |line|
puts line
end
Now, learning Clojure, I'd like to write some code that allows me to do the same, in an equally intuitive manner. First, a function to get all lines of a file into a list. From there, it should be easy to loop over the list. Here's the lister for lines of a file:
(defn lines-of-file[file-name]
(line-seq
(java.io.BufferedReader.
(java.io.InputStreamReader.
java.io.FileInputStream. file-name)))))
Now, if I want to loop over the file and print out each line, I would write:
(doseq [line (lines-of-file "test.txt")]
(println line))
The code is concise, legible, no complaints there. However, it doesn't read as well as the ruby version. If you read it out loud, substituting real words into the forms, you get:
Do for each line in the lines of file test.txt : print out the line
That's not an intuitive form for me. So I write the following macro that inverses the order of certain elements and drops the brackets for declaring the loop variable:
(defmacro over[coll var-name & body] `(doseq [~var-name ~coll] ~@body))
Now, to loop over all lines of the file, I write:
(over (lines-of-file "test.txt") line (println line))
Which translates to:
Over the lines of file test.txt, for each line, print out the line
This may not seem like any improvement, but to me it's a lot more legible. Also: it's reusable! Let's say I want to loop over all key-value pairs in a map, I can write:
(over
{1 "one" 2 "two" 3 "three"} [a b]
(println a "is" b))
Which gives the expected solution:
1 is one 2 is two 3 is three
A note about the macro: the & sign in the parameter list indicates that for all parameters sent after the &, group them in a single vector called 'body'. In the body of the macro, all variables starting with ~ mean: quote variable here. However, to quote the body, I use ~@, which effectively quotes not the 'body' variable, but rather each element in the vector it represents. This is a splice.
Macros are reputed hard to debug and to be avoided whenever possible. But hey, who can resist writing macros? Once executed, they're baked right into the runtime alongside most of the other forms composing the language and hoist the programmer's abstraction patterns to the same level as all other language forms. Now that's empowering.
