(archive 'newLISPer)

October 25, 2008

Project Nestor: part 1

Filed under: newLISP — newlisper @ 09:42

Most of my newLISP projects are unfinished and nameless, but my current project, still an early prototype, and still far from being finished, has already managed to acquire a name. It’s called Nestor. I needed to call it something, and since it deals with nested lists quite a lot, this was the first name that sprang to mind.

For those of you with a classical education, and/or fans of the film Troy, Nestor was the old soldier in Homer’s Iliad who advised King Agamemnon of Mycenae. To be honest, though, I was thinking more of Captain Haddock’s butler, Nestor, who runs Marlinspike Hall. After some desperate reverse acronymization, I managed to make NESTOR stand for “newLISP Source Translator Or Reducer.

The general aim is to convert newLISP source code into s-expressions, to extend the classic Lisp principle of allowing a rich interplay between code and data. Although newLISP has functions for converting text into code (read-expr), and for evaluating code passed as text data (eval and eval-string), information is lost in the process. And I’m hoping to preserve the two aspects of source code that are usually discarded: the comments and the white space (ie the formatting). Although these are solely for the benefit of the human reader and writer of code, I think they’re an important part of the process, and should be treated as such rather than always discarded.

My first attempt at a recursive descent parser for an LL(0) grammar worked OK. In plain English, that simply meant that the read function scanned the source input from left to right, and called itself recursively to look for the next token. On large source files, though (over 3000 lines), it did require a stack larger than the newLISP default of 2800 elements (so it would have to be run with a larger stack setting), and it wasn’t very quick either. I found it hard to control the nesting of the output as the various lists within lists were encountered in the source. And recursion, elegant though it undoubtedly is, has always made me scratch my head a little.

For the next version I went for a more iterative approach, and used a plain list as a stack, controlling the nesting by maintaining a stack pointer in a list:

(define (Stack:Stack i)
  (push i stack sp)
  (setf (sp -1) (+ 1 (sp -1))))

(define (Stack:inc-nest)
  (push 0 sp -1)
  (push '() stack sp)
  (inc 'nest-level))

(define (Stack:dec-nest)
  (pop sp -1)
  (setf (sp -1) (+ 1 (last sp)))
  (dec 'nest-level))

In the Stack default function, which is used to add items to the stack, push is putting an item on the stack at the location determined by the stack pointer sp, which is just a list of index numbers. The nesting can then be easily controlled by adding or removing indices to sp.

The main scanner works its way through the source, and calls the relevant function. For example, an opening parenthesis is easy:

(define (open-paren-token)
  (Stack (list "open-paren" "(")))

but some of the others are more complicated, such as the number scanner, which is recursive:

(define (read-number-scanner list-so-far)
  (let ((next-char (peek-char)))
    ;; if next-char is a digit then call self recursively
    (if (and (char-numeric? next-char) next-char)
          (read-number-scanner (cons (get-next-char) list-so-far))
          (reverse list-so-far))))

The top-level function, Nestor:read, converts the following newLISP source:

; get stdin POST method parameters if present
(set 'inline (read-line))
(if inline
  (set 'params (get-vars inline)))

to this list:

("comment" "; get stdin POST method parameters if present")
("comment" ";")
  ("open-paren" "(")
  ("symbol" "set")
  ("whitespace" "IA==")
  ("quote" "'")
  ("symbol" "inline")
  ("whitespace" "IA==")
    ("open-paren" "(")
    ("symbol" "read-line")
    ("close-paren" ")")
  ("close-paren" ")")
("whitespace" "Cg==")
  ("open-paren" "(")
  ("symbol" "if")
  ("whitespace" "IA==")
  ("symbol" "inline")
  ("whitespace" "IAoJ")
    ("open-paren" "(")
    ("symbol" "set")
    ("whitespace" "IA==")
    ("quote" "'")
    ("symbol" "params")
    ("whitespace" "IA==")
      ("open-paren" "(")
      ("symbol" "get-vars")
      ("whitespace" "IA==")
      ("symbol" "inline")
      ("close-paren" ")")
    ("close-paren" ")")
  ("close-paren" ")")

It’s wordy, isn’t it?! Perhaps I’ll switch to abbreviations or code numbers one day, but at the moment I find the longer names really useful. And the format isn’t going to be used for permanent data storage, just for temporary processing. The whitespace is stored in base64 encoding – again, this may not be the best choice, but for now it works OK.

Now that I’ve got this newLISP expression, or nl-expression, perhaps the most obvious and important function is the logical inverse of Nestor:read, Nestor:nlx-to-text, which converts the nl-expression back to plain source form. Fortunately it’s easy, and a natural application for a recursive function:

(define (nlx-to-text nlxp)
  (dolist (i nlxp)
  (if (atom? (first i))
         ((= (first i) "symbol")
              (write-buffer buff (string (last i))))
         ((= (first i) "open-paren")
              (write-buffer buff (string  {(})))
         ((= (first i) "close-paren")
              (write-buffer buff (string  {)})))
         ((= (first i) "whitespace")
              (dostring (s (base64-dec (last i)))
                (write-buffer buff (string (char s)))))
      ; and so on with other element types
     ; not an atom, recurse
     (nlx-to-text i))))

When this completes, the buffer buff should contain the de-nested source code.

When I tested this approach against the official newLISP installation, I was pleasantly surprised that all the modules converted to and from nl-expressions without error. In other words, given some newLISP in source-text:

(= source-text (Nestor:nlx-to-text (Nestor:read source-text)))

usually returns true.

However, there are problems. For one thing, I’m embarassed that the presence of one or more Unicode characters will kill the translator stone dead. I’ve no idea what the cause is yet, but I hope it’s fixable.

There are other things on my to-do list that will be done soon: nested braces in strings, for example. But there are some aspects of the language that I probably won’t be able to handle at all. The newLISP interpreter is a strange and powerful beast, and my own scripting is never going to be able to match it:

1.2.3e4e5                          ; three tokens: 1.2 .3e4 e5 ?
(println"hi there")                ; no space?
string}                            ; weird symbol or string typo?
(set '[ you're (not) joking ] 0)   ; wacky symbol names in brackets?

That first one was aptly described as a monster! Does anyone use these newLISP constructions in their code?

The first application for Nestor is to re-introduce the syntax colouring of newLISP source code that appeared about a year ago on this site and disappeared during the revamp in March this year. If you’re seeing newLISP code in colour then it’s working. More on that another day.


October 5, 2008

Going API

Filed under: newLISP — newlisper @ 23:02

Not functional in the archive version.

I’ve been playing with the Google chart API, and it’s a great way of inserting charts into your web pages. Here, for example, is a pie chart showing the relative sizes of the different directories in this web site:

And this is the code in the post that generates the chart.

— (Route.Blog:pie-chart “600×200” “t:10.79,1.07,42.80,2.15,19.42,3.59,18.70,1.43” “p3” “views|resources|jack-the-glypher|includes|images|dragonfly-framework|downloads|databases”) —

This chart is generated by Google automatically whenever the page is loaded. Originally it was a live chart, produced by examining the current file system every time the page was loaded, but this made the page take too long to generate… :)

Charts can be generated using this newLISP function:

(define (pie-chart
  (chs "400x200")
  (chd "t:60,40")
  (cht "p3")
  (chl "Hello|World")
  (chalt "Sample Google Chart"))
    {<img src="http://chart.apis.google.com/chart?chs=%s&chd=%s&cht=%s&chl=%s" alt="%s">}
    chs chd cht chl chalt))

which is called like this:

- --(Route.Blog:pie-chart  "600x200" "t:10.79,1.07,42.80,2.15,19.42,3.59,18.70,1.43" "p3" "views|resources|jack-the-glypher|includes|images|dragonfly-framework|downloads|databases") - --

although there shouldn’t be spaces between the three “-” signs (see lispinwebpages” for details).

This following little bit of code is useful for converting numbers into percentages, because the pie chart expects percentage values by default:

(define (make-percentages)
  (let ((total 0))
     (set 'total (apply add (args)))
     (map (fn (f) (mul 100 (div f total))) (args))))

(make-percentages 300 800)
;->(27.27272727 72.72727273)

Create a free website or blog at WordPress.com.