(archive 'newLISPer)

November 18, 2007

Colour me orange

Filed under: newLISP — newlisper @ 12:53

Not relevant for the archive version

You might be surprised to learn that – despite the bright orange tint everywhere on this site – I’m a bit of a monochromatic minimalist at heart, preferring dark text on light backgrounds and avoiding bright colours on web pages. However, I’ve decided that it’s time that newLISP code on this site should be displayed in colour rather than in dull grey. (I was partly inspired by Cyril’s excellent work on the vim syntax module, which you can read about on the newLISP forum, and by the colour schemes available in the newLISP editor, named after composers.)

Up to now, I’ve been using Lutz’ syntax.cgi program for colouring the code in the downloads section. I thought it would be cool to adapt this to use CSS styling, rather than use the old-school tags. And then I also noticed that this syntax program didn’t always process every file perfectly – there’s a few known problems mentioned in the file itself. So I thought I’d have a go at a new version of the program.

I’ve set it up so that there are four different CSS styles you can use for newLISP source code. These are selected by one of four classes:

  • c for comments
  • k for keywords
  • s for strings
  • p for parentheses

So the HTML code for marking up a function definition looks like this:

(encode-backslash-escapes t

which is a lot of text for a simple task (it was even longer until I abbreviated the class names). I’ve installed the syntax processor to paint the files on the downloads page, and it’s working quite well so far. I admit that painting code this way isn’t very quick. But I bet Michaelangelo often said that about painting the Sistine Chapel.

A more interesting problem, though, is how to integrate the code painting with Markdown, which I use for writing the text and which is also running as a comment processor. The problem is that Markdown doesn’t provide any syntax or convention for specifying which language a piece of code is written in. The only convention available is to indent text in the source file by at least four spaces. Then such text will formatted using white space inside

and tags. You use this convention for all code listings – HTML, CSS, and any other language, and for anything else that relies on formatting defined by white space.

Obviously we don’t want to run a newLISP formatting script on a block of HTML code. But there doesn’t seem to be a convention for specifying a language, so there are two choices: detect the language using some form of scanning (or guessing), or stick an indicator at the start of the code-block to specify which language to assume.

I’ve taken the easy way out. The convention I’ve used in the latest newLISP version of Markdown is simple: if you want to display a code block but you don’t want it processed by the newLISP code painter, put an exclamation mark (!) at the start, on a line of its own. Like this:

(def-inline-matcher 'link
    (:greedy-repetition 0 nil :whitespace-char-class)
    (:register (:greedy-repetition 0 nil (:inverted-char-class #\))))

So that bit of Common Lisp won’t be painted in colour. (To show the exclamation mark, I used a second one, but followed it with a space so that it didn’t get processed.) But this bit of newLISP will:

(define (process-source source-code-segments)
  (let ((result {})
  ; work through segment list
  (dolist (pair source-code-segments)  
    (set 'start (last (first pair)))
    (set 'end   (last (last pair)))
    ; put any white space back in
    (while (< cursor start) (print (cursor 1 Txt)) (inc 'cursor))
    (set 'type  (first (first pair)))
    (set 'source-string (slice Txt start (- (+ end 1) start)))
      ((= type 0)
         (push (highlight-keywords source-string) result -1))
      ((= type 4)
        (push  (string {} (escape-html source-string) {}) result -1))
         (push  (string {} (escape-html source-string) {}) result -1)))
    (set 'cursor (+ end 1)))

It’s a user-friendly solution. Most of the code examples here are in newLISP, and presumably that’s true of most of the comments as well, so it’s easier to say when you don’t want painted code, not when you do.

The main syntax painting code is in syntax.lsp (the syntax.cgi file loads this and builds an HTML page). I’ve relied heavily on code by newLISP guru Fanda and newLISP creator Lutz. The heart of the script is Fanda’s routine that scans newLISP source and records the character positions where the mode (code, string, or comment) changes. Then this list is used to rebuild a copy of the source in which the different sections are enclosed in tags, and the white space gaps are copied over from the original.

This still requires some testing, and some additions (I haven’t included the single-character operators yet, because I’m not sure what form of escaping some of them will need). Please let me know of any problems or improvements. And if you can work out how to choose and apply your own colour schemes as well, please share.

Comment from newlisper

During testing, I’ve noticed a few problems with this syntax-painting module. One to watch is that consecutive strings that are not separated by a space are not processed correctly. For example:

(println {like}{this})

won’t be handled correctly. It’s like the scanner doesn’t get the time to switch from string to code then back to string. The solution at the moment is to not write strings like that!

Another – more cosmetic – problem is that some of the reserved words with question marks aren’t picked up. For example:

(number? list list?)

should all be matched. The regex that matches them needs more work.

Comment from Fanda

Hello newlisper!

I am happy that you are using my code processing script :-)

The error you are talking about has been fixed. I removed ‘group’ function and rewrote ‘group-types’. Download it from:
http://www.intricatevisions.com/ source/newlisp/code.lsp


PS: I don’t know about speed up yet ;-)

Comment from cormullion

Thanks Fanda! I’m now trying to put your new work into my new work! I’m using your group function as the main tokenizer function for my formatter…

Comment from Fanda

Group function is obsolete, because its functionality was added to explode:

(explode ‘(1 2 3 4 5))
((1) (2) (3) (4) (5))
(explode ‘(1 2 3 4 5) 2)
((1 2) (3 4) (5))
(explode ‘(1 2 3 4 5) 2 true)
((1 2) (3 4))

‘true’ flag works opposite than in ‘group’ (if I had to guess).


PS: You can fix my English grammar mistakes since you are reviewing comments :-)))) [joking]


Leave a Comment »

No comments yet.

RSS feed for comments on this post.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

Blog at WordPress.com.

%d bloggers like this: