(archive 'newLISPer)

March 26, 2007

Interview with Lutz

Filed under: newLISP — newlisper @ 23:52
Tags:

Lutz Mueller, the creator of newLISP, is a familiar presence on the newLISP user forums, where he’s usually to be found helping everyone with their code problems, or announcing a never-ending stream of new releases and additions to the language.

I thought that it would be great to hear more from him, so I’m pleased to be able to present you with this short interview – an exclusive, as they say. Unfortunately, I wasn’t able to hop over to the USA to interview him personally, so we communicated electronically.

When did you first encounter Lisp? What did you like about it?

I heard about Lisp when at college, but that was before the personal computer had arrived. It was not until the beginning of the 1990s that I had a closer look. What appealed to me was the simplicity and elegant syntax of the language.

What made you decide to write your own version of Lisp?

A colleague at work used Lisp on his Macintosh computer; he did all his electronics engineering calculations and modeling with it. Seeing somebody doing real work with Lisp caught my interest. I had done several programming languages before and had enough code fragments in my toolbox to put a Lisp interpreter together quickly on SunBSD and show it to my colleague. The rest is history.

Were you surprised by the negative reaction to your work from some of the more conservative elements of the Lisp community – you might expect that they would welcome an addition to the extended family of Lisp-like languages?

People don’t like big change. All of today’s mainstream scripting languages had fierce opposition in the beginning. The way programs are written has changed dramatically over the last 20 years. I wanted a Lisp with its essential core characteristics of lists for both programs and data, and lambda functions, but with a script-language-like feeling and a modern built-in API. Things like regular expressions or networking functions should not require external modules or libraries, but should be built into the main executable.

Have you ever considered changing the name?

The name describes well what it is: a new approach to Lisp for a new age of programming and a new way to interact with computers. A scripting language is like a scratchpad; it helps you to develop thoughts. Such a language should not be overloaded with concepts developed about computer programming in CS (Computer Science) but with little value in practice. My biggest heroes in computer programming are people like John Kemeny and Thomas Kurtz (BASIC), Larry Wall (Perl) and Guido van Rossum (Python). They all showed that programming is not an elitist endeavor indulging in pure CS theories, but a way for solving practical problems on a computer by practical people. In this way, newLISP is a new way to look at Lisp, a way which has attracted people from domains normally not associated with programming, like artists and writers.

So the ‘new’ in ‘newLISP’ refers just as much to the type of people that use it – ie, not CS people necessarily, and possibly non-programmers as well – and the areas they use it in, rather than a new version of the Lisp language?

Yes, a new scripting-like approach to Lisp attracts new and different kinds of people to try it, for new and different types of application. The same has happened to other scripting languages. In the beginning of the web, people wrote CGI programs in C, but pretty quickly realized that Perl would be a better choice for this new type of application, and graphic designers started to program.

Why did Kozoru adopt newLISP?

All parties found in newLISP what they were looking for, and found their way into the language quickly. Nobody on the team had experience with using newLISP. Only one person had used Common Lisp before. Some of the programmers came from companies related to AI (Artificial Intelligence), others came from companies working in networking and security. They felt that Lisp was more appropriate for the class of problems they were trying to solve (natural language processing). They liked newLISP because of its small size, scripting character, and the ease of programming network-related tasks in a distributed application.

They built a distributed system, running on several dozen nodes, analyzing and ranking sentences of internet content as matches for questions. The guts of the system were built in less than a month. newLISP was used for almost all parts in the system, the core logic of parsing and processing language content, as well as administrative tasks like monitoring and logging nodes and updating them with new code.

I believe this was the first time newLISP was used in a programming team and by programmers not familiar with the tool previously. The project prompted many of the changes and additions to newLISP during 2005 and 2006.

There are many more examples of individuals solving individual problems using their computers. Like a writer customizing his editor with newLISP scripts. A musician controlling MIDI equipment or a composer scripting his web site.

Was the Kozoru project a success, technically?

The project was successful in porting an existing application – written in a well-established scripting language – to newLISP in a short time, with many times faster performance and with fewer lines of code. During the duration of the project, the direction changed many times, going from the web to chat and to the mobile phone, and finally to the BYOMS approach of specialized, user-customized search buttons, while at the same time new core algorithms were being plugged into the system.

There’s no shortage of scripting languages these days. Perl, Ruby, and Python are often pre-installed, so how can newLISP tempt new users, with such riches already available?

Shorter, denser programs, less computing resources, more expressive power via functional abstraction when designing algorithms and a relevant API built into the language. Quick installation, all you really need is the ~200KB executable. Excellent introductory and reference documentation. Much quicker and easier to learn than other Lisps.

You’re very keen to keep the size of newLISP down – 240KB for the latest binary. My first computer had 16KB RAM, but computers these days aren’t short of disk and memory. Why the continued emphasis on code size?

Computers are evolving to multi-core architectures. Programs are evolving from big monolithic programs to distributed agent-based applications. You may want to run dozens or hundreds of instances of newLISP on the same computer. Being diligent and saving resources (of any kind) will be one of the major human challenges in this century. Less code, less errors, less memory, less CPU cycles, less energy, less resources.

What do you personally use newLISP for?

Everything on newlisp.org is programmed in the newLISP wiki, a content management system, and most of the administrative tasks, cron jobs, log analysis etc. are done with newLISP too. I frequently open a newLISP shell to develop ideas and try out something quickly.

For people to invest time and energy into learning a new language, they have to feel that their investment is safe. What’s the future hold for the development of newLISP?

I’ve been working full time on newLISP since the end of 2002 and plan on doing so for many years to come. In the last few years, newLISP downloads have doubled every year. Since the 9.0 release in October 2006, the downloads curve has accelerated more and I believe the user base will have at least tripled by mid year compared to last year.

newLISP seems pretty complete as a language, particularly with large amounts of new functionality in recent releases. What’s your focus for the language in the near future – more libraries, or more core functionality?

Most core functionality is completed. Some minor holes still have to be filled in the API, but most of it is done. This year only one more major release will be done, probably in August. There are several applications I want to write using newLISP. I hope others will come forward to write libraries in specialized areas. It is time to show that newLISP can solve interesting problems. newLISP is a moving target – it is never finished, always changing, just like the fields of computers and software generally.

Who’s Don Lucio?

I went by this name when living in El Salvador during part of the 70s and 80s. It was an important time in my life.

What do you get up to when you’re not at your computer?

I either eat, go to sleep or meet people ;-)

Many thanks to Lutz for taking the time to talk to me!

Advertisements

March 24, 2007

and counting

Filed under: newLISP — newlisper @ 19:18
Tags:

Here’s an unusual sight on this blog: a Perl script. This one counts the number of times the word “and” appears in Sir Arthur Conan Doyle’s great Sherlock Holmes novel, “The Hound of the Baskervilles”, which I’ve downloaded in plain text form from (Project Gutenberg).

#!/usr/bin/perl -w
@ARGV = ("/Users/me/hound-of-baskervilles.txt");
$count = 0;
while (<>) {
    foreach my $wd (m/\band/ig) {
      ++$count;
      }
    }
print $count,"\n";

This works OK, and it’s quick too, taking about 75 milliseconds on my machine to produce the answer 1361.

(Perl’s pretty handy stuff, really, and I’ve written some in my time, although I confess I’ve never really got on well with it. It looks like there’s been an explosion in a punctuation mark factory. But Larry Wall, Perl’s creator, is a cool guy, and funny, too:

Lisp has all the visual appeal of oatmeal with fingernail clippings mixed in.

Python’s syntax succeeds in combining the mistakes of Lisp and Fortran. I do not construe that as progress.

If I don’t document something, it’s usually either for a good reason, or a bad reason. In this case it’s a good reason.

No, I’m not going to explain it. If you can’t figure it out, you didn’t want to know anyway…

and there’s plenty more!)

Anyway, what about a “fingernail clipping in oatmeal” version?

  (set 'path "/Users/me/hound-of-baskervilles.txt")
  (set 'file (open path "read"))
  (set 'count_ 0)
  (while (read-line file)
    (set 'line (parse (current-line) "\\W" 1))
    (dolist (i line)
      (if (= i "and")
          (inc 'count_))))
  (println count_)

Hmm, it works, but it doesn’t look good. And it’s slow, too – about 1.2 seconds. Each line of the file is parsed, and split at non-word characters, and then the resulting list of words is tested, one by one. Plainly it’s not a good approach.

Perhaps it would be better to make use of newLISP’s count function.

; file and path are the same as before...
(set 'count_ 0)
(while (read-line file)
   (set 'line (parse (current-line) {[[:^alpha:]]} 1))
   (inc 'count_ (first (count '("and") line))))
(println count_)

Looks a bit better, and a bit more Lisp-y, but still about 1.3 seconds. I’ve used a POSIX-style character class as well, just for a change.

Is it the parse that’s taking some time? Let’s try not parsing strings into lists:

; file and path are the same as before...
(set 'count_ 0)
(while (read-line file)
  (replace "\\Wand\\W" (current-line) (inc 'count_) 1))
(println count_)

This uses replace with an active replacement expression to increase the counter each time the word is found. It looks more compact, but it’s still too slow, at just over a second.

Or perhaps it’s not parsing that’s the problem, but the way we read the file line by line? Perhaps Perl is faster at that than we are. So let’s not do that; we’ll parse the whole file in one go:

; file and path are the same as before...
(println
  (count '("and") (parse (read-file path) {\s+} 1)))

Ah. This is 0.6 seconds now, and it looks quite good too. For a change, I used curly brackets to tell parse to split the text up at white space. Doing files in one go like this might give us problems with extremely large documents. Having said that, I tried it without problems on a 4 million word (20 MBytes) document, so it’s OK for smaller stuff such as “War and Peace” or the Bible, which both weigh in at less than 1 million words each.

I still think that parse is doing more work than necessary – after all, we don’t need the resulting list for further processing; we just want the number of occurrences of the string. So what about a parse-less whole-file version:

; path is the same as before...
(set 'count_ 0)
(replace "\\Wand\\W" (read-file path) (inc 'count_) 1)
(println count_)

Now we’re down to 100 milliseconds, within sight of our Perl target. Let’s combine the successful elements from our previous attempts to get an idiomatic and fast solution:

; path is the same as before...
(println (length (find-all "(?i)\\band" (read-file path))))

Now it’s less than 50 milliseconds, and an easy to read one liner, too. The (?i) is another way to specify case-insensitivity; with find-all the pattern of arguments is different, so it’s useful to put the option there rather than at the end. And there are fewer fingernail clippings than in the Perl version (if you include curly braces)! ;-)

I’m sure the Perl script can be made faster (and punctuation-free) too – but that’s a job for a Perl-lover… Besides, this isn’t really a post about performance, but about my ongoing search for idiomatic, Lisp-y solutions, which are likely to be shorter, faster, more reliable – and perhaps easier to write too, eventually.

By the way, to time newLISP expressions, use time – see Time Flies for another post about timing expressions.

And I ought to say that the best newLISP solution was provided by Mr newLISP himself. Thanks, Lutz!

March 16, 2007

Idle chatter

Filed under: newLISP — newlisper @ 16:03
Tags:

I once knew a bloke who had the disconcerting habit of interrupting a conversation without warning and staring into space for 30 seconds or so. It looked as if he was searching for inspiration, but he could well have been downloading information from the mother ship. Talking to him wasn’t always easy, or a pleasure.

I was reminded of him recently when I started using Internet Relay Chat (irc) to access the #newlisp channel. Chatting on irc is odd. You know that there’s someone at the ‘other end’, but the ‘connection’ feels peculiar. Questions and answers sometimes hang in the air for minutes – as if slowly floating through the thick atmosphere of a gravitationally-dense planet. You can’t tell whether the people you’re chatting with are still at their keyboards working on a reply, or have wandered off to do something more interesting.

You meet all kinds of people on irc channels. One chap was hanging around on the #newlisp channel with the sole aim of trying to persuade people not to use newLISP. That’s a bit weird. Imagine finding someone loitering in a shop hoping to prevent you buying something you wanted. Of course, the antics of these ‘Lisp suprematists’ are familiar enough now for us not to take any notice of them – they’re usually quite harmless, and they go away eventually. In this respect, I found a good post on Slashdot, from a character called Melquiades, that summarizes the view that many people have of these oddballs (who are probably not very representative of most Lisp users, by the way):

Yes, you too can become a fanatical Lisp user! Just trawl for any online discussion of any programming langauge that is not Lisp, then post using the following handy form:

Derogatory or condescending salutation. Quasi-religious statement of love for Lisp.

Laundry list of several nifty Lisp features. (It doesn’t really matter which ones.)

Implication or outright statement that every feature in programming language in question has already been implemented in Lisp. Subsequent dismissal of language in question.

Remember, in writing your post, it is essential that you adhere to the following guidelines:

  • Never show any respect for a non-Lisp language.
  • Never admit the usefulness of new experiments, or of personal exploration.
  • Never contribute concrete, constructive suggestions to the designers or users of any other language.
  • Never, never think outside the Lisp box.

But back to irc and newLISP.

With some help from newLISP gurus (Norman, you know who you are!), I’ve been experimenting with some newLISP code to access irc in various ways. Here’s one attempt at an extremely simplistic irc client. Note in passing the lack of error-checking, a characteristic feature of these posts :-)

First we’ll open access to TCP port 6667 and connect to the irc server that hosts the #newlisp channel:

#!/usr/bin/newlisp
(set 'server (net-connect "irc.freenode.net" 6667))

Now we’ll log in to irc and join the #newlisp channel. Here I’ve used the username and nickname ‘newlithper’ – choose your own (and make sure it’s unique!):

(net-send server "USER newlithper 0 * :XXXXXXX\r\n")
(net-send server "NICK newlithper \r\n")
(net-send server "JOIN #newlisp\r\n")

Once we’re connected, we’ll get loads of text back from the server, consisting mainly of notices and also a biography of a leading figure in contemporary culture – cool. We’ll get this from the server via a buffer and print it, looking for a line containing 366, which apparently marks the end of the beginning, if you see what I mean:

(until (find "freenode.net 366" buffer)
  (net-receive server 'buffer 8192 "\n")
  (print buffer))

I don’t know why the maximum byte value is 8192, but the number looks plausible.

Now we’re connected and ready to ‘chat’. This is where I’ve been trying out a technique that allows me to do two sets of jobs at once. The first set consists of checking for new messages, responding to a periodic ‘keep alive’ exchange (called, reasonably enough, ‘ping-pong’), and handling a few errors. The second set consists of sending messages and commands.

So there are two while loops. They can share a connected flag; each loop can read and set this flag:

(set 'connected (share))
(share connected true)

This syntax is basically the equivalent of (set ‘connected true), used for symbols that are to be shared between threads.

The first set of jobs is done in a separate thread:

(fork
  (while (share connected)
    (cond
      ((net-select server "read" 1000) ; read the latest
          (net-receive server 'buffer 8192 "\n")
          ; output in green, then back to white
          (print "\n27[0;32m" buffer "27[0;0m"))
      ((regex {^PING :(.*)\r\n} buffer) ; play ping-pong
          (net-send server (append "PONG :" (string $1 ) "\r\n"))
          (sleep 5000))
      ((net-error) ; error
          (println "\n27[0;32m" "UH-OH: " (net-error) "27[0;0m")
          (share connected nil)))
   (sleep 1000)))

This runs happily in its own thread, regularly checking the shared ‘connected’ flag to see if it should continue. I’ve made output appear in green in my terminal.

The other while loop looks at the first character of any user input submitted via read-line. If it’s a “/”, it’s assumed to be an irc command. It’s definitely not a good idea to use commands that generate lots of output – since the other thread will work through the responses quite slowly (one line a second). So don’t try a “/list” command! Useful commands might be:

/names #newlisp
/whois cormullion
/time
/help

Here’s the loop:

(while (share connected)
   (sleep 1000)
   (set 'message (read-line))    ; get user input
   (cond
     ((starts-with message "/")  ; a command?
          (net-send server (append (rest message) "\r\n"))
          (if
            (net-select server "read" 1000)
            (begin
                (net-receive server 'buffer 8192 "\n")
                (print "\n27[0;32m" buffer "27[0;0m"))))
     ((starts-with message "quit") ; quit
            (share connected nil))
     (true  ; just send input as plain message
          (net-send server (append "PRIVMSG #newlisp :" message "\r\n")))))

With any luck, this code should allow us to send messages and give commands, while the other thread checks the channel for new messages. We quit by typing the command ‘quit’ at the start of a message. This sets the flag and thus terminates both while loops.

Finally, we should tidy up:

(println "finished; closing server")
(close server)
(exit)

So far this code seems to work fairly well – but I haven’t been testing it that much, mainly because I much prefer using something like Colloquy (MacOS X), which is an extremely well-designed and useful app. Perhaps I should be using semaphores to control access to shared memory, but I haven’t yet managed to understand the description of semaphore in the newLISP manual. :-)

But whatever irc client you use – if you have any interest in newLISP, please call in at the #newlisp channel occasionally. You don’t have to say anything if you don’t want to – just watch those speech balloons hang in the air…!

March 4, 2007

Out of the crypt

Filed under: newLISP — newlisper @ 23:39
Tags:

I enjoyed reading Ax0n’s article on encryption over at HIR Information Report. He used newLISP to demonstrate how to use newLISP’s encrypt command.

I don’t know much about encryption and security, so I usually rely on the built-in tools that are provided by the operating system – on the Mac that means the Keychain and encrypted disk images, which have always worked well for me – but then, I’m not a spy or a terrorist, and I’ve never had my computer stolen or confiscated by the Government.

I often think, though, that it would useful to have some simple encryption tools for use in more casual situations. I’m not fanatical about keeping secrets, but there’s definitely room for some tools that would stop the casual snooper or occasional visitor being able to get access to some of my more personal stuff, such as passwords and payment information. I doubt whether anything too simple and easy would deter a skilful and determined hacker or a Government agency. But you could always lose your computer, or have it stolen, and I suspect that a typical thief wouldn’t necessarily be a skilled code-breaker. So how about using newLISP to do a bit of on-the-fly encoding, to keep out the less-skilled nosy parker?

On the bottom of this web page you can see the text:

If you want to contact me, run the following through newLISP to see my email address:

(encrypt "TH^VDZJmD[N07YBD" ":-)")

This is basically all there is to using the encrypt command – you supply the string to be encrypted or decrypted, and a key. Since I’ve provided the key here, you can decrypt this easily by evaluating this expression in newLISP.

This function uses the XOR method, which is described in “Cryptography for Dummies” (I got it from the library this week) as a toy rather than a useful encryption tool. However, I can’t see that it’s particularly easy to crack – if you’ve just picked up someone’s laptop, you’re not going to be able to read a message encoded like this, at least, not immediately.

I thought it would be cool to write MacOS system services to encrypt and decrypt selected text (I’ve been writing about them recently. Here’s the encryption service:

#!/usr/bin/newlisp
(set 'raw-crypt-key (exec {osascript -e "tell application \"Finder\"
  activate
  set p to (display dialog \"Enter your key\" default answer \"\" giving up after 10 with hidden answer)
  if button returned of p is \"OK\" then
    text returned of p
  end if
end tell"})
)
(unless (string? (set 'crypt-key (first raw-crypt-key)))
    (exit))
(while (read-line)
  (println (base64-enc (encrypt (current-line) crypt-key))))
(exit)

The bulk of this script is just some AppleScript to get the key. The ‘with hidden answer’ option for display dialog does that “bullets instead of characters” thing, to give the illusion of secrecy.

The decryption service is almost identical, of course, although we want to apply base64 decoding first, before running encrypt:

(println (encrypt (base64-dec (current-line)) crypt-key))

These services can be easily used in any application, such as a diary or notebook: just select some text, type the key when prompted (and remember it!) and just the selection is encrypted or decrypted. Let’s try it on the next paragraph:

cw1NX05AXkhNGllGGlhaXw0DWExaXxsdF0hHWQcJW14JTUhFVg1LX05IT15MGgdMVE5bQ11dEA1KW0MJXUhHX19ITkgJSVlbW0NOXw0=
WUVISExKTkhbSQEJW0NNGllBX15MGkBAXUVdGlhZSUhdGllBXw1ISl1FU05ITkRGVA1QVVgOSEgJTUJbUURHXQ1AVAM=

Yes – that looks sufficiently cryptic to scare off the less determined snoopers, and the key is easy to remember :-) Of course, by using short keys that you can remember (and enter) easily, you’re compromising security. And doing each line separately is probably a big weakness, too. But then, if you’re paranoid about keeping secrets, you’ll be using some proper heavy-duty encryption tools instead. I figure that this level of encryption is enough to baffle the typical unwelcome visitor to your secret files!

March 3, 2007

The end of the line for this service

Filed under: newLISP — newlisper @ 18:14
Tags:

I’ve been having a few problems with the services that I’ve been writing for use with ThisService.

The problem is basically that, on the Mac, applications use either carriage returns or line feeds to terminate the lines of a text file. In the days of the Classic Mac OS (which was basically the last decade of the previous century), each text line ended with a Carriage Return (CR) – hex 0D, “\r”. In the current MacOS X, like other Unix flavours, lines end with a Line Feed (LF) – hex 0A, “\n”.

Perhaps uniquely, BBEdit and TextWrangler use CRs for ending lines while your document is being edited, and convert to and from your preferred line ending (probably LF, but could also be CRLF or CR) when you open or save a document. This isn’t normally a problem, but it becomes one when you write scripts intended to be used as services. The service reads the line endings from the application rather than from the file on disk, and therefore reads CRs from BBEdit rather than LFs from the file. So the basic number-lines service to number lines:

(set 'counter 0)
(while (read-line)
  (println (inc 'counter) {: } (current-line)))
(exit)

will, when run in BBEdit, generate only a single number (1), and remove all the line endings from the file as well, thus turning this:

one two three
four five six
seven eight nine

into this:

1:   one two three  four five six  seven eight nine

Here, read-line isn’t looking for “\r”, and so reads the entire selection at once.

This service works fine in most applications because the line endings are LFs. (The reason it doesn’t work in Mail is because Apple haven’t fixed a long-standing bug in its support for services, and the reason it doesn’t work in Microsoft Word escapes me.)

It shouldn’t be too hard to write a service that works with both CR- and LF- delimited text, though. Here’s what I’ve got so far, with some help from Lutz (of course):

(set 'counter 0 'contents "")
(while (read-buffer 0 'buff 1000000 "\r")
  (set 'contents (append contents buff)))
; buffer might have a bit left over...
(unless (nil? buff)
  (set 'contents (append contents buff)))
(replace "\r" contents "\n")
; number lines
(dolist (l (parse contents "\n"))
  (println (inc 'counter) { } l))
(exit)

This reads in characters either in chunks delimited by CR, or a million characters at a time, and stores the lines in a big string, which can then be changed to use LFs. This, so far, seems to work with all the applications I’ve tried it with, including BBEdit. I’ll be testing it some more in due course.

I should say here that BBEdit has its own Unix tools menus, so strictly speaking you could ignore this problem altogether and keep a separate version of each script you write, one for use in BBEdit and a global one for use with all LF-using applications. But, to be honest, that seems an untidy solution.

Blog at WordPress.com.