I wouldn't say it was entirely unintentional :-). I was definitely aware that "mal" could mean "evil" when I named it. It was a bit more apropos when mal only had a single implementation in GNU Make macro language.
Mal uses a regex for lexing/tokenizing. I didn't want people to get hung up on the lexing step (my university compilers class spent 1/3rd of the semester just on lexing). It's certainly a worthwhile area to study but not the focus of mal/make-a-lisp.
It's a long regex, but it's just whitespace followed by an alternation with 5 different types of data: split-unquote, special characters, strings, comments, symbols. The string tokenizing branch is a bit complicated because it has to allow internal escaping of quotes. Early iterations of the guide didn't explain the regex in detail but the section now describes each of the regex components.
- There are several reasons for the weird way that the output and expected messages are printed. One of the main reasons is that in order to enable more flexible test case results, the expected output is a regex rather than a plain string, while the output is just a plain string. There is probably a way this could be made clearer. I'll add it to my TODO list to look at.
- Do you have some specific examples? The test cases are marked as either deferrable or optional. You shouldn't ever have to implement optional (and if so that's a bug in the tests or the guide). I think the deferrable items are marked pretty clearly in the guide where they become mandatory in later steps. If it's not clear, then that's a bug.
- This is one of the tensions that exists with trying to make the guide incremental; later features may require re-work of earlier functionality. I do try and minimize that as much as possible although I've found it can really vary depending on the nature of the target language. Note that the primary goal of mal/make-a-lisp is pedagogical (as opposed to say "the easiest way to make your own Lisp"). So sometimes the need to go back and re-work something is in line with that goal.
If you have any concrete guide text or test driver improvements (especially that further the pedagogic goals of mal), I'm always happy to review pull requests! :-)
Thanks for your answer. I probably meant deferrables rather than optionals that were de facto required. Sorry I don't have a concrete example handy, it's been a while.
The steps and main files are the same for every implementation (that's part of the requirements for merging into the main tree). Some implementations have additional files like readline, utility routines, etc. But for the most part the general structure and file divisions are very similar.
Every implementation has a stats target that gives byte counts, LOCs, and comments that is specific to that implementation. I.e. from the top-level you can run the following to get stats for every language:
It started with the question "Could you implement a Lisp using just GNU Make macros"? (Hint: Yes) Bit of trivia: Mal originally stood for MAke-Lisp. Conveniently the acronym didn't need to be changed for "Make-A-Lisp"
It then grew into a personal learning tool for me for learning new languages. As I implemented more languages, I refactored the structure to make it more portable and incremental. At some point I realized that it might be an interesting learning tool for others so I wrote an early version of the guide and had a friend work through it and give feedback. He liked it enough that he immediately did a second implementation.
At some point the project got some twitter/HN attention and other people began contributing implementations and feedback for the guide.
Mal also serves as programming language "Rosetta Stone". A bit like rosettacode.org but using a full working project.
Implementation size is hard to compare accurately across languages (bytes? lines of code? exclude comments? excluding leading spaces?). Also, the mal implementations were created by many different people so individual style plays a large role in concision/verbosity.
However, that being said, the following implementations are "smallish" (in both lines of code and bytes): mal itself (self-hosted), Clojure, factor, Perl 6, Ruby. The following are "largish": Awk, PL-pgSQL, C, PL-SQL, Chuck, Swift 2, Ada.
OCaml is in the "smallest" third of the implementations.
BYOL is targeted at implementing Lisp in C and focuses on learning C. There is a fair bit more hand-holding and example code in C for most of the steps than in the mal guide. BYOL is more polished and written in the form of a short book whereas the mal guide intentionally tries to be more concise and informal in style. I would read through the first couple of sections of both and decide based on your own goals and preferences.
I look forwards to your pull request! One challenge you'll run into is that (as I recall) the Boehm GC linkage no longer works with newer versions of Boehm.