Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Does anyone recommend any resources for learning how to write handwritten parsers nowadays? I’ve written a couple simple ones for various tasks before, but I’d really love to learn more about it.


Learn what lexing is. That's step one. Once you have a token stream, look into how you'd transform them into structures.

PUB FN ID(main) OPAREN CPAREN COLON ID(i32) OBRACE NL RETURN NUMBER(0) CBRACE

Look familiar? How might you parse that into something like

{ kind: "function", name: "main", args: [], public: true, return_type: { kind: "integer_type", width: 32 }, body: [ { kind: "return", expression: { kind: "number_literal", value: 0 } } ] }

A recursive descent parser is just a parser that starts at a root node type and recursively traverses a token stream and calls handler functions based on what's next. Those functions might also call other handlers (e.g. for nested expression parsing, etc.) which is why it's called recursive.


Actually don’t. Knowing how to write parsers without a lexer (scanner less parsers) is the true superpower.


Fully disagree. Care to expand?


Scanner less parsers are strictly more powerful than parsers using fixed lexers. You can literally change the language you are parsing based on what you have parsed so far. I have code running in production doing exactly that.


Generally such languages are so abstract that they lose usefulness in practice. Because something can be done, doesn't mean it should. Writing a simple recursive descent parser with a Lexer's token stream is much easier than doing bytestream ingestion on the fly, especially for a beginner.


You can even parse a definition of a parser and then continue parsing the rest of the code using that parser. Not something I have needed in production but it shows the power of non-lexing parsers.


Downvoting without giving a reason why is not useful.


This book is great: https://interpreterbook.com/

It is sort of a walk-through of a small programming langauge written in go, using a recursive descent parser.

Crafting Interpreters is good as well, though I only read part of it, because it wasn't done at the time I read it. https://craftinginterpreters.com/

I've also interviewed both of them, if you like an audio intro: https://corecursive.com/032-bob-nystrom-on-building-an-inter...


Just read the code for an existing one like:

https://github.com/dlang/dmd/blob/master/src/dmd/cparse.d

which is a C parser. It's not hard to follow.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: