Goal: Introduce and demonstrate Eclector, mainly from a user's perspective.
The Common Lisp reader is not single object or concept but a rather lose collection of things described in chapter 2 of the specification.
Conceptually | Technically |
---|---|
Characters, syntax types | set-syntax-from-char |
Readtable, variables | *readtable* , *read-eval* , *read-base* , … |
Reader algorithm | read , read-preserving-whitespace , read-from-string |
Token interpretation | ? (parse-integer belongs to numbers) |
Standard macros | get-macro-character , get-dispatch-macro-character , … |
Control and customization of some aspects through readtables and variables.
So, the name …
The Eclector was the main ship used by Yondu's clan of Ravagers serving as a port to the M-ships. https://marvelcinematicuniverse.fandom.com/wiki/Eclector
I don't know how people could miss such an obvious reference. I mean, with beach talking about comics all the time and everything.
But seriously: Eclector, Eclector.
Create an implementation of the Common Lisp reader algorithm and surrounding machinery that
Non-goal: support for load
ing an entire
file (more on that later)
Implementing a specification is great! … but our favorite one doesn't cover everything. For example, which of the following are valid?
#S A proper list of a structure type name and initargs must follow #S.(foo)
#C (1 2)
#C#|foo|#(1 2)
#C#+true(1 2)
#C#.(list 1 2)
`#',Unquote is illegal in the function reader macro.foo
code directory |
4,124 lines |
test directory |
2,930 lines |
cost to make | $ 433,634 |
texinfo
-based with homegrown syntax highlighting.Side node: There is clear demand for "Eclector but at the s-expression syntax level".
At the character level, there are many possible ways of violating the specified Common Lisp syntax:
(1 2 3While reading list, expected the character ) when input ended.
::fooA symbol token must not start with two package markers as in ::name.
#\HyperUnrecognized character name: "Hyper"
#10R89bThe character b is not a digit in base 10.76
`(:foo ,)An object must follow a unquote.
To produce good error messages for as many of those as possible,
Eclector can specifically detect around
99
kinds of syntax errors for which it has corresponding
condition types. Such as:
For applications such
it is important to continue processing source code after encountering errors.
(defun foo (x y)
(+ x #b002The character 2 is not a digit in base 2.0101 (code-char #\RetrnUnrecognized character name: "Retrn") Avoid tab.'(,Unquote not inside backquote.(frob::barDo not use unexported symbols. y)) #1#Reference to undefined label #1#.)
¶While reading list, expected the character ) when input ended.
The eclector.reader:recover
restart (and convenience
function of the same name) can be used to recover and continue
reading after most syntax errors.
The simplest way or making a recovering reader is via the
eclector.reader:recover
convenience function:
(handler-bind ((error #'eclector.reader:recover))
(print (eclector.reader:read-from-string "`(::foo ,")))
(ECLECTOR.READER:QUASIQUOTE (:FOO (ECLECTOR.READER:UNQUOTE NIL)))
For each error, this prints the error message, prints the restart
description and invokes eclector.reader:recover
:
(handler-bind ((error (lambda (condition)
Avoid tab. Avoid tab. Avoid tab.(let ((restart (find-restart 'eclector.reader:recover)))
Avoid tab. Avoid tab. Avoid tab. (format t "Recovering from error:~%~2@T~A~%using~%~2@T~A~2%"
Avoid tab. Avoid tab. Avoid tab. Avoid tab. condition restart))
Avoid tab. Avoid tab. Avoid tab.(eclector.reader:recover))))
(print (eclector.reader:read-from-string "`(::foo ,")))
Recovering from error: A symbol token must not start with two package markers as in ::name. using Treat the character as if it had been escaped. Recovering from error: While reading unquote, expected an object when input ended. using Use NIL in place of the missing object. Recovering from error: While reading list, expected the character ) when input ended. using Return a list of the already read elements. (ECLECTOR.READER:QUASIQUOTE (:FOO (ECLECTOR.READER:UNQUOTE NIL)))
Condition and restart reports use the Acclimation library:
(let* ((language (make-instance 'acclimation:german))
(acclimation:*locale* (make-instance 'acclimation:locale :language language)))
(handler-bind ((error (lambda (condition)
Avoid tab. Avoid tab. Avoid tab. (let ((restart (find-restart 'eclector.reader:recover)))
Avoid tab. Avoid tab. Avoid tab. (format t "Behandle Fehler~%~2@T~A~%durch~%~2@T~A~2%"
Avoid tab. Avoid tab. Avoid tab. Avoid tab. condition restart))
Avoid tab. Avoid tab. Avoid tab. (eclector.reader:recover))))
(eclector.reader:read-from-string "`(::foo ,")))
Behandle Fehler Ein Symbol darf nicht mit zwei Paketmarkierungen beginnen wie bei ::name. durch Behandle die Zeichen als maskiert. Behandle Fehler Beim Lesen eines Antizitats wurde ein Objekt erwartet als die Eingabe endete. durch Verwende NIL anstelle des fehlenden Objekts. Behandle Fehler Beim Lesen einer Liste wurde das Zeichen ) erwartet als die Eingabe endete. durch Erstelle eine Liste bestehend aus den bereits gelesenen Elementen.
Background image: Public Domain, https://en.wikipedia.org/w/index.php?curid=33285421
Architecture Idea 1
Express all operations performed by the reader as a set of protocols in which each generic function accepts a client parameter.
Examples
(defgeneric eclector.reader:interpret-symbol-token (client input-stream Avoid tab. Avoid tab. Avoid tab. Avoid tab. Avoid tab. Avoid tab. token Avoid tab. Avoid tab. Avoid tab. Avoid tab. Avoid tab. Avoid tab. position-package-marker-1 Avoid tab. Avoid tab. Avoid tab. Avoid tab. Avoid tab. Avoid tab. position-package-marker-2)) (defgeneric eclector.reader:evaluate-feature-expression (client feature-expression))
Threats and extension points for corresponding mitigation:
eclector.reader:evaluate-expression
eclector.reader:make-structure-instance
eclector.reader:interpret-symbol
Writing a simple sandboxed reader using the mentioned methods:
eclector.reader:evaluate-expression
eclector.reader:make-structure-instance
eclector.reader:interpret-symbol
Architecture Idea 2
Notify the client when non-objects are encountered in the input and provide a
read
-style function that does not skip over them.
Examples
(defgeneric eclector.reader:note-skipped-input (client input-stream reason)) (defgeneric eclector.reader:call-as-top-level-read (client thunk input-stream eof-error-p eof-value preserve-whitespace-p)) (defgeneric eclector.reader:read-maybe-nothing (client input-stream eof-error-p eof-value))
Before
(list 1 #|foo|# "bar"
Call
(highlight-code "<code class=\"src src-lisp\">(list 1 #|foo|# \"bar\"</code>")
After
(list 1 #|foo|# "bar"
¶While reading list, expected the character ) when input ended.
Used in this presentation and will be used in the Eclector manual.
Architecture Idea 3
As a second way of using Eclector, weave the construction of source locations and parse results (for objects and skipped input) into the normal reader execution.
Examples
(defgeneric eclector.parse-result:source-position (client stream)) (defgeneric eclector.parse-result:make-expression-result (client result children source)) (defgeneric eclector.parse-result:make-skipped-input-result (client stream reason source))
Test
(time (loop :repeat 10000
Avoid tab. :do (read-from-string "(1 (2 3) #+sbcl #1=\"foo\" `(,#1#))")))
SBCL 2.0.6.debian
:
Evaluation took: 0.020 seconds of real time 0.020502 seconds of total run time (0.020502 user, 0.000000 system) 105.00% CPU 62,665,929 processor cycles 3,995,504 bytes consed
Eclector master:
Evaluation took: 0.150 seconds of real time 0.149348 seconds of total run time (0.149348 user, 0.000000 system) [ Run times consist of 0.011 seconds GC time, and 0.139 seconds non-GC time. ] 99.33% CPU 52 lambdas converted 449,065,278 processor cycles 86,892,784 bytes consed
eclector.reader:fixup
and friends is a big chunk
(probably of the consing as well because of hash-table
shenanigans)Current float construction is naive:
(let ((magnitude (* (+ (funcall decimal-mantissa)
Avoid tab. Avoid tab. (/ (funcall fraction-numerator)
Avoid tab. Avoid tab. Avoid tab. fraction-denominator))
Avoid tab. Avoid tab. (if exponentp
Avoid tab. Avoid tab. Avoid tab.(expt 10 (* exponent-sign (funcall exponent)))
Avoid tab. Avoid tab. Avoid tab.1))))
(return-from interpret-token
(* sign (coerce magnitude type))))
But better let client decide:
(defgeneric make-float
(client type
sign decimal-mantissa fraction-numerator fraction-denominator
exponent-sign exponent))
Existing and novels extensions:
::()
Eclector Resources:
#sicl
on freenode