This manual is for Eclector version 0.11.0.
Eclector is a portable, implementation-independent version of the
Common Lisp function read
, a corresponding readtable and a
quasiquotation facility. As opposed to existing implementation-specific
versions of read
, Eclector uses generic functions to allow clients
to customize the exact behavior, such as the interpretation of tokens.
Another unusual feature of Eclector is its ability to, at the discretion of the client, recover from many syntax errors, continue reading and return a result that somewhat resembles what would have been returned in case the syntax had been valid.
Furthermore, Eclector can be used as a source tracking reader, which is accomplished through a mode of operation that produces parse results which wrap the Common Lisp expressions in objects that can also contain information about the positions in the source code of those expressions. One example of such parse results are concrete syntax trees 1.
The package for basic features such as customizable source location
construction is named eclector.base
. Although this package does not
shadow any symbol in the common-lisp
package, we still recommend the
use of explicit package prefixes to refer to symbols in this package.
The package for ordinary reader features is named
eclector.reader
. To use features of this package, we
recommend the use of explicit package prefixes, simply because this
package shadows and exports names that are also exported from the
common-lisp
package. Importing this package will likely
cause conflicts with the common-lisp
package otherwise.
The package for readtable-related features is named
eclector.readtable
. To use features of this package, we
recommend the use of explicit package prefixes, simply because this
package shadows and exports names that are also exported from the
common-lisp
package. Importing this package will likely
cause conflicts with the common-lisp
package otherwise.
The package for features related to the creation of client-defined parse
results is named eclector.parse-result
. To use features of this
package, we recommend the use of explicit package prefixes, simply
because this package shadows and exports names that are also exported
from the common-lisp
package. Importing this package will likely
cause conflicts with the common-lisp
package otherwise.
The package for features related to the creation of concrete syntax
trees is named eclector.concrete-syntax-tree
. To use features of
this package, we recommend the use of explicit package prefixes, simply
because this package shadows and exports names that are also exported
from the common-lisp
package. Importing this package will likely
cause conflicts with the common-lisp
package otherwise.
In this section, symbols written without package marker
are in the eclector.base
package
(see Package for basic features).
This package provides the mechanism that enables clients to customize the behavior of the reader. Furthermore this package provides a protocol for customizing a particular aspect of the behavior, namely the construction of source positions and source ranges. Eclector uses source positions and source ranges in signaled conditions and parse results (see Parse result construction features).
This condition type is the supertype of all conditions which are
signaled by Eclector functions. An instance of this condition type
stores an approximate position in an input stream and an offset from
that position. The condition is associated with the stream content at
the designated position and offset. The position uses a representation
which is controlled by the respective client by adding a method on the
source-position
generic function. The offset indicates a distance
in characters which must be added to the approximate position to produce
the exact position.
This generic function can be called by clients in order to obtain the
approximate position in the input stream to which condition
pertains. The type and interpretation of the returned object depend on
the client, namely the presence of client-specific methods on the
source-position
generic function. The information returned by the
functions position-offset
and range-length
can be used to refine
the approximate position and compute a range in the input stream
respectively.
Applicable methods exist for all conditions of type
stream-position-condition
.
This generic function is called in order to compute the exact position
(or start of a range) in the input stream to which condition
pertains by refining the approximate position obtained by calling
stream-position
. The returned value is an integer (possibly
negative) which indicates the offset in characters from the approximate
position to the exact position. Since the representation of the
approximate position is chosen by the client, applying the offset to
that position in a suitable way is also the responsibility of the
client. Assuming the object returned by (stream-position
condition)
is suitable for arithmetic, the exact position is
stream-position + position-offset.
Applicable methods exist for all conditions of type
stream-position-condition
.
This generic function is called in order to compute the length of the
range in the input stream to which condition pertains. The
returned value is a non-negative integer which indicates the length of
the range in characters. Therefore, assuming the object returned by
(stream-position condition)
is suitable for arithmetic, the
range covers input the positions [start, start + range-length]
where start = stream-position + position-offset.
Applicable methods exist for all conditions of type
stream-position-condition
.
This variable is used by several generic functions which are called by
eclector.reader:read
. The default value of the variable is nil
.
Clients that want to override or extend the default behavior of some
generic function of Eclector should bind this variable to some
standard object and provide a method on that generic function,
specialized to the class of that standard object.
This generic function is called in order to determine the current
position in stream. Eclector does not inspect or manipulate the
objects returned by this generic function beyond storing them in
signaled conditions and passing them as arguments to the
make-source-range
generic function. A client is therefore free to
define methods on this generic function that return arbitrary objects.
The default method on this generic function calls cl:file-position
.
This generic function is called in order to turn the source positions
start and end into a range representation suitable
for client. The returned representation designates the range
of input characters from and including the character at position
start to but not including the character at position
end. The default method returns
(cons start end)
.
In this section, symbols written without package marker are in the
eclector.reader
package
(see Package for ordinary reader features)
The features provided in this package fall into two categories:
read
, read-preserving-whitespace
,
read-from-string
and read-delimited-list
which, together with
standard special variables, replicate the interface of the standard
Common Lisp reader (except functions related to readtables which Eclector
provides separately, see Readtable features). These functions are
discussed in the section Common Lisp reader compatible interface.
eclector.base:*client*
special variable and a collection of protocols which allow customizing
the behavior of the reader by defining methods specialized to a
particular client on the generic functions of the protocols.
Figure 2.1: Functions and typical function call sequences. Solid arrows represent calls, dashed arrows represent returns from function calls. Labels above arrows represent arguments and return values.
Figure 2.1 illustrates the categorization
into the Common Lisp reader compatible interface and the extensible
behavior protocol as well as typical function call patterns that arise
when the functions read
, read-preserving-whitespace
,
read-from-string
and read-delimited-list
are called by client
code.
The following functions are like their standard Common Lisp
counterparts with the two differences that their names are symbols in
the eclector.reader
package and that their behavior can deviate from
that of the standard reader depending on the value of the
eclector.base:*client*
variable.
&optional
(input-stream *standard-input*
) (eof-error-p t
) (eof-value nil
) (recursive-p nil
) ¶This function is the main entry point for the ordinary reader. It is entirely compatible with the standard Common Lisp function with the same name.
&optional
(input-stream *standard-input*
) (eof-error-p t
) (eof-value nil
) (recursive-p nil
) ¶This function is entirely compatible with the standard Common Lisp function with the same name.
&optional
(eof-error-p t
) (eof-value nil
) &key
(start 0
) (end nil
) (preserve-whitespace nil
) ¶This function is entirely compatible with the standard Common Lisp function with the same name.
&optional
(input-stream *standard-input*
) (recursive-p nil
) ¶This function is entirely compatible with the standard Common Lisp function with the same name.
By defining methods on the generic functions of this protocol, clients can customize the high-level behavior of the reader.
Figure 2.2: Functions and typical function call sequences terminating on the right hand side of the diagram at customizable generic functions which implement aspects of the reader algorithm and standard reader macros. Solid arrows represent calls, dashed arrows represent returns from function calls. Labels above arrows represent arguments and return values.
Figure 2.2 illustrates how the customizable generic functions described in this section are called through the client interface and the implementation of the reader algorithm.
This generic function is called by read
if read
is called with a
false value for the recursive-p parameter. It calls thunk
with the necessary context for a global read
call. thunk
should read and return an object without consuming any whitespace
following the object. If preserve-whitespace-p is false, this
function reads up to one character of whitespace after thunk
returns. By default, this function returns the object or
eof-value returned by thunk as its sole value.
Note: This generic function may return more values in addition to the one described above. Clients may use this feature to communicate additional information between methods (see Parse result construction features). Client defined methods on this generic function should accept such additional values when calling thunk, a next method or
read-common
and themselves return the additional values.
The default method on this generic function performs two tasks:
#N=
) and references
(#N#
) work.
This generic function is called by read
, passing it the value of the
variable eclector.base:*client*
and the corresponding parameters.
By default, this generic function returns the objects as its sole value.
Note: This generic function may return more values in addition to the one described above. Clients may use this feature to communicate additional information between methods (see Parse result construction features). Client defined methods on this generic function should accept such additional values when calling a next method or
read-maybe-nothing
and themselves return the additional values.
Client code can add methods on this function, specializing them to the
client class of its choice. The actions that read
needs to take for
different values of the parameter recursive-p have already been
taken before read
calls this generic function.
This generic function can be called directly by the client or by the
generic function read-common
to read an object or consume input
without returning an object. If called directly by the client, the call
has to be in the dynamic scope of a call-as-top-level-read
call.
The function read-maybe-nothing
either
:eof
nil
and :whitespace
*read-suppress*
is true, the function
returns nil and :suppress
. Otherwise it returns the object and
:object
.
nil
and
:skip
.
Note: This generic function may return more values in addition to the ones described above. Clients may use this feature to communicate additional information between methods (see Parse result construction features). Client defined methods on this generic function should accept such additional values when calling a next method and themselves return the additional values.
This generic function is called whenever the reader skips some input
such as a comment or a form that must be skipped because of a reader
conditional. It is called with the value of the variable
eclector.base:*client*
, the input stream from which the input is
being read and an object indicating the reason for skipping the input.
The default method on this generic function does nothing. Client code
can supply a method that specializes to the client class of its choice.
When this function is called, the stream is positioned immediately
after the skipped input. Client code that wants to know both
the beginning and the end of the skipped input must remember the
stream position before the call to read
was made as well as
the stream position when the call to this function is made.
This variable is used by the reader to determine why a range of input
characters has been skipped. To this end, internal functions of the
reader as well as reader macros can set this variable to a suitable
value before skipping over some input. Then, after the input has been
skipped, the generic function note-skipped-input
is called
with the value of the variable as its reason argument.
As an example, the method on note-skipped-input
specialized to
eclector.parse-result:parse-result-client
relays the reason and
position information to the client by calling the
eclector.parse-result:make-skipped-input-result
generic function
(see Parse result construction features).
This generic function is called by read-common
when it has
been detected that a token should be read. This function is
responsible for accumulating the characters of the token and then
calling interpret-token
(see below) in order to create and
return a token.
This generic function is called by read-token
in order to
create a token from accumulated token characters. The parameter
token is a string containing the characters that make up the
token. The parameter escape-ranges indicates ranges of
characters read from input-stream and preceded by a character
with single-escape syntax or delimited by characters with
multiple-escape syntax. Values of escape-ranges are lists of
elements of the form (start\ .\ end)
where
start
is the index of the first escaped character
and end
is the index following the last
escaped character. Note that start
and
var
can be identical indicating no escaped
characters. This can happen in cases like a||b
. The
information conveyed by the escape-ranges parameter is used
to convert the characters in token according to the
readtable case of the current readtable before a token is
constructed.
This generic function is called by the default method on
interpret-token
when the syntax of the token corresponds to that of
a symbol. This function checks the syntactic validity of the symbol
token and signals an error in case of a syntax error. If there are no
syntax errors (or error recovery has been performed, see Recovering from errors), this function returns three values:
The parameter input-stream is the input stream from which the
characters were read. The parameter token is a string that
contains all the characters of the token. The parameter
escape-ranges indicates ranges within token that were
preceded by a character with single-escape syntax or delimited by
characters with multiple-escape syntax. The parameter
position-package-marker-1 contains the index into token of
the first package marker, or nil
if the token contains no package
markers. The parameter position-package-marker-2 contains the
index into token of the second package marker, or nil
if the
token contains no package markers or only a single package marker.
The default method on this generic function checks the positions of the package markers taking into account escape ranges. The method signals errors and allows error recovery as described above.
This generic function is called by the default method on
interpret-token
when the syntax of the token corresponds to
that of a valid symbol. The parameter input-stream is the
input stream from which the characters were read. The parameter
token is a string that contains all the characters of the
token. The parameter position-package-marker-1 contains the
index into token of the first package marker, or nil
if the token contains no package markers. The parameter
position-package-marker-2 contains the index into
token of the second package marker, or nil
if the
token contains no package markers or only a single package marker.
The default method on this generic function calls
interpret-symbol
(see below) with a symbol name string and a
package indicator.
This generic function is called by the default method on
interpret-symbol-token
as well as the default
#:
reader macro function to resolve a symbol name string and
a package indicator to a representation of the designated symbol. The
parameter input-stream is the input stream from which
package-indicator and symbol-name were read. The
parameter package-indicator is a either
:current
designating the current package
:keyword
designating the keyword package
nil
to indicate that an uninterned symbol should be created
The symbol-name is the name of the desired symbol.
The default method uses cl:find-package
(or
cl:*package*
when package-indicator is
:current
) to resolve package-indicator followed by
cl:find-symbol
or cl:intern
, depending on
internp, to resolve symbol-name.
A second method which is specialized on package-indicator
being nil
uses cl:make-symbol
to create uninterned
symbols.
This generic function is called when the reader has determined that some character is associated with a reader macro. The parameter char has to be used in conjunction with the readtable parameter to obtain the macro function that is associated with the macro character. The parameter input-stream is the input stream from which the reader macro function will read additional input to accomplish its task.
The default method on this generic function simply obtains the reader macro function for char from readtable and calls it, passing input-stream and char as arguments. The default method therefore does the same thing that the standard Common Lisp reader does.
This generic function is called by the default #\
reader macro
function to find a character. designator is either
string
that is the name of the character to be found with single
and multiple escapes removed, but with the case of all characters as it
was in the input.
The function has to either return the character designated by
designator or nil
if no such character exists.
If designator is a string
, it is the responsibility of the
client to disregard the case of characters in designator, for
example by producing an uppercase string from designator before
looking up the designated character.
A default method on this generic function that is not specialized to any
particular client but is specialized to designator being a
string
recognizes the mandatory character names listing in HyperSpec
Section 13.1.7 Character Names. Another default method on this generic
function that is not specialized to any particular client but is
specialized to designator being a character
just returns
designator.
This generic function is called by the default #S
reader
macro function to construct structure instances. name is a
symbol naming the structure type of which an instance should be
constructed. initargs is a list the elements of which
alternate between string designators naming structure slots and values
for those slots.
It is the responsibility of the client to coerce the string
designators to symbols as if by
(intern (string slot-name) (find-package 'keyword))
as described in the Common Lisp specification.
There is no default method on this generic function since there is no portable way to construct structure instances given only the name of the structure type.
Warning: This generic function is deprecated as of Eclector 0.9 and will be removed in a future version. Please use the generic function
eclector.reader:call-with-state-value
with the aspect designator'cl:*package*
instead (see Reader state protocol for more information on the reader state protocol).
This generic function is called by the reader when input has to be
read with a particular current package. This is currently only the
case in the #+
and #-
reader macro functions which
read feature expressions in the keyword package. thunk is a
function that should be called without arguments.
package-designator designates the package that should be the
current package around the call to thunk.
The default method on this generic function simply binds
cl:*package*
to the result of
(cl:find-package package-designator)
around calling thunk.
This generic function is called by the default #.
reader
macro function to perform read-time evaluation. expression
is the expression that should be evaluated as it was returned by a
recursive read
call and potentially influenced by
client. The function has to either return the result of
evaluating expression or signal an error.
The default method on this generic function simply returns the result
of (cl:eval expression)
.
This generic function is called by the default #+
and
#-
reader macro functions to check the well-formedness of
feature-expression which has been read from the input stream
before evaluating it. For compound expressions, only the outermost
expression is checked regarding the atom in operator position and its
shape – child expressions are not checked. The function returns an
unspecified value if feature-expression is well-formed and
signals an error otherwise.
The default method on this generic function accepts standard
Common Lisp feature expression, i.e. expressions recursively
composed of symbols, :not
-expressions,
:and
-expressions and :or
-expressions.
This generic function is called by the default #+
and
#-
reader macro functions to evaluate
feature-expression which has been read from the input stream.
The function returns either true or false if
feature-expression is well-formed and signals an error
otherwise.
For compound feature expressions, the well-formedness of child
expressions is not checked immediately but lazily, just before the child
expression in question is evaluated in a subsequent
evaluate-feature-expression
call. This allows expressions like
#+(and my-cl-implementation (special-feature a b)) form
to
be read without error when the :my-cl-implementation
feature is
absent.
The default method on this generic function first calls
check-feature-expression
to check the well-formedness of
feature-expression. It then evaluates
feature-expression according to standard Common Lisp
semantics for feature expressions.
The reader state protocol consists of generic functions which the reader and the client call to query and modify the values of reader state aspects. Each aspect is named by a symbol and holds a current value and has a stack of shadowed values like a special variable. Most aspects roughly correspond to a particular reader control variable defined in the Common Lisp specification. In addition to those, Eclector uses aspects for representing the validity of the consing dot as well as the quasiquotation depth and validity in a given context. In total, Eclector defines the following aspects:
cl:*readtable*
Like the cl:*readtable*
special variable, this aspect controls the
readtable object in which the reader looks up the syntax types of
characters, the case conversion mode as well as reader macros. By
default, values of this aspect must satisfy the
eclector.readtable:readtablep
predicate.
cl:*package*
Like the cl:*package*
special variable, this aspects controls the
package which the reader uses when it looks up or interns symbols in the
current package. By default, values of this aspect must be package
designators.
cl:*read-suppress*
Like the cl:*read-suppress*
special variable, this aspect controls
whether the reader skips over expressions without detailed parsing.
cl:*read-eval*
Like the cl:*read-eval*
special variable, this aspect controls
whether the reader evaluates expressions in #.
constructs.
cl:*features*
Like the cl:*features*
special variable, this aspect controls the
evaluation of features in feature expressions in #+
and #-
constructs. By default, values of this aspect must be proper lists of
symbols.
cl:*read-base*
Like the cl:*read-base*
special variable, this aspect controls the
interpretation of tokens by the reader as being integers or ratios. By
default, values of this aspect must be of type (integer 1 36)
.
cl:*read-default-float-format*
Like the cl:*read-default-float-format*
special variable, this
aspect controls the floating-point format that the reader uses for
floating-point numbers without exponent marker or the default exponent
marker.
eclector.reader::*quasiquotation-state*
Warning: Clients should not query, bind or set the value of this aspect at this time.
This aspect controls whether backquote and unquote are allowed in the current context.
eclector.reader::*quasiquotation-depth*
Warning: Clients should not query, bind or set the value of this aspect at this time.
This aspect tracks the backquote nesting depth in the current context.
eclector.reader::*consing-dot-allowed-p*
Warning: Clients should not query, bind or set the value of this aspect at this time.
This aspect controls whether the consing dot is allowed in the current context.
Errors of this type are signaled when an attempt is made to establish an object as the value for a reader state aspect and the supplied object is not of the type required by the aspect.
Since this condition type is a subtype of cl:type-error
, the
offending value and the expected type can be retrieved via the readers
cl:type-error-datum
and cl:type-error-expected-type
respectively.
The aspect for which the value was supplied can be retrieved via the
reader eclector.reader:aspect
.
This generic function is called by the reader to determine whether value is a valid value for the reader state aspect designated by aspect. The generic function returns true if, according to client, value is a valid value for the reader state aspect designated by aspect. aspect must designate a reader state aspect that is recognized by client. At least the aspects listed in the minimal reader state aspects table must be recognized by any client.
With the exceptions of cl:*readtable*
and cl:*package*
, the
default methods on this generic function recognize state aspects and
implement type restrictions informed by the Common Lisp specification:
Aspect | Type |
---|---|
cl:*readtable* | (satisfies eclector.readtable:readtablep) |
cl:*package* | (or cl:package cl:symbol cl:string cl:character) (package designator) |
cl:*read-suppress* | t (generalized Boolean) |
cl:*read-eval* | t (generalized Boolean) |
cl:*features* | list (proper list) |
cl:*read-base* | (integer 2 36) (radix) |
cl:*read-default-float-format* | (member short-float single-float double-float long-float) |
Return the current value of the reader state aspect designated by aspect.
aspect must designate a reader state aspect that is recognized by client. At least the aspects listed in the minimal reader state aspects table must be recognized by any client.
The cl:*package*
aspect mandates further explanation: When the
client uses only the default methods of the reader state protocol, the
return value of this generic function for the cl:*package*
aspect is
of type cl:package
which is a strict subtype of the type of valid
values for this aspect. In other words, the defaults coerce package
designators to package objects.
Set the current value of the reader state aspect designated by aspect to new-value.
aspect must designate a reader state aspect that is recognized by client. At least the aspects listed in the minimal reader state aspects table must be recognized by any client.
new-value is the desired new value for the designated aspect.
new-value has to be a valid value for aspect in the sense
that (eclector.reader:valid-state-value-p client aspect
value)
must return true.
The cl:*package*
aspect mandates further explanation: When the
client uses only the default methods of the reader state protocol, the
method on this generic function which handles the cl:*package*
aspect coerces new-value from designators to package objects so
that a subsequent eclector.reader:state-value
call returns the
designated package object.
Call thunk with the reader state aspect designated by aspect bound to value.
aspect must designate a reader state aspect that is recognized by client. At least the aspects listed in the minimal reader state aspects table must be recognized by any client.
The following properties must hold:
(eclector.reader:valid-state-value-p client aspect
value)
must return true.
eclector.reader:call-with-state-value
, the expression
(eclector.reader:state-value client aspect)
must
evaluate to value.
When Eclector calls this generic function with cl:*package*
as the
value of aspect, the value is a always a string designator
and never a package object. The default method on this generic function
coerces such string designators to package objects so that a subsequent
eclector.reader:state-value
call returns the designated package
object.
Backquote and unquote syntax is forbidden in some contexts such as
multi-dimensional array literals (#A
) and structure literals
(#S
). Eclector tracks and controls whether backquote, unquote or
both should be allowed in a given context using the aspects
eclector.reader::*quasiquotation-state*
and
eclector.reader::*quasiquotation-depth*
mentioned above. Since
custom reader macros may also have to control this state, Eclector
provides the following convenience macro:
&body
body ¶Warning: This macro is experimental and its name is not exported for now.
Control whether backquote syntax, unquote syntax or both are allowed in
read
functions called during the execution of body.
context is a symbol identifying the current context which is used
for error reporting. A typical value is the name of the reader macro
function in which this macro is used. quasiquote-forbidden-p
controls whether backquote syntax should be forbidden. The value
:keep
causes the binding to remain unchanged.
unquote-forbidden-p controls whether unquote syntax should be
forbidden. The value :keep
causes the binding to remain unchanged.
&optional
(quasiquote-forbidden-p t
) (unquote-forbidden-p t
) &body
body ¶Warning: This macro is deprecated as of Eclector 0.11 and will be removed in a future version. This macro is replaced by the macro
eclector.reader::with-quasiquotation-state
but that macro is experimental and its name is not exported for now.
Disallow backquote syntax, unquote syntax or both in read
functions called during the execution of body.
context is a symbol identifying the current context which is
used for error reporting. A typical value is the name of the reader
macro function in which this macro is used.
quasiquote-forbidden-p controls whether backquote syntax
should be forbidden. The value :keep
causes the binding to
remain unchanged. unquote-forbidden-p controls whether
unquote syntax should be forbidden. The value :keep
causes
the binding to remain unchanged.
Eclector includes implementations of the #=
and ##
reader
macros and they are present in the default readtable. One way to
customize the behavior of the reader around the #=
and ##
syntax
is replacing the reader macro functions with custom ones but with this
approach the client code has to reimplement a lot of functionality. As
a finer grained and more composable mechanism for customization,
Eclector provides a protocol for implementing and customizing the
behavior of the #=
and ##
reader macros, with or without
modifying the readtable. The remainder of this section describes that
protocol.
To start with a bit of terminology, we call the object created by
reading #N=expression
a labeled object. We call
N the label of the labeled object and the result of reading
expression
the object of the labeled object. We say
that #N=expression
defines the labeled object and
#N#
references the labeled object. We call the
reference circular if #N#
occurs within
expression. Labeled objects are internal to the reader and only
exist during eclector.reader:read
calls: before such a call returns
an object, each labeled object within the returned object is replaced by
its respective final object. Callers of eclector.reader:read
and
related functions will therefore only ever see the object, never the
labeled object2.
On a technical level, a labeled object is represented as a data type with a current state and a single (possibly unbound) slot containing the object. The following diagrams depicts the possible states of a labeled object together with input patterns and corresponding transitions:
Figure 2.3: Possible states of a labeled object and input patterns which correspond to state transitions.
Put differently, a labeled object can be in the following states:
State | Object slot |
---|---|
undefined | – |
defined | unbound |
final | the object |
referenced (not strictly needed) | the object |
circular | unbound |
final (circular) | the object |
referenced (circular) (not strictly needed) | the object |
The distinction between the states final
and referenced
on the
one hand and final (circular)
, and referenced (circular)
on the
other hand is not required for implementing labeled objects. Those two
pairs of states are therefore collapsed to just final
and final
(circular)
in the remainder of this section. The following figure and
paragraphs describe generic functions and methods which implement the
creation, registration, lookup and manipulation of labeled objects
according to the reduced set of states:
Figure 2.4: Reduced set of states of a labeled object and protocol functions with corresponding state transitions.
In addition to the generic functions referenced in the above diagram,
the generic functions fixup-graph-p
, fixup-graph
and fixup
are part of the protocol. Those functions are used to replace labeled
objects with their respective final objects within an object that is
about to be returned to the caller of eclector.reader:read
3. To
this end, the #=
reader macro function must inspect and update the
state of the labeled object it is processing after reading
expression
by calling finalize-labeled-object
.
finalize-labeled-object
decides whether fixup-graph
(see below)
must be called: If after reading expression
the labeled object
is in state :circular
, expression
must have contained
circular references and the result of reading it contains labeled
objects that have to be replaced with their respective final objects.
fixup-graph
and fixup
perform this replacement. This
replacement is performed by recursively traversing objects which are
reachable from the final object of the labeled objects, for example by
visiting the slots of standard objects, and replacing labeled objects
with their respective final object.
In certain cases, the computational complexity of this traversal and
replacement can be rather high, depending on when and how exactly the
traversal is performed: consider an expression of the form
#1=(1 #1# #2=(2 #2# …))
. The nested labeled objects in this
expression are all circular and thus require fixing up. The read
call for the innermost labeled object, say #100=…
, returns
first and the fixup processing for the labeled object could be performed
immediately. The problem is that each of the labeled objects would
be processed in the same manner which would lead to a computation
complexity of O(N M) where N is the number of labels and
M is the number of nodes in the object graph rooted at the object
which is returned by the outermost read
call. One way to avoid this
problem would be to perform fixup processing only for the outermost
read
call. The problem with that approach is that only a small
sub-graph of the whole object graph may be circular in which case most
of the work for traversing the whole graph would be wasted. To address
both problems, Eclector allows clients to track the nesting of labeled
objects and fix up sub-graphs which contain multiple nested objects in
one go
(see fixup-graph-p).
This generic function is called by the default method on
call-as-top-level-read
in order to establish a context for tracking
#=
label definitions and ##
label references around a call to
thunk.
The default method on this generic function establishes a context in
which the default #=
and ##
reader macro functions can make the
appropriate calls to note-labeled-object
, forget-labeled-object
,
find-labeled-object
.
This generic function is called by the default #=
reader macro
function to note the definition of a labeled object with label
label while reading from input-stream. The function
creates, registers and returns a representation of the labeled object.
The returned object is registered in the sense that a subsequent call to
find-labeled-object
with arguments client and label
returns the same object unless forget-labeled-object
has been called
to unregister the object.
parent is either nil
or a (previously created) surrounding
labeled object. The parent labeled object is provided to allow the
client to potentially defer fixup processing for the new labeled object
if the processing for the surrounding labeled object subsumes the
processing for the new labeled object.
Note that, when reading an expression of the form
#N=object
, this function is called after reading
#N=
from input-stream but before reading
object
. Consequently, the created and returned labeled object
is defined but does not have an object associated with it.
The default method on this generic function calls
make-labeled-object
with client, input-stream and
label to create an object of an unspecified type. The method
registers and returns the created object. Client code should manipulate
the object only via the generic functions described in this section and
in particular not rely on the object being of a particular type (since
methods on make-labeled-object
specialized to certain client classes
could return unexpected objects). The default method requires the
context established by the default method on
call-with-label-tracking
.
This generic function is called by the default #=
reader macro
function when Eclector reads an invalid labeled object of the form
#N=#N#
and the caller chooses to recover from the
resulting error (see Recovering from errors). In that situation,
the remainder of the input is processed as if there had been no labeled
object with label N. This function makes the labeled object
undefined so that a subsequent find-labeled-object
call for
label will return nil
.
The default method on this generic function requires the context
established by the default method on call-with-label-tracking
.
This generic function is called by the default ##
reader macro
function to look up the previously registered representation of a
labeled object for label. The function returns nil
if no such
object has been registered for label and the registered object
otherwise.
The default method on this generic function requires the context
established by the default method on call-with-label-tracking
.
This generic function is called by note-labeled-object
to create and
return a representation of a labeled object with label label.
parent is either nil
or a previously created, surrounding
labeled object which allows the client to potentially defer fixup
processing for the new labeled object if the processing for the
surrounding labeled object subsumes the processing.
The default method on this generic function creates and returns an
object of an unspecified type. Client code should manipulate the object
only via the generic functions labeled-object-state
,
finalize-labeled-object
and reference-labeled-object
and in
particular not rely on the object being of a particular type (since
methods on this generic function specialized to certain client classes
could return unexpected objects).
This generic function is called by the default #=
reader macro
function to determine the state of object. This function returns
nil
if object is not a labeled object
:defined
, :circular
, :final
, :final/circular
and the
final object stored in object if the first value is either
:final
or :final/circular
or nil
otherwise.
The following table lists all possible return value shapes:
object is a labeled object | First value | Second value |
---|---|---|
no | nil | |
yes | :defined | nil |
yes | :circular | nil |
yes | :final | final-object |
yes | :final/circular | final-object |
The default method on this generic function is applicable to labeled
object representations returned by the default methods on
note-labeled-object
and make-labeled-object
.
This generic function is called by the default #=
reader macro
function after reading a complete labeled object in order to store
object in labeled-object and change the state of
labeled-object to either :final
or :final/circular
. The
function returns two values: the finalized labeled-object and the
new state of labeled-object.
The default method on this generic function is applicable to labeled
object representations returned by the default methods on
note-labeled-object
and make-labeled-object
.
This generic function is called by the default ##
reader macro
function to process a reference to labeled-object while reading
from input-stream. labeled-object must be a representation
of a labeled object and has, in the context of the ##
reader macro
function, likely been obtained by calling find-labeled-object
.
Depending on the state of labeled-object, this function returns
either labeled-object itself or an object that can be returned to
the caller as-is. In case labeled-object is returned, it will be
replaced by its associated object later, when fixup-graph
is called.
The default method on this generic function is applicable to labeled
object representations returned by the default methods on
note-labeled-object
and make-labeled-object
.
As briefly mentioned above, the generic functions fixup-graph
and
fixup
traverse and inspect objects in the object graph reachable
from an object that is about to be returned to the caller of
eclector.reader:read
. In order to distinguish ordinary objects from
labeled objects that act as placeholders in the object graph and must be
replaced with their respective final objects, fixup
methods call
labeled-object-state
on all encountered objects.
labeled-object-state
returns nil
for all objects that are not
labeled objects and :final
for labeled objects which must be
replaced with their final object.
This generic function is potentially called by a method on
finalize-labeled-object
to determine whether the object graph
reachable from the object of root-labeled-object should be fixed
up by calling fixup-graph
with client and
labeled-object.
Multiple default methods on this generic function jointly implement the following behavior:
:final/circular
, root-labeled-object should be fixed
up immediately.
&key
object-key ¶This generic function is potentially called after the reader has
constructed an object graph which is reachable from the object of
root-labeled-object and noticed circular references within this
graph to fix up circular references before the object of
root-labeled-object is returned to the caller (of read
or
related functions).
object-key is a function that accepts a labeled object and returns the object of the labeled object.
The default method on this generic function creates a hash table for
tracking already processed objects and calls fixup
with
client, the object of root-labeled-object and the hash table
to recursively process objects in the object graph which is reachable
from the object of root-labeled-object.
This generic function is potentially called to apply circularity-related
changes to the object constructed by the reader before it is returned to
the caller. object is the object that should be modified.
seen-objects is a eq
-hash table used to track already
processed objects (see below). A method specialized to a class,
instances of which consists of parts, should modify object by
scanning its parts for labeled object markers, replacing found labeled
object markers with the respective final object and recursively calling
fixup
for all parts.
To recognize labeled objects which have to be replaced, methods should
call labeled-object-state
on each part of object and interpret
the returned values as follows: if nil
is returned, the part should
not be replaced but recursively processed. If :final
is returned as
the first value, the part should be replaced with the final object that
is returned as the second value. Parts are replaced by mutating
object.
fixup
is called for side effects – its return value is ignored.
Default methods specializing the object parameter to cons
,
array
, standard-object
and hash-table
process instances of
those classes in the obvious way.
An unspecialized :around
method queries and updates
seen-objects to ensure that each object is processed exactly once.
The following generic functions allow clients to construct
representations of quoted and quasiquoted forms as well as function
special forms.
This generic function is called by the default '
-reader macro
function to construct a quotation form in which material is
the quoted material.
The default method on this generic function returns a result
equivalent to (list 'common-lisp:quote material)
.
This generic function is called by the default `
-reader macro
function to construct a quasiquotation form in which form is
the quasiquoted material.
The default method on this generic function returns a result
equivalent to (list 'eclector.reader:quasiquote form)
.
This generic function is called by the default ,
-reader macro
function to construct an unquote form in which form is the
unquoted material.
The default method on this generic function returns a result
equivalent to (list 'eclector.reader:unquote form)
.
This generic function is called by the default ,@
-reader macro
function to construct a splicing unquote form in which form is the
unquoted material.
The default method on this generic function returns a result
equivalent to
(list 'eclector.reader:unquote-splicing form)
.
This generic function is called by the default #'
-reader macro
function to construct a form that applies the function
special
operator to the name expression.
The default method on this generic function returns a result equivalent
to (list 'common-lisp:function form)
.
The standard syntax types and macro character associations used by the ordinary reader can be set up for any readtable object implementing the readtable protocol (see Readtable features). The following functions are provided for this purpose:
This function sets the standard syntax types in readtable (See HyperSpec section 2.1.4.)
This function sets the standard macro characters in readtable (See HyperSpec section 2.4.)
This function sets the standard dispatch macro characters, that is sharpsign and its sub-characters, in readtable (See HyperSpec section 2.4.8.)
This function sets the standard syntax types and macro characters in readtable by calling the above three functions.
In this section, symbols written without package marker are in the
eclector.readtable
package (see Package for readtable features).
This package exports two kinds of symbols:
common-lisp
package. The functions bound to these symbols are
generic versions of the corresponding standard Common Lisp functions.
Clients can define custom readtables by defining methods on these
generic functions.
This function is the generic version of the standard Common Lisp
function cl:readtablep
. The function returns true if
object can be used as a readtable in Eclector via the
protocol functions in the ecelctor.readtable
package. The
default method returns nil
.
TODO
In this section, symbols written without package marker are in the
eclector.parse-result
package (see Package for parse result construction features).
This package provides clients with a reader that behaves similarly to
cl:read
but returns custom parse result objects controlled by
the client. Some parse results correspond to things like symbols,
numbers and lists that cl:read
would return, while others, if
the client chooses, represent comments and other kinds of input that
cl:read
would discard. Furthermore, clients can associate
source location information with parse results.
Clients using this package pass a “client” object for which methods on
the generic functions described below are applicable to read
,
read-preserving-whitespace
or read-from-string
. Suitable client
classes can be defined by using parse-result-client
as a superclass
and at least defining a method on the generic function
make-expression-result
.
When a client constructs parse results, some of the generic functions for customizing the behavior of the reader (see Reader behavior protocol) return additional values:
Generic function | Situation | Ordinary values | Extended values |
---|---|---|---|
eclector.reader:call-as-top-level-read | object | object | object, parse result, orphan results |
eclector.reader:read-common | object | object | object, parse result |
eclector.reader:read-maybe-nothing | object | object, kind | object, kind, parse result |
eclector.reader:call-as-top-level-read | end of input | eof-value | eof-value, orphan results |
eclector.reader:read-common | end of input | eof-value | eof-value |
eclector.reader:read-maybe-nothing | end of input | eof-value, :eof | eof-value, :eof |
Note how eclector.reader:call-as-top-level-read
and
eclector.reader:read-common
return fewer values for the “end of
input” situation. This difference in return value count allows the
caller to recognize the “end of input” situation even if
eof-value is an object that could be read such as nil
. Using
such an eof-value makes sense for clients which construct parse
results since top-level eclector.parse-result:read
calls return
these parse results so that there is no risk of confusing the chosen
eof-value, even if something like nil
, with having read a
similar object.
Figure 2.5: Functions and typical function call sequences. Solid arrows represent calls, dashed arrows represent returns from function calls. Labels above arrows represent arguments and return values. Differences from the non-parse result version are highlighted with bold text.
Figure 2.5 shows typical function call
patterns, including ordinary and additional return values, that arise
when the functions read
, read-preserving-whitespace
,
read-from-string
and read-delimited-list
are called by client
code.
&optional
(input-stream *standard-input*
) (eof-error-p t
) (eof-value nil
) ¶This function is the main entry point for this variant of the reader.
It is in many ways similar to the standard Common Lisp function
cl:read
. The differences are:
make-skipped-input-result
and arise when skipping input at the
toplevel such as comments which are not lexically contained in lists:
#|orphan|# (#|not orphan|#)
.
eclector.reader:read
behave suitably.
&optional
(input-stream *standard-input*
) (eof-error-p t
) (eof-value nil
) ¶This function is similar to the standard Common Lisp function
cl:read-preserving-whitespace
. The differences are the same
as described above for read
compared to cl:read
.
&optional
(eof-error-p t
) (eof-value nil
) &key
(start 0
) (end nil
) (preserve-whitespace nil
) ¶This function is similar to the standard Common Lisp function
cl:read-from-string
. The differences are:
This class should generally be used as a superclass for client classes using this package.
This generic function is called in order to construct a parse result
object. The value of the result parameter is the raw object read.
The value of the children parameter is a list of already
constructed parse result objects representing objects read by recursive
read
calls. The value of the source parameter is a source
range, as returned by eclector.base:make-source-range
and
eclector.base:source-position
delimiting the range of characters
from which result has been read.
This generic function does not have a default method since the purpose of the package is the construction of custom parse results. Thus, a client must define a method on this generic function.
This generic function is called after the reader skipped over a range
of characters in stream. It returns either nil
if the
skipped input should not be represented or a client-specific
representation of the skipped input. The value of the children
parameter is a list of already constructed parse result objects which
represent object read by recursive read
calls (Such as the feature
expression and the ignored expression in #+(and (or) some-feature)
skipped-expression
). The value of the source parameter
designates the skipped range using a source range representation
obtained via make-source-range
and source-position
.
Reasons for skipping input include comments, the #+
and
#-
reader macros and *read-suppress*
. The
aforementioned reasons are reflected by the value of the
reason parameter as follows:
Input | Value of the reason parameter |
---|---|
Comment starting with ; | (:line-comment . 1) |
Comment starting with ;; | (:line-comment . 2) |
Comment starting with n ; | (:line-comment . n) |
Comment delimited by #| |# | :block-comment |
#+false-expression | (:sharpsign-plus . false-expression) |
#-true-expression | (:sharpsign-minus . true-expression) |
*read-suppress* is true | *read-suppress* |
A reader macro returns no values | :reader-macro |
The default method returns nil
, that is the skipped input is
not represented as a parse result.
In this section, symbols written without package marker are in the
eclector.concrete-syntax-tree
package (see Package for CST features).
&optional
(input-stream *standard-input*
) (eof-error-p t
) (eof-value nil
) ¶This function is the main entry point for the CST reader. It is
mostly compatible with the standard Common Lisp function
cl:read
. The differences are:
concrete-syntax-tree:cst
.
eclector.reader:read
behave suitably.
&optional
(input-stream *standard-input*
) (eof-error-p t
) (eof-value nil
) ¶This function is similar to the standard Common Lisp function
cl:read-preserving-whitespace
. The differences are the same
as described above for read
compared to cl:read
.
&optional
(eof-error-p t
) (eof-value nil
) &key
(start 0
) (end nil
) (preserve-whitespace nil
) ¶This function is similar to the standard Common Lisp function
cl:read-from-string
. The differences are the same as
described above for read
compared to cl:read
.
Eclector offers extensive support for recovering from many syntax
errors, continuing to read from the input stream and return a result
that somewhat resembles what would have been returned in case the
syntax had been valid. To this end, a restart named
eclector.reader:recover
is established when recoverable
errors are signaled. Like the standard Common Lisp restart
cl:continue
, this restart can be invoked by a function of the
same name:
&optional
condition ¶This function recovers from an error by invoking the most recently
established applicable restart named eclector.reader:recover
. If no
such restart is currently established, it returns nil
. If
condition is non-nil
, only restarts that are either explicitly
associated with condition, or not associated with any condition
are considered.
When a read
call during which error recovery has been
performed returns, Eclector tries to return an object that is
similar in terms of type, numeric value, sequence length, etc. to what
would have been returned in case the input had been well-formed. For
example, recovering after encountering the invalid digit in
#b11311
returns either the number #b11011
or the
number #b11111
.
A syntax error and a corresponding recovery strategy are characterized
by the type of the signaled condition and the report of the
established eclector.reader:recover
restart respectively.
Attempting to list and describe all examples of both would provide
little insight. Instead, this section describes different classes of
errors and corresponding recovery strategies in broad terms:
#=1
⟶ 1
,
#5P"."
⟶ #P"."
"foo
⟶ "foo"
, (1 2
⟶ (1 2)
,
#(1 2
⟶ #(1 2)
, #C(1 2
⟶ #C(1 2)
#12rc
⟶ 1
,
1/0
⟶ 1
, #C(1 :foo)
⟶ #C(1 1)
#\foo
⟶ #\?
(,1)
⟶ (1)
, #S(foo :bar 1 2 3)
⟶ #S(foo
:bar 1)
#C(1 2
3)
⟶ #C(1 2)
, #2(1 2 3)
⟶ #2(1 2)
#S(5)
⟶ nil
, (#1=)
⟶ (nil)
Note that attempting to recover from syntax errors may lead to apparent
success in the sense that the read
call returns an object, but this
object may not be what the caller wanted. For example, recovering from
the missing closing "
in the following example
(defun foo (x y) "My documentation string (+ x y))
results in (DEFUN FOO (X Y) "My documentation string<newline> (+ x y))")
,
not (DEFUN FOO (X Y) "My documentation string" (+ x y))
.
This chapter describes potential side effects of calling
eclector.reader:read
,
eclector.reader:read-preserving-whitespace
or
eclector.reader:read-from-string
for different kinds of
clients.
The following destructive modifications are considered uninteresting and ignored in the remainder of this section:
Furthermore, the remainder of this section is written under the following assumptions:
eclector.reader:read
does not cause
additional side effects on its own.
eclector.reader:*client*
is bound to an object for
which there are no custom applicable methods on generic functions
belonging to protocols provided by Eclector that introduce additional
side effects.
eclector.readtable:*readtable*
is bound to an object for
which
If any of the above assumptions does not hold, “all bets are off” in the sense that arbitrary side effects other than the ones described below are possible. For notes regarding non-default clients, See Potential side effects for non-default clients.
The default method on the generic function
eclector.reader:interpret-symbol
may create and intern
symbols, thereby modifying the package system.
The default method on the generic function
eclector.reader:evaluate-expression
uses cl:eval
to
evaluate arbitrary expressions, potentially causing side effects.
With the default readtable, the generic function is only called by the
macro function of the #.
reader macro.
The default method on the generic function
eclector.reader:call-reader-macro
can cause side effects by
calling macro functions that cause side effects. The following
standard reader macros potentially cause side-effects:
#.
as described in Read-time evaluation (default client).
In addition to the potential side effects described in Symbols and packages (default client), strings passed as the third argument of
eclector.reader:interpret-token
are potentially destructively
modified during conversion to the current readtable case.
The same considerations as in Read-time evaluation (default client) apply.
Clients defining methods on
eclector.reader:make-structure-instance
which implement the
standard behavior of calling the default constructor (if any) of the
named structure should consider side effects caused by slot initforms
of the structure. The following example illustrates this problem:
(defvar *counter* 0) (defstruct foo (bar (incf *counter*))) #S(foo) *counter* ⇒ 1 #S(foo) *counter* ⇒ 2
The fixup
generic function potentially modifies its second
argument destructively. Clients that define methods on
eclector.reader:make-structure-instance
should be aware of
this potential modification in cases like #1=#S(foo :bar
#1#)
. Similar considerations apply for other ways of
constructing compound objects such as #1=(t . #1#)
.
The following standard reader macros could cause or be affected by side effects when combined with a non-standard client:
#.
as described in Read-time evaluation (default client).
#S
as described in Structure instance creation.
(
, #(
and #S
as described in Circular structure.
,.
(i.e. destructively splicing) variant of the ,
reader
macro does not currently destructively modify the surrounding object,
but clients should not rely on this fact. This consideration applies to
clients that install non-standard macro functions for the (
and
#(
reader macros.
This chapter describes Eclector’s interpretation of passages in the Common Lisp specification that do not describe the behavior of a conforming reader completely unambiguously.
At first glance, Sharpsign C and Sharpsign S seem to follow the same syntactic structure: the dispatch macro character followed by the sub-character followed by a list of a specific structure. However, the actual descriptions of the respective syntax is different. For Sharpsign C, the specification states:
#C
reads a following object, which must be a list of length two whose elements are both reals.
For Sharpsign S, on the other hand, the specification describes the syntax as:
#s(name slot1 value1 slot2 value2 ...)
denotes a structure.
Note how the description for Sharpsign C relies on a recursive read
invocation while the description for Sharpsign S gives a character-level
pattern with meta-syntactic variables. It is possible that this is an
oversight and the syntax was intended to be uniform between the two
reader macros. Whatever the case may be, in order to handle existing
code without inconveniencing clients, Eclector implements both
Sharpsign C and Sharpsign S with a recursive read
invocation which
corresponds to permissive behavior.
More concretely, Eclector behaves as summarized in the following table:
Input | Behavior |
---|---|
#C(1 2) | Read as #C(1 2) |
#C (1 2) | Read as #C(1 2) |
#C#||#(1 2) | Read as #C(1 2) |
#C#.(list 1 (+ 2 3)) | Read as #C(1 5) |
#C[1 2] for left-parenthesis syntax on [ | Read as #C(1 2) |
#S(foo) | Read as #S(foo) |
#S (foo) | Read as #S(foo) |
#S#||#(foo) | Read as #S(foo) |
#S#.(list 'foo) | Read as #S(foo) |
#S[foo] for left-parenthesis syntax on [ | Read as #S(foo) |
Eclector provides a strict version of the Sharpsign C macro function
under the name eclector.reader:strict-sharpsign-c
which behaves as
follows:
Input | Behavior |
---|---|
#C(1 2) | Read as #C(1 2) |
#C (1 2) | Rejected |
#C#||#(1 2) | Rejected |
#C#.(list 1 (+ 2 3)) | Rejected |
#C[1 2] for left-parenthesis syntax on [ | Read as #C(1 2) |
Eclector provides a strict version of the Sharpsign S macro function
under the name eclector.reader:strict-sharpsign-s
which behaves as
follows:
Input | Behavior |
---|---|
#S(foo) | Read as #S(foo) |
#S (foo) | Rejected |
#S#||#(foo) | Rejected |
#S#.(list 'foo) | Rejected |
#S[foo] for left-parenthesis syntax on [ | Rejected |
The Common Lisp specification is very specific about the contexts in
which the quasiquotation mechanism can be used. Explicit descriptions
of the behavior of the quasiquotation mechanism are given for
expressions which are lists or vectors and it is implied that
unquote is not allowed in other expressions. From this description, it
is clear that `#S(foo :bar ,x)
is not valid syntax, for example.
However, whether `#',foo
is valid syntax depends on whether
#'thing
is considered to be a list. Since `#',foo
is a relatively common idiom, Eclector accepts it by default.
Eclector provides a strict version of the Sharpsign Single Quote macro
function under the name
eclector.reader:strict-sharpsign-single-quote
which does not accept
unquote in the function name.
The Common Lisp specification describes the behavior of the ##
reader macro as follows:
#n#
, where n is a required unsigned decimal integer, provides a reference to some object labeled by#n=
; that is,#n#
represents a pointer to the same (eq
) object labeled by#n=
.
The vague phrasing “represents a pointer to the same (eq
) object”
is probably chosen to cover the situation in which the object in
question is not yet defined when the reader encounters the #n#
reference as is the case with input of the form
#n=(…#n#…)
. The fact that the object is not yet defined
when the reference is encountered is not a problem in general except for
one situation: assume #_
is a custom reader macro in the current
readtable which calls read
. In this situation, reading an
expression of the form #n=(…#_#n#…)
causes the reader
macro function for #_
to be called which calls read
to read the
following object which encounters the reference. This chain of calls
leads to a potential problem: the read
call made by the reader macro
function has to return some object but it cannot return the object
labeled n since that object has not been read yet. The reader
macro function must therefore receive some sort of
implementation-dependent 4 object which stands in
for the object labeled n and gets replaced at some later time
after the object labeled n has been read. Since the stand-in
object is implementation-dependent, the reader macro function must not
make any assumptions regarding the type of the object or operate on it
in any way other than returning the object or using the object as a part
of a compound object.
The following example violates this principle since the reader macro
function in custom-macro-readtable
calls cl:second
on the object
returned by eclector.reader:read
:
(defun custom-macro-readtable () (let ((readtable (eclector.readtable:copy-readtable eclector.reader:*readtable*))) (eclector.readtable:set-dispatch-macro-character readtable #\# #\_ (lambda (stream char sub-char) (declare (ignore char sub-char)) (second (eclector.reader:read stream t nil t)))) readtable)) (let ((eclector.reader:*readtable* (custom-macro-readtable))) (eclector.reader:read-from-string "#1=(:a #_#1#)")) ⇒ undefined
To handle the problem described above, Eclector imposes the following
restriction on custom reader macro functions which call read
:
A reader macro function which reads an object by calling
read
must account for the object being of an implementation-dependent type and must not operate on the object in any way other than returning the object or using the object as a part of a compound object.
Jump to: | A B C E F L P Q R S U |
---|
Jump to: | A B C E F L P Q R S U |
---|
Jump to: | (
*
C E F I L M N P R S V W |
---|
Jump to: | (
*
C E F I L M N P R S V W |
---|
A children
parameter has been added to the lambda list of the generic
function eclector.parse-result:make-skipped-input-result
so that results
which represent skipped material can have children. For example, before this
change, a eclector.parse-result:read
call which encountered the expression
#+no-such-feature foo bar
potentially constructed parse results for all
(recursive) read
calls, that is for the whole expression, for
no-such-feature
, for foo
and for bar
, but the parse results for
no-such-feature
and foo
could not be attached to a parent parse result
and were thus lost. In other words the shape of the parse result tree was
skipped input result #+no-such-feature foo expression result bar
With this change, the parse results in question can be attached to the parse
result which represents the whole #+no-such-feature foo
expression so that
the entire parse result tree has the following shape
skipped input result #+no-such-feature foo skipped input result no-such-feature skipped input result foo expression result bar
Since this is a major incompatible change, we offer the following workaround for clients that must support Eclector versions with and without this change:
(eval-when (:compile-toplevel :load-toplevel :execute) (let* ((generic-function #'eclector.parse-result:make-skipped-input-result) (lambda-list (c2mop:generic-function-lambda-list generic-function))) (when (= (length lambda-list) 5) (pushnew 'skipped-input-children *features*)))) (defmethod eclector.parse-result:make-skipped-input-result ((client client) (stream t) (reason t) #+PACKAGE-THIS-CODE-IS-READ-IN::skipped-input-children (children t) (source t)) ... #+PACKAGE-THIS-CODE-IS-READ-IN::skipped-input-children (use children) ...)
The above code pushes a symbol that is interned in a package under the control
of the respective client (as opposed to the KEYWORD
package) onto
*features*
before the second form is read and uses that feature to select
either the version with or the version without the children
parameter of
the method definition. See Maintaining Portable Lisp Programs by Christophe
Rhodes for a detailed discussion of this technique.
eclector.reader:state-value-type-error
can be used
to indicate that a value of an unsuitable type has been provided for a reader
state aspect.
(setf eclector.readtable:state-value)
which allows clients
to set reader state aspects in addition to establishing dynamically scoped
bindings.
eclector.reader:unquote
and eclector.reader:unquote-splicing
now signal sensible errors when used outside of the lexical scope of a
eclector.reader:quasiquote
macro call. Note that the name of the associated
condition type is not exported for now since quasiquotation will be implemented
in a separate module in the future.
Such invalid uses can happen when the above macros are called directly or when
the ,
, ,@
and ,.
reader macros are used in a way that constructs
the unquoted expression in one context and then "injects" it into some other
context, for example via an object reference #N#
or read-time evaluation
#.(...)
. Full example:
(progn (print `(a #1=,(+ 1 2) c)) (print #1#))
Another minor aspect of this change is that the condition types
eclector.reader:unquote-splicing-in-dotted-list
and
eclector.reader:unquote-splicing-at-top
are no longer subtypes of
common-lisp:stream-error
. The previous relation did not make sense since
errors of those types are signaled during macro expansion.
The (internal) macro eclector.reader::with-forbidden-quasiquotation
is
deprecated as of this release. Clients which really need a replacement
immediately can use the new (internal) macro
eclector.reader::with-quasiquotation-state
.
An example of a situation that was previously handled incorrectly is the following
(defun bypassing-left-parenthesis (stream char) (declare (ignore char)) (loop for peek = (eclector.reader:peek-char t stream t nil t) when (eq peek #\)) do (eclector.reader:read-char stream t nil t) (loop-finish) collect (let ((function (eclector.readtable:get-macro-character eclector.reader:*readtable* peek))) (cond (function (eclector.reader:read-char stream t nil t) (funcall function stream peek)) (t (eclector.reader:read stream t nil t)))))) (let ((eclector.reader:*readtable* (eclector.readtable:copy-readtable eclector.reader:*readtable*))) (eclector.readtable:set-macro-character eclector.reader:*readtable* #\( #'bypassing-left-parenthesis) (describe (eclector.parse-result:read-from-string (make-instance 'eclector.parse-result.test::simple-result-client) "(print (quote #1=(member :floor :ceiling)))"))) ;; [...] ;; Slots with :INSTANCE allocation: ;; %RAW = (PRINT '(MEMBER :FLOOR :CEILING)) ;; %SOURCE = (0 . 43) ;; [...] ;; The %RAW slot used to contain (MEMBER :FLOOR :CEILING) instead of ;; (PRINT '(MEMBER :FLOOR :CEILING)).
+
in the sign part.
For example, Eclector now accepts #x+10
as a spelling of 16
.
As an example, the following signaled an error before this change:
(let ((eclector.reader:*readtable* (eclector.readtable:copy-readtable eclector.reader:*readtable*))) (eclector.readtable:set-macro-character eclector.reader:*readtable* #\1 (lambda (stream char) (declare (ignore stream char)) 1) t) ; non-terminating (eclector.reader:read-from-string "#x01"))
eclector.parse-result:source-position
and
eclector.parse-result:make-source-range
have been removed. Clients should
use eclector.base:source-position
and eclector.base:make-source-range
respectively instead.
eclector.base:range-length
can be applied to conditions of
type eclector.base:stream-position-condition
(which includes almost all
conditions related to syntax errors) to determine the length of the
sub-sequence of the input to which the condition in question pertains.
The part of the labeled objects protocol that allows clients to construct parse
results which represent labeled objects has been changed in an incompatible
way. The change allows parse results which represent labeled objects to have
child parse results but requires that clients construct parse results which
represent labeled objects differently: instead of eql-specializing the
result
parameters of methods on
eclector.parse-result:make-expression-result
to
eclector.parse-result:**definition**
and
eclector.parse-result:**reference**
and receiving the labeled object in the
children
parameters, the result
parameters now have to be specialized
to the classes eclector.parse-result:definition
and
eclector.parse-result:reference
respectively. The object passed as the
result
argument now contains the labeled object so that the children
parameter can receive child parse results.
This change is considered minor since the old mechanism described above was not documented. For now, the new mechanism also remains undocumented so that the design can be validated through experimentation before it is finalized.
syntax-extensions
module contains a collection of syntax extensions
which are implemented as either mixin classes for clients or reader macro
functions.
my-package::(a b)
is read as
(my-package::a my-package::b)
with this extension.
eclector.syntax-extensions.s-expression-comment:s-expression-comment
allows
commenting out s-expressions in a fashion similar to SRFI 62 for scheme. One
difference is that a numeric infix argument can be used to comment out a number
of s-expressions different from 1:
(frob r1 r2 :k3 4 #4; :k5 6 :k6 7)
concrete-syntax-tree
module now produces a better tree structure for
certain inputs like (0 . 0)
. Before this change the produced CST had the
same concrete-syntax-tree:atom-cst
object as the
concrete-syntax-tree:first
and concrete-syntax-tree:rest
of the outer
concrete-syntax-tree:cons-cst
node. After this change the
concrete-syntax-tree:first
child is the concrete-syntax-tree:atom-cst
which corresponds to the first 0
in the input and the
concrete-syntax-tree:rest
child is the concrete-syntax-tree:atom-cst
which corresponds to the second 0
in the input. In contrast to the previous
example, an input like (#1=0 . #1#)
continues to result in a single
concrete-syntax-tree:atom-cst
in both the concrete-syntax-tree:first
and concrete-syntax-tree:rest
slots of the outer
concrete-syntax-tree:cons-cst
object.
eclector.concrete-syntax-tree:cst-read
has been
removed. Clients should use eclector.concrete-syntax-tree:read
instead.
eclector.reader:find-character
receives characters names with unmodified
case and is also called in the #\<single character>
case so that clients
have more control over character lookup.
eclector.base:position-offset
allows interested
clients to refine the source positions of errors obtained by calling
eclector.base:stream-position
.
eclector.reader:fixup
method for hash tables have been
fixed: keys were not checked for circular structure and circular structures in
values were not fixed up in some cases.
#=
and ##
reader macros
respectively (See Labeled objects and references).
Before this change, cases like
#1=(1 #1# #2=(2 #2# ... #100=(100 #100#)))
or
#1=(1 #2=(2 ... #2#) ... #1#)
led to unnecessary and/or repeated traversals during fixup processing.
Before this change, something like
(eclector.concrete-syntax-tree:read-from-string "#1=(#1#)")
produced a CST object, say cst
, which failed to satisfy
(eq (cst:first cst) cst) (eq (cst:raw (first cst)) (cst:raw cst))
The properties now hold.
eclector.concrete-syntax-tree:definition-csts-mixin
and
eclector.concrete-syntax-tree:reference-csts-mixin
to represent labeled
object definitions and references as instances of
eclector.concrete-syntax-tree:definition-cst
and
eclector.concrete-syntax-tree:reference-cst
respectively.
eclector.reader::sharpsign-colon
is now always present.
nil
when nil
is used as the eof-value
(nil
makes sense as an eof-value
in that case since nil
is
generally not a possible parse result).
eclector-concrete-syntax-tree
system now works with and requires
version 0.2 of the concrete-syntax-tree
system.
Clients can use this protocol to control the reader state in other ways than binding the Common Lisp variables, for example by storing the values of reader state aspects in context objects.
Furthermore, implementations which use Eclector as the Common Lisp reader can
use this protocol to tie the cl:*readtable*
aspect to the
cl:*readtable*
variable instead of the eclector.reader:*readtable*
variable.
The new protocol subsumes the purpose of the generic function
eclector.reader:call-with-current-package
which is deprecated as of this
Eclector version.
eclector.reader::sharpsign-s
reader macro function which requires the input
following #S
to be read as a list but not necessarily be literally written
as (TYPE INITARG₁ VALUE₁ …)
.
A detailed discussion of the topic has been added to the manual (See Interpretation of Sharpsign C and Sharpsign S).
eclector.reader:read-token
method and the functions
eclector.reader::sharpsign-colon
and
eclector.reader::sharpsign-backslash
are now more efficient as well as less
redundant in terms of repeated code.
:eclector-define-cl-variables
now controls whether the file
code/reader/variables.lisp
is loaded and thus whether the variables
eclector.reader:*package*
, eclector.reader:*read-eval*
, etc. are
defined.
eclector.reader:check-symbol-token
has been fixed.
||
are no longer interpreted as potential numbers.
eclector.reader:interpret-symbol
method now signals specific
conditions and offers restarts for recovering from situations related to
non-existent packages and symbols as well as non-exported symbols.
The default error recovery strategy for invalid symbols now constructs an
uninterned symbol of the given name instead of using nil
.
eclector.reader::left-parenthesis
.
At the same time, it is now possible to recover from encountering the "consing dot" in invalid positions.
eclector.reader:interpret-token
method has been optimized
substantially.
eclector.reader:*client*
variable and the source location protocol
(that is the generic functions eclector.parse-result:source-position
and
eclector.parse-result:make-source-range
) have been moved to a new base
module and package eclector.base
which the reader
module and the
eclector.reader
package can use. This structure allows code in the
reader
module to work with source locations.
The name eclector.base:*client*
remains exported as
eclector.reader:*client*
.
The old names eclector.parse-result:source-position
and
eclector.parse-result:make-source-range
still exist but are now deprecated
and will be removed in a future release.
reader
module now include source
positions which are obtained by calling eclector.base:source-position
.
nil
parse results are no longer generated by
eclector.parse-result:make-skipped-input-result
calls when
cl:*read-suppress*
is true.
eclector.reader:read-maybe-nothing
and
eclector.reader:call-as-top-level-read
give clients additional entry points
to the reader as well as customization possibilities. With these functions, the
chain of functions calls for a read
call looks like this:
eclector.reader:read eclector.reader:call-as-top-level-read eclector.reader:read-common eclector.reader:read-maybe-nothing ... eclector.reader:read-char eclector.reader:peek-char
Diagrams which illustrate the relations between the new and existing functions have been added to the manual (Figure 2.1, Figure 2.2, Figure 2.5).
eclector.reader::read-rational
now better respects the value
of *read-suppress*
.
eclector.readtable:set-syntax-from-char
, fix
(setf eclector.readtable:syntax-from-char)
to also copy the macro character
information.
eclector.reader:wrap-in-function
.
eclector.reader::*list-reader*
around recursive read in
eclector.reader::sharpsign-dot
.
eclector.reader::sharpsign-c
.
The strict version is still available as eclector.reader:strict-sharpsign-c
and can be installed into a custom readtable.
A detailed discussion of the topic has been added to the manual (See Interpretation of Sharpsign C and Sharpsign S).
..
and ...
.
eclector.reader::sharpsign-single-quote
. The strict version is still
available as eclector.reader:strict-sharpsign-single-quote
and can be
installed into a custom readtable.
A detailed discussion of the topic has been added to the manual (See Interpretation of Backquote and Sharpsign Single Quote).
eclector.reader:check-symbol-token
.
PACKAGE::||
is now correctly read as a symbol.
:
.
eclector.reader:call-with-current-package
has been
added.
eclector.parse-result:read-preserving-whitespace
and
eclector.parse-result:read-from-string
have been added.
eclector.concrete-syntax-tree:read-preserving-whitespace
and
eclector.concrete-syntax-tree:read-from-string
have been added.
eclector.concrete-syntax-tree:cst-read
has been renamed to
eclector.concrete-syntax-tree:read
.
eclector.concrete-syntax-tree:cst-read
still exists but is deprecated and
will be removed in a future version.
eclector.reader::with-forbidden-quasiquotation
is used by Eclector (and can
be used in custom reader macros) to control this behavior.
eclector.readtable:readtablep
for the simple readtable
implementation has been added.
eclector.base:end-of-file
is now a subtype of
cl:stream-error
but not of cl:reader-error
.
eof-error
parameter when the end of input is encountered a after single
escape or within a multiple escape. The new error conditions
eclector.reader:unterminated-single-escape
and
eclector.reader:unterminated-multiple-escape
are signaled in such
situations.
#
now conforms to the specification.
cl:*read-base*
is now used correctly when distinguishing
numbers and symbols.
eclector.reader:zero-denominator
is signaled.
eclector.reader:read-delimited-list
has been added.
eclector.reader::sharpsign-s
now accepts string
designators as slot names.
eclector.reader::sharpsign-equals
and
eclector.reader::sharpsign-sharpsign
respect the value of
cl:*read-suppress*
.
eclector.reader:fixup
now works
correctly for standard-object
instances with unbound slots.
eclector.reader::left-parenthesis
now always
reads until #\)
, not some "opposite" character.
eclector.reader:*skip-reason*
is now set correctly when a line comment at
the end of input is read.
eclector.reader:recover
is now established which, when invoked
performs some action which allows the remainder of the expression to be read.
The convenience function eclector.reader:recover
can be used to invoke the
restart.
eclector.reader::sharpsign-plus-minus
now sets
eclector.reader:*skip-reason*
so that parse results can be created with an
accurate "reason" value.
eclector.reader:read-char
and
eclector.reader:peek-char
have been fixed.
eclector.reader::read-rational
now respects
cl:*read-suppress*
and handles inputs of the form 1 2
correctly.
eclector.reader::sharpsign-r
now handles
cl:*read-suppress*
better.
eclector.reader:interpret-token
now distinguishes positive and negative float zeros and uses radix 10 instead
of the value of cl:*read-base*
for float digits.
.||
is now interpreted as a symbol instead of the "consing dot".
#A
expressions now signal
appropriate errors.
eclector.parse-result:read
is now accurate.
eclector.reader:peek-char
has been added. The new function is
like cl:peek-char
but signals Eclector conditions and uses the Eclector
readtable.
eclector.reader:read-from-string
has been added. The new
function is like cl:read-from-string
but uses Eclector’s reader
implementation.
eclector.reader::sharpsign-s
and the generic
function eclector.reader:make-structure-instance
have been added. Eclector
does not define any methods on the latter generic function since there is no
portable way of creating a structure instance when only the symbol naming the
structure is known.
eclector.reader:interpret-symbol
is now called when
the reader creates uninterned symbols.
eclector.reader:fixup
now accepts a client object as
the the argument.
eclector.reader:wrap-in-quasiquote
,
eclector.reader:wrap-in-unquote
and
eclector.reader:wrap-in-unquote-splicing
, the client parameter is now the
first parameter.
eclector.reader:wrap-in-quote
has been added.
concrete-syntax-tree
module has been generalized into a
parse-result
module which provides a protocol for constructing arbitrary
custom parse results. The concrete-syntax-tree
module is now based on this
new module but can be used as before by clients.
eof-error-p
parameter of the
eclector.reader:read-char
function is now true.
See: https://github.com/s-expressionists/Concrete-Syntax-Tree
Reader macro functions which call
eclector.reader:read
may receive labeled objects under certain
circumstances (see Circular objects and custom reader macros).
This fixup processing has to be delayed under certain circumstances (see Circular objects and custom reader macros).
We use “implementation-dependent” in the sense defined in the Common Lisp specification except that Eclector is the implementation in question.