This manual is for Eclector version 0.11.0.
Eclector is a portable, implementation-independent version of the Common Lisp function read, a corresponding readtable and a quasiquotation facility. As opposed to existing implementation-specific versions of read, Eclector uses generic functions to allow clients to customize the exact behavior, such as the interpretation of tokens.
Another unusual feature of Eclector is its ability to, at the discretion of the client, recover from many syntax errors, continue reading and return a result that somewhat resembles what would have been returned in case the syntax had been valid.
Furthermore, Eclector can be used as a source tracking reader, which is accomplished through a mode of operation that produces parse results which wrap the Common Lisp expressions in objects that can also contain information about the positions in the source code of those expressions. One example of such parse results are concrete syntax trees 1.
The package for basic features such as customizable source location construction is named eclector.base. Although this package does not shadow any symbol in the common-lisp package, we still recommend the use of explicit package prefixes to refer to symbols in this package.
The package for ordinary reader features is named eclector.reader. To use features of this package, we recommend the use of explicit package prefixes, simply because this package shadows and exports names that are also exported from the common-lisp package. Importing this package will likely cause conflicts with the common-lisp package otherwise.
The package for readtable-related features is named eclector.readtable. To use features of this package, we recommend the use of explicit package prefixes, simply because this package shadows and exports names that are also exported from the common-lisp package. Importing this package will likely cause conflicts with the common-lisp package otherwise.
The package for features related to the creation of client-defined parse results is named eclector.parse-result. To use features of this package, we recommend the use of explicit package prefixes, simply because this package shadows and exports names that are also exported from the common-lisp package. Importing this package will likely cause conflicts with the common-lisp package otherwise.
The package for features related to the creation of concrete syntax trees is named eclector.concrete-syntax-tree. To use features of this package, we recommend the use of explicit package prefixes, simply because this package shadows and exports names that are also exported from the common-lisp package. Importing this package will likely cause conflicts with the common-lisp package otherwise.
In this section, symbols written without package marker are in the eclector.base package (see Package for basic features).
This package provides the mechanism that enables clients to customize the behavior of the reader. Furthermore this package provides a protocol for customizing a particular aspect of the behavior, namely the construction of source positions and source ranges. Eclector uses source positions and source ranges in signaled conditions and parse results (see Parse result construction features).
This condition type is the supertype of all conditions which are signaled by Eclector functions. An instance of this condition type stores an approximate position in an input stream and an offset from that position. The condition is associated with the stream content at the designated position and offset. The position uses a representation which is controlled by the respective client by adding a method on the source-position generic function. The offset indicates a distance in characters which must be added to the approximate position to produce the exact position.
This generic function can be called by clients in order to obtain the approximate position in the input stream to which condition pertains. The type and interpretation of the returned object depend on the client, namely the presence of client-specific methods on the source-position generic function. The information returned by the functions position-offset and range-length can be used to refine the approximate position and compute a range in the input stream respectively.
Applicable methods exist for all conditions of type stream-position-condition.
This generic function is called in order to compute the exact position (or start of a range) in the input stream to which condition pertains by refining the approximate position obtained by calling stream-position. The returned value is an integer (possibly negative) which indicates the offset in characters from the approximate position to the exact position. Since the representation of the approximate position is chosen by the client, applying the offset to that position in a suitable way is also the responsibility of the client. Assuming the object returned by (stream-position condition) is suitable for arithmetic, the exact position is stream-position + position-offset.
Applicable methods exist for all conditions of type stream-position-condition.
This generic function is called in order to compute the length of the range in the input stream to which condition pertains. The returned value is a non-negative integer which indicates the length of the range in characters. Therefore, assuming the object returned by (stream-position condition) is suitable for arithmetic, the range covers input the positions [start, start + range-length] where start = stream-position + position-offset.
Applicable methods exist for all conditions of type stream-position-condition.
This variable is used by several generic functions which are called by eclector.reader:read. The default value of the variable is nil. Clients that want to override or extend the default behavior of some generic function of Eclector should bind this variable to some standard object and provide a method on that generic function, specialized to the class of that standard object.
This generic function is called in order to determine the current position in stream. Eclector does not inspect or manipulate the objects returned by this generic function beyond storing them in signaled conditions and passing them as arguments to the make-source-range generic function. A client is therefore free to define methods on this generic function that return arbitrary objects.
The default method on this generic function calls cl:file-position.
This generic function is called in order to turn the source positions
start and end into a range representation suitable
for client. The returned representation designates the range
of input characters from and including the character at position
start to but not including the character at position
end. The default method returns
(cons start end)
.
In this section, symbols written without package marker are in the eclector.reader package (see Package for ordinary reader features)
The features provided in this package fall into two categories:
Figure 2.1 illustrates the categorization into the Common Lisp reader compatible interface and the extensible behavior protocol as well as typical function call patterns that arise when the functions read, read-preserving-whitespace, read-from-string and read-delimited-list are called by client code.
The following functions are like their standard Common Lisp counterparts with the two differences that their names are symbols in the eclector.reader package and that their behavior can deviate from that of the standard reader depending on the value of the eclector.base:*client* variable.
This function is the main entry point for the ordinary reader. It is entirely compatible with the standard Common Lisp function with the same name.
This function is entirely compatible with the standard Common Lisp function with the same name.
This function is entirely compatible with the standard Common Lisp function with the same name.
This function is entirely compatible with the standard Common Lisp function with the same name.
By defining methods on the generic functions of this protocol, clients can customize the high-level behavior of the reader.
Figure 2.2 illustrates how the customizable generic functions described in this section are called through the client interface and the implementation of the reader algorithm.
This generic function is called by read if read is called with a false value for the recursive-p parameter. It calls thunk with the necessary context for a global read call. thunk should read and return an object without consuming any whitespace following the object. If preserve-whitespace-p is false, this function reads up to one character of whitespace after thunk returns. By default, this function returns the object or eof-value returned by thunk as its sole value.
Note: This generic function may return more values in addition to the one described above. Clients may use this feature to communicate additional information between methods (see Parse result construction features). Client defined methods on this generic function should accept such additional values when calling thunk, a next method or read-common and themselves return the additional values.
The default method on this generic function performs two tasks:
This generic function is called by read, passing it the value of the variable eclector.base:*client* and the corresponding parameters. By default, this generic function returns the objects as its sole value.
Note: This generic function may return more values in addition to the one described above. Clients may use this feature to communicate additional information between methods (see Parse result construction features). Client defined methods on this generic function should accept such additional values when calling a next method or read-maybe-nothing and themselves return the additional values.
Client code can add methods on this function, specializing them to the client class of its choice. The actions that read needs to take for different values of the parameter recursive-p have already been taken before read calls this generic function.
This generic function can be called directly by the client or by the generic function read-common to read an object or consume input without returning an object. If called directly by the client, the call has to be in the dynamic scope of a call-as-top-level-read call. The function read-maybe-nothing either
Note: This generic function may return more values in addition to the ones described above. Clients may use this feature to communicate additional information between methods (see Parse result construction features). Client defined methods on this generic function should accept such additional values when calling a next method and themselves return the additional values.
This generic function is called whenever the reader skips some input such as a comment or a form that must be skipped because of a reader conditional. It is called with the value of the variable eclector.base:*client*, the input stream from which the input is being read and an object indicating the reason for skipping the input. The default method on this generic function does nothing. Client code can supply a method that specializes to the client class of its choice.
When this function is called, the stream is positioned immediately after the skipped input. Client code that wants to know both the beginning and the end of the skipped input must remember the stream position before the call to read was made as well as the stream position when the call to this function is made.
This variable is used by the reader to determine why a range of input characters has been skipped. To this end, internal functions of the reader as well as reader macros can set this variable to a suitable value before skipping over some input. Then, after the input has been skipped, the generic function note-skipped-input is called with the value of the variable as its reason argument.
As an example, the method on note-skipped-input specialized to eclector.parse-result:parse-result-client relays the reason and position information to the client by calling the eclector.parse-result:make-skipped-input-result generic function (see Parse result construction features).
This generic function is called by read-common when it has been detected that a token should be read. This function is responsible for accumulating the characters of the token and then calling interpret-token (see below) in order to create and return a token.
This generic function is called by read-token in order to create a token from accumulated token characters. The parameter token is a string containing the characters that make up the token. The parameter escape-ranges indicates ranges of characters read from input-stream and preceded by a character with single-escape syntax or delimited by characters with multiple-escape syntax. Values of escape-ranges are lists of elements of the form (start\ .\ end) where start is the index of the first escaped character and end is the index following the last escaped character. Note that start and var can be identical indicating no escaped characters. This can happen in cases like a||b. The information conveyed by the escape-ranges parameter is used to convert the characters in token according to the readtable case of the current readtable before a token is constructed.
This generic function is called by the default method on interpret-token when the syntax of the token corresponds to that of a symbol. This function checks the syntactic validity of the symbol token and signals an error in case of a syntax error. If there are no syntax errors (or error recovery has been performed, see Recovering from errors), this function returns three values:
The parameter input-stream is the input stream from which the characters were read. The parameter token is a string that contains all the characters of the token. The parameter escape-ranges indicates ranges within token that were preceded by a character with single-escape syntax or delimited by characters with multiple-escape syntax. The parameter position-package-marker-1 contains the index into token of the first package marker, or nil if the token contains no package markers. The parameter position-package-marker-2 contains the index into token of the second package marker, or nil if the token contains no package markers or only a single package marker.
The default method on this generic function checks the positions of the package markers taking into account escape ranges. The method signals errors and allows error recovery as described above.
This generic function is called by the default method on interpret-token when the syntax of the token corresponds to that of a valid symbol. The parameter input-stream is the input stream from which the characters were read. The parameter token is a string that contains all the characters of the token. The parameter position-package-marker-1 contains the index into token of the first package marker, or nil if the token contains no package markers. The parameter position-package-marker-2 contains the index into token of the second package marker, or nil if the token contains no package markers or only a single package marker.
The default method on this generic function calls interpret-symbol (see below) with a symbol name string and a package indicator.
This generic function is called by the default method on interpret-symbol-token as well as the default #: reader macro function to resolve a symbol name string and a package indicator to a representation of the designated symbol. The parameter input-stream is the input stream from which package-indicator and symbol-name were read. The parameter package-indicator is a either
The symbol-name is the name of the desired symbol.
The default method uses cl:find-package (or cl:*package* when package-indicator is :current) to resolve package-indicator followed by cl:find-symbol or cl:intern, depending on internp, to resolve symbol-name.
A second method which is specialized on package-indicator being nil uses cl:make-symbol to create uninterned symbols.
This generic function is called when the reader has determined that some character is associated with a reader macro. The parameter char has to be used in conjunction with the readtable parameter to obtain the macro function that is associated with the macro character. The parameter input-stream is the input stream from which the reader macro function will read additional input to accomplish its task.
The default method on this generic function simply obtains the reader macro function for char from readtable and calls it, passing input-stream and char as arguments. The default method therefore does the same thing that the standard Common Lisp reader does.
This generic function is called by the default #\ reader macro function to find a character. designator is either
The function has to either return the character designated by designator or nil if no such character exists.
If designator is a string, it is the responsibility of the client to disregard the case of characters in designator, for example by producing an uppercase string from designator before looking up the designated character.
A default method on this generic function that is not specialized to any particular client but is specialized to designator being a string recognizes the mandatory character names listing in HyperSpec Section 13.1.7 Character Names. Another default method on this generic function that is not specialized to any particular client but is specialized to designator being a character just returns designator.
This generic function is called by the default #S reader macro function to construct structure instances. name is a symbol naming the structure type of which an instance should be constructed. initargs is a list the elements of which alternate between string designators naming structure slots and values for those slots.
It is the responsibility of the client to coerce the string
designators to symbols as if by
(intern (string slot-name) (find-package 'keyword))
as described in the Common Lisp specification.
There is no default method on this generic function since there is no portable way to construct structure instances given only the name of the structure type.
Warning: This generic function is deprecated as of Eclector 0.9 and will be removed in a future version. Please use the generic function eclector.reader:call-with-state-value with the aspect designator 'cl:*package* instead (see Reader state protocol for more information on the reader state protocol).
This generic function is called by the reader when input has to be read with a particular current package. This is currently only the case in the #+ and #- reader macro functions which read feature expressions in the keyword package. thunk is a function that should be called without arguments. package-designator designates the package that should be the current package around the call to thunk.
The default method on this generic function simply binds
cl:*package* to the result of
(cl:find-package package-designator)
around calling thunk.
This generic function is called by the default #. reader macro function to perform read-time evaluation. expression is the expression that should be evaluated as it was returned by a recursive read call and potentially influenced by client. The function has to either return the result of evaluating expression or signal an error.
The default method on this generic function simply returns the result
of (cl:eval expression)
.
This generic function is called by the default #+ and #- reader macro functions to check the well-formedness of feature-expression which has been read from the input stream before evaluating it. For compound expressions, only the outermost expression is checked regarding the atom in operator position and its shape – child expressions are not checked. The function returns an unspecified value if feature-expression is well-formed and signals an error otherwise.
The default method on this generic function accepts standard Common Lisp feature expression, i.e. expressions recursively composed of symbols, :not-expressions, :and-expressions and :or-expressions.
This generic function is called by the default #+ and #- reader macro functions to evaluate feature-expression which has been read from the input stream. The function returns either true or false if feature-expression is well-formed and signals an error otherwise.
For compound feature expressions, the well-formedness of child
expressions is not checked immediately but lazily, just before the child
expression in question is evaluated in a subsequent
evaluate-feature-expression call. This allows expressions like
#+(and my-cl-implementation (special-feature a b)) form
to
be read without error when the :my-cl-implementation feature is
absent.
The default method on this generic function first calls check-feature-expression to check the well-formedness of feature-expression. It then evaluates feature-expression according to standard Common Lisp semantics for feature expressions.
The reader state protocol consists of generic functions which the reader and the client call to query and modify the values of reader state aspects. Each aspect is named by a symbol and holds a current value and has a stack of shadowed values like a special variable. Most aspects roughly correspond to a particular reader control variable defined in the Common Lisp specification. In addition to those, Eclector uses aspects for representing the validity of the consing dot as well as the quasiquotation depth and validity in a given context. In total, Eclector defines the following aspects:
cl:*readtable*
Like the cl:*readtable* special variable, this aspect controls the readtable object in which the reader looks up the syntax types of characters, the case conversion mode as well as reader macros. By default, values of this aspect must satisfy the eclector.readtable:readtablep predicate.
cl:*package*
Like the cl:*package* special variable, this aspects controls the package which the reader uses when it looks up or interns symbols in the current package. By default, values of this aspect must be package designators.
cl:*read-suppress*
Like the cl:*read-suppress* special variable, this aspect controls whether the reader skips over expressions without detailed parsing.
cl:*read-eval*
Like the cl:*read-eval* special variable, this aspect controls whether the reader evaluates expressions in #. constructs.
cl:*features*
Like the cl:*features* special variable, this aspect controls the evaluation of features in feature expressions in #+ and #- constructs. By default, values of this aspect must be proper lists of symbols.
cl:*read-base*
Like the cl:*read-base* special variable, this aspect controls the interpretation of tokens by the reader as being integers or ratios. By default, values of this aspect must be of type (integer 1 36).
cl:*read-default-float-format*
Like the cl:*read-default-float-format* special variable, this aspect controls the floating-point format that the reader uses for floating-point numbers without exponent marker or the default exponent marker.
eclector.reader::*quasiquotation-state*
Warning: Clients should not query, bind or set the value of this aspect at this time.
This aspect controls whether backquote and unquote are allowed in the current context.
eclector.reader::*quasiquotation-depth*
Warning: Clients should not query, bind or set the value of this aspect at this time.
This aspect tracks the backquote nesting depth in the current context.
eclector.reader::*consing-dot-allowed-p*
Warning: Clients should not query, bind or set the value of this aspect at this time.
This aspect controls whether the consing dot is allowed in the current context.
Errors of this type are signaled when an attempt is made to establish an object as the value for a reader state aspect and the supplied object is not of the type required by the aspect.
Since this condition type is a subtype of cl:type-error, the offending value and the expected type can be retrieved via the readers cl:type-error-datum and cl:type-error-expected-type respectively. The aspect for which the value was supplied can be retrieved via the reader eclector.reader:aspect.
This generic function is called by the reader to determine whether value is a valid value for the reader state aspect designated by aspect. The generic function returns true if, according to client, value is a valid value for the reader state aspect designated by aspect. aspect must designate a reader state aspect that is recognized by client. At least the aspects listed in the minimal reader state aspects table must be recognized by any client.
With the exceptions of cl:*readtable* and cl:*package*, the default methods on this generic function recognize state aspects and implement type restrictions informed by the Common Lisp specification:
Aspect | Type |
---|---|
cl:*readtable* | (satisfies eclector.readtable:readtablep) |
cl:*package* | (or cl:package cl:symbol cl:string cl:character) (package designator) |
cl:*read-suppress* | t (generalized Boolean) |
cl:*read-eval* | t (generalized Boolean) |
cl:*features* | list (proper list) |
cl:*read-base* | (integer 2 36) (radix) |
cl:*read-default-float-format* | (member short-float single-float double-float long-float) |
Return the current value of the reader state aspect designated by aspect.
aspect must designate a reader state aspect that is recognized by client. At least the aspects listed in the minimal reader state aspects table must be recognized by any client.
The cl:*package* aspect mandates further explanation: When the client uses only the default methods of the reader state protocol, the return value of this generic function for the cl:*package* aspect is of type cl:package which is a strict subtype of the type of valid values for this aspect. In other words, the defaults coerce package designators to package objects.
Set the current value of the reader state aspect designated by aspect to new-value.
aspect must designate a reader state aspect that is recognized by client. At least the aspects listed in the minimal reader state aspects table must be recognized by any client.
new-value is the desired new value for the designated aspect.
new-value has to be a valid value for aspect in the sense
that (eclector.reader:valid-state-value-p client aspect
value)
must return true.
The cl:*package* aspect mandates further explanation: When the client uses only the default methods of the reader state protocol, the method on this generic function which handles the cl:*package* aspect coerces new-value from designators to package objects so that a subsequent eclector.reader:state-value call returns the designated package object.
Call thunk with the reader state aspect designated by aspect bound to value.
aspect must designate a reader state aspect that is recognized by client. At least the aspects listed in the minimal reader state aspects table must be recognized by any client.
The following properties must hold:
(eclector.reader:valid-state-value-p client aspect
value)
must return true.
(eclector.reader:state-value client aspect)
must
evaluate to value.
When Eclector calls this generic function with cl:*package* as the value of aspect, the value is a always a string designator and never a package object. The default method on this generic function coerces such string designators to package objects so that a subsequent eclector.reader:state-value call returns the designated package object.
Backquote and unquote syntax is forbidden in some contexts such as multi-dimensional array literals (#A) and structure literals (#S). Eclector tracks and controls whether backquote, unquote or both should be allowed in a given context using the aspects eclector.reader::*quasiquotation-state* and eclector.reader::*quasiquotation-depth* mentioned above. Since custom reader macros may also have to control this state, Eclector provides the following convenience macro:
Warning: This macro is experimental and its name is not exported for now.
Control whether backquote syntax, unquote syntax or both are allowed in read functions called during the execution of body. context is a symbol identifying the current context which is used for error reporting. A typical value is the name of the reader macro function in which this macro is used. quasiquote-forbidden-p controls whether backquote syntax should be forbidden. The value :keep causes the binding to remain unchanged. unquote-forbidden-p controls whether unquote syntax should be forbidden. The value :keep causes the binding to remain unchanged.
Warning: This macro is deprecated as of Eclector 0.11 and will be removed in a future version. This macro is replaced by the macro eclector.reader::with-quasiquotation-state but that macro is experimental and its name is not exported for now.
Disallow backquote syntax, unquote syntax or both in read functions called during the execution of body. context is a symbol identifying the current context which is used for error reporting. A typical value is the name of the reader macro function in which this macro is used. quasiquote-forbidden-p controls whether backquote syntax should be forbidden. The value :keep causes the binding to remain unchanged. unquote-forbidden-p controls whether unquote syntax should be forbidden. The value :keep causes the binding to remain unchanged.
Eclector includes implementations of the #= and ## reader macros and they are present in the default readtable. One way to customize the behavior of the reader around the #= and ## syntax is replacing the reader macro functions with custom ones but with this approach the client code has to reimplement a lot of functionality. As a finer grained and more composable mechanism for customization, Eclector provides a protocol for implementing and customizing the behavior of the #= and ## reader macros, with or without modifying the readtable. The remainder of this section describes that protocol.
To start with a bit of terminology, we call the object created by reading #N=expression a labeled object. We call N the label of the labeled object and the result of reading expression the object of the labeled object. We say that #N=expression defines the labeled object and #N# references the labeled object. We call the reference circular if #N# occurs within expression. Labeled objects are internal to the reader and only exist during eclector.reader:read calls: before such a call returns an object, each labeled object within the returned object is replaced by its respective final object. Callers of eclector.reader:read and related functions will therefore only ever see the object, never the labeled object2.
On a technical level, a labeled object is represented as a data type with a current state and a single (possibly unbound) slot containing the object. The following diagrams depicts the possible states of a labeled object together with input patterns and corresponding transitions:
Put differently, a labeled object can be in the following states:
State | Object slot |
---|---|
undefined | – |
defined | unbound |
final | the object |
referenced (not strictly needed) | the object |
circular | unbound |
final (circular) | the object |
referenced (circular) (not strictly needed) | the object |
The distinction between the states final and referenced on the one hand and final (circular), and referenced (circular) on the other hand is not required for implementing labeled objects. Those two pairs of states are therefore collapsed to just final and final (circular) in the remainder of this section. The following figure and paragraphs describe generic functions and methods which implement the creation, registration, lookup and manipulation of labeled objects according to the reduced set of states:
In addition to the generic functions referenced in the above diagram, the generic functions fixup-graph-p, fixup-graph and fixup are part of the protocol. Those functions are used to replace labeled objects with their respective final objects within an object that is about to be returned to the caller of eclector.reader:read 3. To this end, the #= reader macro function must inspect and update the state of the labeled object it is processing after reading expression by calling finalize-labeled-object. finalize-labeled-object decides whether fixup-graph (see below) must be called: If after reading expression the labeled object is in state :circular, expression must have contained circular references and the result of reading it contains labeled objects that have to be replaced with their respective final objects. fixup-graph and fixup perform this replacement. This replacement is performed by recursively traversing objects which are reachable from the final object of the labeled objects, for example by visiting the slots of standard objects, and replacing labeled objects with their respective final object.
In certain cases, the computational complexity of this traversal and replacement can be rather high, depending on when and how exactly the traversal is performed: consider an expression of the form #1=(1 #1# #2=(2 #2# …)). The nested labeled objects in this expression are all circular and thus require fixing up. The read call for the innermost labeled object, say #100=…, returns first and the fixup processing for the labeled object could be performed immediately. The problem is that each of the labeled objects would be processed in the same manner which would lead to a computation complexity of O(N M) where N is the number of labels and M is the number of nodes in the object graph rooted at the object which is returned by the outermost read call. One way to avoid this problem would be to perform fixup processing only for the outermost read call. The problem with that approach is that only a small sub-graph of the whole object graph may be circular in which case most of the work for traversing the whole graph would be wasted. To address both problems, Eclector allows clients to track the nesting of labeled objects and fix up sub-graphs which contain multiple nested objects in one go (see fixup-graph-p).
This generic function is called by the default method on call-as-top-level-read in order to establish a context for tracking #= label definitions and ## label references around a call to thunk.
The default method on this generic function establishes a context in which the default #= and ## reader macro functions can make the appropriate calls to note-labeled-object, forget-labeled-object, find-labeled-object.
This generic function is called by the default #= reader macro function to note the definition of a labeled object with label label while reading from input-stream. The function creates, registers and returns a representation of the labeled object. The returned object is registered in the sense that a subsequent call to find-labeled-object with arguments client and label returns the same object unless forget-labeled-object has been called to unregister the object.
parent is either nil or a (previously created) surrounding labeled object. The parent labeled object is provided to allow the client to potentially defer fixup processing for the new labeled object if the processing for the surrounding labeled object subsumes the processing for the new labeled object.
Note that, when reading an expression of the form #N=object, this function is called after reading #N= from input-stream but before reading object. Consequently, the created and returned labeled object is defined but does not have an object associated with it.
The default method on this generic function calls make-labeled-object with client, input-stream and label to create an object of an unspecified type. The method registers and returns the created object. Client code should manipulate the object only via the generic functions described in this section and in particular not rely on the object being of a particular type (since methods on make-labeled-object specialized to certain client classes could return unexpected objects). The default method requires the context established by the default method on call-with-label-tracking.
This generic function is called by the default #= reader macro function when Eclector reads an invalid labeled object of the form #N=#N# and the caller chooses to recover from the resulting error (see Recovering from errors). In that situation, the remainder of the input is processed as if there had been no labeled object with label N. This function makes the labeled object undefined so that a subsequent find-labeled-object call for label will return nil.
The default method on this generic function requires the context established by the default method on call-with-label-tracking.
This generic function is called by the default ## reader macro function to look up the previously registered representation of a labeled object for label. The function returns nil if no such object has been registered for label and the registered object otherwise.
The default method on this generic function requires the context established by the default method on call-with-label-tracking.
This generic function is called by note-labeled-object to create and return a representation of a labeled object with label label. parent is either nil or a previously created, surrounding labeled object which allows the client to potentially defer fixup processing for the new labeled object if the processing for the surrounding labeled object subsumes the processing.
The default method on this generic function creates and returns an object of an unspecified type. Client code should manipulate the object only via the generic functions labeled-object-state, finalize-labeled-object and reference-labeled-object and in particular not rely on the object being of a particular type (since methods on this generic function specialized to certain client classes could return unexpected objects).
This generic function is called by the default #= reader macro function to determine the state of object. This function returns
The following table lists all possible return value shapes:
object is a labeled object | First value | Second value |
---|---|---|
no | nil | |
yes | :defined | nil |
yes | :circular | nil |
yes | :final | final-object |
yes | :final/circular | final-object |
The default method on this generic function is applicable to labeled object representations returned by the default methods on note-labeled-object and make-labeled-object.
This generic function is called by the default #= reader macro function after reading a complete labeled object in order to store object in labeled-object and change the state of labeled-object to either :final or :final/circular. The function returns two values: the finalized labeled-object and the new state of labeled-object.
The default method on this generic function is applicable to labeled object representations returned by the default methods on note-labeled-object and make-labeled-object.
This generic function is called by the default ## reader macro function to process a reference to labeled-object while reading from input-stream. labeled-object must be a representation of a labeled object and has, in the context of the ## reader macro function, likely been obtained by calling find-labeled-object. Depending on the state of labeled-object, this function returns either labeled-object itself or an object that can be returned to the caller as-is. In case labeled-object is returned, it will be replaced by its associated object later, when fixup-graph is called.
The default method on this generic function is applicable to labeled object representations returned by the default methods on note-labeled-object and make-labeled-object.
As briefly mentioned above, the generic functions fixup-graph and fixup traverse and inspect objects in the object graph reachable from an object that is about to be returned to the caller of eclector.reader:read. In order to distinguish ordinary objects from labeled objects that act as placeholders in the object graph and must be replaced with their respective final objects, fixup methods call labeled-object-state on all encountered objects. labeled-object-state returns nil for all objects that are not labeled objects and :final for labeled objects which must be replaced with their final object.
This generic function is potentially called by a method on finalize-labeled-object to determine whether the object graph reachable from the object of root-labeled-object should be fixed up by calling fixup-graph with client and labeled-object.
Multiple default methods on this generic function jointly implement the following behavior:
This generic function is potentially called after the reader has constructed an object graph which is reachable from the object of root-labeled-object and noticed circular references within this graph to fix up circular references before the object of root-labeled-object is returned to the caller (of read or related functions).
object-key is a function that accepts a labeled object and returns the object of the labeled object.
The default method on this generic function creates a hash table for tracking already processed objects and calls fixup with client, the object of root-labeled-object and the hash table to recursively process objects in the object graph which is reachable from the object of root-labeled-object.
This generic function is potentially called to apply circularity-related changes to the object constructed by the reader before it is returned to the caller. object is the object that should be modified. seen-objects is a eq-hash table used to track already processed objects (see below). A method specialized to a class, instances of which consists of parts, should modify object by scanning its parts for labeled object markers, replacing found labeled object markers with the respective final object and recursively calling fixup for all parts.
To recognize labeled objects which have to be replaced, methods should call labeled-object-state on each part of object and interpret the returned values as follows: if nil is returned, the part should not be replaced but recursively processed. If :final is returned as the first value, the part should be replaced with the final object that is returned as the second value. Parts are replaced by mutating object.
fixup is called for side effects – its return value is ignored.
Default methods specializing the object parameter to cons, array, standard-object and hash-table process instances of those classes in the obvious way.
An unspecialized :around method queries and updates seen-objects to ensure that each object is processed exactly once.
The following generic functions allow clients to construct representations of quoted and quasiquoted forms as well as function special forms.
This generic function is called by the default '-reader macro function to construct a quotation form in which material is the quoted material.
The default method on this generic function returns a result
equivalent to (list 'common-lisp:quote material)
.
This generic function is called by the default `-reader macro function to construct a quasiquotation form in which form is the quasiquoted material.
The default method on this generic function returns a result
equivalent to (list 'eclector.reader:quasiquote form)
.
This generic function is called by the default ,-reader macro function to construct an unquote form in which form is the unquoted material.
The default method on this generic function returns a result
equivalent to (list 'eclector.reader:unquote form)
.
This generic function is called by the default ,@-reader macro function to construct a splicing unquote form in which form is the unquoted material.
The default method on this generic function returns a result
equivalent to
(list 'eclector.reader:unquote-splicing form)
.
This generic function is called by the default #'-reader macro function to construct a form that applies the function special operator to the name expression.
The default method on this generic function returns a result equivalent
to (list 'common-lisp:function form)
.
The standard syntax types and macro character associations used by the ordinary reader can be set up for any readtable object implementing the readtable protocol (see Readtable features). The following functions are provided for this purpose:
This function sets the standard syntax types in readtable (See HyperSpec section 2.1.4.)
This function sets the standard macro characters in readtable (See HyperSpec section 2.4.)
This function sets the standard dispatch macro characters, that is sharpsign and its sub-characters, in readtable (See HyperSpec section 2.4.8.)
This function sets the standard syntax types and macro characters in readtable by calling the above three functions.
In this section, symbols written without package marker are in the eclector.readtable package (see Package for readtable features).
This package exports two kinds of symbols:
This function is the generic version of the standard Common Lisp function cl:readtablep. The function returns true if object can be used as a readtable in Eclector via the protocol functions in the ecelctor.readtable package. The default method returns nil.
TODO
In this section, symbols written without package marker are in the eclector.parse-result package (see Package for parse result construction features).
This package provides clients with a reader that behaves similarly to cl:read but returns custom parse result objects controlled by the client. Some parse results correspond to things like symbols, numbers and lists that cl:read would return, while others, if the client chooses, represent comments and other kinds of input that cl:read would discard. Furthermore, clients can associate source location information with parse results.
Clients using this package pass a “client” object for which methods on the generic functions described below are applicable to read, read-preserving-whitespace or read-from-string. Suitable client classes can be defined by using parse-result-client as a superclass and at least defining a method on the generic function make-expression-result.
When a client constructs parse results, some of the generic functions for customizing the behavior of the reader (see Reader behavior protocol) return additional values:
Generic function | Situation | Ordinary values | Extended values |
---|---|---|---|
eclector.reader:call-as-top-level-read | object | object | object, parse result, orphan results |
eclector.reader:read-common | object | object | object, parse result |
eclector.reader:read-maybe-nothing | object | object, kind | object, kind, parse result |
eclector.reader:call-as-top-level-read | end of input | eof-value | eof-value, orphan results |
eclector.reader:read-common | end of input | eof-value | eof-value |
eclector.reader:read-maybe-nothing | end of input | eof-value, :eof | eof-value, :eof |
Note how eclector.reader:call-as-top-level-read and eclector.reader:read-common return fewer values for the “end of input” situation. This difference in return value count allows the caller to recognize the “end of input” situation even if eof-value is an object that could be read such as nil. Using such an eof-value makes sense for clients which construct parse results since top-level eclector.parse-result:read calls return these parse results so that there is no risk of confusing the chosen eof-value, even if something like nil, with having read a similar object.
Figure 2.5 shows typical function call patterns, including ordinary and additional return values, that arise when the functions read, read-preserving-whitespace, read-from-string and read-delimited-list are called by client code.
This function is the main entry point for this variant of the reader. It is in many ways similar to the standard Common Lisp function cl:read. The differences are:
This function is similar to the standard Common Lisp function cl:read-preserving-whitespace. The differences are the same as described above for read compared to cl:read.
This function is similar to the standard Common Lisp function cl:read-from-string. The differences are:
This class should generally be used as a superclass for client classes using this package.
This generic function is called in order to construct a parse result object. The value of the result parameter is the raw object read. The value of the children parameter is a list of already constructed parse result objects representing objects read by recursive read calls. The value of the source parameter is a source range, as returned by eclector.base:make-source-range and eclector.base:source-position delimiting the range of characters from which result has been read.
This generic function does not have a default method since the purpose of the package is the construction of custom parse results. Thus, a client must define a method on this generic function.
This generic function is called after the reader skipped over a range of characters in stream. It returns either nil if the skipped input should not be represented or a client-specific representation of the skipped input. The value of the children parameter is a list of already constructed parse result objects which represent object read by recursive read calls (Such as the feature expression and the ignored expression in #+(and (or) some-feature) skipped-expression). The value of the source parameter designates the skipped range using a source range representation obtained via make-source-range and source-position.
Reasons for skipping input include comments, the #+ and #- reader macros and *read-suppress*. The aforementioned reasons are reflected by the value of the reason parameter as follows:
Input | Value of the reason parameter |
---|---|
Comment starting with ; | (:line-comment . 1) |
Comment starting with ;; | (:line-comment . 2) |
Comment starting with n ; | (:line-comment . n) |
Comment delimited by #| |# | :block-comment |
#+false-expression | (:sharpsign-plus . false-expression) |
#-true-expression | (:sharpsign-minus . true-expression) |
*read-suppress* is true | *read-suppress* |
A reader macro returns no values | :reader-macro |
The default method returns nil, that is the skipped input is not represented as a parse result.
In this section, symbols written without package marker are in the eclector.concrete-syntax-tree package (see Package for CST features).
This function is the main entry point for the CST reader. It is mostly compatible with the standard Common Lisp function cl:read. The differences are:
This function is similar to the standard Common Lisp function cl:read-preserving-whitespace. The differences are the same as described above for read compared to cl:read.
This function is similar to the standard Common Lisp function cl:read-from-string. The differences are the same as described above for read compared to cl:read.
Eclector offers extensive support for recovering from many syntax errors, continuing to read from the input stream and return a result that somewhat resembles what would have been returned in case the syntax had been valid. To this end, a restart named eclector.reader:recover is established when recoverable errors are signaled. Like the standard Common Lisp restart cl:continue, this restart can be invoked by a function of the same name:
This function recovers from an error by invoking the most recently established applicable restart named eclector.reader:recover. If no such restart is currently established, it returns nil. If condition is non-nil, only restarts that are either explicitly associated with condition, or not associated with any condition are considered.
When a read call during which error recovery has been performed returns, Eclector tries to return an object that is similar in terms of type, numeric value, sequence length, etc. to what would have been returned in case the input had been well-formed. For example, recovering after encountering the invalid digit in #b11311 returns either the number #b11011 or the number #b11111.
A syntax error and a corresponding recovery strategy are characterized by the type of the signaled condition and the report of the established eclector.reader:recover restart respectively. Attempting to list and describe all examples of both would provide little insight. Instead, this section describes different classes of errors and corresponding recovery strategies in broad terms:
Note that attempting to recover from syntax errors may lead to apparent success in the sense that the read call returns an object, but this object may not be what the caller wanted. For example, recovering from the missing closing " in the following example
(defun foo (x y) "My documentation string (+ x y))
results in (DEFUN FOO (X Y) "My documentation string<newline> (+ x y))")
,
not (DEFUN FOO (X Y) "My documentation string" (+ x y))
.
This chapter describes potential side effects of calling eclector.reader:read, eclector.reader:read-preserving-whitespace or eclector.reader:read-from-string for different kinds of clients.
The following destructive modifications are considered uninteresting and ignored in the remainder of this section:
Furthermore, the remainder of this section is written under the following assumptions:
If any of the above assumptions does not hold, “all bets are off” in the sense that arbitrary side effects other than the ones described below are possible. For notes regarding non-default clients, See Potential side effects for non-default clients.
The default method on the generic function eclector.reader:interpret-symbol may create and intern symbols, thereby modifying the package system.
The default method on the generic function eclector.reader:evaluate-expression uses cl:eval to evaluate arbitrary expressions, potentially causing side effects. With the default readtable, the generic function is only called by the macro function of the #. reader macro.
The default method on the generic function eclector.reader:call-reader-macro can cause side effects by calling macro functions that cause side effects. The following standard reader macros potentially cause side-effects:
In addition to the potential side effects described in Symbols and packages (default client), strings passed as the third argument of eclector.reader:interpret-token are potentially destructively modified during conversion to the current readtable case.
The same considerations as in Read-time evaluation (default client) apply.
Clients defining methods on eclector.reader:make-structure-instance which implement the standard behavior of calling the default constructor (if any) of the named structure should consider side effects caused by slot initforms of the structure. The following example illustrates this problem:
(defvar *counter* 0) (defstruct foo (bar (incf *counter*))) #S(foo) *counter* ⇒ 1 #S(foo) *counter* ⇒ 2
The fixup generic function potentially modifies its second argument destructively. Clients that define methods on eclector.reader:make-structure-instance should be aware of this potential modification in cases like #1=#S(foo :bar #1#). Similar considerations apply for other ways of constructing compound objects such as #1=(t . #1#).
The following standard reader macros could cause or be affected by side effects when combined with a non-standard client:
This chapter describes Eclector’s interpretation of passages in the Common Lisp specification that do not describe the behavior of a conforming reader completely unambiguously.
At first glance, Sharpsign C and Sharpsign S seem to follow the same syntactic structure: the dispatch macro character followed by the sub-character followed by a list of a specific structure. However, the actual descriptions of the respective syntax is different. For Sharpsign C, the specification states:
#C reads a following object, which must be a list of length two whose elements are both reals.
For Sharpsign S, on the other hand, the specification describes the syntax as:
#s(name slot1 value1 slot2 value2 ...) denotes a structure.
Note how the description for Sharpsign C relies on a recursive read invocation while the description for Sharpsign S gives a character-level pattern with meta-syntactic variables. It is possible that this is an oversight and the syntax was intended to be uniform between the two reader macros. Whatever the case may be, in order to handle existing code without inconveniencing clients, Eclector implements both Sharpsign C and Sharpsign S with a recursive read invocation which corresponds to permissive behavior.
More concretely, Eclector behaves as summarized in the following table:
Input | Behavior |
---|---|
#C(1 2) | Read as #C(1 2) |
#C (1 2) | Read as #C(1 2) |
#C#||#(1 2) | Read as #C(1 2) |
#C#.(list 1 (+ 2 3)) | Read as #C(1 5) |
#C[1 2] for left-parenthesis syntax on [ | Read as #C(1 2) |
#S(foo) | Read as #S(foo) |
#S (foo) | Read as #S(foo) |
#S#||#(foo) | Read as #S(foo) |
#S#.(list 'foo) | Read as #S(foo) |
#S[foo] for left-parenthesis syntax on [ | Read as #S(foo) |
Eclector provides a strict version of the Sharpsign C macro function under the name eclector.reader:strict-sharpsign-c which behaves as follows:
Input | Behavior |
---|---|
#C(1 2) | Read as #C(1 2) |
#C (1 2) | Rejected |
#C#||#(1 2) | Rejected |
#C#.(list 1 (+ 2 3)) | Rejected |
#C[1 2] for left-parenthesis syntax on [ | Read as #C(1 2) |
Eclector provides a strict version of the Sharpsign S macro function under the name eclector.reader:strict-sharpsign-s which behaves as follows:
Input | Behavior |
---|---|
#S(foo) | Read as #S(foo) |
#S (foo) | Rejected |
#S#||#(foo) | Rejected |
#S#.(list 'foo) | Rejected |
#S[foo] for left-parenthesis syntax on [ | Rejected |
The Common Lisp specification is very specific about the contexts in which the quasiquotation mechanism can be used. Explicit descriptions of the behavior of the quasiquotation mechanism are given for expressions which are lists or vectors and it is implied that unquote is not allowed in other expressions. From this description, it is clear that `#S(foo :bar ,x) is not valid syntax, for example. However, whether `#',foo is valid syntax depends on whether #'thing is considered to be a list. Since `#',foo is a relatively common idiom, Eclector accepts it by default.
Eclector provides a strict version of the Sharpsign Single Quote macro function under the name eclector.reader:strict-sharpsign-single-quote which does not accept unquote in the function name.
The Common Lisp specification describes the behavior of the ## reader macro as follows:
#n#, where n is a required unsigned decimal integer, provides a reference to some object labeled by #n=; that is, #n# represents a pointer to the same (eq) object labeled by #n=.
The vague phrasing “represents a pointer to the same (eq) object” is probably chosen to cover the situation in which the object in question is not yet defined when the reader encounters the #n# reference as is the case with input of the form #n=(…#n#…). The fact that the object is not yet defined when the reference is encountered is not a problem in general except for one situation: assume #_ is a custom reader macro in the current readtable which calls read. In this situation, reading an expression of the form #n=(…#_#n#…) causes the reader macro function for #_ to be called which calls read to read the following object which encounters the reference. This chain of calls leads to a potential problem: the read call made by the reader macro function has to return some object but it cannot return the object labeled n since that object has not been read yet. The reader macro function must therefore receive some sort of implementation-dependent 4 object which stands in for the object labeled n and gets replaced at some later time after the object labeled n has been read. Since the stand-in object is implementation-dependent, the reader macro function must not make any assumptions regarding the type of the object or operate on it in any way other than returning the object or using the object as a part of a compound object.
The following example violates this principle since the reader macro function in custom-macro-readtable calls cl:second on the object returned by eclector.reader:read:
(defun custom-macro-readtable () (let ((readtable (eclector.readtable:copy-readtable eclector.reader:*readtable*))) (eclector.readtable:set-dispatch-macro-character readtable #\# #\_ (lambda (stream char sub-char) (declare (ignore char sub-char)) (second (eclector.reader:read stream t nil t)))) readtable)) (let ((eclector.reader:*readtable* (custom-macro-readtable))) (eclector.reader:read-from-string "#1=(:a #_#1#)")) ⇒ undefined
To handle the problem described above, Eclector imposes the following restriction on custom reader macro functions which call read:
A reader macro function which reads an object by calling read must account for the object being of an implementation-dependent type and must not operate on the object in any way other than returning the object or using the object as a part of a compound object.
Jump to: | A B C E F L P Q R S U |
---|
Jump to: | A B C E F L P Q R S U |
---|
Jump to: | (
*
C E F I L M N P R S V W |
---|
Jump to: | (
*
C E F I L M N P R S V W |
---|
A children parameter has been added to the lambda list of the generic function eclector.parse-result:make-skipped-input-result so that results which represent skipped material can have children. For example, before this change, a eclector.parse-result:read call which encountered the expression #+no-such-feature foo bar potentially constructed parse results for all (recursive) read calls, that is for the whole expression, for no-such-feature, for foo and for bar, but the parse results for no-such-feature and foo could not be attached to a parent parse result and were thus lost. In other words the shape of the parse result tree was
skipped input result #+no-such-feature foo expression result bar
With this change, the parse results in question can be attached to the parse result which represents the whole #+no-such-feature foo expression so that the entire parse result tree has the following shape
skipped input result #+no-such-feature foo skipped input result no-such-feature skipped input result foo expression result bar
Since this is a major incompatible change, we offer the following workaround for clients that must support Eclector versions with and without this change:
(eval-when (:compile-toplevel :load-toplevel :execute) (let* ((generic-function #'eclector.parse-result:make-skipped-input-result) (lambda-list (c2mop:generic-function-lambda-list generic-function))) (when (= (length lambda-list) 5) (pushnew 'skipped-input-children *features*)))) (defmethod eclector.parse-result:make-skipped-input-result ((client client) (stream t) (reason t) #+PACKAGE-THIS-CODE-IS-READ-IN::skipped-input-children (children t) (source t)) ... #+PACKAGE-THIS-CODE-IS-READ-IN::skipped-input-children (use children) ...)
The above code pushes a symbol that is interned in a package under the control of the respective client (as opposed to the KEYWORD package) onto *features* before the second form is read and uses that feature to select either the version with or the version without the children parameter of the method definition. See Maintaining Portable Lisp Programs by Christophe Rhodes for a detailed discussion of this technique.
Such invalid uses can happen when the above macros are called directly or when the ,, ,@ and ,. reader macros are used in a way that constructs the unquoted expression in one context and then "injects" it into some other context, for example via an object reference #N# or read-time evaluation #.(...). Full example:
(progn (print `(a #1=,(+ 1 2) c)) (print #1#))
Another minor aspect of this change is that the condition types eclector.reader:unquote-splicing-in-dotted-list and eclector.reader:unquote-splicing-at-top are no longer subtypes of common-lisp:stream-error. The previous relation did not make sense since errors of those types are signaled during macro expansion.
The (internal) macro eclector.reader::with-forbidden-quasiquotation is deprecated as of this release. Clients which really need a replacement immediately can use the new (internal) macro eclector.reader::with-quasiquotation-state.
The part of the labeled objects protocol that allows clients to construct parse results which represent labeled objects has been changed in an incompatible way. The change allows parse results which represent labeled objects to have child parse results but requires that clients construct parse results which represent labeled objects differently: instead of eql-specializing the result parameters of methods on eclector.parse-result:make-expression-result to eclector.parse-result:**definition** and eclector.parse-result:**reference** and receiving the labeled object in the children parameters, the result parameters now have to be specialized to the classes eclector.parse-result:definition and eclector.parse-result:reference respectively. The object passed as the result argument now contains the labeled object so that the children parameter can receive child parse results.
This change is considered minor since the old mechanism described above was not documented. For now, the new mechanism also remains undocumented so that the design can be validated through experimentation before it is finalized.
my-package::(a b)
is read as
(my-package::a my-package::b)
with this extension.
(frob r1 r2 :k3 4 #4; :k5 6 :k6 7)
Before this change, cases like
#1=(1 #1# #2=(2 #2# ... #100=(100 #100#)))
or
#1=(1 #2=(2 ... #2#) ... #1#)
led to unnecessary and/or repeated traversals during fixup processing.
Before this change, something like
(eclector.concrete-syntax-tree:read-from-string "#1=(#1#)")
produced a CST object, say cst, which failed to satisfy
(eq (cst:first cst) cst) (eq (cst:raw (first cst)) (cst:raw cst))
The properties now hold.
Clients can use this protocol to control the reader state in other ways than binding the Common Lisp variables, for example by storing the values of reader state aspects in context objects.
Furthermore, implementations which use Eclector as the Common Lisp reader can use this protocol to tie the cl:*readtable* aspect to the cl:*readtable* variable instead of the eclector.reader:*readtable* variable.
The new protocol subsumes the purpose of the generic function eclector.reader:call-with-current-package which is deprecated as of this Eclector version.
A detailed discussion of the topic has been added to the manual (See Interpretation of Sharpsign C and Sharpsign S).
The default error recovery strategy for invalid symbols now constructs an uninterned symbol of the given name instead of using nil.
At the same time, it is now possible to recover from encountering the "consing dot" in invalid positions.
The name eclector.base:*client* remains exported as eclector.reader:*client*.
The old names eclector.parse-result:source-position and eclector.parse-result:make-source-range still exist but are now deprecated and will be removed in a future release.
eclector.reader:read eclector.reader:call-as-top-level-read eclector.reader:read-common eclector.reader:read-maybe-nothing ... eclector.reader:read-char eclector.reader:peek-char
Diagrams which illustrate the relations between the new and existing functions have been added to the manual (Figure 2.1, Figure 2.2, Figure 2.5).
A detailed discussion of the topic has been added to the manual (See Interpretation of Sharpsign C and Sharpsign S).
A detailed discussion of the topic has been added to the manual (See Interpretation of Backquote and Sharpsign Single Quote).
See: https://github.com/s-expressionists/Concrete-Syntax-Tree
Reader macro functions which call eclector.reader:read may receive labeled objects under certain circumstances (see Circular objects and custom reader macros).
This fixup processing has to be delayed under certain circumstances (see Circular objects and custom reader macros).
We use “implementation-dependent” in the sense defined in the Common Lisp specification except that Eclector is the implementation in question.