I would like to coax the reader to return a string "label". Right now, I have defined a macro character #\< to collect characters from the input stream until the matching > is found. But I wonder if I can coax the reader to interpret <label> as a "label" string.
> I would like to coax the reader to return a string "label". Right > now, I have defined a macro character #\< to collect characters from > the input stream until the matching > is found. But I wonder if I can > coax the reader to interpret <label> as a "label" string.
> > I would like to coax the reader to return a string "label". Right > > now, I have defined a macro character #\< to collect characters from > > the input stream until the matching > is found. But I wonder if I can > > coax the reader to interpret <label> as a "label" string.
I was looking for something like this (and which does not work): CL-USER> (set-syntax-from-char #\< #\") T CL-USER> (set-syntax-from-char #\> #\") T CL-USER> (read-from-string "<abc>") or (read-from-string <abc>),
but both result in an error:
READ: input stream #1=#<INPUT STRING-INPUT-STREAM> ends within a string. [Condition of type SYSTEM::SIMPLE-END-OF-FILE]
To explain what I am after: I want to parse GNUPLOT command definitions and create a CL-interface to them automatically. See for example the`set label' command: http://www.gnuplot.info/docs/node193.html
My code can read the definition (I have re-defined the meaning of `|' and `,'), but I would like to streamline it a bit.
On Sun, 8 Nov 2009 11:55:31 -0800 (PST), Mirko <mirko.vuko...@gmail.com> said:
> Hello, > I am parsing a file that has snippets such as > <label> > I would like to coax the reader to return a string "label". Right > now, I have defined a macro character #\< to collect characters from > the input stream until the matching > is found. But I wonder if I can > coax the reader to interpret <label> as a "label" string.
The latter is exactly achieved by doing the former (i.e. by suitably defining #\< as a macro character), so in what way would you like to do better?
Note that the only way the (standard Common Lisp) reader algorithm can return a string is through the use of macro characters (only numbers and symbols are produced from extended tokens).
If each such tag were never adjacent to constituent characters,
would have it read as a symbol (without upcasing), but I am not suggesting that, as the condition is not likely to be satisfied, even if such kludginess was acceptable.
---Vassil.
-- "Even when the muse is posting on Usenet, Alexander Sergeevich?"
Mirko <mirko.vuko...@gmail.com> writes: > On Nov 8, 3:01 pm, p...@informatimago.com (Pascal J. Bourguignon) > wrote: >> Mirko <mirko.vuko...@gmail.com> writes: >> > Hello,
>> > I am parsing a file that has snippets such as
>> > <label>
>> > I would like to coax the reader to return a string "label". Right >> > now, I have defined a macro character #\< to collect characters from >> > the input stream until the matching > is found. But I wonder if I can >> > coax the reader to interpret <label> as a "label" string.
> I was looking for something like this (and which does not work): > CL-USER> (set-syntax-from-char #\< #\") > T > CL-USER> (set-syntax-from-char #\> #\") > T > CL-USER> (read-from-string "<abc>") > or > (read-from-string <abc>),
To be able to use <abc> in lisp sources, just run: (set-macro-character #\< (function bracket-string-reader) *readtable*) on the REPL readtable. Above I showed a benign example installing the reader macro on a temporary local readtable, but of course you can modify directly *readtable*.
> but both result in an error:
> READ: input stream #1=#<INPUT STRING-INPUT-STREAM> ends within a > string. > [Condition of type SYSTEM::SIMPLE-END-OF-FILE]
This errors comes from the fact that the predefined reader macro for #\" expects only #\" to end the string. There's no READ-DELIMITED-STRING like READ-DELIMITED-LIST taking the terminating character as argument (but you could write it of course). In anycase, when you'll have implemented READ-DELIMITED-STRING, you will still have to hook it to the reader, and this is what SET-MACRO-CHARACTER is for. This is the way to do it, there's no other.
If you prefer another API, you can always implement it yourself. Since you seem to be wanting to use the name SET-SYNTAX-FROM-CHAR which is already defined for your AI system that will infer the code to implement to read a string delimited by brackets, you will have to SHADOW it before redefining it as your AI entry point.
On Sun, 08 Nov 2009 22:07:49 +0100, p...@informatimago.com (Pascal J. Bourguignon) said:
> Mirko <mirko.vuko...@gmail.com> writes: >> ... >> I was looking for something like this (and which does not work):
CL-USER> (set-syntax-from-char #\< #\")
>> T
CL-USER> (set-syntax-from-char #\> #\")
>> T >> ... > This errors comes from the fact that the predefined reader macro for > #\" expects only #\" to end the string.
It actually expects the same delimiter at the end as in the beginning (maybe not according to CLHS, but to CLtL), so after the above, <foo< would be read as "foo" and so would >foo> (which is still not what is desired, of course).
---Vassil.
-- "Even when the muse is posting on Usenet, Alexander Sergeevich?"
Vassil Nikolov <vniko...@pobox.com> writes: > On Sun, 08 Nov 2009 22:07:49 +0100, p...@informatimago.com (Pascal J. Bourguignon) said:
>> Mirko <mirko.vuko...@gmail.com> writes: >>> ... >>> I was looking for something like this (and which does not work): > CL-USER> (set-syntax-from-char #\< #\") >>> T > CL-USER> (set-syntax-from-char #\> #\") >>> T >>> ... >> This errors comes from the fact that the predefined reader macro for >> #\" expects only #\" to end the string.
> It actually expects the same delimiter at the end as in the > beginning (maybe not according to CLHS, but to CLtL), so after the > above, <foo< would be read as "foo" and so would >foo> (which is > still not what is desired, of course).
CLHS 2.4.5 says:
The double-quote is used to begin and end a string. When a double-quote is encountered, characters are read from the input stream and accumulated until another double-quote is encountered. If a single escape character is seen, the single escape character is discarded, the next character is accumulated, and accumulation continues. The accumulated characters up to but not including the matching double-quote are made into a simple string and returned. It is implementation-dependent which attributes of the accumulated characters are removed in this process.
Notice how it says that "double-quote ends a string", not that "the same character that opened the strings ends the string".
Therefore if you have an implementation that uses the reader macro character argument as terminator, good for you, but you should certainly not count on it in conformant programs. I'd rather expect that for performance reasons the parameter be ignored and that an absolute #\" be expected to end the string.
> to read the above as the symbol LABEL (I see no reason why one would > insist on reading strings here).
The reason I have to read as string is as follows: <foo> denotes entry of an explicit value. In my interface, I would have (defun set-bar (foo) (gnuplot-set-command "bar" foo) ;; pseudocode
However, in some cases, GNUPLOT has a compound name: <label name>, which I will have to translate into (defun set-bar (label-name) )
I thought the simplest way is to read a string "label name" substitute #\- for #\SPACE and intern.
Some GNUPLOT commands are specified as "<foo>". Here the quotes will have to be inserted into the command sent to gnuplot. My plan was to ignore quotes (make them equivalent to blanks), but set a flag, so that all items read inside them have a plist '(type string)
Then the dispatching code would read the plist, and know to enclose the value into quotes before sending it to gnuplot.
> > to read the above as the symbol LABEL (I see no reason why one would > > insist on reading strings here).
> The reason I have to read as string is as follows: > <foo> denotes entry of an explicit value. In my interface, I would > have > (defun set-bar (foo) > (gnuplot-set-command "bar" foo) ;; pseudocode
> However, in some cases, GNUPLOT has a compound name: <label name>, > which I will have to translate into
> I thought the simplest way is to read a string "label name" substitute > #\- for #\SPACE and intern.
> Some GNUPLOT commands are specified as "<foo>". Here the quotes will > have to be inserted into the command sent to gnuplot. My plan was to > ignore quotes (make them equivalent to blanks), but set a flag, so > that all items read inside them have a plist '(type string)
> Then the dispatching code would read the plist, and know to enclose > the value into quotes before sending it to gnuplot.
> Anyway, thank you both for your comments.
> Mirko
Previous post was sent before being finished. Fixed above.
Mirko <mirko.vuko...@gmail.com> writes: >> The reason I have to read as string is as follows: >> <foo> denotes entry of an explicit value. In my interface, I would >> have >> (defun set-bar (foo) >> (gnuplot-set-command "bar" foo) ;; pseudocode
>> However, in some cases, GNUPLOT has a compound name: <label name>, >> which I will have to translate into
On Sun, 08 Nov 2009 22:36:05 +0100, p...@informatimago.com (Pascal J. Bourguignon) said:
> Vassil Nikolov <vniko...@pobox.com> writes: >> On Sun, 08 Nov 2009 22:07:49 +0100, p...@informatimago.com (Pascal J. Bourguignon) said: >> ... >>> This errors comes from the fact that the predefined reader macro for >>> #\" expects only #\" to end the string.
^^^^
>> It actually expects the same delimiter at the end as in the >> beginning (maybe not according to CLHS, but to CLtL), so after the >> above, <foo< would be read as "foo" and so would >foo> (which is >> still not what is desired, of course). > CLHS 2.4.5 says: > The double-quote is used to begin and end a string. When a > double-quote is encountered, characters are read from the input > stream and accumulated until another double-quote is > encountered. > ... > Notice how it says that "double-quote ends a string", not that "the > same character that opened the strings ends the string".
I made no claims about CLHS; but even ignoring that, CLHS still does not say that #\"'s macro character function expects _only_ a #\" as the string terminator.
In other words, after (SET-SYNTAX-FROM-CHAR #\< #\"), the theory that explains (READ-FROM-STRING "<foo>")'s error with the absence of a double quote fails to explain why (READ-FROM-STRING "<foo<") returns "foo" without an error.
---Vassil.
-- "Even when the muse is posting on Usenet, Alexander Sergeevich?"
Vassil Nikolov <vniko...@pobox.com> writes: > On Sun, 08 Nov 2009 22:36:05 +0100, p...@informatimago.com (Pascal J. Bourguignon) said:
>> Vassil Nikolov <vniko...@pobox.com> writes: >>> On Sun, 08 Nov 2009 22:07:49 +0100, p...@informatimago.com (Pascal J. Bourguignon) said: >>> ... >>>> This errors comes from the fact that the predefined reader macro for >>>> #\" expects only #\" to end the string. > ^^^^
>>> It actually expects the same delimiter at the end as in the >>> beginning (maybe not according to CLHS, but to CLtL), so after the >>> above, <foo< would be read as "foo" and so would >foo> (which is >>> still not what is desired, of course).
>> CLHS 2.4.5 says:
>> The double-quote is used to begin and end a string. When a >> double-quote is encountered, characters are read from the input >> stream and accumulated until another double-quote is >> encountered. >> ...
>> Notice how it says that "double-quote ends a string", not that "the >> same character that opened the strings ends the string".
> I made no claims about CLHS; but even ignoring that, CLHS still does > not say that #\"'s macro character function expects _only_ a #\" as > the string terminator.
> In other words, after (SET-SYNTAX-FROM-CHAR #\< #\"), the theory > that explains (READ-FROM-STRING "<foo>")'s error with the absence of > a double quote fails to explain why (READ-FROM-STRING "<foo<") > returns "foo" without an error.
A macro definition from a character such as " can be copied to another character; the standard definition for " looks for another character that is the same as the character that invoked it. The definition of ( can not be meaningfully copied to {, on the other hand. The result is that lists are of the form {a b c), not {a b c}, because the definition always looks for a closing parenthesis, not a closing brace.
> > On Sun, 08 Nov 2009 22:36:05 +0100, p...@informatimago.com (Pascal J. > > Bourguignon) said:
> >> Vassil Nikolov <vniko...@pobox.com> writes: > >>> On Sun, 08 Nov 2009 22:07:49 +0100, p...@informatimago.com (Pascal J. > >>> Bourguignon) said: > >>> ... > >>>> This errors comes from the fact that the predefined reader macro for > >>>> #\" expects only #\" to end the string. > > ^^^^
> >>> It actually expects the same delimiter at the end as in the > >>> beginning (maybe not according to CLHS, but to CLtL), so after the > >>> above, <foo< would be read as "foo" and so would >foo> (which is > >>> still not what is desired, of course).
> >> CLHS 2.4.5 says:
> >> The double-quote is used to begin and end a string. When a > >> double-quote is encountered, characters are read from the input > >> stream and accumulated until another double-quote is > >> encountered. > >> ...
> >> Notice how it says that "double-quote ends a string", not that "the > >> same character that opened the strings ends the string".
> > I made no claims about CLHS; but even ignoring that, CLHS still does > > not say that #\"'s macro character function expects _only_ a #\" as > > the string terminator.
> > In other words, after (SET-SYNTAX-FROM-CHAR #\< #\"), the theory > > that explains (READ-FROM-STRING "<foo>")'s error with the absence of > > a double quote fails to explain why (READ-FROM-STRING "<foo<") > > returns "foo" without an error.
> A macro definition from a character such as " can be copied to > another character; the standard definition for " looks for another > character that is the same as the character that invoked it. The > definition of ( can not be meaningfully copied to {, on the other > hand. The result is that lists are of the form {a b c), not {a b > c}, because the definition always looks for a closing parenthesis, > not a closing brace.
I use this:
(defun make-string-reader (c1 c2) (set-macro-character c1 (lambda (stream c) (declare (ignore c)) (with-output-to-string (s) (loop for c = (read-char stream) with cnt = 1 if (eql c c1) do (incf cnt) else if (eql c c2) do (decf cnt) until (and (eql c c2) (eql cnt 0)) do (princ c s)) s)) t))
This lets you read strings using any pair of characters, e.g.:
(make-string-reader #\< #\>)
and it also allows strings delimited with asymmetric characters to be nested.
Personally, since I now use a unicode-aware Lisp (CCL) I use this:
(make-string-reader #\« #\»)
or, depending on your tastes:
(make-string-reader #\“ #\”)
(Just in case the character encoding gets screwed up, those are supposed to be european-style <<quotes>> or ``quotes,,)
e.g.:
? (print «With asymmetric delimiters, you can «nest» "strings" without backslashes.»)
"With asymmetric delimiters, you can «nest» \"strings\" without backslashes." "With asymmetric delimiters, you can «nest» \"strings\" without backslashes." ?
> Mirko <mirko.vuko...@gmail.com> writes: > >> The reason I have to read as string is as follows: > >> <foo> denotes entry of an explicit value. In my interface, I would > >> have > >> (defun set-bar (foo) > >> (gnuplot-set-command "bar" foo) ;; pseudocode
> >> However, in some cases, GNUPLOT has a compound name: <label name>, > >> which I will have to translate into
On Sun, 08 Nov 2009 15:58:38 -0800, Ron Garret <rNOSPA...@flownet.com> said:
> ... > (defun make-string-reader (c1 c2) > (set-macro-character > c1 > (lambda (stream c) > (declare (ignore c)) > (with-output-to-string (s) > (loop for c = (read-char stream) > with cnt = 1 > if (eql c c1) do (incf cnt) > else if (eql c c2) do (decf cnt) > until (and (eql c c2) (eql cnt 0)) > do (princ c s)) > s)) > t))
No escape?
---Vassil.
-- "Even when the muse is posting on Usenet, Alexander Sergeevich?"
> On Sun, 08 Nov 2009 15:58:38 -0800, Ron Garret <rNOSPA...@flownet.com> said: > > ... > > (defun make-string-reader (c1 c2) > > (set-macro-character > > c1 > > (lambda (stream c) > > (declare (ignore c)) > > (with-output-to-string (s) > > (loop for c = (read-char stream) > > with cnt = 1 > > if (eql c c1) do (incf cnt) > > else if (eql c c2) do (decf cnt) > > until (and (eql c c2) (eql cnt 0)) > > do (princ c s)) > > s)) > > t))
> No escape?
I presume you mean: no backslash-style escape.
That's right. The whole point of using balanced delimiters for strings is to (mostly) eliminate the need for escapes. The only time you'd need one is if you wanted to do something like this:
(print «European-style strings are terminated with a \» character.»)
That sort of thing is pretty rare. If you ever do need to do it, there are a number of easy workarounds:
1.
(format t «European-style strings are terminated with a ~A character.» #\»)
2.
(make-string-reader #\“ #\”) (print “European style strings are terminated with a » character.”)
3. Fix the code so that it supports backslash escapes. This would be pretty easy to do, but since the whole point for me is to discourage backslash escapes (because I think they are a Really Bad Idea (tm)) this is left as an exercise.
Vassil Nikolov <vniko...@pobox.com> writes: > On Sun, 08 Nov 2009 15:58:38 -0800, Ron Garret <rNOSPA...@flownet.com> said: >> ... >> (defun make-string-reader (c1 c2) >> (set-macro-character >> c1 >> (lambda (stream c) >> (declare (ignore c)) >> (with-output-to-string (s) >> (loop for c = (read-char stream) >> with cnt = 1 >> if (eql c c1) do (incf cnt) >> else if (eql c c2) do (decf cnt) >> until (and (eql c c2) (eql cnt 0)) >> do (princ c s)) >> s)) >> t))
> No escape?
As Common Lisp did not standardize on a function GET-CHARACTER-SYNTAX, that endeavor is actually either half-hearted, inconceivably kludgy, or unportable. It's a nice exercise trying to write SINGLE-ESCAPE-CHAR-P and MULTIPLE-ESCAPE-CHAR-P in a portable way, though. :-)