Thanks for the code. Here are a few remarks I hope will be useful for others replacing Camlp4's lexer by their own.
I find easier to use Camlp4's own Loc module rather than redefining it entirely:
module Loc = Camlp4.Struct.Loc
module Lexer = My_lexer.Make(Loc)
I found the definition of Token to be a tricky one because the "token" type should be visible to Token, Lexer, and the parser. I got rid of the Token functor and included it in the lexer:
My lexer is as follows:
module Make (Loc : Camlp4.Sig.Loc) = struct
module Loc = Loc
type token = KEYWORD | INT | ...
module Token = struct
module Filter = struct
...
let keyword_conversion tok is_kwd =
match tok with
SYMBOL s | IDENT s when is_kwd s -> KEYWORD s
| _ -> tok
...
end
end
Note the declaration of keyword_conversion. This function is to be called by "filter":
let filter x =
let f tok loc =
let tok' = keyword_conversion tok x.is_kwd in
(tok', loc)
in
...
This allows camlp4 to translate symbols or identifiers to keywords, so you can write:
[ "if"; expr; "then"; expr; "else"; expr ]
instead of:
[ `KWD "if"; expr; `KWD "then"; expr; `KWD "else"; expr ]
A few comments on the above method
Thanks for the code. Here are a few remarks I hope will be useful for others replacing Camlp4's lexer by their own.
I find easier to use Camlp4's own Loc module rather than redefining it entirely:
module Loc = Camlp4.Struct.Loc
module Lexer = My_lexer.Make(Loc)
I found the definition of Token to be a tricky one because the "token" type should be visible to Token, Lexer, and the parser. I got rid of the Token functor and included it in the lexer:
My lexer is as follows:
module Make (Loc : Camlp4.Sig.Loc) = struct
module Loc = Loc
type token = KEYWORD | INT | ...
module Token = struct
module Filter = struct
...
let keyword_conversion tok is_kwd =
match tok with
SYMBOL s | IDENT s when is_kwd s -> KEYWORD s
| _ -> tok
...
end
end
Note the declaration of keyword_conversion. This function is to be called by "filter":
let filter x =
let f tok loc =
let tok' = keyword_conversion tok x.is_kwd in
(tok', loc)
in
...
This allows camlp4 to translate symbols or identifiers to keywords, so you can write:
[ "if"; expr; "then"; expr; "else"; expr ]
instead of:
[ `KWD "if"; expr; `KWD "then"; expr; `KWD "else"; expr ]