c++ - How to use a slash in Spirit Lex patterns? -
code below compiles fine with
clang++ -std=c++11 test.cpp -o test
but when running exception thrown
terminate called after throwing instance of 'boost::lexer::runtime_error' what(): lookahead ('/') not supported yet.
the problem the slash (/) in input and/or regex (line 12 , 39) can't find solution how escape right. hints?
#include <string> #include <cstring> #include <boost/spirit/include/lex.hpp> #include <boost/spirit/include/lex_lexertl.hpp> #include <boost/spirit/include/qi.hpp> #include <boost/spirit/include/phoenix.hpp> namespace lex = boost::spirit::lex; namespace qi = boost::spirit::qi; namespace phoenix = boost::phoenix; std::string regex("foo/bar"); template <typename type> struct lexer : boost::spirit::lex::lexer<type> { lexer() : foobar_(regex) { this->self.add(foobar_); } boost::spirit::lex::token_def<std::string> foobar_; }; template <typename iterator, typename def> struct grammar : qi::grammar <iterator, qi::in_state_skipper<def> > { template <typename lexer> grammar(const lexer & _lexer); typedef qi::in_state_skipper<def> skipper; qi::rule<iterator, skipper> rule_; }; template <typename iterator, typename def> template <typename lexer> grammar<iterator, def>::grammar(const lexer & _lexer) : grammar::base_type(rule_) { rule_ = _lexer.foobar_; } int main() { // input char const * first("foo/bar"); char const * last(first + strlen(first)); // lexer typedef lex::lexertl::token<const char *> token; typedef lex::lexertl::lexer<token> type; lexer<type> l; // grammar typedef lexer<type>::iterator_type iterator; typedef lexer<type>::lexer_def def; grammar<iterator, def> g(l); // parse bool ok = lex::tokenize_and_phrase_parse ( first , last , l , g , qi::in_state("ws")[l.self] ); // check if (!ok || first != last) { std::cout << "failed parsing input file" << std::endl; return 1; } return 0; }
as sehe points out, /
intended used lookahead operator, taking after the syntax of flex. it's unfortunate spirit wouldn't use more normal lookahead syntax (not think other syntax more elegant; gets confusing subtle variations in regex syntax).
if @ re_tokeniser.hpp
:
// not escape sequence , not inside string, // check meta characters. switch (ch_) { ... case '/': throw runtime_error("lookahead ('/') not supported yet."); break; ... }
it thinks you're not in escape sequence nor inside string, it's checking meta characters. /
considered meta character lookahead (even though feature isn't implemented), , must escaped, despite boost docs not mentioning @ all.
try escaping /
(not in input) backslash (i.e. "\\/"
, or "\/"
if using raw string). alternatively, others have suggested using [/]
.
i'd consider bug in spirit lex documentation lacking point out /
must escaped.
edit: kudos sehe , cv_and_he, helped correct of earlier thinking. if post answer here, sure give them +1.
Comments
Post a Comment