How to search a non-ASCII character in a c++ string? -
string s="x1→(y1⊕y2)∧z3";  for(auto i=s.begin(); i!=s.end();i++){     if(*i=='→'){        ...     } }  the char comparing wrong, what's correct way it? using vs2013.
first need basic understanding of how programs handle unicode. otherwise, should read up, quite this post on joel on software.
you have 2 problems here:
problem #1: getting string program
your first problem getting actual string in string s. depending on encoding of source code file, msvc may corrupt non-ascii characters in string.
- either save c++ file utf-16 (which windows confusingly calls unicode), , use - whcar_t,- wstring(effectively encoding expression utf-16). saving utf-8 bom work. other encoding ,- l"..."character literals contain wrong characters.- note other platforms may define - wchar_t4 bytes instead of 2. handling of characters above u+ffff non-portable.
- in other cases, can't write characters in source file. portable way encoding string literals utf-8, using - \xescape codes non-ascii characters. this:- "x1\xe2\x86\x92a\xe2\x8a\x95" "b)"rather- "x1→(a⊕b)".- and yes, that's unreadable , cumbersome gets. root problem msvc doesn't support using utf-8. can go through question here overview: how create utf-8 string literal in visual c++ 2008 . - but, consider how strings show in source code. 
problem #2: finding character
(if you're using utf-16, can find l'→' character, since character representable 1 whcar_t. characters above u+ffff you'll have use wide version of workaround below.)
it's impossible define char representing arrow character. can string: "\xe2\x86\x92". (that's string 3 chars arrow, , \0 terminator.
you can search string in expression:
s.find("\xe2\x86\x92"); the utf-8 encoding scheme guarantees finds correct character, keep in mind offset in bytes.
Comments
Post a Comment