Remove escaped unicode string in java with regex -
i have string below
"them coming \nlove \ud83d\ude00"
i want remove character "\ud83d\ude00"
.
"them coming \nlove "
how can achieve in java? have tried code below won't works
payload.tostring().replaceall("\\\\u\\b{4}.", "")
thanks :)
i think \\\\u\\b{4}.
not work, because regex treat \ud83d
symbol �, not literal string. match kind unwanted (for reason) unicode characters better exclude character accept(don't want replace), ecample ascii character, , match else (what want replace). try with:
[^\x00-\x7f]+
the \x00-\x7f
includes unicode basic latin block.
string str = "them coming \nlove \ud83d\ude00"; system.out.println(str.replaceall("[^\\x00-\\x7f]+", ""));
will result with:
them coming
love it
however, willl hava problem, if use national character, other non-ascii symbols (ś,ą,♉,☹,etc.).
Comments
Post a Comment