Remove escaped unicode string in java with regex -
i have string below
"them coming \nlove \ud83d\ude00" i want remove character "\ud83d\ude00".
"them coming \nlove " how can achieve in java? have tried code below won't works
payload.tostring().replaceall("\\\\u\\b{4}.", "") thanks :)
i think \\\\u\\b{4}. not work, because regex treat \ud83d symbol �, not literal string. match kind unwanted (for reason) unicode characters better exclude character accept(don't want replace), ecample ascii character, , match else (what want replace). try with:
[^\x00-\x7f]+ the \x00-\x7f includes unicode basic latin block.
string str = "them coming \nlove \ud83d\ude00"; system.out.println(str.replaceall("[^\\x00-\\x7f]+", "")); will result with:
them coming
love it
however, willl hava problem, if use national character, other non-ascii symbols (ś,ą,♉,☹,etc.).
Comments
Post a Comment