Remove escaped unicode string in java with regex -


i have string below

"them coming \nlove \ud83d\ude00" 

i want remove character "\ud83d\ude00".

"them coming \nlove " 

how can achieve in java? have tried code below won't works

payload.tostring().replaceall("\\\\u\\b{4}.", "") 

thanks :)

i think \\\\u\\b{4}. not work, because regex treat \ud83d symbol �, not literal string. match kind unwanted (for reason) unicode characters better exclude character accept(don't want replace), ecample ascii character, , match else (what want replace). try with:

[^\x00-\x7f]+ 

the \x00-\x7f includes unicode basic latin block.

string str = "them coming \nlove \ud83d\ude00"; system.out.println(str.replaceall("[^\\x00-\\x7f]+", "")); 

will result with:

them coming
love it

however, willl hava problem, if use national character, other non-ascii symbols (ś,ą,♉,☹,etc.).


Comments

Popular posts from this blog

javascript - Using jquery append to add option values into a select element not working -

Android soft keyboard reverts to default keyboard on orientation change -

Rendering JButton to get the JCheckBox behavior in a JTable by using images does not update my table -