c# - Encode XDocumnet form win-1251 to utf-8 -
i try convert xdocument win-1 utf-8. in raw-view russian characters have bad view.
var encoding = new utf8encoding(false,false); xmltextwriter xmltextwriter = new xmltextwriter("f:\\file", encoding.getencoding("windows-1251")); document.save(xmltextwriter); xmltextwriter.close(); xmltextwriter = null; string text = file.readalltext("f:\\file", encoding.default); xdocument documentcode = xdocument.parse(text); xmltextwriter = new xmltextwriter(_stream, encoding); documentcode.save(xmltextwriter); xmltextwriter.flush(); _stream.position = 0; headers.contenttype = new mediatypeheadervalue("application/xml");
this raw-view in soapui
<?xml version="1.0" encoding="utf-8"?><statobservationlist><statobservation><objectid>0b575ec1-7dea-41c4-a1f0-287190715ed2</objectid><name>Тестовое статнаблюдение</name><code>gppcode42</code></statobservation><statobservation><objectid>3a871ea1-06ee-4991-a263-d643b424bdd4</objectid><name>МиСП</name><code /></statobservation></statobservationlist>
i think i've got now. text in xdocument
has, whatever reason, been decoded incorrectly using windows-1251.
ideally, need go source , ensure decoded (with utf8). converting may not entirely loss-free process, there code points in utf8 don't have representation in windows-1251 (a quick glance @ code page shows nothing 0x98
, example).
however, convert after fact simplest way text back, bytes encoding decoded , decode correct encoding:
var windows1251 = encoding.getencoding("windows-1251"); var utf8 = encoding.utf8; var originalbytes = windows1251.getbytes(document.tostring()); var correctxmlstring = utf8.getstring(originalbytes); var correctdocument = xdocument.parse(correctxmlstring);
Comments
Post a Comment