i have many long documents need parsed. document format xml not xml.
here's example:
<doc> <text>it's content p&g</text> </doc> <doc> <text>it's antoher</text> </doc> note there mutiple root tags - <doc>, , entity & should & in xml.
thus, above file not standard xml.
can use xmldocument parse file, or should write own parser?
what saying incorrect - "not standard xml". document not xml. period.
you cannot use xmldocument or other xml parser parse complete document.
you need ensure have valid xml before try parse using xml parser.
so - in case, either warp document in root element or break out several documents. in either case, need ensure special characters encoded correctly (quotes, ampersands etc...).
the answer oakio gets part way treating document xml fragment, still doesn't invalid content such unescaped ampersands.
Comments
Post a Comment