Using XML functions in oracle and troubleshooting errors – for the new oracle dba





this is my consolidated notes that i gathered when i had issues with xml functions in oracle 10g.so it also includes data from other sites as well as metalink which i found useful.The intention is to give credit to the original authors of some of these articles.

a new oracle dba can find here all detailed information on how to use xml functions in the oracle database and how to troubleshoot problems with different character sets.

http://it.toolbox.com/blogs/oracle-guide/xml-in-the-database-a-brief-overview-4410

 

 

 

 

SELECT length(XMLELEMENT(“Description”,REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(main.shortname,CHR(31),”),CHR(30),”),CHR(29),”),CHR(28),”),CHR(27),”),CHR(26),”),CHR(25),”),CHR(24),”),CHR(23),”),CHR(22),”),CHR(21),”),CHR(20),”),CHR(19),”),CHR(18),”),CHR(16),”),CHR(15),”),CHR(14),”),CHR(12),”),CHR(11),”),CHR(8), ”),CHR(7),”),CHR(6),”),CHR(5),”),CHR(4),”),CHR(3),”),CHR(2),”),CHR(1),”),CHR(0),”)   )) into j

from

YOURTABLEA main WHERE main.rowid In (‘AABEsGAAGAAEDLhAAH’);

 

 

 

 

Subject: SQL Using XML Functions Fails ORA-31011, ORA-19202, LPX-217
  Doc ID: 242166.1 Type: PROBLEM
  Modified Date: 09-APR-2009 Status: PUBLISHED
 
Checked for relevance on 28-Mar-2007
 
The information in this article applies to:
 
DBAs and developers attempting to manipulate data using XML functions.
 
 
Symptom(s)
~~~~~~~~~~
 
The following query gives error:
 
SELECT XMLFOREST(sys_xmlagg(XMLELEMENT("dummy",chr(1))) as "test") from dual;
 
ORA-31011: XML parsing failed
ORA-19202: Error occurred in XML processing
LPX-00217: invalid character 1 (\u0001)
 
The following query works:
 
SELECT XMLFOREST(sys_xmlagg(XMLELEMENT("dummy",chr(50))) as "test") from dual;
 
XMLFOREST(SYS_XMLAGG(XMLELEMEN
 
<test><ROWSET>
  <dummy>2</dummy>
</ROWSET>
</test>
 
Change(s)
~~~~~~~~~~
 
The number in the chr() function is a code for a control character.
 
Cause
~~~~~~~
 
XML doees not support all valid characters in charactersets.  There are limitations as to what is supported by XML.
 
Valid characters supported by XML are listed on the W3.ORG website and are currently listed as:
 
http://www.w3.org/TR/2000/REC-xml-20001006#charsets
 
Character Range
[2]    Char    ::=    #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF] 
 
Therefore, only characters in the ranges above are supported by XML Parsers.
 
For more detail on characters supported by the XML DB:
 
Note 248344.1 Valid Characters Supported by XDB
 
Fix
~~~~
 
Do not use characters that are not supported by XML standards set forth by W3.ORG.
 
References
~~~~~~~~~~~
 
Note 248344.1 Valid Characters Supported by XDB
 
http://www.w3.org
http://www.w3.org/TR/2000/REC-xml-20001006#charsets
 
[Insert related documents.  This can be manuals, white papers, SURe Notes,
Bugs, or Patches.  Be sure to include the title and valid hyperlink syntax.]
 
Note 248344.1 Valid Characters Supported by XDB
 
 

 

 

ASCII Table

Character Instruction Binary Hex Decimal HTML
Numeric
Entity
VBScript
Null Zero Null 00000000 00 00   Chr(00)
Smiley Face Start Heading 00000001 01 01   Chr(01)
Smiley(black) Start Text 00000010 02 02   Chr(02)
Heart End Text 00000011 03 03   Chr(03)
Diamond End Transmission 00000100 04 04   Chr(04)
Club Enquire 00000101 05 05   Chr(05)
Spade Acknowledge 00000110 06 06   Chr(06)
Circle(black) bell 00000111 07 07   Chr(07)
Rectangle(black) backspace 00001000 08 08   Chr(08)
Circle(white) tab 00001001 09 09 &#009; Chr(09)
Square(black) line feed 00001010 0A 10 &#010; Chr(10)
Mars/male vertical tab 00001011 0B 11   Chr(11)
Venus/female form feed 00001100 0C 12   Chr(12)
Music note RETURN 00001101 0D 13 &#013; Chr(13)
Two music notes shift-out 00001110 0E 14   Chr(14)
Sun shift-in 00001111 0F 15   Chr(15)
Right arrow data link escape 00010000 10 16   Chr(16)
Left arrow device control 1 00010001 11 17   Chr(17)
Up and down arrow device control 2 00010010 12 18   Chr(18)
!! device control 3 00010011 13 19   Chr(19)
paragraph symbol device control 4 00010100 14 20   Chr(20)
Site mark negative acknowlege 00010101 15 21   Chr(21)
synch idle 00010110 16 22   Chr(22)
Double arrow(U&D w/line) end trans block 00010111 17 23   Chr(23)
Up arrow cancel 00011000 18 24   Chr(24)
Down arrow end of medium 00011001 19 25   Chr(25)
arrow right substitute 00011010 1A 26   Chr(26)
arrow left escape 00011011 1B 27   Chr(27)
90 degree angle file separator 00011100 1C 28   Chr(28)
double arrow group separator 00011101 1D 29   Chr(29)
up triangle record separator 00011110 1E 30   Chr(30)
down triangle unit separator 00011111 1F 31   Chr(31)
Space Space 00100000 20 32   Chr(32)
!   00100001 21 33 &#033; Chr(33)
  00100010 22 34 &#034; Chr(34)
#   00100011 23 35 &#035; Chr(35)
$   00100100 24 36 &#036; Chr(36)
%   00100101 25 37 &#037; Chr(37)
&   00100110 26 38 &#038; Chr(38)
  00100111 27 39 &#039; Chr(39)
(   00101000 28 40 &#040; Chr(40)
)   00101001 29 41 &#041; Chr(41)
*   00101010 2A 42 &#042; Chr(42)
+   00101011 2B 43 &#043; Chr(43)
,   00101100 2C 44 &#044; Chr(44)
  00101101 2D 45 &#045; Chr(45)
.   00101110 2E 46 &#046; Chr(46)
/   00101111 2F 47 &#047; Chr(47)
0   00110000 30 48 &#048; Chr(48)
1   00110001 31 49 &#049; Chr(49)
2   00110010 32 50 &#050; Chr(50)
3   00110011 33 51 &#051; Chr(51)
4   00110100 34 52 &#052; Chr(52)
5   00110101 35 53 &#053; Chr(53)
6   00110110 36 54 &#054; Chr(54)
7   00110111 37 55 &#055; Chr(55)
8   00111000 38 56 &#056; Chr(56)
9   00111001 39 57 &#057; Chr(57)
:   00111010 3A 58 &#058; Chr(58)
;   00111011 3B 59 &#059; Chr(59)
<    00111100 3C 60 &#060; Chr(60)
=   00111101 3D 61 &#061; Chr(61)
>    00111110 3E 62 &#062; Chr(62)
?   00111111 3F 63 &#063; Chr(63)
@   01000000 40 64 &#064; Chr(64)
A   01000001 41 65 &#065; Chr(65)
B   01000010 42 66 &#066; Chr(66)
C   01000011 43 67 &#067; Chr(67)
D   01000100 44 68 &#068; Chr(68)
E   01000101 45 69 &#069; Chr(69)
F   01000110 46 70 &#070; Chr(70)
G   01000111 47 71 &#071; Chr(71)
H   01001000 48 72 &#072; Chr(72)
I   01001001 49 73 &#073; Chr(73)
J   01001010 4A 74 &#074; Chr(74)
K   01001011 4B 75 &#075; Chr(75)
L   01001100 4C 76 &#076; Chr(76)
M   01001101 4D 77 &#077; Chr(77)
N   01001110 4E 78 &#078; Chr(78)
O   01001111 4F 79 &#079; Chr(79)
P   01010000 50 80 &#080; Chr(80)
Q   01010001 51 81 &#081; Chr(81)
R   01010010 52 82 &#082; Chr(82)
S   01010011 53 83 &#083; Chr(83)
T   01010100 54 84 &#084; Chr(84)
U   01010101 55 85 &#085; Chr(85)
V   01010110 56 86 &#086; Chr(86)
W   01010111 57 87 &#087; Chr(87)
X   01011000 58 88 &#088; Chr(88)
Y   01011001 59 89 &#089; Chr(89)
Z   01011010 5A 90 &#090; Chr(90)
[   01011011 5B 91 &#091; Chr(91)
\   01011100 5C 92 &#092; Chr(92)
]   01011101 5D 93 &#093; Chr(93)
^   01011110 5E 94 &#094; Chr(94)
_   01011111 5F 95 &#095; Chr(95)
$   01100000 60 96 &#096; Chr(96)
a   01100001 61 97 &#097; Chr(97)
b   01100010 62 98 &#098; Chr(98)
c   01100011 63 99 &#099; Chr(99)
d   01100100 64 100 &#100; Chr(100)
e   01100101 65 101 &#101; Chr(101)
f   01100110 66 102 &#102; Chr(102)
g   01100111 67 103 &#103; Chr(103)
h   01101000 68 104 &#104; Chr(104)
i   01101001 69 105 &#105; Chr(105)
j   01101010 6A 106 &#106; Chr(106)
k   01101011 6B 107 &#107; Chr(107)
l   01101100 6C 108 &#108; Chr(108)
m   01101101 6D 109 &#109; Chr(109)
n   01101110 6E 110 &#110; Chr(110)
o   01101111 6F 111 &#111; Chr(111)
p   01110000 70 112 &#112; Chr(112)
q   01110001 71 113 &#113; Chr(113)
r   01110010 72 114 &#114; Chr(114)
s   01110011 73 115 &#115; Chr(115)
t   01110100 74 116 &#116; Chr(116)
u   01110101 75 117 &#117; Chr(117)
v   01110110 76 118 &#118; Chr(118)
w   01110111 77 119 &#119; Chr(119)
x   01111000 78 120 &#120; Chr(120)
y   01111001 79 121 &#121; Chr(121)
z   01111010 7A 122 &#122; Chr(122)
{   01111011 7B 123 &#123; Chr(123)
|   01111100 7C 124 &#124; Chr(124)
}   01111101 7D 125 &#125; Chr(125)
~   01111110 7E 126 &#126; Chr(126)
  DELETE 01111111 7F 127   Chr(127)

 

 

Subject: DBMS_XMLQUERY Gives XML Document Which Contains Invalid Character
  Doc ID: 397671.1 Type: HOWTO
  Modified Date: 03-NOV-2006 Status: MODERATED

In this Document
  Goal
  Solution
  References


This document is being delivered to you via Oracle Support’s Rapid Visibility (RaV) process, and therefore has not been subject to an independent technical review.

Applies to:

Oracle Server – Enterprise Edition – Version: 10.2.0.2.0
Information in this document applies to any platform.

Goal

Running the following, where chr(25) is an invalid character per W3 standard, gives back an XML
document:

select DBMS_XMLQUERY.getxml(‘select ”Illegal Character: ” || CHR(25) as TEXTCOL from dual’) from
dual;

<?xml version = ‘1.0’?>
<ROWSET>
<ROW num=”1″>
<TEXTCOL>Illegal Character: </TEXTCOL>
</ROW>
</ROWSET>

If a valid character is used in the query it is replaced with a character entity as expected:

select DBMS_XMLQUERY.getxml(‘select ”Illegal Character: ” || CHR(38) as TEXTCOL from dual’) from
dual;

<?xml version = ‘1.0’?>
<ROWSET>
<ROW num=”1″>
<TEXTCOL>Illegal Character: &</TEXTCOL>
</ROW>
</ROWSET>

Solution

The XML output which includes the invalid character is expected behavior. If XML parsing of the resulting XML is needed (check for invalid characters) use DBMS_XMLGEN, which has a setCheckInvalidChars function for turning on/off the invalid character test.

References

Note 248344.1 – Valid Characters Supported by XDB


Help us improve our service. Please email us your comments for this document. .

 

 

Subject: XMLELEMENT Function Returns Strange Tags Used With DBMS_XMLGEN
  Doc ID: 565787.1 Type: PROBLEM
  Modified Date: 09-FEB-2009 Status: PUBLISHED

In this Document
  Symptoms
  Cause
  Solution
  References


Applies to:

Oracle XML Developers Kit – Version: 9.2.0.8 to 11.1.0.6
This problem can occur on any platform.

Symptoms

The XMLELEMENT function returns strange tags used inside DBMS_XMLGEN. The following code is showing that behavior:

declare
Result CLOB;
qryctx dbms_xmlgen.ctxhandle;
begin
qryCtx:= dbms_xmlgen.newContext(‘select xmlelement(“keys”, ‘||
‘ xmlelement(“id”,empno), ‘ ||
‘ xmlelement(“name”,ename))’ ||
‘,xmlelement(“employee”, deptno)’||
‘ FROM EMP’ ||
‘ where empno=7369’ );
result := dbms_xmlgen.getXml(qryCtx);
dbms_xmlgen.closeContext(qryCtx);
dbms_output.put_line(result);
end;
/

OUTPUT:

<?xml version=”1.0″?>
<ROWSET>
<ROW>

<XMLELEMENT_x0028__x0022_KEYS_x0022__x002C_XMLELEMENT_x0028__x0022_ID_x0022__x00
2C_EMPNO_x0029__x002C_XMLELEMENT_x0028__x0022_NAME_x0022__x002C_ENAME_x0029__x00
29_>
<keys><id>7369</id><name>SMITH</name></keys>
</XMLELEMENT_x0028__x0022_KEYS_x0022__x002C_XMLELEMENT_x0028__x0022_ID_x0022__x0
02C_EMPNO_x0029__x002C_XMLELEMENT_x0028__x0022_NAME_x0022__x002C_ENAME_x0029__x0
029_>
<XMLELEMENT_x0028__x0022_EMPLOYEE_x0022__x002C_DEPTNO_x0029_>

<employee>20</employee>
</XMLELEMENT_x0028__x0022_EMPLOYEE_x0022__x002C_DEPTNO_x0029_>

</ROW>
</ROWSET>
Running the XMLELEMENT function outside DBMS_XMLGEN in SQLPLUS the output is as expected:

SQL> select xmlelement(“keys”, xmlelement(“id”,empno),
xmlelement(“name”,ename)),
xmlelement(“employee”, deptno) FROM EMP where empno=7369;
XMLELEMENT(“KEYS”,XMLELEMENT(“ID”,EMPNO),XMLELEMENT(“NAME”,ENAME))
——————————————————————————–
XMLELEMENT(“EMPLOYEE”,DEPTNO)
——————————————————————————–
<keys><id>7369</id><name>SMITH</name></keys>
<employee>20</employee>

 

Cause

DBMS_XMLGEN takes a normal SQL query and converts the output to xml.
However if there is a query already returning xml it’s trying to further convert that.
Hence the generated tag names based on the query contents.
This can be controled by the rowset and row tag names by DBMS_XMLGEN calls but
the element tag names come from the query itself so to
change them from what DBMS_XMLGEN generates you need to use aliases.

See also Bug 6997578 XMLELEMENT FUNCTION RETURNS STRANGE TAGS USED WITH DBMS_XMLGEN
.

Solution

To implement the solution, please execute the following steps:

1. To avoid the strange tags you need to use aliases. Modify the code the following way:

declare
Result CLOB;
qryctx dbms_xmlgen.ctxhandle;
begin
qryCtx:= dbms_xmlgen.newContext(
'select xmlelement("keys", '||
' xmlelement("id",empno), ' ||
' xmlelement("name",ename)) "Id"' ||
',xmlelement("employee", deptno) "Department"'||
' FROM EMP' ||
' where empno=7369' );
dbms_xmlgen.SETROWSETTAG(qryCtx,'Employees');
dbms_xmlgen.SETROWTAG(qryCtx,'EmployeeDetails');
result := dbms_xmlgen.getXml(qryCtx);
dbms_xmlgen.closeContext(qryCtx);
dbms_output.put_line(result);
end;

OUTPUT:
<?xml version="1.0"?>
<Employees>
<EmployeeDetails>
<Id>

<keys><id>7369</id><name>SMITH</name></keys> </Id>
<Department>

<employee>20</employee> </Department>
</EmployeeDetails>
</Employees>

 

Subject: How to Escape Special XML Characters Using XML SQL Utility
  Doc ID: 171951.1 Type: PROBLEM
  Modified Date: 20-AUG-2009 Status: PUBLISHED

In this Document
  Symptoms
  Cause
  Solution


Applies to:

Oracle XML SQL Utility (XSU) – Version: 9.0.1.0 to 11.1.0.7
This problem can occur on any platform.

Symptoms

You have an ‘&’ (ampersand) stored as a part of the data inside the database table. You want to retrieve the data from table in XML format using the Oracle provided XML SQL Utility(XSU) and retain the format of the text (for example if you have data that is already in XML format).

You use DBMS_XMLQuery package to query the data and the output result shows ‘&amp;’ instead of ‘&’ (ampersand).

Cause

There is no way of easily handling this using the DBMS_XMLQUERY package, use the DBMS_XMLGEN package that comes with 9i database onwards instead.

In raw XML, the ampersand(&), less-than sign(<), greater-than sign(>), double quotes(“) and apostrophes(‘) are interpreted as part of the markup instead of character data. Therefore the XSU encodes the raw ampersand as ‘&amp;’ when converting the text data stored inside the database into a well-formed XML
document. This can be a problem if the data you are processing is already in XML format and so Oracle has provided an API call to make this easier to do in the DBMS_XMLGEN package.

Solution

Starting with 10g DBMS_XMLQUERY package(written java) is deprecated and ct’s are recommended to use DBMS_XMLGEN package DBMS_XMLGEN  package is similar to the DBMS_XMLQuery package, except that it is written in C and compiled into the database kernel.

The package has a procedure “setConvertSpecialChars(ctxHandle, boolean);” through which you can set whether special characters should be converted or not to their escaped representation. The default is to perform the conversion.

Code Sample using DBMS_XMLGEN Pacakge to Escape special XML Characters

REM serveroutput on so we can see the results
set serveroutput on
REM scan off to prevent parsing of ampersands in sql*plus
set scan off

REM Drop table if present

drop table xmltab;

REM Create table
  
create table xmltab (specchar varchar2(4000));

REM Insert the data that has special XML characters like '&', '<' and '>'

   insert into xmltab values ('my text &amp; comment');
   insert into xmltab values ('<index>1</index><data>My Data</data>');

REM  Run the anonymous PL/SQL block to retrieve the data in XML format from
REM  the "xmltab" table 
  
declare
  
  -- declare variables
  queryCtx DBMS_XMLGEN.ctxhandle;
  result CLOB;
  xmlstr varchar2(32767);
  line varchar2(4000);

begin

   -- create a new context handle from a passed in SQL query
   queryCtx := DBMS_XMLGEN.newContext('select * from xmltab');

   -- set the conversion of special xml characters to false
   DBMS_XMLGEN.setConvertSpecialChars(queryCtx, false);
    
   -- get XML
   result := DBMS_XMLGEN.getXML(queryCtx);

   -- to print the data
   xmlstr := dbms_lob.SUBSTR(result,32767);
   loop
      exit when xmlstr is null;
      line := substr(xmlstr,1,instr(xmlstr,chr(10))-1);
      dbms_output.put_line('| '||line);
      xmlstr := substr(xmlstr,instr(xmlstr,chr(10))+1);
   end loop;

   -- close the context
   DBMS_XMLGEN.closeContext(queryCtx);  
end;
/

Result :-

| <?xml version=”1.0″?>
| <ROWSET>
| <ROW>
| <SPECCHAR>my text &amp; comment</SPECCHAR>
| </ROW>
| <ROW>
| <SPECCHAR><index>1</index><data>My Data</data></SPECCHAR>
| </ROW>
| </ROWSET>

 

References
———-
Oracle® XML DB Developer’s Guide 11g Release 1 (11.1)
Part Number B28369-01

 

Subject: How To Generate XML Output From a Relational Table?
  Doc ID: 395699.1 Type: HOWTO
  Modified Date: 14-DEC-2006 Status: PUBLISHED

In this Document
  Goal
  Solution


Applies to:

Oracle Server – Enterprise Edition – Version: 9.2.0 to 10.2.0
Information in this document applies to any platform.

Goal

How to generate XML output from a relational table?

Solution

DBMS_XMLGEN is one method to create XML documents from a regular Oracle Table.
For a quick example:
From SQL*Plus connected to the database as SCOTT/TIGER enter:

set pagesize 999
set linesize 100
set long 2000

create table temp_emp_xml (xml xmltype);

declare
  qryctx dbms_xmlgen.ctxhandle;
  result clob;
begin
  qryctx := dbms_xmlgen.newcontext('select ename,deptno from emp');
  dbms_xmlgen.setrowtag(qryctx,'employee');
  result:=dbms_xmlgen.getxml(qryctx);
  insert into temp_emp_xml values(XMLType(result));
  dbms_xmlgen.closecontext(qryctx);
end;
/

select * from temp_emp_xml;
RESULT
——————————————————————————-

<?xml version=”1.0″?>
<ROWSET>
<employee>
<ENAME>SMITH</ENAME>
<DEPTNO>20</DEPTNO>
</employee>
<employee>
<ENAME>ALLEN</ENAME>
<DEPTNO>30</DEPTNO>
</employee>
<employee>
<ENAME>WARD</ENAME>
<DEPTNO>30</DEPTNO>
</employee>
<employee>
<ENAME>JONES</ENAME>
<DEPTNO>20</DEPTNO>
</employee>
<employee>
<ENAME>MARTIN</ENAME>
<DEPTNO>30</DEPTNO>
</employee>

<employee>
<ENAME>MILLER</ENAME>
<DEPTNO>10</DEPTNO>
</employee>
</ROWSET>

 

 

 

Subject: How To Show The Column Tag When There Is No Data (NULL Value)
  Doc ID: 560462.1 Type: HOWTO
  Modified Date: 02-MAR-2009 Status: PUBLISHED

In this Document
  Goal
  Solution


Applies to:

Oracle Server Enterprise Edition – Version: 10.1 to 11.1
Oracle XML Developers Kit – Version: 10.2 to 11.1
Information in this document applies to any platform.

Goal

How to get an empty tag generated from the DBMS_XMLGEN.setNullHandling procedure?

Solution

Pass the DBMS_XMLGEN.EMPTY_TAG constant to the flag argument.

set long 10000;
create table xml_test (a number, b varchar2(15), c varchar2(20));
insert into xml_test values (1, NULL, 'text');
commit;

var i CLOB;

DECLARE
SqlQuery VARCHAR2(4000) := 'select * from xml_test';
QueryCursor sys_refcursor;
XMLReport CLOB;
QryCtx DBMS_XMLGEN.CTXHANDLE;
BEGIN
dbms_lob.createtemporary(XMLReport,FALSE);
OPEN QueryCursor FOR SqlQuery;
QryCtx := DBMS_XMLGEN.NEWCONTEXT(QueryCursor);
DBMS_XMLGEN.setNullHandling(QryCtx,DBMS_XMLGEN.EMPTY_TAG);
:i := DBMS_XMLGEN.GETXML(QryCtx,DBMS_XMLGEN.NONE);
DBMS_XMLGEN.closeContext(QryCtx);
dbms_lob.freetemporary(XMLReport);
end;
/

print i;

Result:

<?xml version=”1.0″?>
<ROWSET>
<ROW>
<A>1</A>
<B/>
<C>text</C>
</ROW>
</
ROWSET>

The <B/> tag is an accepted output in the XML standard.

 

 

 

 

To workaround the invalid characters

 

SELECT

    XMLELEMENT(name “row”,

        XMLELEMENT( NAME “Description”,

            

            REPLACE(REPLACE(REPLACE(

            REPLACE(REPLACE(REPLACE(   

            REPLACE(REPLACE(REPLACE(

                  REPLACE(REPLACE(REPLACE(           

                  REPLACE(REPLACE(REPLACE(

                  REPLACE(REPLACE(REPLACE(

                  REPLACE(REPLACE(REPLACE(

                  REPLACE(REPLACE(REPLACE(

                  REPLACE(REPLACE(REPLACE(

                  REPLACE(t.description,

                 

                  /* \x0e-\x1f*/

                  CHAR(31), ”),CHAR(30), ”),CHAR(29), ”),

                  CHAR(28), ”),CHAR(27), ”),CHAR(26), ”),

                  CHAR(25), ”),CHAR(24), ”),CHAR(23), ”),

                  CHAR(22), ”),CHAR(21), ”),CHAR(20), ”),

                  CHAR(19), ”),CHAR(18), ”),CHAR(16), ”),

                  CHAR(15), ”),CHAR(14), ”),

 

                  /* \x0b-\x0c */

                  CHAR(12), ”),CHAR(11), ”),

 

                  /* \x00-\x08 */

                  CHAR(8), ”),CHAR(7), ”),CHAR(6), ”),

                  CHAR(5), ”),CHAR(4), ”),CHAR(3), ”),

                  CHAR(2), ”),CHAR(1), ”),CHAR(0), ”)

            )

      )

FROM AbacBill.dbo.InitialTicketsImport t

 

 

 

The System.Xml.XmlTextWriter does not know which version XML is being generated. There is no API to configure it one way or the other. The XmlLayout does not generate a full XML document, only a fragment which must be included in a document.

If the XML output in included in an XML 1.1 document then the numeric character references in the additional ranges allowed by the 1.1 spec will be valid. However this is outside of the scope of log4net to enforce.

The XmlLayout must be told which XML version is being targeted and must default to 1.0 not to 1.1.

For invalid characters such as 0x1e there are 3 possible solutions:

1) Discard the character from the output.

2) Replace the character with a numeric representation e.g. “0x1E”.

3) Replace the character with an XML element e.g. <char code=”30″/>

Regardless of the output version (1.0 or 1.1) selected one of the above choices will need to be made. XML version 1.1 does not allow a NULL (0x0) character to appear un-encoded or as a numeric character reference, therefore this will need to be represented in some way.

Note that the invalid characters cannot be included in a CDATA block, however there are issues with some parsers that do allow them there when they should not.

I favour option 3 above because information is not lost. In options 1 and 2 information is lost. In 2 the encoding is not reversible. With 3 the application reading the data requires additional smarts to pickup on the encoded values in element, but all the original information is preserved. If the app just asks for the text nodes, ignoring the child elements, then they will get back the same result as from 1.

[ Show » ]

Nicko Cadell added a comment – 10/Apr/05 09:02 PM The System.Xml.XmlTextWriter does not know which version XML is being generated. There is no API to configure it one way or the other. The XmlLayout does not generate a full XML document, only a fragment which must be included in a document. If the XML output in included in an XML 1.1 document then the numeric character references in the additional ranges allowed by the 1.1 spec will be valid. However this is outside of the scope of log4net to enforce. The XmlLayout must be told which XML version is being targeted and must default to 1.0 not to 1.1. For invalid characters such as 0x1e there are 3 possible solutions: 1) Discard the character from the output. 2) Replace the character with a numeric representation e.g. “0x1E”. 3) Replace the character with an XML element e.g. <char code=”30″/> Regardless of the output version (1.0 or 1.1) selected one of the above choices will need to be made. XML version 1.1 does not allow a NULL (0x0) character to appear un-encoded or as a numeric character reference, therefore this will need to be represented in some way. Note that the invalid characters cannot be included in a CDATA block, however there are issues with some parsers that do allow them there when they should not. I favour option 3 above because information is not lost. In options 1 and 2 information is lost. In 2 the encoding is not reversible. With 3 the application reading the data requires additional smarts to pickup on the encoded values in element, but all the original information is preserved. If the app just asks for the text nodes, ignoring the child elements, then they will get back the same result as from 1.
 

[ Permlink | « Hide ]

Niall Daley added a comment – 24/Aug/05 03:31 PM

By default characters that can not be specified in XML will now be masked by a ?. This can be changed by setting InvalidCharReplacement to a different string. Alternatively set Base64EncodeMessage or Base64EncodeProperties to true, as appropriate, to Base64 encode the data. This allows all values to be output safely.

 

 

Here is what I did:
I created a function that checks for invalid characters like chr(0) and chr(10) etc… and replaces them with something else. See below. What is happening when you get the invalid character 0x0 is basically an end of string is being passed to the XML parser. Since the end of string causes the XML to become malformed you get the message I produced in an earlier posting on this topic. This function when called from another process will help to clean up the data in such a way that the chr(0) will not malform the data. Like the chr(0) the remaining values have also been determined to cause issues when transforming data from structured query to XML. This replace is function is just an example and while it can be used as is yuou should make your own determination as to what to replace the values with. Remember too that this was based on a Western European character set and you would need to change some of the values and add others for Unicode.

CREATE OR REPLACE function OLTP.change_illegal_char
(p_val in varchar2)
return varchar2
is
v_val varchar2(10000);
begin
v_val := replace(p_val,chr(0),’~0′);
v_val := replace(v_val,’&’,’&amp;’);
v_val := replace(v_val,”,’~na~’);
v_val := replace(v_val,'<‘,’&lt;’);
v_val := replace(v_val,'”‘,’&quot;’);
v_val := replace(v_val,’>’,’&gt;’);
v_val := replace(v_val,””,’&apos;’);
v_val := replace(v_val,chr(12),’FF’);
v_val := replace(v_val,chr(10),’LF’);
v_val := replace(v_val,’/’,’/’);
v_val := replace(v_val,’¿’,’¿’);
v_val := replace(v_val,’.’,’ð’);
return(v_val);
end;
/ [ Show » ]

Niall Daley added a comment – 24/Aug/05 03:31 PM By default characters that can not be specified in XML will now be masked by a ?. This can be changed by setting InvalidCharReplacement to a different string. Alternatively set Base64EncodeMessage or Base64EncodeProperties to true, as appropriate, to Base64 encode the data. This allows all values to be output safely.

 

Author: admin