this is my consolidated notes that i gathered when i had issues with xml functions in oracle 10g.so it also includes data from other sites as well as metalink which i found useful.The intention is to give credit to the original authors of some of these articles.
a new oracle dba can find here all detailed information on how to use xml functions in the oracle database and how to troubleshoot problems with different character sets.
http://it.toolbox.com/blogs/oracle-guide/xml-in-the-database-a-brief-overview-4410
SELECT length(XMLELEMENT(“Description”,REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(main.shortname,CHR(31),”),CHR(30),”),CHR(29),”),CHR(28),”),CHR(27),”),CHR(26),”),CHR(25),”),CHR(24),”),CHR(23),”),CHR(22),”),CHR(21),”),CHR(20),”),CHR(19),”),CHR(18),”),CHR(16),”),CHR(15),”),CHR(14),”),CHR(12),”),CHR(11),”),CHR(8), ”),CHR(7),”),CHR(6),”),CHR(5),”),CHR(4),”),CHR(3),”),CHR(2),”),CHR(1),”),CHR(0),”) )) into j
from
YOURTABLEA main WHERE main.rowid In (‘AABEsGAAGAAEDLhAAH’);
Subject: | SQL Using XML Functions Fails ORA-31011, ORA-19202, LPX-217 | |||
Doc ID: | 242166.1 | Type: | PROBLEM | |
Modified Date: | 09-APR-2009 | Status: | PUBLISHED |
Checked for relevance on 28-Mar-2007
The information in this article applies to:
DBAs and developers attempting to manipulate data using XML functions.
Symptom(s)
~~~~~~~~~~
The following query gives error:
SELECT XMLFOREST(sys_xmlagg(XMLELEMENT("dummy",chr(1))) as "test") from dual;
ORA-31011: XML parsing failed
ORA-19202: Error occurred in XML processing
LPX-00217: invalid character 1 (\u0001)
The following query works:
SELECT XMLFOREST(sys_xmlagg(XMLELEMENT("dummy",chr(50))) as "test") from dual;
XMLFOREST(SYS_XMLAGG(XMLELEMEN
<test><ROWSET>
<dummy>2</dummy>
</ROWSET>
</test>
Change(s)
~~~~~~~~~~
The number in the chr() function is a code for a control character.
Cause
~~~~~~~
XML doees not support all valid characters in charactersets. There are limitations as to what is supported by XML.
Valid characters supported by XML are listed on the W3.ORG website and are currently listed as:
http://www.w3.org/TR/2000/REC-xml-20001006#charsets
Character Range
[2] Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]
Therefore, only characters in the ranges above are supported by XML Parsers.
For more detail on characters supported by the XML DB:
Note 248344.1 Valid Characters Supported by XDB
Fix
~~~~
Do not use characters that are not supported by XML standards set forth by W3.ORG.
References
~~~~~~~~~~~
Note 248344.1 Valid Characters Supported by XDB
http://www.w3.org
http://www.w3.org/TR/2000/REC-xml-20001006#charsets
[Insert related documents. This can be manuals, white papers, SURe Notes,
Bugs, or Patches. Be sure to include the title and valid hyperlink syntax.]
Note 248344.1 Valid Characters Supported by XDB
ASCII Table
Character | Instruction | Binary | Hex | Decimal | HTML Numeric Entity |
VBScript |
Null Zero | Null | 00000000 | 00 | 00 | Chr(00) | |
Smiley Face | Start Heading | 00000001 | 01 | 01 | Chr(01) | |
Smiley(black) | Start Text | 00000010 | 02 | 02 | Chr(02) | |
Heart | End Text | 00000011 | 03 | 03 | Chr(03) | |
Diamond | End Transmission | 00000100 | 04 | 04 | Chr(04) | |
Club | Enquire | 00000101 | 05 | 05 | Chr(05) | |
Spade | Acknowledge | 00000110 | 06 | 06 | Chr(06) | |
Circle(black) | bell | 00000111 | 07 | 07 | Chr(07) | |
Rectangle(black) | backspace | 00001000 | 08 | 08 | Chr(08) | |
Circle(white) | tab | 00001001 | 09 | 09 | 	 | Chr(09) |
Square(black) | line feed | 00001010 | 0A | 10 | 
 | Chr(10) |
Mars/male | vertical tab | 00001011 | 0B | 11 | Chr(11) | |
Venus/female | form feed | 00001100 | 0C | 12 | Chr(12) | |
Music note | RETURN | 00001101 | 0D | 13 | 
 | Chr(13) |
Two music notes | shift-out | 00001110 | 0E | 14 | Chr(14) | |
Sun | shift-in | 00001111 | 0F | 15 | Chr(15) | |
Right arrow | data link escape | 00010000 | 10 | 16 | Chr(16) | |
Left arrow | device control 1 | 00010001 | 11 | 17 | Chr(17) | |
Up and down arrow | device control 2 | 00010010 | 12 | 18 | Chr(18) | |
!! | device control 3 | 00010011 | 13 | 19 | Chr(19) | |
paragraph symbol | device control 4 | 00010100 | 14 | 20 | Chr(20) | |
Site mark | negative acknowlege | 00010101 | 15 | 21 | Chr(21) | |
– | synch idle | 00010110 | 16 | 22 | Chr(22) | |
Double arrow(U&D w/line) | end trans block | 00010111 | 17 | 23 | Chr(23) | |
Up arrow | cancel | 00011000 | 18 | 24 | Chr(24) | |
Down arrow | end of medium | 00011001 | 19 | 25 | Chr(25) | |
arrow right | substitute | 00011010 | 1A | 26 | Chr(26) | |
arrow left | escape | 00011011 | 1B | 27 | Chr(27) | |
90 degree angle | file separator | 00011100 | 1C | 28 | Chr(28) | |
double arrow | group separator | 00011101 | 1D | 29 | Chr(29) | |
up triangle | record separator | 00011110 | 1E | 30 | Chr(30) | |
down triangle | unit separator | 00011111 | 1F | 31 | Chr(31) | |
Space | Space | 00100000 | 20 | 32 | Chr(32) | |
! | 00100001 | 21 | 33 | ! | Chr(33) | |
“ | 00100010 | 22 | 34 | " | Chr(34) | |
# | 00100011 | 23 | 35 | # | Chr(35) | |
$ | 00100100 | 24 | 36 | $ | Chr(36) | |
% | 00100101 | 25 | 37 | % | Chr(37) | |
& | 00100110 | 26 | 38 | & | Chr(38) | |
‘ | 00100111 | 27 | 39 | ' | Chr(39) | |
( | 00101000 | 28 | 40 | ( | Chr(40) | |
) | 00101001 | 29 | 41 | ) | Chr(41) | |
* | 00101010 | 2A | 42 | * | Chr(42) | |
+ | 00101011 | 2B | 43 | + | Chr(43) | |
, | 00101100 | 2C | 44 | , | Chr(44) | |
– | 00101101 | 2D | 45 | - | Chr(45) | |
. | 00101110 | 2E | 46 | . | Chr(46) | |
/ | 00101111 | 2F | 47 | / | Chr(47) | |
0 | 00110000 | 30 | 48 | 0 | Chr(48) | |
1 | 00110001 | 31 | 49 | 1 | Chr(49) | |
2 | 00110010 | 32 | 50 | 2 | Chr(50) | |
3 | 00110011 | 33 | 51 | 3 | Chr(51) | |
4 | 00110100 | 34 | 52 | 4 | Chr(52) | |
5 | 00110101 | 35 | 53 | 5 | Chr(53) | |
6 | 00110110 | 36 | 54 | 6 | Chr(54) | |
7 | 00110111 | 37 | 55 | 7 | Chr(55) | |
8 | 00111000 | 38 | 56 | 8 | Chr(56) | |
9 | 00111001 | 39 | 57 | 9 | Chr(57) | |
: | 00111010 | 3A | 58 | : | Chr(58) | |
; | 00111011 | 3B | 59 | ; | Chr(59) | |
< | 00111100 | 3C | 60 | < | Chr(60) | |
= | 00111101 | 3D | 61 | = | Chr(61) | |
> | 00111110 | 3E | 62 | > | Chr(62) | |
? | 00111111 | 3F | 63 | ? | Chr(63) | |
@ | 01000000 | 40 | 64 | @ | Chr(64) | |
A | 01000001 | 41 | 65 | A | Chr(65) | |
B | 01000010 | 42 | 66 | B | Chr(66) | |
C | 01000011 | 43 | 67 | C | Chr(67) | |
D | 01000100 | 44 | 68 | D | Chr(68) | |
E | 01000101 | 45 | 69 | E | Chr(69) | |
F | 01000110 | 46 | 70 | F | Chr(70) | |
G | 01000111 | 47 | 71 | G | Chr(71) | |
H | 01001000 | 48 | 72 | H | Chr(72) | |
I | 01001001 | 49 | 73 | I | Chr(73) | |
J | 01001010 | 4A | 74 | J | Chr(74) | |
K | 01001011 | 4B | 75 | K | Chr(75) | |
L | 01001100 | 4C | 76 | L | Chr(76) | |
M | 01001101 | 4D | 77 | M | Chr(77) | |
N | 01001110 | 4E | 78 | N | Chr(78) | |
O | 01001111 | 4F | 79 | O | Chr(79) | |
P | 01010000 | 50 | 80 | P | Chr(80) | |
Q | 01010001 | 51 | 81 | Q | Chr(81) | |
R | 01010010 | 52 | 82 | R | Chr(82) | |
S | 01010011 | 53 | 83 | S | Chr(83) | |
T | 01010100 | 54 | 84 | T | Chr(84) | |
U | 01010101 | 55 | 85 | U | Chr(85) | |
V | 01010110 | 56 | 86 | V | Chr(86) | |
W | 01010111 | 57 | 87 | W | Chr(87) | |
X | 01011000 | 58 | 88 | X | Chr(88) | |
Y | 01011001 | 59 | 89 | Y | Chr(89) | |
Z | 01011010 | 5A | 90 | Z | Chr(90) | |
[ | 01011011 | 5B | 91 | [ | Chr(91) | |
\ | 01011100 | 5C | 92 | \ | Chr(92) | |
] | 01011101 | 5D | 93 | ] | Chr(93) | |
^ | 01011110 | 5E | 94 | ^ | Chr(94) | |
_ | 01011111 | 5F | 95 | _ | Chr(95) | |
$ | 01100000 | 60 | 96 | ` | Chr(96) | |
a | 01100001 | 61 | 97 | a | Chr(97) | |
b | 01100010 | 62 | 98 | b | Chr(98) | |
c | 01100011 | 63 | 99 | c | Chr(99) | |
d | 01100100 | 64 | 100 | d | Chr(100) | |
e | 01100101 | 65 | 101 | e | Chr(101) | |
f | 01100110 | 66 | 102 | f | Chr(102) | |
g | 01100111 | 67 | 103 | g | Chr(103) | |
h | 01101000 | 68 | 104 | h | Chr(104) | |
i | 01101001 | 69 | 105 | i | Chr(105) | |
j | 01101010 | 6A | 106 | j | Chr(106) | |
k | 01101011 | 6B | 107 | k | Chr(107) | |
l | 01101100 | 6C | 108 | l | Chr(108) | |
m | 01101101 | 6D | 109 | m | Chr(109) | |
n | 01101110 | 6E | 110 | n | Chr(110) | |
o | 01101111 | 6F | 111 | o | Chr(111) | |
p | 01110000 | 70 | 112 | p | Chr(112) | |
q | 01110001 | 71 | 113 | q | Chr(113) | |
r | 01110010 | 72 | 114 | r | Chr(114) | |
s | 01110011 | 73 | 115 | s | Chr(115) | |
t | 01110100 | 74 | 116 | t | Chr(116) | |
u | 01110101 | 75 | 117 | u | Chr(117) | |
v | 01110110 | 76 | 118 | v | Chr(118) | |
w | 01110111 | 77 | 119 | w | Chr(119) | |
x | 01111000 | 78 | 120 | x | Chr(120) | |
y | 01111001 | 79 | 121 | y | Chr(121) | |
z | 01111010 | 7A | 122 | z | Chr(122) | |
{ | 01111011 | 7B | 123 | { | Chr(123) | |
| | 01111100 | 7C | 124 | | | Chr(124) | |
} | 01111101 | 7D | 125 | } | Chr(125) | |
~ | 01111110 | 7E | 126 | ~ | Chr(126) | |
DELETE | 01111111 | 7F | 127 | Chr(127) |
Subject: | DBMS_XMLQUERY Gives XML Document Which Contains Invalid Character | |||
Doc ID: | 397671.1 | Type: | HOWTO | |
Modified Date: | 03-NOV-2006 | Status: | MODERATED |
In this Document
Goal
Solution
References
This document is being delivered to you via Oracle Support’s Rapid Visibility (RaV) process, and therefore has not been subject to an independent technical review. |
Applies to:
Oracle Server – Enterprise Edition – Version: 10.2.0.2.0
Information in this document applies to any platform.
Goal
Running the following, where chr(25) is an invalid character per W3 standard, gives back an XML
document:
select DBMS_XMLQUERY.getxml(‘select ”Illegal Character: ” || CHR(25) as TEXTCOL from dual’) from
dual;
<?xml version = ‘1.0’?>
<ROWSET>
<ROW num=”1″>
<TEXTCOL>Illegal Character: </TEXTCOL>
</ROW>
</ROWSET>
If a valid character is used in the query it is replaced with a character entity as expected:
select DBMS_XMLQUERY.getxml(‘select ”Illegal Character: ” || CHR(38) as TEXTCOL from dual’) from
dual;
<?xml version = ‘1.0’?>
<ROWSET>
<ROW num=”1″>
<TEXTCOL>Illegal Character: &</TEXTCOL>
</ROW>
</ROWSET>
Solution
The XML output which includes the invalid character is expected behavior. If XML parsing of the resulting XML is needed (check for invalid characters) use DBMS_XMLGEN, which has a setCheckInvalidChars function for turning on/off the invalid character test.
References
Note 248344.1 – Valid Characters Supported by XDB
Help us improve our service. Please email us your comments for this document. .
Subject: | XMLELEMENT Function Returns Strange Tags Used With DBMS_XMLGEN | |||
Doc ID: | 565787.1 | Type: | PROBLEM | |
Modified Date: | 09-FEB-2009 | Status: | PUBLISHED |
In this Document
Symptoms
Cause
Solution
References
Applies to:
Oracle XML Developers Kit – Version: 9.2.0.8 to 11.1.0.6
This problem can occur on any platform.
Symptoms
The XMLELEMENT function returns strange tags used inside DBMS_XMLGEN. The following code is showing that behavior:
declare
Result CLOB;
qryctx dbms_xmlgen.ctxhandle;
begin
qryCtx:= dbms_xmlgen.newContext(‘select xmlelement(“keys”, ‘||
‘ xmlelement(“id”,empno), ‘ ||
‘ xmlelement(“name”,ename))’ ||
‘,xmlelement(“employee”, deptno)’||
‘ FROM EMP’ ||
‘ where empno=7369’ );
result := dbms_xmlgen.getXml(qryCtx);
dbms_xmlgen.closeContext(qryCtx);
dbms_output.put_line(result);
end;
/
OUTPUT:
<?xml version=”1.0″?>
<ROWSET>
<ROW>
<XMLELEMENT_x0028__x0022_KEYS_x0022__x002C_XMLELEMENT_x0028__x0022_ID_x0022__x00
2C_EMPNO_x0029__x002C_XMLELEMENT_x0028__x0022_NAME_x0022__x002C_ENAME_x0029__x00
29_>
<keys><id>7369</id><name>SMITH</name></keys>
</XMLELEMENT_x0028__x0022_KEYS_x0022__x002C_XMLELEMENT_x0028__x0022_ID_x0022__x0
02C_EMPNO_x0029__x002C_XMLELEMENT_x0028__x0022_NAME_x0022__x002C_ENAME_x0029__x0
029_>
<XMLELEMENT_x0028__x0022_EMPLOYEE_x0022__x002C_DEPTNO_x0029_>
<employee>20</employee>
</XMLELEMENT_x0028__x0022_EMPLOYEE_x0022__x002C_DEPTNO_x0029_>
</ROW>
</ROWSET>
Running the XMLELEMENT function outside DBMS_XMLGEN in SQLPLUS the output is as expected:
SQL> select xmlelement(“keys”, xmlelement(“id”,empno),
xmlelement(“name”,ename)),
xmlelement(“employee”, deptno) FROM EMP where empno=7369;
XMLELEMENT(“KEYS”,XMLELEMENT(“ID”,EMPNO),XMLELEMENT(“NAME”,ENAME))
——————————————————————————–
XMLELEMENT(“EMPLOYEE”,DEPTNO)
——————————————————————————–
<keys><id>7369</id><name>SMITH</name></keys>
<employee>20</employee>
Cause
DBMS_XMLGEN takes a normal SQL query and converts the output to xml.
However if there is a query already returning xml it’s trying to further convert that.
Hence the generated tag names based on the query contents.
This can be controled by the rowset and row tag names by DBMS_XMLGEN calls but
the element tag names come from the query itself so to
change them from what DBMS_XMLGEN generates you need to use aliases.
See also Bug 6997578 XMLELEMENT FUNCTION RETURNS STRANGE TAGS USED WITH DBMS_XMLGEN
.
Solution
To implement the solution, please execute the following steps:
1. To avoid the strange tags you need to use aliases. Modify the code the following way:
declare
Result CLOB;
qryctx dbms_xmlgen.ctxhandle;
begin
qryCtx:= dbms_xmlgen.newContext(
'select xmlelement("keys", '||
' xmlelement("id",empno), ' ||
' xmlelement("name",ename)) "Id"' ||
',xmlelement("employee", deptno) "Department"'||
' FROM EMP' ||
' where empno=7369' );
dbms_xmlgen.SETROWSETTAG(qryCtx,'Employees');
dbms_xmlgen.SETROWTAG(qryCtx,'EmployeeDetails');
result := dbms_xmlgen.getXml(qryCtx);
dbms_xmlgen.closeContext(qryCtx);
dbms_output.put_line(result);
end;
OUTPUT:
<?xml version="1.0"?>
<Employees>
<EmployeeDetails>
<Id>
<keys><id>7369</id><name>SMITH</name></keys> </Id>
<Department>
<employee>20</employee> </Department>
</EmployeeDetails>
</Employees>
Subject: | How to Escape Special XML Characters Using XML SQL Utility | |||
Doc ID: | 171951.1 | Type: | PROBLEM | |
Modified Date: | 20-AUG-2009 | Status: | PUBLISHED |
In this Document
Symptoms
Cause
Solution
Applies to:
Oracle XML SQL Utility (XSU) – Version: 9.0.1.0 to 11.1.0.7
This problem can occur on any platform.
Symptoms
You have an ‘&’ (ampersand) stored as a part of the data inside the database table. You want to retrieve the data from table in XML format using the Oracle provided XML SQL Utility(XSU) and retain the format of the text (for example if you have data that is already in XML format).
You use DBMS_XMLQuery package to query the data and the output result shows ‘&’ instead of ‘&’ (ampersand).
Cause
There is no way of easily handling this using the DBMS_XMLQUERY package, use the DBMS_XMLGEN package that comes with 9i database onwards instead.
In raw XML, the ampersand(&), less-than sign(<), greater-than sign(>), double quotes(“) and apostrophes(‘) are interpreted as part of the markup instead of character data. Therefore the XSU encodes the raw ampersand as ‘&’ when converting the text data stored inside the database into a well-formed XML
document. This can be a problem if the data you are processing is already in XML format and so Oracle has provided an API call to make this easier to do in the DBMS_XMLGEN package.
Solution
Starting with 10g DBMS_XMLQUERY package(written java) is deprecated and ct’s are recommended to use DBMS_XMLGEN package DBMS_XMLGEN package is similar to the DBMS_XMLQuery package, except that it is written in C and compiled into the database kernel.
The package has a procedure “setConvertSpecialChars(ctxHandle, boolean);” through which you can set whether special characters should be converted or not to their escaped representation. The default is to perform the conversion.
Code Sample using DBMS_XMLGEN Pacakge to Escape special XML Characters
REM serveroutput on so we can see the results
set serveroutput on
REM scan off to prevent parsing of ampersands in sql*plus
set scan off
REM Drop table if present
drop table xmltab;
REM Create table
create table xmltab (specchar varchar2(4000));
REM Insert the data that has special XML characters like '&', '<' and '>'
insert into xmltab values ('my text & comment');
insert into xmltab values ('<index>1</index><data>My Data</data>');
REM Run the anonymous PL/SQL block to retrieve the data in XML format from
REM the "xmltab" table
declare
-- declare variables
queryCtx DBMS_XMLGEN.ctxhandle;
result CLOB;
xmlstr varchar2(32767);
line varchar2(4000);
begin
-- create a new context handle from a passed in SQL query
queryCtx := DBMS_XMLGEN.newContext('select * from xmltab');
-- set the conversion of special xml characters to false
DBMS_XMLGEN.setConvertSpecialChars(queryCtx, false);
-- get XML
result := DBMS_XMLGEN.getXML(queryCtx);
-- to print the data
xmlstr := dbms_lob.SUBSTR(result,32767);
loop
exit when xmlstr is null;
line := substr(xmlstr,1,instr(xmlstr,chr(10))-1);
dbms_output.put_line('| '||line);
xmlstr := substr(xmlstr,instr(xmlstr,chr(10))+1);
end loop;
-- close the context
DBMS_XMLGEN.closeContext(queryCtx);
end;
/
Result :-
| <?xml version=”1.0″?>
| <ROWSET>
| <ROW>
| <SPECCHAR>my text & comment</SPECCHAR>
| </ROW>
| <ROW>
| <SPECCHAR><index>1</index><data>My Data</data></SPECCHAR>
| </ROW>
| </ROWSET>
References
———-
Oracle® XML DB Developer’s Guide 11g Release 1 (11.1)
Part Number B28369-01
Subject: | How To Generate XML Output From a Relational Table? | |||
Doc ID: | 395699.1 | Type: | HOWTO | |
Modified Date: | 14-DEC-2006 | Status: | PUBLISHED |
In this Document
Goal
Solution
Applies to:
Oracle Server – Enterprise Edition – Version: 9.2.0 to 10.2.0
Information in this document applies to any platform.
Goal
How to generate XML output from a relational table?
Solution
DBMS_XMLGEN is one method to create XML documents from a regular Oracle Table.
For a quick example:
From SQL*Plus connected to the database as SCOTT/TIGER enter:
set pagesize 999
set linesize 100
set long 2000
create table temp_emp_xml (xml xmltype);
declare
qryctx dbms_xmlgen.ctxhandle;
result clob;
begin
qryctx := dbms_xmlgen.newcontext('select ename,deptno from emp');
dbms_xmlgen.setrowtag(qryctx,'employee');
result:=dbms_xmlgen.getxml(qryctx);
insert into temp_emp_xml values(XMLType(result));
dbms_xmlgen.closecontext(qryctx);
end;
/
select * from temp_emp_xml;
RESULT
——————————————————————————-
–
<?xml version=”1.0″?>
<ROWSET>
<employee>
<ENAME>SMITH</ENAME>
<DEPTNO>20</DEPTNO>
</employee>
<employee>
<ENAME>ALLEN</ENAME>
<DEPTNO>30</DEPTNO>
</employee>
<employee>
<ENAME>WARD</ENAME>
<DEPTNO>30</DEPTNO>
</employee>
<employee>
<ENAME>JONES</ENAME>
<DEPTNO>20</DEPTNO>
</employee>
<employee>
<ENAME>MARTIN</ENAME>
<DEPTNO>30</DEPTNO>
</employee>
…
<employee>
<ENAME>MILLER</ENAME>
<DEPTNO>10</DEPTNO>
</employee>
</ROWSET>
Subject: | How To Show The Column Tag When There Is No Data (NULL Value) | |||
Doc ID: | 560462.1 | Type: | HOWTO | |
Modified Date: | 02-MAR-2009 | Status: | PUBLISHED |
In this Document
Goal
Solution
Applies to:
Oracle Server Enterprise Edition – Version: 10.1 to 11.1
Oracle XML Developers Kit – Version: 10.2 to 11.1
Information in this document applies to any platform.
Goal
How to get an empty tag generated from the DBMS_XMLGEN.setNullHandling procedure?
Solution
Pass the DBMS_XMLGEN.EMPTY_TAG constant to the flag argument.
set long 10000;
create table xml_test (a number, b varchar2(15), c varchar2(20));
insert into xml_test values (1, NULL, 'text');
commit;
var i CLOB;
DECLARE
SqlQuery VARCHAR2(4000) := 'select * from xml_test';
QueryCursor sys_refcursor;
XMLReport CLOB;
QryCtx DBMS_XMLGEN.CTXHANDLE;
BEGIN
dbms_lob.createtemporary(XMLReport,FALSE);
OPEN QueryCursor FOR SqlQuery;
QryCtx := DBMS_XMLGEN.NEWCONTEXT(QueryCursor);
DBMS_XMLGEN.setNullHandling(QryCtx,DBMS_XMLGEN.EMPTY_TAG);
:i := DBMS_XMLGEN.GETXML(QryCtx,DBMS_XMLGEN.NONE);
DBMS_XMLGEN.closeContext(QryCtx);
dbms_lob.freetemporary(XMLReport);
end;
/
print i;
Result:
<?xml version=”1.0″?>
<ROWSET>
<ROW>
<A>1</A>
<B/>
<C>text</C>
</ROW>
</
ROWSET>
The <B/> tag is an accepted output in the XML standard.
To workaround the invalid characters
SELECT
XMLELEMENT(name “row”,
XMLELEMENT( NAME “Description”,
REPLACE(REPLACE(REPLACE(
REPLACE(REPLACE(REPLACE(
REPLACE(REPLACE(REPLACE(
REPLACE(REPLACE(REPLACE(
REPLACE(REPLACE(REPLACE(
REPLACE(REPLACE(REPLACE(
REPLACE(REPLACE(REPLACE(
REPLACE(REPLACE(REPLACE(
REPLACE(REPLACE(REPLACE(
REPLACE(t.description,
/* \x0e-\x1f*/
CHAR(31), ”),CHAR(30), ”),CHAR(29), ”),
CHAR(28), ”),CHAR(27), ”),CHAR(26), ”),
CHAR(25), ”),CHAR(24), ”),CHAR(23), ”),
CHAR(22), ”),CHAR(21), ”),CHAR(20), ”),
CHAR(19), ”),CHAR(18), ”),CHAR(16), ”),
CHAR(15), ”),CHAR(14), ”),
/* \x0b-\x0c */
CHAR(12), ”),CHAR(11), ”),
/* \x00-\x08 */
CHAR(8), ”),CHAR(7), ”),CHAR(6), ”),
CHAR(5), ”),CHAR(4), ”),CHAR(3), ”),
CHAR(2), ”),CHAR(1), ”),CHAR(0), ”)
)
)
FROM AbacBill.dbo.InitialTicketsImport t
The System.Xml.XmlTextWriter does not know which version XML is being generated. There is no API to configure it one way or the other. The XmlLayout does not generate a full XML document, only a fragment which must be included in a document.
If the XML output in included in an XML 1.1 document then the numeric character references in the additional ranges allowed by the 1.1 spec will be valid. However this is outside of the scope of log4net to enforce.
The XmlLayout must be told which XML version is being targeted and must default to 1.0 not to 1.1.
For invalid characters such as 0x1e there are 3 possible solutions:
1) Discard the character from the output.
2) Replace the character with a numeric representation e.g. “0x1E”.
3) Replace the character with an XML element e.g. <char code=”30″/>
Regardless of the output version (1.0 or 1.1) selected one of the above choices will need to be made. XML version 1.1 does not allow a NULL (0x0) character to appear un-encoded or as a numeric character reference, therefore this will need to be represented in some way.
Note that the invalid characters cannot be included in a CDATA block, however there are issues with some parsers that do allow them there when they should not.
I favour option 3 above because information is not lost. In options 1 and 2 information is lost. In 2 the encoding is not reversible. With 3 the application reading the data requires additional smarts to pickup on the encoded values in element, but all the original information is preserved. If the app just asks for the text nodes, ignoring the child elements, then they will get back the same result as from 1.
[ Show » ]
Nicko Cadell added a comment – 10/Apr/05 09:02 PM The System.Xml.XmlTextWriter does not know which version XML is being generated. There is no API to configure it one way or the other. The XmlLayout does not generate a full XML document, only a fragment which must be included in a document. If the XML output in included in an XML 1.1 document then the numeric character references in the additional ranges allowed by the 1.1 spec will be valid. However this is outside of the scope of log4net to enforce. The XmlLayout must be told which XML version is being targeted and must default to 1.0 not to 1.1. For invalid characters such as 0x1e there are 3 possible solutions: 1) Discard the character from the output. 2) Replace the character with a numeric representation e.g. “0x1E”. 3) Replace the character with an XML element e.g. <char code=”30″/> Regardless of the output version (1.0 or 1.1) selected one of the above choices will need to be made. XML version 1.1 does not allow a NULL (0x0) character to appear un-encoded or as a numeric character reference, therefore this will need to be represented in some way. Note that the invalid characters cannot be included in a CDATA block, however there are issues with some parsers that do allow them there when they should not. I favour option 3 above because information is not lost. In options 1 and 2 information is lost. In 2 the encoding is not reversible. With 3 the application reading the data requires additional smarts to pickup on the encoded values in element, but all the original information is preserved. If the app just asks for the text nodes, ignoring the child elements, then they will get back the same result as from 1.
[ Permlink | « Hide ]
Niall Daley added a comment – 24/Aug/05 03:31 PM
By default characters that can not be specified in XML will now be masked by a ?. This can be changed by setting InvalidCharReplacement to a different string. Alternatively set Base64EncodeMessage or Base64EncodeProperties to true, as appropriate, to Base64 encode the data. This allows all values to be output safely.
Here is what I did:
I created a function that checks for invalid characters like chr(0) and chr(10) etc… and replaces them with something else. See below. What is happening when you get the invalid character 0x0 is basically an end of string is being passed to the XML parser. Since the end of string causes the XML to become malformed you get the message I produced in an earlier posting on this topic. This function when called from another process will help to clean up the data in such a way that the chr(0) will not malform the data. Like the chr(0) the remaining values have also been determined to cause issues when transforming data from structured query to XML. This replace is function is just an example and while it can be used as is yuou should make your own determination as to what to replace the values with. Remember too that this was based on a Western European character set and you would need to change some of the values and add others for Unicode.
CREATE OR REPLACE function OLTP.change_illegal_char
(p_val in varchar2)
return varchar2
is
v_val varchar2(10000);
begin
v_val := replace(p_val,chr(0),’~0′);
v_val := replace(v_val,’&’,’&’);
v_val := replace(v_val,”,’~na~’);
v_val := replace(v_val,'<‘,’<’);
v_val := replace(v_val,'”‘,’"’);
v_val := replace(v_val,’>’,’>’);
v_val := replace(v_val,””,’'’);
v_val := replace(v_val,chr(12),’FF’);
v_val := replace(v_val,chr(10),’LF’);
v_val := replace(v_val,’/’,’/’);
v_val := replace(v_val,’¿’,’¿’);
v_val := replace(v_val,’.’,’ð’);
return(v_val);
end;
/ [ Show » ]
Niall Daley added a comment – 24/Aug/05 03:31 PM By default characters that can not be specified in XML will now be masked by a ?. This can be changed by setting InvalidCharReplacement to a different string. Alternatively set Base64EncodeMessage or Base64EncodeProperties to true, as appropriate, to Base64 encode the data. This allows all values to be output safely.