Latest web development tutorials

XML CDATA

All text in an XML document will be parsed by the parser.

Only CDATA section, the text will be ignored by the parser.


PCDATA - parsed character data

XML parsers normally parse all the text in an XML document.

When an XML element is parsed, the text between the tags it will be resolved:

<message> This text is also parsed </message>

Parser did so because XML elements can contain other elements, as in this instance, where <name> element contains two other elements (first and last):

<name><first>Bill</first><last>Gates</last></name>

The parser will break it down into sub-elements like this:

<name>
<first>Bill</first>
<last>Gates</last>
</name>

Parsed character data (PCDATA) is a term used in the text data in the XML parser.


CDATA - (unresolved) character data

The term CDATA is text data should not be parsed by the XML parser.

Like "<" and "&" character in the XML element is illegal.

"<" Will generate an error because the parser will interpret the character as the start of the new element.

"&" Will generate an error because the parser will interpret the character as the start character entities.

Some text, like JavaScript code, contains a lot of "<" or "&" characters. To avoid errors script code can be defined as CDATA.

All content CDATA section will be ignored by the parser.

CDATA part by the"<! [CDATA [" Start by "]]>"End:

<script>
<![CDATA[
function matchwo(a,b)
{
if (a < b && a < 0) then
{
return 1;
}
else
{
return 0;
}
}
]]>
</script>

In the example above, the parser will ignore CDATA section all content.

Notes on CDATA section:

CDATA section can not contain the string "]]>." It does not allow nested CDATA sections.

Marks the end of a CDATA section. "]]>" Can not contain spaces or line breaks.