XML
Extensible Markup Language - structured data with semantic meaning.
XML (Extensible Markup Language) is a markup language for encoding structured data in a format readable by both humans and machines. The World Wide Web Consortium (W3C) developed XML, and it became a W3C recommendation in 1998.
Syntax
XML uses tags to wrap data, providing both values and semantic context:
<agent>
<name>Claude</name>
<type>assistant</type>
<capabilities>
<capability>reasoning</capability>
<capability>coding</capability>
<capability>analysis</capability>
</capabilities>
<parameters temperature="0.7" maxTokens="4096" />
</agent>XML supports attributes on elements, namespace declarations for mixing vocabularies, and self-describing document structure.
Use in AI Systems
Large language models process XML effectively due to its explicit hierarchical structure. Claude uses XML-style tags in its prompting format:
<document>
<section>Introduction</section>
<content>The main body of text...</content>
</document>XML's explicit hierarchy maps directly to how language models process nested context.
Validation
XML supports schema validation through several mechanisms. XSD (XML Schema Definition) provides type-rich validation with complex constraints. RelaxNG offers an alternative schema language with simpler syntax. DTD (Document Type Definition) is the original validation mechanism from XML's early days.
Comparison with JSON
XML requires more characters than JSON for equivalent data:
<?xml version="1.0" encoding="UTF-8"?>
<response xmlns="http://example.com/api/v1">
<status>success</status>
<data>42</data>
</response>Equivalent JSON:
{"status": "success", "data": 42}JSON parses faster and requires less bandwidth.
Common Use Cases
XML is used for document formats mixing text and structure such as XHTML and DocBook. It serves configuration files requiring validation, legacy system integration, and formats inherently based on XML including SVG, HTML, RSS, and SOAP.
JSON is typically preferred for web APIs, structured LLM output, and configuration files edited by humans.