XML and Languages

I've written in the past about XML and languages, and why you might be interested in being aware of the language associated with text.

Text with no language is just not quite there
Impact of text language on WPF  
Text, language and sorting

For dealing with languages, xml:lang is your friend, as you can tell from these older posts.

Something that is a bit special about xml:lang is that xml is a reserved namespace. From https://www.w3.org/TR/REC-xml-names/#xmlReserved

The prefix xml is by definition bound to the namespace name https://www.w3.org/XML/1998/namespace. It MAY, but need not, be declared, and MUST NOT be bound to any other namespace name. Other prefixes MUST NOT be bound to this namespace name, and it MUST NOT be declared as the default namespace.

Here is the code you can use to write an xml:lang attribute using an XmlWriter.

XmlWriterSettings settings = new XmlWriterSettings();

settings.Indent = true;

using (StringWriter textWriter = new StringWriter())

using (XmlWriter writer = XmlWriter.Create(textWriter, settings))

{

    writer.WriteStartElement("e");

    writer.WriteStartElement("t1");

    writer.WriteAttributeString("xml", "lang", null, "en-US");

    writer.WriteString("Hello, world!");

    writer.WriteEndElement();

    writer.WriteStartElement("t2");

    writer.WriteAttributeString("xml", "lang", null, "es-AR");

    writer.WriteString("¡Hola, mundo!");

    writer.WriteEndElement();

    writer.WriteEndElement();

    writer.Flush();

    Trace.WriteLine(textWriter.ToString());

}

Here is the traced output.

<?xml version="1.0" encoding="utf-16"?>
<e>
<t1 xml:lang="en-US">Hello, world!</t1>
<t2 xml:lang="es-AR">¡Hola, mundo!</t2>
</e>

If you are not using XmlWriter, but instead prefer to use LINQ to XML, it is even easier. This is because LINQ to XML has support for the xml namespace built in. Here is the code you could use to set the language on elements by adding the xml:lang attribute after creating the XDocument. Notice the lack of the xml:lang attribute in the source.

XDocument doc = XDocument.Parse(@"

    <e>

        <t1>Hello, world!</t1>

        <t2>¡Hola, mundo!</t2>

    </e>");

XNamespace xmlNs = XNamespace.Xml;

foreach (var element in doc.Root.Descendants())

{

    if (element.Name.LocalName == "t1")

        element.SetAttributeValue(xmlNs + "lang", "en-US");

    if (element.Name.LocalName == "t2")

        element.SetAttributeValue(xmlNs + "lang", "es-AR");

   

}

Trace.WriteLine(doc.ToString());

Here is the traced output.

<e>
<t1 xml:lang="en-US">Hello, world!</t1>
<t2 xml:lang="es-AR">¡Hola, mundo!</t2>
</e>

Enjoy!

Marcelo Lopez Ruiz

https://blogs.msdn.com/marcelolr/