In this article, we will explore how to create XML files in C#.
Let’s dive in.
Create Custom XML Files
To create custom XML documents in .NET, we use an object of the type XDocument
. Some resources may mention objects of the type XmlDocument
, but for us, as we’ll explore in a moment, the clear choice is XDocument
.
XDocument or XmlDocument?
In the realm of .NET, XDocument
takes precedence over XmlDocument
. While XmlDocument
was the initial class in the .NET Framework for managing XML documents, with the advent of LINQ in the .NET Framework, Microsoft introduced the XDocument
class. This class aligns closely with LINQ, offering an easy way to create XML documents using LINQ or, in this context, LinqToXml
. Let’s explore this as our first option.
Prepare Data That We Will Use to Create XML Files
To facilitate the creation of XML documents, we’ll implement various methods for transforming data. Let’s start by defining a Person
type, encompassing a few essential attributes:
public record Person( string FirstName, string LastName, string Email, DateTime Birthday) { public int Age { get { var age = DateTime.Today.Year - Birthday.Year; if (Birthday > DateTime.Today.AddYears(-age)) age--; return age; } } }
A person possesses a first name, last name, email, and date of birth. As a bonus, we have also included a calculated age based on today’s date and the birthdate.
Now, with our Person
object, we can easily create a method to generate a specified number of random individuals:
public static class People { private static readonly string[] firstNames = ["John", "Jane", "Robert", "Emily", "Michael", "Sarah", "William", "Olivia", "James","Emma"]; private static readonly string[] lastNames = ["Smith", "Johnson", "Williams", "Jones", "Brown", "Davis", "Miller", "Wilson", "Moore", "Taylor"]; public static Person GetOne() => Generate().First(); public static Person[] Get(int numberOfPeople) => Generate().Take(numberOfPeople).ToArray(); private static IEnumerable<Person> Generate() { while (true) { var firstName = firstNames[Random.Shared.Next(firstNames.Length)]; var lastName = lastNames[Random.Shared.Next(lastNames.Length)]; yield return new Person( FirstName: firstName, LastName: lastName, Email: $"{firstName.ToLower()}.{lastName.ToLower()}@code-maze.com", Birthday: DateTime.Today.AddDays(-Random.Shared.Next(1_000, 25_000)) ); } } }
For data generation, we introduce a People
class housing a Get()
method to create an array of random people and GetOne()
to create one random Person
.
Leveraging two arrays with common English names and surnames, we randomly select entries to form names and last names. These details, along with a generated email address and random birthdate, compose our random individual data.
If you want to learn about how to serialize and deserialize objects to and from XML, you can do that in the articles titled Serializing Objects to XML in C# and XML Deserialization in C#.
Create XML Files With LinqToXml
Creating XML with LinqToXml
is as straightforward as executing a regular LINQ query.
Let’s create an XML document representing our Person
class:
<person> <name> <firstName>Emma</firstName> <lastName>Brown</lastName> </name> <email>[email protected]</email> <age>58</age> </person>
To create an object of the XDocument
type named xmlPerson
, we initiate an XDocument
constructor:
public string CreateSimpleXML(Person person) { var xmlPerson = new XDocument( new XElement("person", new XElement("name", new XElement("firstName", person.FirstName), new XElement("lastName", person.LastName)), new XElement("email", person.Email), new XElement("age", person.Age)) ); return xmlPerson.ToString(); }
Starting with the root element <person>
, we employ XElement
to create nested elements such as <name>
, within which we further nest <firstName>
and <lastName>
. The same pattern applies to <email>
and <age>
elements.
Using this LINQ expression, we construct an XDocument
object, convert it to a string, and then return it to the calling code.
Understanding XML Element Relationships
It’s crucial to recognize the relationships between various XElements
within the XML structure. Some XElements
are nested within others, while some exist side by side.
For instance, the <firstName>
element is enclosed within the <name>
element, which in turn resides within the <person>
element. On the other hand, the <email>
and <age>
elements are positioned adjacent to each other since they share the same level.
Another observation is that the code mirrors the resultant XML structure. The <firstName>
element is slightly indented to the right of the <name>
element, while <email>
and <age>
are indented similarly.
However, it’s essential to note that indentation in the code is a matter of preference and doesn’t impact the XML relationships. The arrangement of elements in terms of indentation doesn’t signify sibling or child relationships; it’s the use of parentheses that defines these relationships.
XML Elements With Attributes
Now, let’s create a similar XML document, but this time we want the <name>
element to possess attributes ‘firstName’ and ‘lastName’ rather than sub-elements:
<person> <name firstName="Emma" lastName="Davis" /> <email>[email protected]</email> <age>57</age> </person>
The code structure remains nearly identical:
public string CreateSimpleXMLWithAttributes(Person person) { var xmlPerson = new XDocument( new XElement("person", new XElement("name", new XAttribute("firstName", person.FirstName), new XAttribute("lastName", person.LastName)), new XElement("email", person.Email), new XElement("age", person.Age)) ); return xmlPerson.ToString(); }
But now we are using XAttributes
‘firstName’ and ‘lastName’ within the <name>
element. In the CreateSimpleXML()
method, we utilized two XElements
instead.
XML Namespaces
Now, let’s consider a scenario where the <person>
element belongs to a namespace defined in https://www.code-maze.com/sample-schema (for illustration purposes only, as this namespace does not actually exist). Our goal is to create such an XML document:
<person xmlns:xsi="https://www.code-maze.com/sample-schema"> <name> <firstName>Jane</firstName> <lastName>Jones</lastName> </name> <email>[email protected]</email> <age>58</age> </person>
Similar to the XAttribute
element we’ve seen earlier, this time it’s a namespace and not an ordinary text attribute. The code structure remains very similar as in the CreateSimpleXMLWithAttributes()
method, with the only difference being the addition of another XAttribute
for the <person>
element:
public string CreateXMLWithNamespace(Person person) { var xmlPerson = new XDocument( new XElement("person", new XAttribute(XNamespace.Xmlns + "xsi", "https://www.code-maze.com/sample-schema"), new XElement("name", new XElement("firstName", person.FirstName), new XElement("lastName", person.LastName)), new XElement("email", person.Email), new XElement("age", person.Age)) ); return xmlPerson.ToString(); }
Instead of using the string constant ‘xmlns:’, we employ the constant XNamespace.Xmlns
and add ‘xsi’ as the name.
The value of the namespace is specified by the URL from the aforementioned HTML page, and this becomes the value of the attribute.
Using more than one namespace and defining elements as part of selected namespaces can clutter the code and make it harder to read:
<p:person xmlns:p="https://www.code-maze.com/sample-person" xmlns:o="https://www.code-maze.com/sample-other" xmlns:n="https://www.code-maze.com/sample-name"> <n:name> <n:firstName>James</n:firstName> <n:lastName>Williams</n:lastName> </n:name> <o:email>[email protected]</o:email> <o:age>38</o:age> </p:person>
Now, we are using three namespaces with prefixes ‘p’ for ‘person,’ ‘n’ for ‘name,’ and ‘o’ for others.
To handle this, it’s best to predefine all namespaces and subsequently add them as prefixes (as the first element) for each XElement
object:
public string CreateXmlWithThreeNamespaces(Person person) { var namespaceP = XNamespace.Get("https://www.code-maze.com/sample-person"); var namespaceN = XNamespace.Get("https://www.code-maze.com/sample-name"); var namespaceO = XNamespace.Get("https://www.code-maze.com/sample-other"); var xmlPerson = new XDocument( new XElement(namespaceP + "person", new XAttribute(XNamespace.Xmlns + "p", namespaceP.ToString()), new XAttribute(XNamespace.Xmlns + "o", namespaceO.ToString()), new XAttribute(XNamespace.Xmlns + "n", namespaceN.ToString()), new XElement(namespaceN + "name", new XElement(namespaceN + "firstName", person.FirstName), new XElement(namespaceN + "lastName", person.LastName)), new XElement(namespaceO + "email", person.Email), new XElement(namespaceO + "age", person.Age)) ); return xmlPerson.ToString(); }
The <person>
element has three namespaces defined, so as before, we add them as three XAttributes
. While the code may appear less clean, it remains readable, with each XML element prefixed by the corresponding ‘namespaceX’ object.
XML Documents as an Array of People
Until now, our XML documents were concise, describing only one person. What if we aim to create an XML document from an array of people? Suppose we desire an XML document with <person>
elements:
<people> <person> <!-- data of the first person --> </person> <person> <!-- data of the second person --> </person> <person> <!-- data of the third person --> </person> <!-- and so on ... --> </people>
We’ve already learned how to create an XML document for a single person, incorporating various possibilities such as using attributes or namespaces. Now, the task is to efficiently organize the array of sub-elements under the root element:
public string CreateAnArrayOfPeople(Person[] people) { var xmlPeople = new XDocument( new XElement("people", from person in people select new XElement("person", new XElement("name", new XElement("firstName", person.FirstName), new XElement("lastName", person.LastName)), new XElement("email", person.Email), new XElement("age", person.Age))) ); return xmlPeople.ToString(); }
Achieving this is quite straightforward.
While previously we had only one <person>
element beneath the <people>
element, now we want multiple instances. To accomplish this, we employ a loop using the LINQ expression ‘from element in collection
‘ to generate all the people in the people
collection.
Create XML Files Using XmlWriter
As mentioned at the beginning of the article, LinqToXml
and the XDocument
object were introduced later in the .NET journey. In the early days, XML documents were crafted using the XmlWriter
class. While the code using XmlWriter
tends to be longer, it offers a clear and explicit way to understand the XML document creation process.
Basic XML Documents
Let’s replicate the same XML documents as before using XmlWriter
. We’ll maintain the same method names to facilitate a comparison of both implementations:
public string(Person person) { var sb = new StringBuilder(); using var xmlWriter = XmlWriter.Create(sb, new XmlWriterSettings { Indent = true }); xmlWriter.WriteStartElement("person"); xmlWriter.WriteStartElement("name"); xmlWriter.WriteElementString("firstName", person.FirstName); xmlWriter.WriteElementString("lastName", person.LastName); xmlWriter.WriteEndElement(); xmlWriter.WriteElementString("email", person.Email); xmlWriter.WriteElementString("age", person.Age.ToString()); xmlWriter.WriteEndElement(); xmlWriter.Flush(); return sb.ToString(); }
We initiate an XmlWriter
object, specifying a location to write to—in this case, a StringBuilder
object. The XmlWriterSettings
allow us to define various settings like encoding, new-line characters, and indentation.
To create the XML document, we follow the document structure, starting with the WriteStartElement()
method and ending with the WriteEndElement()
method for each element. The code is more verbose, but its structure is straightforward and intelligible.
XML Elements With Attributes
As before, if we desire attributes instead of elements, we can achieve this by using the WriteAttributeString()
method instead of WriteElementString()
:
public string CreateSimpleXMLWithAttributes(Person person) { var sb = new StringBuilder(); using var xmlWriter = XmlWriter.Create(sb, new XmlWriterSettings { Indent = true }); xmlWriter.WriteStartElement("person"); xmlWriter.WriteStartElement("name"); xmlWriter.WriteAttributeString("firstName", person.FirstName); xmlWriter.WriteAttributeString("lastName", person.LastName); xmlWriter.WriteEndElement(); xmlWriter.WriteElementString("email", person.Email); xmlWriter.WriteElementString("age", person.Age.ToString()); xmlWriter.WriteEndElement(); xmlWriter.Flush(); return sb.ToString(); }
Here, we only change two lines have changed because we are now writing ‘firstName’ and ‘lastName’ as attributes instead of elements. The overall structure and process remain consistent with our previous XmlWriter
example.
XML Namespaces
To incorporate namespaces inside an XML document using XmlWriter
, we once again employ the WriteAttributeString()
method within the element where we want the namespace to be defined:
public string CreateXmlWithNamespace(Person person) { var sb = new StringBuilder(); using var xmlWriter = XmlWriter.Create(sb, new XmlWriterSettings { Indent = true }); xmlWriter.WriteStartElement("person"); xmlWriter.WriteAttributeString("xmlns", "xsi", null, "https://www.code-maze.com/sample-schema"); xmlWriter.WriteStartElement("name"); xmlWriter.WriteElementString("firstName", person.FirstName); xmlWriter.WriteElementString("lastName", person.LastName); xmlWriter.WriteEndElement(); xmlWriter.WriteElementString("email", person.Email); xmlWriter.WriteElementString("age", person.Age.ToString()); xmlWriter.WriteEndElement(); xmlWriter.Flush(); return sb.ToString(); }
We use the WriteAttributeString()
method to introduce the ‘xsi’ namespace within the <person>
element. The rest of the code structure and elements remain consistent with the previous XmlWriter
examples.
When working with multiple namespaces and defining elements as belonging to specific namespaces, the code can become a bit more cluttered and harder to read. We use XmlWriter
to achieve the same XML document structure as before:
public string CreateXmlWithThreeNamespaces(Person person) { var sb = new StringBuilder(); var namespaceP = XNamespace.Get("https://www.code-maze.com/sample-person"); var namespaceN = XNamespace.Get("https://www.code-maze.com/sample-name"); var namespaceO = XNamespace.Get("https://www.code-maze.com/sample-other"); using var xmlWriter = XmlWriter.Create(sb, new XmlWriterSettings { Indent = true }); xmlWriter.WriteStartElement("p", "person", namespaceP.ToString()); xmlWriter.WriteAttributeString("xmlns", "p", null, namespaceP.ToString()); xmlWriter.WriteAttributeString("xmlns", "n", null, namespaceN.ToString()); xmlWriter.WriteAttributeString("xmlns", "o", null, namespaceO.ToString()); xmlWriter.WriteStartElement("n", "name", namespaceN.ToString()); xmlWriter.WriteElementString("n", "firstName", namespaceN.ToString(), person.FirstName); xmlWriter.WriteElementString("n", "lastName", namespaceN.ToString(), person.LastName); xmlWriter.WriteEndElement(); xmlWriter.WriteElementString("email", namespaceO.ToString(), person.Email); xmlWriter.WriteElementString("age", namespaceO.ToString(), person.Age.ToString()); xmlWriter.WriteEndElement(); xmlWriter.Flush(); return sb.ToString(); }
We declare three namespaces and then explicitly write each prefix and namespace with each XML element using the overloaded versions of the WriteAttributeString()
and WriteElementString()
methods that accept these different parameters.
The resulting XML document structure remains the same as before, but the code is more intricate due to the explicit handling of namespaces.
XML Documents as Array of People
When using XmlWriter
, we write each element into an XML document using various WriteXXX()
methods. Writing an array of people into an XML document entails using a loop to write each person:
public string CreateAnArrayOfPeople(Person[] people) { var sb = new StringBuilder(); using var xmlWriter = XmlWriter.Create(sb, new XmlWriterSettings { Indent = true }); xmlWriter.WriteStartElement("people"); foreach (var person in people) { xmlWriter.WriteStartElement("person"); xmlWriter.WriteStartElement("name"); xmlWriter.WriteElementString("firstName", person.FirstName); xmlWriter.WriteElementString("lastName", person.LastName); xmlWriter.WriteEndElement(); xmlWriter.WriteElementString("email", person.Email); xmlWriter.WriteElementString("age", person.Age.ToString()); xmlWriter.WriteEndElement(); } xmlWriter.WriteEndElement(); xmlWriter.Flush(); return sb.ToString(); }
This time, our code may be even easier to understand than the code using LinqToXML
. Now, the for-each loop and the creation of a <person>
element are explicitly visible.
Each iteration of the loop appends a new <person>
element to the <people>
element, resulting in an XML document with an array of people.
Differences Between LinqToXml and XmlWriter
Having examined the two approaches, there are certainly differences between them. Let’s delve into those distinctions.
Closing Elements
When using LinqToXml
, closing elements are automatically managed, preventing the creation of syntactically incorrect XML documents.
However, with XmlWriter
, it is possible to forget to call WriteEndElement()
. Let’s explore what happens in that case:
public string CreateWrongXML(Person person) { var sb = new StringBuilder(); using var xmlWriter = XmlWriter.Create(sb, new XmlWriterSettings { Indent = true }); xmlWriter.WriteStartElement("person"); xmlWriter.WriteStartElement("name"); xmlWriter.WriteElementString("firstName", person.FirstName); xmlWriter.WriteElementString("lastName", person.LastName); xmlWriter.WriteElementString("email", person.Email); xmlWriter.WriteElementString("age", person.Age.ToString()); xmlWriter.WriteEndElement(); xmlWriter.Flush(); return sb.ToString(); }
A problem arises after writing our lastName
element, where the WriteEndElement()
method is not invoked to close the <name>
element. Our document is incorrect, missing the </person>
element:
<person> <name> <firstName>John</firstName> <lastName>Jones</lastName> <email>[email protected]</email> <age>42</age> </name>
The method is behaving exactly as instructed; since the WriteEndElement()
method for <name>
was not called, the next tag <email>
is considered on the same level as <lastName>
. When the program encounters the WriteEndElement()
call, it closes the last open element, which is <name>
. Consequently, the <person>
element is never closed, resulting in an incomplete XML document.
This underscores the importance of managing the WriteEndElement()
calls diligently to ensure the correct structure of the XML document.
Examining XML Declaration Differences in LinqToXml and XmlWriter
By examining the results of methods using LinqToXml
or XmlWriter
, we can observe a small but significant difference—the presence of an XML declaration.
When utilizing LinqToXml
, the XML document obtained will not include a declaration:
<person> <!-- other elements --> </person>
When utilizing XmlWriter
, the XML document obtained will include a declaration:
<?xml version="1.0" encoding="utf-16"?> <person> <!-- other elements --> </person>
The ToString()
method of the XDocument
does not include the XML declaration, but the Save()
method does:
public string CreateSimpleXMLWithXmlDeclaration(Person person) { var xmlPerson = new XDocument( new XElement("person", new XElement("name", new XElement("firstName", person.FirstName), new XElement("lastName", person.LastName)), new XElement("email", person.Email), new XElement("age", person.Age)) ); var stringWriter = new StringWriter(); xmlPerson.Save(stringWriter); return stringWriter.ToString(); }
To include the XML declaration, the Save()
method must be used, and for that, we must prepare a TextWriter
object.
In our CreateSimpleXMLWithXmlDeclaration()
method, we have chosen the StringWriter
object. This illustrates how the choice of method can impact the inclusion of the XML declaration in the resulting XML document.
Create Custom XML Files in Practice
Now that we know how to create XML files, let’s create something practical.
Converting CSV Files to XML Files
When dealing with tables of information, we often come across CSV files. These files, commonly utilized by applications such as Excel, serve as a popular format for storing data. CSV files have a simple structure, delimiting data elements with commas.
Let’s explore two instances of CSV files:
First CSV Example: Name,Email,Telephone,Age John Doe,[email protected],555-1234,30 Jane Smith,[email protected],555-5678,25 Bob Johnson,[email protected],555-9876,35 Second CSV Example: Manchester United,Manchester Liverpool,Liverpool Chelsea,London Arsenal,London
The first file includes people data with headers in the first line, while the second file lists football clubs and their cities without headers.
Implement Class to Convert CSV to XML
Initially, when dealing with captions in a CSV file, we typically employ these captions as element names:
public string[] GetCaptions(string firstLine) { return firstLine .Split(separator) .Select(value => value.Trim().Replace(" ", "_")) .ToArray(); }
The GetCaptions()
method takes the initial line of a CSV file, which contains captions.
Initially, we split the entire line using a separator. Since XML element names cannot contain spaces, in the next step, we replace all spaces with underscores. Finally, we gather all these strings into an array.
However, in the case of the second CSV file, where there are no captions in the first line, we need to generate names dynamically. In such cases, we might opt for names like Field0
, Field1
, and so forth. Creating such a method still requires the first line of the file to determine the number of fields:
public string[] GetCaptionReplacements(string firstLine) { return firstLine .Split(separator) .Select((_, index) => $"Field{index}") .ToArray(); }
The GetCaptionReplacements()
method closely resembles the GetCaptions()
method, differing mainly in the middle part. Previously, with captions available, we replaced spaces with underscores in the second step. However, without captions, we generate strings like Field{index}
, where the index
serves as the column ID.
Now we can proceed to read the remainder of the CSV file:
public class ConvertCsv2Xml( IEnumerable<string> csvLines, bool hasCaptionLine, string mainTag = "rows", string rowTag = "row", string separator = ",") { public XDocument Convert() { if (!csvLines.Any()) return new XDocument(); var elementNames = hasCaptionLine ? GetCaptions(csvLines.First()) : GetCaptionReplacements(csvLines.First()); return new XDocument( new XElement(mainTag, csvLines .Skip(hasCaptionLine ? 1 : 0) .Select(line => new XElement(rowTag, line.Split(separator) .Select((value, index) => new XElement(elementNames[index], value))))) ); } }
We start with a primary constructor for our class, accepting csvLines
, hasCaptionLine
, mainTag
, rowTag
, and separator
as fields.
To start, we check for the presence of data. If there is no data, we return an empty XML document. In the presence of data, the class proceeds to generate element names. Once the element names are obtained, we create an XML document using LinqToXml
, starting with the mainTag
. Following this, if the first line contains captions, we skip it as it has already been utilized for element names.
In the Linq
expression, we iteratively process each line, splitting it by separators and using the values as content for XML elements. We create a corresponding XML element for every column by combining the caption and field value.
CSV to XML Conversion in Action
Now, we can test our class on CSV data without captions, such as our team’s CSV data:
var converter = new ConvertCsv2Xml(csvLines, false, "teams", "team"); var document1 = converter.Convert();
The outcome will be an XML document containing elements named FieldX
for each team and the corresponding city:
<teams> <team> <Field0>Manchester United</Field0> <Field1>Manchester</Field1> </team> <team> <Field0>Liverpool</Field0> <Field1>Liverpool</Field1> </team> <!-- ... --> </teams>
Additionally, we prepare a CSV with people’s data, which includes captions:
var converter = new ConvertCsv2Xml(csvLines, true, "people", "person"); var document2 = converter.Convert();
The resulting XML document now features appropriately named XML elements for each set of personal data:
<people> <person> <Name>John Doe</Name> <Email>[email protected]</Email> <Telephone>555-1234</Telephone> <Age>30</Age> </person> <person> <Name>Jane Smith</Name> <Email>[email protected]</Email> <Telephone>555-5678</Telephone> <Age>25</Age> </person> <!-- ... --> </people>
Conclusion
Despite JSON evolving into the de facto standard for API communications, XML remains an integral part of many industries due to its longstanding presence. As we inevitably encounter XML standards, it becomes essential to familiarize ourselves with XML handling in .NET, utilizing classes such as XDocument, and XmlWriter.
For most tasks, XDocument proves to be a versatile choice, offering functions to create diverse XML documents and facilitating string or stream-based writing. With the support of LinqToXml, creating XML documents becomes a straightforward process.
However, when confronted with more complex tasks, the use of XmlWriter may be necessary for its enhanced capabilities.