In this article, we are going to cover the DataTable class in C#.
DataTable
is a data structure that is a member of the System.Data
namespace. It consists of columns that define the data type and rows that hold individual records, making it ideal for structured data sets. Good use cases include reading information from a relational database or files organized in columns and rows.
Create a DataTable and Define the Structure
Creating a DataTable
in C# requires two steps:
- Declaring and instantiating a
DataTable
object - Defining the structure by calling the
Add()
method on theColumns
property
Let’s jump in with an example.
First, let’s declare and instantiate dt
, a DataTable
:
var dt = new DataTable();
Next, let’s define the structure of our table:
dt.Columns.Add("Ticker", typeof(string)); dt.Columns.Add("Date", typeof(DateTime)); dt.Columns.Add("Price", typeof(double));
Calling the Add()
method creates a new DataColumn
object and assigns a reference to it in the Columns
property of dt
, our DataTable
. Passing Ticker
, Date
and Price
as string
parameters defines the ColumnName
property of the new columns. We use the typeof
operator to explicitly define the DataType
property of our DataColumn
objects.
Populate a DataTable in C#
Now that we’ve created and defined our DataTable
, let’s add some data. There are three steps to adding a record:
- Declaring and instantiating a
DataRow
object by calling theNewRow()
method on theDataTable
- Populating the
DataRow
by setting each column to the data we want to populate - Calling the
Add()
method on theRows
property to add the newDataRow
to ourDataTable
Let’s apply these steps by adding a record for Microsoft’s stock price.
First, let’s create a DataRow
object:
DataRow row = dt.NewRow();
Next, let’s update row
by referencing each column to insert the data we want to add:
row["Ticker"] = "MSFT"; row["Date"] = new DateTime(2023, 3, 3); row["Price"] = 255.29;
Here, we use the ColumnName
property of each DataColumn
object to populate the data. However, we could populate the columns by index:
row[0] = "MSFT"; row[1] = new DateTime(2023, 3, 3); row[2] = 255.29;
The Ordinal
property of the DataColumn
object assigns a zero-based int
index to each column in the order it’s added. Referencing 0
, 1
and 2
has identical behavior to referencing Ticker
, Date
and Price
. We’re just referencing different properties of the same object.
Common DataTable Properties
In addition to the Columns
and Rows
properties, DataTable
has a couple of common properties that allow us to apply relational data modeling concepts in memory in C#.
Primary Key
A primary key is defining one or more columns as a unique identifier for records in a table. We can use the PrimaryKey
property to define this in C# with a DataColumn[]
object
.
Building off our example, let’s create a new column and assign it as the primary key of our table:
var pkCol = new DataColumn("Key", typeof(int)); dt.Columns.Add(pkCol); var pkCols = new DataColumn[1]; pkCols[0] = pkCol; dt.PrimaryKey = pkCols;
First, we’re defining pkCol
, a variable referencing Key
, a new DataColumn
we’re adding to our table. Next, we create pkCols
, which is holding a reference to Key
, the column we want as our primary key. Finally, we’re setting the PrimaryKey
property to the pkCols
column array, defining the primary key of our table.
Constraints
Constraints allow us to set rules to limit what can be inserted into a table. This is beneficial as it guarantees reliability in tables and relationships between multiple tables. In C#, the Constraints
property references a ConstraintsCollection
object that stores defined constraints.
Let’s look at what constraints exist on our table:
foreach (Constraint constraint in dt.Constraints) { Console.WriteLine(constraint.ConstraintName); Console.WriteLine(constraint.GetType()); }
Here, we loop through each item in our table’s ConstraintCollection
and print the ConstraintName
and type
.
Let’s run the code and view the output:
Constraint1 System.Data.UniqueConstraint
Constraint1
is our only Constraint
object in our Constraints
property. It is a UniqueConstraint
that was automatically created by defining the PrimaryKey
property for our table. It prevents adding duplicates to our Key
column, ensuring integrity for the unique identifier of our table.
Common Methods of DataTable Class in C#
DataTable
has several common methods that make wrangling structured data sets easy.
Clone
The Clone()
method takes a copy of the schema and columns but does not copy any of the actual data.
Let’s clone dt
as an example:
DataTable dtClone = dt.Clone();
Next, let’s print out the columns. First, we’ll create a helper method to do this:
public static void PrintColumns(DataTable dt) { foreach (DataColumn col in dt.Columns) Console.WriteLine(col.ColumnName); }
Then, we’ll call it on our cloned table:
PrintColumns(dtClone);
The output shows the identical column structure:
Date Price Key
Copy
The Copy()
method copies a table’s underlying structure and the actual content of the table.
Let’s try it on dt
. First, we’ll create a helper method to print out the content of a DataTable
object:
public static void PrintData(DataTable dt) { foreach (DataRow row in dt.Rows) { foreach(DataColumn col in dt.Columns) { Console.WriteLine($"{col}: {row[col]}"); } } }
Next, we’ll copy our table, and print out the contents:
DataTable dtCopy = dt.Copy(); PrintData(dtCopy);
The output shows identical data to what we’ve added in dt
:
Ticker: MSFT Date: 3/3/2023 12:00:00 AM Price: 255.29 Key: 0
Select
The Select()
method allows us to easily access data rows. We can pass a string
argument with an expression to filter records. Let’s say we add an additional record to dt
:
row = dt.NewRow(); row["Ticker"] = "AMZN"; row["Date"] = new DateTime(2023, 3, 6); row["Price"] = 93.75; row["Key"] = 1; dt.Rows.Add(row);
Now, let’s access our new record. We can do this by writing an expression that filters the table by the Ticker
:
string expression = "Ticker='AMZN'"; DataRow[] result = dt.Select(expression);
Next, let’s write a helper method to print out the results:
public static void PrintSelectedRows(DataRow[] rows, DataTable dt) { foreach (DataRow dr in rows) { foreach (DataColumn col in dt.Columns) { Console.WriteLine($"{col}: {dr[col]}"); } } }
Finally, we can call this method to view the output of our Select()
method call:
PrintSelectedRows(result, dt);
Which prints out the expected result of the DataRow
objects that have a Ticker
value of AMZN:
Ticker: AMZN Date: 3/6/2023 12:00:00 AM Price: 93.75 Key: 1
Practical Example With DataTable Class in C#
Let’s take everything we’ve learned and apply it to an example. A common scenario is loading data into a program from a .csv
file. A .csv
file is structured so columns are separated by a comma, and rows are separated by a new line.
First, let’s create a method to load a file into a DataTable
:
public static DataTable CsvToDataTable(string filePath) { var dtRes = new DataTable(); using (var parser = new TextFieldParser(filePath)) { parser.TextFieldType = FieldType.Delimited; parser.TrimWhiteSpace = true; parser.SetDelimiters(","); var columns = parser.ReadFields(); foreach (var col in columns) dtRes.Columns.Add(col); while (!parser.EndOfData) { var row = parser.ReadFields(); dtRes.LoadDataRow(row, true); } } return dtRes; }
Here, we use parser
, a TextFieldParser
object that has built-in methods for reading structured text files to read a .csv
file. We then use the LoadDataRow()
method to read content from a .csv
into dtRes
, the DataTable
that is returned with the file contents.
Next, let’s write a method to load the .csv
veggie-sales
into our program:
public static void ProcessSales() { var filePath = Path.Combine(AppDomain.CurrentDomain.BaseDirectory, "vegetable-sales.csv"); DataTable veggieSales = CsvToDataTable(filePath); }
Now that we’ve loaded the file into a table, let’s apply a filter to analyze sales for carrots:
DataRow[] carrotSales = veggieSales.Select("Vegetable='Carrot'");
Finally, we can view the output using our PrintSelectedRows()
method from our previous example:
Program.PrintSelectedRows(carrotSales, veggieSales);
Which prints out the records for carrot sales to the console:
Vegetable: Carrot Quantity: 1 Price: 1 Sales: 1 Vegetable: Carrot Quantity: 1 Price: 3 Sales: 1
Conclusion
DataTable class in C# is an excellent choice for storing and manipulating structured data sets. It has built-in methods to easily define columns and populate rows with data. Properties like PrimaryKey
and Constraints
enable the application of data modeling best practices. Methods like Copy()
, Clone()
and Select()
allow easy data transformation in memory.
Good article
Expression evaluation capabilities of data table?