Cryptography is the security backbone of what our modern society is built upon. This science has enabled us to keep data and communication safe from being compromised or stolen. We can find examples of its use in personal password managers up to the security aspect of HTTPS communication. In this article, we will learn more about cryptography and how we can apply it in code.
Let’s start.
What is Cryptography
Cryptography is the study, development, and application of mathematical algorithms on data or communications to secure the data from being easily read or unencrypted. Cryptographic algorithms are applicable in multiple ways to achieve goals. We can generate cryptographic keys from them, hash data, digitally sign files and programs, enable safe payment online, and encrypt data or communications.
We can access the .NET library of cryptographic algorithms through the System.Security.Cryptography
namespace.
In this article, we will discuss what hashing is and how it is different from encryption. We will cover the major points of encryption and dive into different encryption methods.
Hash Functions in Cryptography
Cryptographic hashing functions are one-way mathematical algorithms. Hash functions map a data set of any size down to a byte array of fixed size. It is essential to understand that hashing data is irreversible. This is an important feature of hashing and one we can take advantage of to validate data is correct without having to reveal it. Another property of a hash function is that they are deterministic algorithms. This means it produces the same result for a given input every time. Lastly, no two distinct data sets should produce the same hash value or output. When a hashing algorithm produces the same output for different inputs, it causes a collision. These three properties are what make hashing a great method to use for security in cryptography.
Because hash functions take inputs of any size and always produce an output of fixed size, it is inevitable that collisions will occur. Although unlikely, this can be an issue when absolute data security is a priority. A good way to avoid a situation like this is to use algorithms that are not deprecated or cryptographically broken. Moreover, using hashing with larger outputs can make collisions statistically very probable. For example, although SHA256 is 60% slower than MD5 the chances of a collision are 4.3*10–60.
How Hash Functions Are Utilized
We see examples of hashing being used in a variety of settings. For example, it is standard practice that companies with user accounts do not store passwords in plain text, but rather hashes of the passwords. We use hashing when verifying the integrity of a file.
Let’s say we received a sensitive executable file. We would first like to verify the file has not been tampered with before running it. If we were to receive a hash of the original file, we could run the hashing on the copy of the file we have to see if the file is the same. Remember, hashing algorithms produce the same output for a given input. Lastly, hashing is used in blockchain technology as proof of work.
MD5
Message-digest algorithm, or MD5, is a widely used hashing algorithm. This algorithm is widely used, but it should be recognized that MD5 is cryptographically broken. This means there is a way to manipulate the algorithm to violate the three properties of hashing algorithms we discussed earlier. MD5 can still be used for checksum purposes to verify integrity, but only for unintentional corruption. It is still preferred in some cases where the added security of SHA-family hashing is less important a factor. This hash function is less computationally expensive than the SHA family of functions:
var strStreamOne = new MemoryStream(Encoding.UTF8.GetBytes("This is my password! Dont read me!")); byte[] hashOne; using (var hasher = MD5.Create()) { hashOne = await hasher.ComputeHashAsync(strStreamOne); } var hashAsString = Convert.ToHexString(hashOne); Console.WriteLine("Hash Value:\n" + hashAsString)
We use the Create()
to make a default instance of the MD5
class. We provide a byte[]
created from our data payload.
The hash is truly an array of bytes but we can convert it to a string format for readability. To visualize the data we convert it to a hex string by calling Convert.ToHexString()
:
Hash Value: 5347BC359818CD57401561F4FBF5B0BF
SHA Family
Secure Hashing Algorithm, or SHA, is a family of hashing algorithms that the National Institute of Standards and Technology (NIST) developed. Currently, the most recent version of SHA in use is SHA-3 which succeeds SHA-1 and SHA-2.
SHA-2 is more popular as it has been in use safely for many years. There are multiple flavors of SHA-2. We will discuss SHA-256 and SHA-512. Although they have different block and output sizes, both follow the same algorithmic steps.
Released in 2015, SHA-3 is still in the process of adoption. New cryptographic algorithms take time to be widely adopted as they must be studied for long periods. SHA-3 is fundamentally different than its predecessors as it is based on an algorithm formerly called Keccak. SHA-1 and SHA-2 were based on the MD-5 algorithm. SHA-3 is not available in .NET.
Let’s look at an example of the SHA-256:
var strStreamOne = new MemoryStream(Encoding.UTF8.GetBytes("This is my password! Dont read me!")); byte[] hashOne; using (var sha256 = SHA256.Create()) { hashOne = await sha256.ComputeHashAsync(strStreamOne); } var hashAsString = Convert.ToHexString(hashOne); Console.WriteLine("Hash Value:\n" + hashAsString)
And then visualize the data:
Hash Value: 9F4C6711B093950EBA527F0AD86F17C6F64F956A341A2CD321213F0536245774
HMAC
HMAC is a hashing function that requires a secret key to hash data. We can use a hash-based message authentication code (HMAC) to verify the integrity of the data and the authentication of a message. Like any hashing algorithm, we can hash the data ourselves to verify there was no change in the data. Since HMAC requires a secret key, we can confirm that an authorized person made the hash.
Let’s perform an HMAC hash:
var strStreamOne = new MemoryStream(Encoding.UTF8.GetBytes("This is my password! Dont read me!")); byte[] hashOne; byte[] key = Encoding.UTF8.GetBytes("superSecretH4shKey1!"); using (var hmac = new HMACSHA256(key)) { hashOne = await hmac.ComputeHashAsync(strStreamOne); } var hashAsString = Convert.ToHexString(hashOne); Console.WriteLine("Hash Value:\n" + hashAsString)
And visualize the converted hex:
Hash Value: 1B24C3C6AC68397C8C58076C77C85DC6CB4713153369F1B92C183BCBF9DB8927
For more information about hashing visit this Code Maze article.
Symmetric Encryption in Cryptography
Encrypting involves changing data in a way that obscures it. Decryption is the reversal of encryption.
The first of two ways to encrypt data is Symmetric encryption. Symmetric encryption is a class of algorithms that can encrypt data with a private key and can decrypt data with the same key. This means we need a secure way to share this private key, or else anyone with the key can decrypt our data. We can refer to symmetric encryption as private key encryption.
AES
AES is the most popular symmetric encryption algorithm. This algorithm is fast and can encrypt data of any size. This is very important because we need to encrypt a large amount of data often. A slow algorithm can make a task like this very time-consuming.
AES features a 128-bit block size and 128, 192, and 256-bit key sizes. The variable key size is essential because we cannot tell the size by looking at the cipher text. Additionally, the key sizes are long when we consider other algorithms. This all makes for a safer, harder-to-break algorithm.
Let’s perform an AES encryption:
var dataStr = "This is corporate research! Dont read me!"; var data = Encoding.UTF8.GetBytes(dataStr); var key = GenerateAESKey(); var encryptedData = Encrypt(data, key, out var iv); var encryptedDataAsString = Convert.ToHexString(encryptedData); Console.WriteLine("Encrypted Value:\n" + encryptedDataAsString); public static byte[] Encrypt(byte[] data, byte[] key, out byte[] iv) { using (var aes = Aes.Create()) { aes.Mode = CipherMode.CBC; // better security aes.Key = key; aes.GenerateIV(); // IV = Initialization Vector using (var encryptor = aes.CreateEncryptor()) { iv = aes.IV; return encryptor.TransformFinalBlock(data, 0, data.Length); } } } public static byte[] Decrypt(byte[] data, byte[] key, byte[] iv) { using (var aes = Aes.Create()) { aes.Key = key; aes.IV = iv; aes.Mode = CipherMode.CBC; // same as for encryption using (var decryptor = aes.CreateDecryptor()) { return decryptor.TransformFinalBlock(data, 0, data.Length); } } } public static byte[] GenerateAESKey() { var rnd = new RNGCryptoServiceProvider(); var b = new byte[16]; rnd.GetNonZeroBytes(b); return b; }
Finally, let’s visualize the data:
Encrypted Value: 46180CD8EF50860880F7D5885FF1B96A3ABA3A319FBB2374CF14705C64D66EEB050551E2E799D6300DB4F3E654A56F6D
In this example, we use AES to encrypt our data. It is important to note that if we provide a key of the wrong size, .NET throws an exception. In this case, we are using a 16-byte key
. We are using the CBC cipher mode. We use GenerateIV()
to create an IV. IV stands for the initialization vector. The vector is a random value. The purpose of the IV is to produce different cipher text for the same input. It is also required for decryption. Aes.CreateEncryptor()
returns an object on which we can call TransformFinalBlock()
to perform the encryption.
Cryptography Cipher Modes
We can set the cipher mode we want to use by setting aes.Mode
to one of four values. The cipher modes are enumerated by the CipherMode
enum.
The Cipher Block Chaining mode utilizes feedback from a previous block when encrypting the current block. The current block’s encryption includes a part of the cipher text of the previous block. This method assures that identical blocks result in different encrypted values.
The Electronic Codebook mode encrypts each block individually. If we use the same key, identical blocks of plain text will have the same encryption value. This mode is not practical as it introduces vulnerabilities to the encryption.
The Cipher Feedback mode encrypts small increments of data smaller than the block size.
The Cipher Text Stealing mode can produce an encrypted output the same size as the original unencrypted data set. CTS behaves exactly like CBC for all blocks except the last two blocks.
In this example, we used CBC because it is a safer option to use when encrypting. It closes off opportunities someone can exploit to retrieve our secret data.
DES and Triple DES
.NET also offers support for DES and Triple DES. Data Encryption Standard (DES) is a symmetric encryption algorithm that preceded AES. Triple DES applies DES three times to each block. DES was once a formidable encryption, but by today’s standards, it is no longer practical to use DES. This is because DES uses a 56-bit key length which makes it easier to break when compared to other algorithms with longer keys.
Asymmetric Encryption in Cryptography
Asymmetric encryption is an encryption method where data is encrypted by a publicly available key and decrypted by a private key. The public and private keys are mathematically related. This means only a private key can decrypt data that the related public key encrypts. These algorithms typically have a fixed buffer size. We can refer to asymmetric encryption as public key encryption.
Typically asymmetric encryption is much slower than symmetric encryption. It is not practical to use asymmetric encryption on a large data set.
RSA
RSA is a popular public key algorithm that is a standard algorithm for asymmetric encryption. It is old for a secure cryptographic algorithm. Given a long enough key, there is no proven method for breaking RSA. This means it has withstood the test of time and is still a secure option for encryption. RSA can only handle encrypting blocks of data the size of the key used in the encryption. Data sets larger than the key size must be split into multiple blocks. It is traditional to use the power of two key sizes, but this is not a requirement.
We commonly use RSA for key exchange and digital signing. This is because encryption of large data sets would be slow using RSA. Practical uses of RSA include digital signing, secure messaging applications, and secure connections such as SSL and VPN connections:
var dataStr = "This is corporate research! Dont read me!"; var data = Encoding.UTF8.GetBytes(dataStr); var keyLength = 2048; // size in bits GenerateKeys(keyLength , out var publicKey, out var privateKey); var encryptedData = Encrypt(data, publicKey); var encryptedDataAsString = Convert.ToHexString(encryptedData); Console.WriteLine("Encrypted Value:\n" + encryptedDataAsString); public void GenerateKeys(int keyLength, out RSAParameters publicKey, out RSAParameters privateKey) { using (var rsa = RSA.Create()) { rsa.KeySize = keyLength; publicKey = rsa.ExportParameters(includePrivateParameters: false); privateKey = rsa.ExportParameters(includePrivateParameters: true); } } public byte[] Encrypt(byte[] data, RSAParameters publicKey) { using (var rsa = RSA.Create()) { rsa.ImportParameters(publicKey); var result = rsa.Encrypt(data, RSAEncryptionPadding.OaepSHA256); return result; } } public byte[] Decrypt(byte[] data, RSAParameters privateKey) { using (var rsa = RSA.Create()) { rsa.ImportParameters(privateKey); return rsa.Decrypt(data, RSAEncryptionPadding.OaepSHA256); } }
In this example, we set rsa.KeySize
and ExportParameters()
to generate a public and private key. We can use the public key to encrypt our data in the Encrypt()
method. Similarly, we can decrypt data using our private key in the Decrypt()
method. It is important to note that if the key size is too small we will not be able to create a set of cryptographic keys. A standard and safe key size is 2048 bits.
Let’s check how the encrypted data looks in a hex string format:
Encrypted Value: 5B63ADA5C1ADC2E2E4884C6B7157BE7A5C2F562CF089E1DE9225968F4F0226E0234A685E7CD47D02B4AD1653100F7F3B5A8050B5 0B5CCBD162EAF62C65A3C61E284C7337352173403A71FF395843A0F3B9A50E85BF25E5344285EFF054B82B29B56BF9AC77A33D8C DCBDD6FD055B2E843B2FC467F64D879D974CC8CBDEC9B78714F3286B27CD316175F6FB453E6C2264BC45645FBCAB1D6844E83B3D 37FE2DA00732EA5F11F45BF40BE16810BA1FE15B88C267D90C6164A960A2690CB095F0D2A8AF816DB337E67A30882EAD94B68DF3 2F9B1A27FDC2A5467ED8810A95239D7D73CACF9AD0C5FE1D717D7C730328BB191D743DE04549CA0C165D600ACD57FF60
DSA
The Digital Signing Algorithm, or DSA, is another public key algorithm. We use DSA for key generation and digital signing. We can use DSA for encryption, but it is not typical. Compared to RSA, DSA is faster at key generation, digital signing, and decryption. On the other hand, RSA is faster at encrypting, and digital signature verification:
var dsa = DSA.Create(); var dataStr = "This is corporate research! Dont read me!"; var data = Encoding.UTF8.GetBytes(dataStr); var signedData = Sign(dsa, data); dsa.Dispose(); public byte[] Sign(DSA dsa, byte[] data) { if(dsa is null) throw new NullReferenceException(nameof(dsa)); var result = dsa.SignData(data, HashAlgorithmName.SHA256); return result; } public bool VerifySignature(DSA dsa, byte[] data, byte[] signedData) { if (dsa is null) throw new NullReferenceException(nameof(dsa)); return dsa.VerifyData(data, signedData, HashAlgorithmName.SHA256); }
Here, we use DSA to digitally sign data. This data can be a text file, executable, or email. We sign the data by calling SignData()
. We can verify the validity of signed data by calling VerifyData()
.
Choosing the Right Algorithm in Cryptography
When deciding which encryption method and implementation to use, it is a good idea to base our decision on the parameters of our application of the encryption. What does our data look like? Do we have a lot of data to encrypt? Is performance a factor? What level of security must we ensure? How can we share or transmit this data? Do we need to transfer data or just prove the data is authentic? All of these answers will lead us to make an appropriate choice of which algorithm to employ.
Security must always be a top priority in these decisions. We must strive to choose the safest method considering our use case.
Asymmetric Encryption
If a data set is not large, asymmetric encryption would be the safest choice as long as we can safely share the private key with the party needing to decrypt the data. Asymmetric encryption is not always applicable to all cases. This is true for large data sets, as they will take much longer to encrypt. Asymmetric encryption is also not a good choice if we only need to validate data is authentic rather than a secret. This would be a job more suited for a hashing algorithm.
Symmetric Encryption
Symmetric encryption is useful when the data set is large. This is because symmetric encryption is typically much faster than asymmetric. This, of course, offers less security than asymmetric encryption. This is a trade-off we must consider. If the situation calls for performant encryption, the loss of security may not be as important. Of course, symmetric encryption is still safe, but it is less safe than asymmetric.
Hash Function
Sometimes the issue we want to solve is not to keep our data secret but rather that the data transmission is accurate. We must verify that there are no changes between the data we sent and the data received. We can employ hashing algorithms in a situation like this. Hashing algorithms are typically faster than encryption. We should use hashing rather than encryption when the goal is to validate data rather than hide it.
Hybrid Encryption
Ideally, we can take advantage of public and private key encryption in unison to achieve performant and safe encryption. We can encrypt a large data set with the speed of symmetric encryption and the security of asymmetric encryption. First, a symmetric algorithm encrypts the data. Next, an asymmetric algorithm encrypts the private key. This way, we can share the original private key while not taking a hit on performance.
We have covered many topics on encryption, but for further reading, visit the official Microsoft .NET documentation.
Cryptographic Random Numbers
It is important to note the difference between random values and cryptographic random values that we use in cryptography. This is important because the security of a system is based on the assumption that it is generating random numbers. Using System.Random
is a performant easy way to get random values, but this implementation is not truly random. System.Random
uses a seed to generate values from. The same seed will produce the same order of values. Similar seeds produce similar values. This could lead to vulnerabilities a bad actor can use.
RandomNumberGenerator
provides a great way to safely generate random numbers. Using this class will increase the security of our code:
var randomNumGenerator = RandomNumberGenerator.Create(); var data = new byte[length]; randomNumGenerator.GetBytes(data); return data ;
In this snippet, we use the Create
method from a RandomNumberGenerator
and use it to generate random bytes, a length
number of times. We can convert these bytes to other types like int
, char
, and string
.
Find further reading on Code Maze about random numbers in this article.
Conclusion
In conclusion, there are many real-world applications for cryptography. Many current technologies use encryption security benefits to protect our information. Much of which we do not see, such as HTTPS and SSL. There are many algorithms in cryptography with varying levels of security and performance. Choosing the right algorithm for an application can provide great security for the user and ensure that we utilize cryptography in the right way.