Performance Comparison: List vs. HashSet in .NET

Performance Comparison: List vs. HashSet in .NET

Introduction

Choosing the right data structure can significantly impact the performance and efficiency of your code. Two commonly used data structures for storing collections of objects are List and HashSet.

In this blog post, we’ll dive into the performance differences between List and HashSet in .NET, and provide C# examples to help you make informed decisions based on your specific requirements.

Understanding List and HashSet

Before we jump into performance comparisons, let’s briefly explain what List and HashSet are:

  1. List: List<T> is a dynamic array-based data structure in .NET that allows you to store and manipulate a collection of elements of a specified type T. Lists are ordered collections, which means the elements are stored in the order they were added.
  2. HashSet: HashSet<T> is a collection of unique elements in .NET. It does not allow duplicate values and provides faster lookups compared to lists. HashSets are unordered collections, which means there is no specific order in which elements are stored.

Performance Comparison

Now, let’s explore the performance differences between List and HashSet in various scenarios.

1. Element Retrieval

When it comes to retrieving elements by their values, HashSet outperforms List. HashSets use a hashing algorithm to determine element locations, providing constant time O(1) retrieval, while Lists require linear time O(n) for searching:

HashSet<int> hashSet = new HashSet<int>();
List<int> list = new List<int>();

for (int i = 0; i < 1000000; i++)
{
    hashSet.Add(i);
    list.Add(i);
}

// Retrieving an element from HashSet
var hashSetResult = hashSet.Contains(999999); // O(1)

// Retrieving an element from List
var listResult = list.Contains(999999); // O(n)

2. Element Insertion and Removal

Inserting and removing elements in a HashSet can be faster than in a List, especially when dealing with uniqueness constraints:

HashSet<int> hashSet = new HashSet<int>();
List<int> list = new List<int>();

// Adding elements
hashSet.Add(1); // O(1)
list.Add(1);    // O(1) or O(n) if resizing is needed

// Removing elements
hashSet.Remove(1); // O(1)
list.Remove(1);    // O(n)

3. Memory Consumption

Lists tend to consume more memory than HashSets due to their dynamic array nature. Lists allocate extra space to accommodate potential growth, while HashSets allocate memory based on their hashing algorithm:

HashSet<int> hashSet = new HashSet<int>();
List<int> list = new List<int>();

for (int i = 0; i < 1000000; i++)
{
    hashSet.Add(i);
    list.Add(i);
}

var hashSetMemory = GC.GetTotalMemory(false);
var listMemory = GC.GetTotalMemory(false);

Conclusion

Choosing between List and HashSet in .NET should depend on your specific use case.

If you require a collection with unique elements and efficient lookup operations, HashSet is the better choice.

However, if you need ordered elements or often perform operations like indexing and inserting at specific positions, List may be more suitable.

Consider the performance characteristics and the specific requirements of your application when making your choice. Using the right data structure can lead to improved performance and efficient memory usage in your .NET projects.

You can find the official documentation here: – https://learn.microsoft.com/en-us/dotnet/api/system.collections.generic.hashset-1?view=net-8.0

Stephen

Hi, my name is Stephen Finchett. I have been a software engineer for over 30 years and worked on complex, business critical, multi-user systems for all of my career. For the last 15 years, I have been concentrating on web based solutions using the Microsoft Stack including ASP.Net, C#, TypeScript, SQL Server and running everything at scale within Kubernetes.