C#: How to get unique values in a List?

By Josip Miskovic
A thumbnail showing C# List.

C#: How to get unique values in a List?

To get a unique value in a List:

  1. Import the System.Linq namespace
  2. Use the Distinct method on your list.
  3. (optional) Get a list using the ToList method.

Full code example:

C#
List<int> numbers = new() { 1, 2, 2, 2, 5, 6};
List<int> unique = numbers.Distinct().ToList();
Console.WriteLine(string.Join(",", unique)); // 1,2,5,6

Distinct is an extension method from the System.Linq namespace that compares hash values to get unique values from an IEnumerable collection.

C#/.NET source code of the Distinct method

Distinct does not directly iterate over the collection to get unique values. Instead, it uses the yield keyword to defer the execution.

What about objects?

By default, Distinct is not going to work with objects because even when objects contain the same values, they have a different reference in memory.

Here's an example:

C#

List<User> userList = new() { new User(1, "Joe"),new User(2, "Matt"), new User(1, "Joe") };

userList.Distinct().ToList().ForEach(Console.WriteLine);

// Prints: 
// 1: Joe
// 2: Matt
// 1: Joe

userList.GroupBy(u => u.Id).Select(g => g.First()).ToList().ForEach(Console.WriteLine);

public class User {
    public int Id { get; init; }
    public string Name { get; init; }
    
    public User (int id, string name)
    {
        Id = id;
        Name = name;
    }
    
    public override string ToString() => $"{Id}: {Name}";
}

We can see that Distinct didn't filter out two instances of the User(1, "Joe) object.

The easiest way to filter out duplicates when you have a unique key is to use DistinctBy:

C#
userList.DistinctBy(user => user.Id).ToList().ForEach(Console.WriteLine);

// Prints: 
// 1: Joe
// 2: Matt

The code above bases the distinction on the Id property which gives us the expected result.

Besides Distinct, there are 3 major alternatives to get unique values from a List:

1. Use a HashSet

Hashset is a data structure that contains only unique elements.

We can use the HashSet(IEnumerable<T> collection) constructor to get unique values from our list because List<T> implements the IEnumerable<T> interface. We can turn the newly created HashSet to a List to get all the unique values.

For example:

C#
List<int> numbers = new() { 1, 2, 2, 2, 5, 6};
List<int> unique = new HashSet<int>(numbers).ToList();
Console.WriteLine(string.Join(",", unique)); // 1,2,5,6

This works because the Hashset constructor takes in an IEnumerable collection and an IEqualityComparer as parameters. If the collection passed in is already a HashSet of the same type and the equality comparers are equal, it copies the elements from that HashSet. If not, it uses the collection's count to set the initial capacity and then adds the elements from the collection to the HashSet.

2. Use GroupBy

You can also use GroupBy to get unique values from a collection:

C#
List<int> numbers = new() { 1, 2, 2, 2, 5, 6};
List<int> unique = numbers.GroupBy(x => x).Select(x => x.Key).ToList();
Console.WriteLine(string.Join(",", unique)); // 1,2,5,6

This approach is not very effective when it comes to primitives, but it was very popular way of getting unique values from a list before the introduction of GroupBy. For example:

C#
List<User> userList = new() { new User(1, "Joe"),new User(2, "Matt"), new User(1, "Joe") };

userList.GroupBy(u => u.Id).Select(g => g.First()).ToList().ForEach(Console.WriteLine);

// Prints: 
/// 1: Joe
//  2: Matt

3. Use a for loop

Using a for loop is my least favorite way of getting unique values from a list because it doesn't include many of the performance factors that the .NET code does. e.g. Distinct, HashSet

However, it can be useful if you were going to iterate your list anyway for something else.

C#
var result = new List<int>();
foreach (var value in _data)
{
    // do something else with my `value`

    // also create a unique list
    if (!result.Contains(value))
        result.Add(value);
}

 

What's the fastest way to get unique values in a List?

I've used BenchmarkDotNet to test out the performance of all these methods:

MethodMeanErrorStdDev
HashSet16.24 us0.597 us1.645 us
Distinct17.90 us0.357 us0.978 us
LinqGroupBy66.41 us1.321 us3.768 us
ForLoop179.73 us3.557 us7.883 us

The Distinct method and using a HashSet are the fastest ways to get unique values. Using a ForLoop is over 10 times slower, so I'd definitely avoid that one.

Conclusion: Even though HashSet had a slightly better performance than Distinct, Distinct is my number one choice for getting unique values from a list. That's because it clearly communicates the intent.

Full test code:

C#
using BenchmarkDotNet.Attributes;

public class UniqueValuesBenchmark
{
    private readonly List<int> _data;

    public UniqueValuesBenchmark()
    {
        _data = Enumerable.Range(0, 1000).ToList();
    }

    [Benchmark]
    public List<int> HashSet() => new HashSet<int>(_data).ToList();

    [Benchmark]
    public List<int> Distinct() => _data.Distinct().ToList();

    [Benchmark]
    public List<int> LinqGroupBy() => _data.GroupBy(x => x).Select(x => x.Key).ToList();

    [Benchmark]
    public List<int> ForLoop()
    {
        var result = new List<int>();
        foreach (var value in _data)
        {
            if (!result.Contains(value))
                result.Add(value);
        }

        return result;
    }
}
Published on:
Josip Miskovic
About Josip

Josip Miskovic is a software developer at Americaneagle.com. Josip has 10+ years in experience in developing web applications, mobile apps, and games.

Read more posts →
Download Free Software Developer Career Guide

I've used these principles to increase my earnings by 63% in two years. So can you.

Dive into my 7 actionable steps to elevate your career.