Use a SortedSet to avoid duplicates and sort items | Code4IT


Using the right data structure is crucial to building robust and efficient applications. So, why use a List or a HashSet to sort items (and remove duplicates) when you have a SortedSet?

Table of Contents

Just a second! 🫷
If you are here, it means that you are a software developer.
So, you know that storage, networking, and domain management have a cost .

If you want to support this blog, please ensure that you have disabled the adblocker for this site.
I configured Google AdSense to show as few ADS as possible – I don’t want to bother you with lots of ads, but I still need to add some to pay for the resources for my site.

Thank you for your understanding.
Davide

As you probably know, you can create collections of items without duplicates by using a HashSet<T> object.

It is quite useful to remove duplicates from a list of items of the same type.

How can we ensure that we always have sorted items? The answer is simple: SortedSet<T>!

HashSet: a collection without duplicates

A simple HashSet creates a collection of unordered items without duplicates.

This example

var hashSet = new HashSet<string>();
hashSet.Add("Turin");
hashSet.Add("Naples");
hashSet.Add("Rome");
hashSet.Add("Bari");
hashSet.Add("Rome");
hashSet.Add("Turin");


var resultHashSet = string.Join(',', hashSet);
Console.WriteLine(resultHashSet);

prints this string: Turin,Naples,Rome,Bari. The order of the inserted items is maintained.

SortedSet: a sorted collection without duplicates

To sort those items, we have two approaches.

You can simply sort the collection once you’ve finished adding items:

var hashSet = new HashSet<string>();
hashSet.Add("Turin");
hashSet.Add("Naples");
hashSet.Add("Rome");
hashSet.Add("Bari");
hashSet.Add("Rome");
hashSet.Add("Turin");

var items = hashSet.ToList<string>().OrderBy(s => s);


var resultHashSet = string.Join(',', items);
Console.WriteLine(resultHashSet);

Or, even better, use the right data structure: a SortedSet<T>

var sortedSet = new SortedSet<string>();

sortedSet.Add("Turin");
sortedSet.Add("Naples");
sortedSet.Add("Rome");
sortedSet.Add("Bari");
sortedSet.Add("Rome");
sortedSet.Add("Turin");


var resultSortedSet = string.Join(',', sortedSet);
Console.WriteLine(resultSortedSet);

Both results print Bari,Naples,Rome,Turin. But the second approach does not require you to sort a whole list: it is more efficient, both talking about time and memory.

Use custom sorting rules

What if we wanted to use a SortedSet with a custom object, like User?

public class User {
    public string FirstName { get; set; }
    public string LastName { get; set; }

    public User(string firstName, string lastName)
    {
        FirstName = firstName;
        LastName = lastName;
    }
}

Of course, we can do that:

var set = new SortedSet<User>();

set.Add(new User("Davide", "Bellone"));
set.Add(new User("Scott", "Hanselman"));
set.Add(new User("Safia", "Abdalla"));
set.Add(new User("David", "Fowler"));
set.Add(new User("Maria", "Naggaga"));
set.Add(new User("Davide", "Bellone"));//DUPLICATE!

foreach (var user in set)
{
    Console.WriteLine($"{user.LastName} {user.FirstName}");
}

But, we will get an error: our class doesn’t know how to compare things!

That’s why we must update our User class so that it implements the IComparable interface:

public class User : IComparable
{
    public string FirstName { get; set; }
    public string LastName { get; set; }

    public User(string firstName, string lastName)
    {
        FirstName = firstName;
        LastName = lastName;
    }

    public int CompareTo(object obj)
    {
        var other = (User)obj;
        var lastNameComparison = LastName.CompareTo(other.LastName);

        return (lastNameComparison != 0)
            ? lastNameComparison :
            (FirstName.CompareTo(other.FirstName));
    }
}

In this way, everything works as expected:

Abdalla Safia
Bellone Davide
Fowler David
Hanselman Scott
Naggaga Maria

Notice that the second Davide Bellone has disappeared since it was a duplicate.

This article first appeared on Code4IT

Wrapping up

Choosing the right data type is crucial for building robust and performant applications.

In this article, we’ve used a SortedSet to insert items in a collection and expect them to be sorted and without duplicates.

I’ve never used it in a project. So, how did I know that? I just explored the libraries I was using!

From time to time, spend some minutes reading the documentation, have a glimpse of the most common libraries, and so on: you’ll find lots of stuff that you’ve never thought existed!

Toy with your code! Explore it. Be curious.

And have fun!

🐧



Source link

دیدگاه‌ها

دیدگاهتان را بنویسید

نشانی ایمیل شما منتشر نخواهد شد. بخش‌های موردنیاز علامت‌گذاری شده‌اند *