برچسب: items

Use a SortedSet to avoid duplicates and sort items | Code4IT
Using the right data structure is crucial to building robust and efficient applications. So, why use a List or a HashSet to sort items (and remove duplicates) when you have a SortedSet?

Table of Contents

Just a second! 🫷
If you are here, it means that you are a software developer.
So, you know that storage, networking, and domain management have a cost .

If you want to support this blog, please ensure that you have disabled the adblocker for this site.
I configured Google AdSense to show as few ADS as possible – I don’t want to bother you with lots of ads, but I still need to add some to pay for the resources for my site.

Thank you for your understanding.
– Davide

As you probably know, you can create collections of items without duplicates by using a HashSet<T> object.

It is quite useful to remove duplicates from a list of items of the same type.

How can we ensure that we always have sorted items? The answer is simple: SortedSet<T>!

HashSet: a collection without duplicates

A simple HashSet creates a collection of unordered items without duplicates.

This example
var hashSet = new HashSet<string>(); hashSet.Add("Turin"); hashSet.Add("Naples"); hashSet.Add("Rome"); hashSet.Add("Bari"); hashSet.Add("Rome"); hashSet.Add("Turin"); var resultHashSet = string.Join(',', hashSet); Console.WriteLine(resultHashSet);
prints this string: Turin,Naples,Rome,Bari. The order of the inserted items is maintained.

SortedSet: a sorted collection without duplicates

To sort those items, we have two approaches.

You can simply sort the collection once you’ve finished adding items:
var hashSet = new HashSet<string>(); hashSet.Add("Turin"); hashSet.Add("Naples"); hashSet.Add("Rome"); hashSet.Add("Bari"); hashSet.Add("Rome"); hashSet.Add("Turin"); var items = hashSet.ToList<string>().OrderBy(s => s); var resultHashSet = string.Join(',', items); Console.WriteLine(resultHashSet);
Or, even better, use the right data structure: a SortedSet<T>
var sortedSet = new SortedSet<string>(); sortedSet.Add("Turin"); sortedSet.Add("Naples"); sortedSet.Add("Rome"); sortedSet.Add("Bari"); sortedSet.Add("Rome"); sortedSet.Add("Turin"); var resultSortedSet = string.Join(',', sortedSet); Console.WriteLine(resultSortedSet);
Both results print Bari,Naples,Rome,Turin. But the second approach does not require you to sort a whole list: it is more efficient, both talking about time and memory.

Use custom sorting rules

What if we wanted to use a SortedSet with a custom object, like User?
public class User { public string FirstName { get; set; } public string LastName { get; set; } public User(string firstName, string lastName) { FirstName = firstName; LastName = lastName; } }
Of course, we can do that:
var set = new SortedSet<User>(); set.Add(new User("Davide", "Bellone")); set.Add(new User("Scott", "Hanselman")); set.Add(new User("Safia", "Abdalla")); set.Add(new User("David", "Fowler")); set.Add(new User("Maria", "Naggaga")); set.Add(new User("Davide", "Bellone"));//DUPLICATE! foreach (var user in set) { Console.WriteLine($"{user.LastName} {user.FirstName}"); }
But, we will get an error: our class doesn’t know how to compare things!

That’s why we must update our User class so that it implements the IComparable interface:
public class User : IComparable { public string FirstName { get; set; } public string LastName { get; set; } public User(string firstName, string lastName) { FirstName = firstName; LastName = lastName; } public int CompareTo(object obj) { var other = (User)obj; var lastNameComparison = LastName.CompareTo(other.LastName); return (lastNameComparison != 0) ? lastNameComparison : (FirstName.CompareTo(other.FirstName)); } }
In this way, everything works as expected:
Abdalla Safia Bellone Davide Fowler David Hanselman Scott Naggaga Maria
Notice that the second Davide Bellone has disappeared since it was a duplicate.

This article first appeared on Code4IT

Wrapping up

Choosing the right data type is crucial for building robust and performant applications.

In this article, we’ve used a SortedSet to insert items in a collection and expect them to be sorted and without duplicates.

I’ve never used it in a project. So, how did I know that? I just explored the libraries I was using!

From time to time, spend some minutes reading the documentation, have a glimpse of the most common libraries, and so on: you’ll find lots of stuff that you’ve never thought existed!

Toy with your code! Explore it. Be curious.

And have fun!

🐧
Source link
سپتامبر 11, 2025
Access items from the end of the array using the ^ operator | Code4IT
Just a second! 🫷
If you are here, it means that you are a software developer.
So, you know that storage, networking, and domain management have a cost .

If you want to support this blog, please ensure that you have disabled the adblocker for this site.
I configured Google AdSense to show as few ADS as possible – I don’t want to bother you with lots of ads, but I still need to add some to pay for the resources for my site.

Thank you for your understanding.
– Davide

Say that you have an array of N items and you need to access an element counting from the end of the collection.

Usually, we tend to use the Length property of the array, and then subtract the number corresponding to the position we want to reach:
string[] values = { "alfa", "bravo", "charlie", "delta", "echo", "foxtrot", "golf" }; var echo = values[values.Length - 3];
As you can see, we are accessing the same variable twice in a row: values[values.Length - 3].

We can simplify that specific line of code by using the ^ operator:
string[] values = { "alfa", "bravo", "charlie", "delta", "echo", "foxtrot", "golf" }; var echo = values[^3];
Yes, that’s just syntactic sugar, but it can help make your code more readable. In fact, if you have a look at the IL code generated by both examples, they are perfectly identical. IL is quite difficult to read and understand, but you can acknowledge that both syntaxes are equivalent by looking at the decompiled C# code:

Performance is not affected by this operator, so it’s just a matter of readability.

Clearly, you still have to take care of array bounds – if you access values[^55] you’ll get an IndexOutOfRangeException.

Pay attention that the position is 1-based!
string[] values = { "alfa", "bravo", "charlie", "delta", "echo", "foxtrot", "golf" }; Console.WriteLine(values[^1]); //golf Console.WriteLine(values[^0]); //IndexOutOfRangeException
Further readings

Using ^ is a nice trick that many C# developers don’t know. There are some special characters that can help us but are often not used. Like the @ operator!

🔗 C# Tip: use the @ prefix when a name is reserved

This article first appeared on Code4IT 🐧

Wrapping up

In this article, we’ve learned that just using the right syntax can make our code much more readable.

But we also learned that not every new addition in the language brings performance improvements to the table.

I hope you enjoyed this article! Let’s keep in touch on Twitter or on LinkedIn, if you want! 🤜🤛

Happy coding!

🐧
Source link
جولای 24, 2025

Is Random.GetItems the best way to get random items in C# 12? | Code4IT

You have a collection of items. You want to retrieve N elements randomly. Which alternatives do we have?

Just a second! 🫷
If you are here, it means that you are a software developer.
So, you know that storage, networking, and domain management have a cost .

If you want to support this blog, please ensure that you have disabled the adblocker for this site.
I configured Google AdSense to show as few ADS as possible – I don’t want to bother you with lots of ads, but I still need to add some to pay for the resources for my site.

Thank you for your understanding.
– Davide

One of the most common operations when dealing with collections of items is to retrieve a subset of these elements taken randomly.

Before .NET 8, the most common way to retrieve random items was to order the collection using a random value and then take the first N items of the now sorted collection.

From .NET 8 on, we have a new method in the Random class: GetItems.

So, should we use this method or stick to the previous version? Are there other alternatives?

For the sake of this article, I created a simple record type, CustomRecord, which just contains two properties.

public record CustomRecord(int Id, string Name);

I then stored a collection of such elements in an array. This article’s final goal is to find the best way to retrieve a random subset of such items. Spoiler alert: it all depends on your definition of best!

Method #1: get random items with Random.GetItems

Starting from .NET 8, released in 2023, we now have a new method belonging to the Random class: GetItems.

There are three overloads:

public T[] GetItems<T>(T[] choices, int length);
public T[] GetItems<T>(ReadOnlySpan<T> choices, int length);
public void GetItems<T>(ReadOnlySpan<T> choices, Span<T> destination);

We will focus on the first overload, which accepts an array of items (choices) in input and returns an array of size length.

We can use it as such:

CustomRecord[] randomItems = Random.Shared.GetItems(Items, TotalItemsToBeRetrieved);

Simple, neat, efficient. Or is it?

Method #2: get the first N items from a shuffled copy of the initial array

Another approach is to shuffle the whole initial array using Random.Shuffle. It takes in input an array and shuffles the items in-place.

Random.Shared.Shuffle(Items);
CustomRecord[] randomItems = copy.Take(TotalItemsToBeRetrieved).ToArray();

If you need to preserve the initial order of the items, you should create a copy of the initial array and shuffle only the copy. You can do this by using this syntax:

CustomRecord[] copy = [.. Items];

If you just need some random items and don’t care about the initial array, you can shuffle it without making a copy.

Once we’ve shuffled the array, we can pick the first N items to get a subset of random elements.

Method #3: order by Guid, then take N elements

Before .NET 8, one of the most used approaches was to order the whole collection by a random value, usually a newly generated Guid, and then take the first N items.

var randomItems = Items
    .OrderBy(_ => Guid.NewGuid()) // THIS!
    .Take(TotalItemsToBeRetrieved)
    .ToArray();

This approach works fine but has the disadvantage that it instantiates a new Guid value for every item in the collection, which is an expensive memory-wise operation.

Method #4: order by Number, then take N elements

Another approach was to generate a random number used as a discriminator to order the collection; then, again, we used to get the first N items.

var randomItems = Items
    .OrderBy(_ => Random.Shared.Next()) // THIS!
    .Take(TotalItemsToBeRetrieved)
    .ToArray();

This approach is slightly better because generating a random integer is way faster than generating a new Guid.

Benchmarks of the different operations

It’s time to compare the approaches.

I used BenchmarkDotNet to generate the reports and ChartBenchmark to represent the results visually.

Let’s see how I structured the benchmark.

[MemoryDiagnoser]
public class RandomItemsBenchmark
{
    [Params(100, 10_000, 1_000_000)]
    public int Size;

    private CustomRecord[] Items;
    private int TotalItemsToBeRetrieved;
    private CustomRecord[] Copy;

    [IterationSetup]
    public void Setup()
    {
        var ids = Enumerable.Range(0, Size).ToArray();
        Items = ids.Select(i => new CustomRecord(i, $"Name {i}")).ToArray();
        Copy = [.. Items];

        TotalItemsToBeRetrieved = Random.Shared.Next(Size);
    }

    [Benchmark(Baseline = true)]
    public void WithRandomGetItems()
    {
        CustomRecord[] randomItems = Random.Shared.GetItems(Items, TotalItemsToBeRetrieved);
        _ = randomItems.Length;
    }

    [Benchmark]
    public void WithRandomGuid()
    {
        CustomRecord[] randomItems = Items
            .OrderBy(_ => Guid.NewGuid())
            .Take(TotalItemsToBeRetrieved)
            .ToArray();

        _ = randomItems.Length;
    }

    [Benchmark]
    public void WithRandomNumber()
    {
        CustomRecord[] randomItems = Items
            .OrderBy(_ => Random.Shared.Next())
            .Take(TotalItemsToBeRetrieved)
            .ToArray();

        _ = randomItems.Length;
    }

    [Benchmark]
    public void WithShuffle()
    {
        CustomRecord[] copy = [.. Items];

        Random.Shared.Shuffle(copy);
        CustomRecord[] randomItems = copy.Take(TotalItemsToBeRetrieved).ToArray();

        _ = randomItems.Length;
    }

    [Benchmark]
    public void WithShuffleNoCopy()
    {
        Random.Shared.Shuffle(Copy);
        CustomRecord[] randomItems = Copy.Take(TotalItemsToBeRetrieved).ToArray();

        _ = randomItems.Length;
    }
}

We are going to run the benchmarks on arrays with different sizes. We will start with a smaller array with 100 items and move to a bigger one with one million items.

We generate the initial array of CustomRecord instances for every iteration and store it in the Items property. Then, we randomly choose the number of items to get from the Items array and store it in the TotalItemsToBeRetrieved property.

We also generate a copy of the initial array at every iteration; this way, we can run Random.Shuffle without modifying the original array.

Finally, we define the body of the benchmarks using the implementations we saw before.

Notice: I marked the benchmark for the GetItems method as a baseline, using [Benchmark(Baseline = true)]. This way, we can easily see the results ratio for the other methods compared to this specific method.

When we run the benchmark, we can see this final result (for simplicity, I removed the Error, StdDev, and Median columns):

Method	Size	Mean	Ratio	Allocated	Alloc Ratio
WithRandomGetItems	100	6.442 us	1.00	424 B	1.00
WithRandomGuid	100	39.481 us	6.64	3576 B	8.43
WithRandomNumber	100	22.219 us	3.67	2256 B	5.32
WithShuffle	100	7.038 us	1.16	1464 B	3.45
WithShuffleNoCopy	100	4.254 us	0.73	624 B	1.47

WithRandomGetItems	10000	58.401 us	1.00	5152 B	1.00
WithRandomGuid	10000	2,369.693 us	65.73	305072 B	59.21
WithRandomNumber	10000	1,828.325 us	56.47	217680 B	42.25
WithShuffle	10000	180.978 us	4.74	84312 B	16.36
WithShuffleNoCopy	10000	156.607 us	4.41	3472 B	0.67

WithRandomGetItems	1000000	15,069.781 us	1.00	4391616 B	1.00
WithRandomGuid	1000000	319,088.446 us	42.79	29434720 B	6.70
WithRandomNumber	1000000	166,111.193 us	22.90	21512408 B	4.90
WithShuffle	1000000	48,533.527 us	6.44	11575304 B	2.64
WithShuffleNoCopy	1000000	37,166.068 us	4.57	6881080 B	1.57

By looking at the numbers, we can notice that:

GetItems is the most performant method, both for time and memory allocation;
using Guid.NewGuid is the worst approach: it’s 10 to 60 times slower than GetItems, and it allocates, on average, 4x the memory;
sorting by random number is a bit better: it’s 30 times slower than GetItems, and it allocates around three times more memory;
shuffling the array in place and taking the first N elements is 4x slower than GetItems; if you also have to preserve the original array, notice that you’ll lose some memory allocation performance because you must allocate more memory to create the cloned array.

Here’s the chart with the performance values. Notice that, for better readability, I used a Log10 scale.

Results comparison for all executions

If we move our focus to the array with one million items, we can better understand the impact of choosing one approach instead of the other. Notice that here I used a linear scale since values are on the same magnitude order.

The purple line represents the memory allocation in bytes.

Results comparison for one-million-items array

So, should we use GetItems all over the place? Well, no! Let me tell you why.

The problem with Random.GetItems: repeated elements

There’s a huge problem with the GetItems method: it returns duplicate items. So, if you need to get N items without duplicates, GetItems is not the right choice.

Here’s how you can demonstrate it.

First, create an array of 100 distinct items. Then, using Random.Shared.GetItems, retrieve 100 items.

The final array will have 100 items; the array may or may not contain duplicates.

int[] source = Enumerable.Range(0, 100).ToArray();

StringBuilder sb = new StringBuilder();

for (int i = 1; i <= 200; i++)
{
    HashSet<int> ints = Random.Shared.GetItems(source, 100).ToHashSet();
    sb.AppendLine($"run-{i}, {ints.Count}");
}

var finalCsv = sb.ToString();

To check the number of distinct elements, I put the resulting array in a HashSet<int>. The final size of the HashSet will give us the exact percentage of unique values.

If the HashSet size is exactly 100, it means that GetItems retrieved each element from the original array exactly once.

For simplicity, I formatted the result in CSV format so that I could generate plots with it.

Unique values percentage returned by GetItems

As you can see, on average, we have 65% of unique items and 35% of duplicate items.

Wrapping up

You should not replace the way you get random items from the array by using Random.GetItems. Well, unless you are okay with having duplicates.

If you need unique values, you should rely on other methods, such as Random.Shuffle.

All in all, always remember to validate your assumptions by running experiments on the methods you are not sure you can trust!

I hope you enjoyed this article! Let’s keep in touch on Twitter or LinkedIn! 🤜🤛

Happy coding!

🐧

Source link

می 30, 2025

برچسب: items

Use a SortedSet to avoid duplicates and sort items | Code4IT

Table of Contents

HashSet: a collection without duplicates

SortedSet: a sorted collection without duplicates

Use custom sorting rules

Wrapping up

Access items from the end of the array using the ^ operator | Code4IT

Further readings

Wrapping up

Is Random.GetItems the best way to get random items in C# 12? | Code4IT

Table of Contents

Method #1: get random items with Random.GetItems

Method #2: get the first N items from a shuffled copy of the initial array

Method #3: order by Guid, then take N elements

Method #4: order by Number, then take N elements

Benchmarks of the different operations

The problem with Random.GetItems: repeated elements

Further readings

Wrapping up