برچسب: size

  • Initialize lists size to improve performance | Code4IT

    Initialize lists size to improve performance | Code4IT


    Lists have an inner capacity. Every time you add more items than the current Capacity, you add performance overhead. How to prevent it?

    Table of Contents

    Just a second! 🫷
    If you are here, it means that you are a software developer.
    So, you know that storage, networking, and domain management have a cost .

    If you want to support this blog, please ensure that you have disabled the adblocker for this site.
    I configured Google AdSense to show as few ADS as possible – I don’t want to bother you with lots of ads, but I still need to add some to pay for the resources for my site.

    Thank you for your understanding.
    Davide

    Some collections, like List<T>, have a predefined initial size.

    Every time you add a new item to the collection, there are two scenarios:

    1. the collection has free space, allocated but not yet populated, so adding an item is immediate;
    2. the collection is already full: internally, .NET resizes the collection, so that the next time you add a new item, we fall back to option #1.

    Clearly, the second approach has an impact on the overall performance. Can we prove it?

    Here’s a benchmark that you can run using BenchmarkDotNet:

    [Params(2, 100, 1000, 10000, 100_000)]
    public int Size;
    
    [Benchmark]
    public void SizeDefined()
    {
        int itemsCount = Size;
    
        List<int> set = new List<int>(itemsCount);
        foreach (var i in Enumerable.Range(0, itemsCount))
        {
            set.Add(i);
        }
    }
    
    [Benchmark]
    public void SizeNotDefined()
    {
        int itemsCount = Size;
    
        List<int> set = new List<int>();
        foreach (var i in Enumerable.Range(0, itemsCount))
        {
            set.Add(i);
        }
    }
    

    Those two methods are almost identical: the only difference is that in one method we specify the initial size of the list: new List<int>(itemsCount).

    Have a look at the result of the benchmark run with .NET 7:

    Method Size Mean Error StdDev Median Gen0 Gen1 Gen2 Allocated
    SizeDefined 2 49.50 ns 1.039 ns 1.678 ns 49.14 ns 0.0248 104 B
    SizeNotDefined 2 63.66 ns 3.016 ns 8.507 ns 61.99 ns 0.0268 112 B
    SizeDefined 100 798.44 ns 15.259 ns 32.847 ns 790.23 ns 0.1183 496 B
    SizeNotDefined 100 1,057.29 ns 42.100 ns 121.469 ns 1,056.42 ns 0.2918 1224 B
    SizeDefined 1000 9,180.34 ns 496.521 ns 1,400.446 ns 8,965.82 ns 0.9766 4096 B
    SizeNotDefined 1000 9,720.66 ns 406.184 ns 1,184.857 ns 9,401.37 ns 2.0142 8464 B
    SizeDefined 10000 104,645.87 ns 7,636.303 ns 22,395.954 ns 99,032.68 ns 9.5215 1.0986 40096 B
    SizeNotDefined 10000 95,192.82 ns 4,341.040 ns 12,524.893 ns 92,824.50 ns 31.2500 131440 B
    SizeDefined 100000 1,416,074.69 ns 55,800.034 ns 162,771.317 ns 1,402,166.02 ns 123.0469 123.0469 123.0469 400300 B
    SizeNotDefined 100000 1,705,672.83 ns 67,032.839 ns 186,860.763 ns 1,621,602.73 ns 285.1563 285.1563 285.1563 1049485 B

    Notice that, in general, they execute in a similar amount of time; for instance when running the same method with 100000 items, we have the same magnitude of time execution: 1,416,074.69 ns vs 1,705,672.83 ns.

    The huge difference is with the allocated space: 400,300 B vs 1,049,485 B. Almost 2.5 times better!

    Ok, it works. Next question: How can we check a List capacity?

    We’ve just learned that capacity impacts the performance of a List.

    How can you try it live? Easy: have a look at the Capacity property!

    List<int> myList = new List<int>();
    
    foreach (var element in Enumerable.Range(0,50))
    {
        myList.Add(element);
        Console.WriteLine($"Items count: {myList.Count} - List capacity: {myList.Capacity}");
    }
    

    If you run this method, you’ll see this output:

    Items count: 1 - List capacity: 4
    Items count: 2 - List capacity: 4
    Items count: 3 - List capacity: 4
    Items count: 4 - List capacity: 4
    Items count: 5 - List capacity: 8
    Items count: 6 - List capacity: 8
    Items count: 7 - List capacity: 8
    Items count: 8 - List capacity: 8
    Items count: 9 - List capacity: 16
    Items count: 10 - List capacity: 16
    Items count: 11 - List capacity: 16
    Items count: 12 - List capacity: 16
    Items count: 13 - List capacity: 16
    Items count: 14 - List capacity: 16
    Items count: 15 - List capacity: 16
    Items count: 16 - List capacity: 16
    Items count: 17 - List capacity: 32
    Items count: 18 - List capacity: 32
    Items count: 19 - List capacity: 32
    Items count: 20 - List capacity: 32
    Items count: 21 - List capacity: 32
    Items count: 22 - List capacity: 32
    Items count: 23 - List capacity: 32
    Items count: 24 - List capacity: 32
    Items count: 25 - List capacity: 32
    Items count: 26 - List capacity: 32
    Items count: 27 - List capacity: 32
    Items count: 28 - List capacity: 32
    Items count: 29 - List capacity: 32
    Items count: 30 - List capacity: 32
    Items count: 31 - List capacity: 32
    Items count: 32 - List capacity: 32
    Items count: 33 - List capacity: 64
    Items count: 34 - List capacity: 64
    Items count: 35 - List capacity: 64
    Items count: 36 - List capacity: 64
    Items count: 37 - List capacity: 64
    Items count: 38 - List capacity: 64
    Items count: 39 - List capacity: 64
    Items count: 40 - List capacity: 64
    Items count: 41 - List capacity: 64
    Items count: 42 - List capacity: 64
    Items count: 43 - List capacity: 64
    Items count: 44 - List capacity: 64
    Items count: 45 - List capacity: 64
    Items count: 46 - List capacity: 64
    Items count: 47 - List capacity: 64
    Items count: 48 - List capacity: 64
    Items count: 49 - List capacity: 64
    Items count: 50 - List capacity: 64
    

    So, as you can see, List capacity is doubled every time the current capacity is not enough.

    Further readings

    To populate the lists in our Benchmarks we used Enumerable.Range. Do you know how it works? Have a look at this C# tip:

    🔗 C# Tip: LINQ’s Enumerable.Range to generate a sequence of consecutive numbers

    This article first appeared on Code4IT 🐧

    Wrapping up

    In this article, we’ve learned that just a minimal change can impact our application performance.

    We simply used a different constructor, but the difference is astounding. Clearly, this trick works only if already know the final length of the list (or, at least, an estimation). The more precise, the better!

    I hope you enjoyed this article! Let’s keep in touch on Twitter or on LinkedIn, if you want! 🤜🤛

    Happy coding!

    🐧





    Source link

  • Improve memory allocation by initializing collection size | Code4IT


    Sometimes just a minor change can affect performance. Here’s a simple trick: initialize your collections by specifying the initial size!

    Table of Contents

    Just a second! 🫷
    If you are here, it means that you are a software developer.
    So, you know that storage, networking, and domain management have a cost .

    If you want to support this blog, please ensure that you have disabled the adblocker for this site.
    I configured Google AdSense to show as few ADS as possible – I don’t want to bother you with lots of ads, but I still need to add some to pay for the resources for my site.

    Thank you for your understanding.
    Davide

    When you initialize a collection, like a List, you create it with the default size.

    Whenever you add an item to a collection, .NET checks that there is enough capacity to hold the new item. If not, it resizes the collection by doubling the inner capacity.

    Resizing the collection takes time and memory.

    Therefore, when possible, you should initialize the collection with the expected number of items it will contain.

    Initialize a List

    In the case of a List, you can simply replace new List<T>() with new List<T>(size). By specifying the initial size in the constructor’s parameters, you’ll have a good performance improvement.

    Let’s create a benchmark using BenchmarkDotNet and .NET 8.0.100-rc.1.23455.8 (at the time of writing, .NET 8 is still in preview. However, we can get an idea of the average performance).

    The benchmark is pretty simple:

    [MemoryDiagnoser]
    public class CollectionWithSizeInitializationBenchmarks
    {
        [Params(100, 1000, 10000, 100000)]
        public int Size;
    
        [Benchmark]
        public void WithoutInitialization()
        {
            List<int> list = new List<int>();
    
            for (int i = 0; i < Size; i++)
            {
    
                list.Add(i);
            }
        }
    
        [Benchmark(Baseline = true)]
        public void WithInitialization()
        {
            List<int> list = new List<int>(Size);
    
            for (int i = 0; i < Size; i++)
            {
                list.Add(i);
            }
        }
    }
    

    The only difference is in the list initialization: in the WithInitialization, we have List<int> list = new List<int>(Size);.

    Have a look at the benchmark result, split by time and memory execution.

    Starting with the execution time, we can see that without list initialization, we have an average 1.7x performance degradation.

    Method Size Mean Ratio
    WithoutInitialization 100 299.659 ns 1.77
    WithInitialization 100 169.121 ns 1.00
    WithoutInitialization 1000 1,549.343 ns 1.58
    WithInitialization 1000 944.862 ns 1.00
    WithoutInitialization 10000 16,307.082 ns 1.80
    WithInitialization 10000 9,035.945 ns 1.00
    WithoutInitialization 100000 388,089.153 ns 1.73
    WithInitialization 100000 227,040.318 ns 1.00

    If we talk about memory allocation, we waste an overage of 2.5x memory if compared to collections with size initialized.

    Method Size Allocated Alloc Ratio
    WithoutInitialization 100 1184 B 2.60
    WithInitialization 100 456 B 1.00
    WithoutInitialization 1000 8424 B 2.08
    WithInitialization 1000 4056 B 1.00
    WithoutInitialization 10000 131400 B 3.28
    WithInitialization 10000 40056 B 1.00
    WithoutInitialization 100000 1049072 B 2.62
    WithInitialization 100000 400098 B 1.00

    Initialize an HashSet

    Similar to what we’ve done with List’s, we can see significant improvements when initializing correctly other data types, such as HashSet’s.

    Let’s run the same benchmarks, but this time, let’s initialize a HashSet<int> instead of a List<int>.

    The code is pretty similar:

     [Benchmark]
     public void WithoutInitialization()
     {
         var set = new HashSet<int>();
    
         for (int i = 0; i < Size; i++)
         {
             set.Add(i);
         }
     }
    
     [Benchmark(Baseline = true)]
     public void WithInitialization()
     {
         var set = new HashSet<int>(Size);
    
         for (int i = 0; i < Size; i++)
         {
             set.Add(i);
         }
     }
    

    What can we say about performance improvements?

    If we talk about execution time, we can see an average of 2x improvements.

    Method Size Mean Ratio
    WithoutInitialization 100 1,122.2 ns 2.02
    WithInitialization 100 558.4 ns 1.00
    WithoutInitialization 1000 12,215.6 ns 2.74
    WithInitialization 1000 4,478.4 ns 1.00
    WithoutInitialization 10000 148,603.7 ns 1.90
    WithInitialization 10000 78,293.3 ns 1.00
    WithoutInitialization 100000 1,511,011.6 ns 1.96
    WithInitialization 100000 810,657.8 ns 1.00

    If we look at memory allocation, if we don’t initialize the HashSet, we are slowing down the application by a factor of 3x. Impressive!

    Method Size Allocated Alloc Ratio
    WithoutInitialization 100 5.86 KB 3.28
    WithInitialization 100 1.79 KB 1.00
    WithoutInitialization 1000 57.29 KB 3.30
    WithInitialization 1000 17.35 KB 1.00
    WithoutInitialization 10000 526.03 KB 3.33
    WithInitialization 10000 157.99 KB 1.00
    WithoutInitialization 100000 4717.4 KB 2.78
    WithInitialization 100000 1697.64 KB 1.00

    Wrapping up

    Do you need other good reasons to initialize your collection capacity when possible? 😉

    I used BenchmarkDotNet to create these benchmarks. If you want an introduction to this tool, you can have a look at how I used it to measure the performance of Enums:

    🔗 Enum.HasFlag performance with BenchmarkDotNet | Code4IT

    I hope you enjoyed this article! Let’s keep in touch on Twitter or LinkedIn! 🤜🤛

    Happy coding!

    🐧





    Source link