A common scenario in web application is that multiple threads (i.e. concurrent page requests) concurrently try to access the same external resource at the same time. For optimal performance, threads that need to access the same external resource should only access it once – i.e. the first thread should call the resource while all subsequent threads just wait to the result instead of calling the resource over and over. With .NET Core, this kind of concurrent access to external resources can be achieved very easily.
Concurrent dictionary and Lazy: Working together
Enter a concurrent dictionary with lazy tasks of T.
private static readonly ConcurrentDictionary<string, Lazy<Task<string=>>> ExternalResources = new ConcurrentDictionary<string, Lazy<Task<string=>>> (StringComparer.OrdinalIgnoreCase);
The dictionary will mainly be used by its GetOrAdd method.
The first thread that requires data from the external resource will call this method and end up adding a new instance of Lazy<>. The Task executes and returns when Lazy calls its property value.
All subsequent threads that need the same data call GetOrAdd with the same key and end up getting the same Lazy<>, which in turn returns the same instance of Task<>. The task awaits if it is still running, but the result value immediately returns if the task is already complete
All threads needing the same data end up waiting for the same instance of Task<>.
IHttpClientFactory httpCientFactory = null; // TODO: Get IHttpClientFactory (via DI). var url = "https://my.externalresource.net/0123"; var task = ExternalResources.GetOrAdd(url, key => new Lazy<Task<string>>(async () => { using var httpClient = httpCientFactory.CreateClient(); var response = await httpClient.GetAsync(key); var content = await response.Content.ReadAsStringAsync(); return content; })).Value; var data = await task;
Why so Lazy?
But why the Lazy<>? Why not just use a ConcurrentDictionary<string, Task<string>>?
It put it simply, ConcurrentDictionary locks access to the internal dictionary, but not to the factory delegates passed to it (executing the delegate inside the lock would lock unknown code and potentially cause deadlocks).
If you call GetOrAdd simultaneously on different threads, valueFactory may be called multiple times, but only one key/value pair will be added to the dictionary.
https://docs.microsoft.com/en-us/dotnet/api/system.collections.concurrent.concurrentdictionary-2.getoradd
That means that when multiple threads need the same data simultaneously, those threads will call GetOrAdd. Each factory delegate executes if the key is not present. The first delegate that finishes acquires the lock and adds the data to the dictionary. The other threads will encounter a present key and not add the the data. Still, the external resource has been called multiple times for the same data.
When we add the Lazy<>, all multiple threads will only return a different instance of Lazy<>, but not execute the task. Only the first instance of Lazy<> will be added to the dictionary, but since no call to the external resource has yet been made, no expensive code has been run.
The task only only executes when Lazy’s Value property is read. Since only one single instance of Lazy<> returns, the external resource is only called once.
Now we have easily created concurrent access to external resources with .NET Core.