Resiliency Upgrade Is Now Easier Than Ever

I have been working on a small project that has some components which have some real-time inter-dependencies. Most of these dependencies are actually making sure a request with access token (JWT/OpenId) is from a genuine user and that user has correct access rights. Although the token and UserPrincipal have enough information, regarding the critical operations at some scenarios, I still have to do a real-time check. A sample architecture diagram for these dependencies are highlighted with yellow.

As you can see, the availability of Identity Context is important for the other contexts for some important operations. In this sense, I was looking at latest docs and remember this great article to upgrade the resiliency.

https://docs.microsoft.com/en-us/dotnet/architecture/microservices/implement-resilient-applications/implement-http-call-retries-exponential-backoff-polly

To apply these, I didn’t need to change a lot of things and still achieved what is promised. Also, keep in mind that I have ApiClientLibrary convention that handles all the calls using our famous friend HttpClient.

A sample API Call is achieved like below in one of my ApiClientLibrary

public async Task<ApiResponse<UserResponse>> GetUserDetailsByEmailAsync(string email)
{
    var url = UserApiUri.UserDetailsByEmail(email);
    var result = await httpClient.GetAsync(url);

    if (result.IsSuccessStatusCode)
    {
        var response = await result.Content.ReadAsAsync<UserResponse>();
        return ApiResponse<UserResponse>.Success(response);
    }

    var resultContent = await result.Content.ReadAsStringAsync();
    return ApiResponse<UserResponse>.Failure(resultContent, result.StatusCode, result);
}

As highlighted on Line 4, await httpClient.GetAsync(url) can easily become a point of failure and can drastically reduce your availability.

Packages Required

To increase the availability, you can follow the steps below and get a nice retry policy for transient network errors.

Start by adding following package dependencies to your csproj

<PackageReference Include="Microsoft.Extensions.Http" Version="2.2.0" />
<PackageReference Include="Microsoft.Extensions.Http.Polly" Version="2.2.0" />

Startup.cs Changes

In your startup.cs add following dependencies using AddHttpClient extension method. This method comes from above packages and pretty much where the magic starts. It basically registers your implementation to HttpClient pool via HttpClientFactory pattern and gives you a HttpClient from the pool whenever you require. My case was something like this:

private readonly HttpClient _httpClient;

public UserApiClient(HttpClient httpClient, IOptions<NavigationConfiguration> navigationConfigurationOptions)
{
    _identityService = identityService;
    _httpClient = httpClient;
    _httpClient.BaseAddress = new Uri(navigationConfigurationOptions.Value.ApiUrl);
}

Next stop is changing your registry of IUserApiClient as below:

services.AddHttpClient<IUserApiClient, UserApiClient>()
        .SetHandlerLifetime(TimeSpan.FromMinutes(5))
        .AddPolicyHandler(GetRetryPolicy());


private static IAsyncPolicy<HttpResponseMessage> GetRetryPolicy()
{
    var jitterer = new Random();
    return HttpPolicyExtensions
        .HandleTransientHttpError()
        .OrResult(msg => msg.StatusCode == System.Net.HttpStatusCode.NotFound)
        .WaitAndRetryAsync(5, retryAttempt => TimeSpan.FromSeconds(Math.Pow(2, retryAttempt)) + TimeSpan.FromMilliseconds(jitterer.Next(0, 100)));
}

Retry Policy & The Goal

In Polly’s own words and from the Microsoft Docs:

With Polly, you can define a Retry policy with the number of retries, the exponential backoff configuration and the actions to take when there’s an HTTP exception, such as logging the error. In this case, the policy is configured to try 5 times with an exponential retry, starting at two seconds. So it will try six times and the seconds between each retry will be exponential, starting on two seconds.

A regular Retry policy can impact your system in cases of high concurrency and scalability and under high contention. To overcome peaks of similar retries coming from many clients in case of partial outages, a good workaround is to add a jitter strategy to the retry algorithm/policy. This can improve the overall performance of the end-to-end system by adding randomness to the exponential backoff. A regular Retry policy can impact your system in cases of high concurrency and scalability and under high contention. To overcome peaks of similar retries coming from many clients in case of partial outages, a good workaround is to add a jitter strategy to the retry algorithm/policy.

Testing

After these steps, I have started my web app and intentionally not started the identity api and hit a URL that requires the user details. As you can see from this screenshot, it has tried 5 times and then finally failed. All of these happened without me changing any line of code in my services or business but by just playing with some registry at Startup.cs and extension methods.

Summary

By the raise of distributed systems and increased pressure from users to demand more from businesses, these kind of tricks are becoming more important. On the other hand, if you are having an eventual consistency pattern or events driven architecture, these can even become more handy and reduce code complexity a lot by standing on the shoulders of the giants…

Life is short, .NetCore is awesome, Polly is great, Microservices are hard!