No caching of Azure client instances leads to constant overhead of configuring/connecting

So I'm noticing something that seems like a potential anti-pattern where Azure client instances are not being cached and instead constantly being re-materialized all the way from configuration up every single time.

It starts with the `AzureIAuthJanitorProviderExtensions::GetAzure` method. This method uses the fluent Azure SDK to construct the base `IAzure` instance which includes network calls to authenticate against, retrieve subscriptions and ultimately "bind" to the target Azure subscription. This also includes allocating a new `RestClient` instance which, if you dive deep enough into it, you'll find actually allocates a new `HttpClientHandler` each time. This is the root cause of the well known `HttpClient` anti-pattern which can/will ultimately result in TCP connection resource starvation.
 
By itself the current design of this method is probably as good as it can be, but it puts the onus on the caller, in this case the various `IAuthJanitorProvider` implementations, to know they need to cache the resulting client instance... spoiler alert tho: none of them do. 😞 

Now, that we understand the implementation and the costs here, let's dive into why it ends up being a problem. If you go look at the various `IAuthJanitorProvider` implementations you will find they constantly just call the extension method via `this.GetAzure()` whenever they want to do something against their respective Azure resources. For example, here's one such usage from the `AppSettingsWebAppApplicationLifecycleProvider` class which illustrates one of the bigger problems that comes from the current design:

https://github.com/microsoft/AuthJanitor/blob/eb2c5e614487a228555a5f832509d04aae627b23/AuthJanitor.Providers.AppServices/WebApps/AppSettingsWebAppApplicationLifecycleProvider.cs#L62-L86

In this method you see a call to `GetDeploymentSlot` first, this actually calls `GetWebApp` which calls `GetAzure()`. It then later calls `GetWebApp` directly which again calls `GetAzure`. So within that one method call, you've gone through the full creation, configuration and subsequent leaking of the underlying TCP connections. This is just one highly localized example where the problem surfaces. Here's another one from `WebApplicationLifecycleProvider::Test`:

https://github.com/microsoft/AuthJanitor/blob/eb2c5e614487a228555a5f832509d04aae627b23/AuthJanitor.Providers.AppServices/WebApps/WebAppApplicationLifecycleProvider.cs#L13-L23

Now, we see the `GetDeploymentSlot` method being called three distinct times here in just this one method.

But wait, it gets worse. 😫 If you understand the rekeying workflow of the `AuthJanitor` every provider is retrieved and goes through the following method call sequence:

Test -> GetSecretToUseDuringRekeying -> BeforeRekeying -> Rekey -> CommitNewSecrets -> AfterRekeying -> OnConsumingApplicationSwapped

Each one of those methods, if you dig into each provider implementation, usually makes one or more calls to `GetAzure()`. If you extrapolate out _N_ providers calling `GetAzure()` _M_ times per workflow every _T_ increments of time (right now 2mins) you can see how this becomes problematic pretty quickly.

There are several ways to solve this, but I would personally want to think about it more before recommending a solution. I just wanted to surface this problem so that we understand it exists right now and can begin to have discussions about.

	private async Task ApplySecrets(string slotName, List<RegeneratedSecret> secrets)
	{
	if (secrets.Count > 1 && secrets.Select(s => s.UserHint).Distinct().Count() != secrets.Count)
	{
	throw new Exception("Multiple secrets sent to Provider but without distinct UserHints!");
	}

	IUpdate<IDeploymentSlot> updateBase = (await GetDeploymentSlot(slotName)).Update();
	foreach (RegeneratedSecret secret in secrets)
	{
	var appSettingName = string.IsNullOrEmpty(secret.UserHint) ? Configuration.SettingName : $"{Configuration.SettingName}-{secret.UserHint}";
	_logger.LogInformation("Updating AppSetting '{0}' in slot '{1}' (as {2})", appSettingName, slotName,
	Configuration.CommitAsConnectionString ? "connection string" : "secret");

	updateBase = updateBase.WithAppSetting(appSettingName,
	Configuration.CommitAsConnectionString ? secret.NewConnectionStringOrKey : secret.NewSecretValue);
	}

	_logger.LogInformation("Applying changes.");
	await updateBase.ApplyAsync();

	_logger.LogInformation("Swapping to '{0}'", slotName);
	await (await GetWebApp()).SwapAsync(slotName);
	}
	}

	public override async Task Test()
	{
	var sourceDeploymentSlot = await GetDeploymentSlot(SourceSlotName);
	if (sourceDeploymentSlot == null) throw new Exception("Source Deployment Slot is invalid");

	var temporaryDeploymentSlot = await GetDeploymentSlot(TemporarySlotName);
	if (temporaryDeploymentSlot == null) throw new Exception("Temporary Deployment Slot is invalid");

	var destinationDeploymentSlot = await GetDeploymentSlot(DestinationSlotName);
	if (destinationDeploymentSlot == null) throw new Exception("Destination Deployment Slot is invalid");
	}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

No caching of Azure client instances leads to constant overhead of configuring/connecting #47

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

No caching of Azure client instances leads to constant overhead of configuring/connecting #47

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions