Saturday, May 30, 2026

AD PKI vs Cloud-based PKI


Public Key Infrastructure is one of those foundational technologies that most organizations installed once, pointed at Active Directory, and largely forgot about — until something breaks, a certificate expires at 2am on a Sunday, or an auditor starts asking uncomfortable questions about your root CA.

Now, with cloud-based PKI services maturing rapidly and Microsoft pushing hard toward cloud-native identity, the question of whether to keep running an on-premises AD-integrated PKI or migrate to a cloud-based alternative is coming up in more and more architecture conversations.

The honest answer is: it depends. But the factors it depends on are worth understanding clearly before you make the call.

In this blog I am focusing on certificates for end points for Wi-Fi authentication.

Over the years Microsoft has never made a big effort to b able to issue certificates to non-AD-integrated Windows platforms. This is something you should use a MDM solution for and that MDM solution can either use a cloud-based PKI or keep using the AD PKI. Other aspects like RADIUS authentication require a separate discussion. All cloud-based solutions are based on the number of users, while the AD PKI requires more or less only the server license.



What We Mean by "AD PKI"

When most organizations say they have a PKI, they mean Microsoft Active Directory Certificate Services (AD CS) — a Windows Server role that has been the default enterprise PKI for two decades. It integrates tightly with Active Directory, auto-enrolls certificates to domain-joined machines via Group Policy, and handles everything from user authentication certificates to machine identity, S/MIME email signing, Wi-Fi EAP-TLS, and internal TLS for servers and web applications.

It works well. It is deeply embedded. And for many organizations, it is also unmaintained, under-documented, and running on hardware that predates the current IT team.

Cloud-based PKI comes in several flavors: Microsoft Cloud PKI (part of Intune / Microsoft Entra Suite),  Glueck Kanja SCEPman, SCEP and PKCS integrations with third-party services like DigiCert, Sectigo, or GlobalSign, and purpose-built cloud PKI platforms like Smallstep, Keyfactor, or Venafi. Each has a different architecture and a different value proposition, but they share a common premise: move certificate lifecycle management off your infrastructure and into a managed service.


The Case for Keeping AD PKI

It works — and you've already paid for it

AD CS is included in Windows Server. If you are running a hybrid Active Directory environment, you already have the licensing, the infrastructure, and the integration in place. For organizations with a significant on-premises footprint - domain-joined endpoints, on-premises servers, legacy applications - AD CS is the path of least resistance. 

Deep Group Policy integration

AD CS integrates with Group Policy in ways that cloud PKI services cannot fully replicate for legacy workloads. Certificate auto-enrollment via GPO is mature, well-understood, and effectively zero-touch for end users and devices once configured correctly. For Windows environments with a large base of domain-joined endpoints, this integration eliminates a significant category of operational overhead. GPO deployment is not working anymore of all endpoints are becoming cloud-only.

Air-gapped and high-security environments

For organizations with genuine air-gap requirements — classified government environments, operational technology networks, critical infrastructure — an on-premises PKI is not optional, it is a compliance requirement. A cloud-based PKI that requires internet connectivity for certificate issuance is architecturally incompatible with these environments regardless of how mature the service is.

Control and auditability

With AD CS you own every component: the root CA (ideally offline), the issuing CA, the certificate database, the CRL distribution points. For organizations with strict compliance requirements around cryptographic key custody — financial services, healthcare, federal contractors — that level of control has real value. You know exactly where your private keys are. You can demonstrate it to an auditor. This also means on a lot of cases that Hardware Security Modules are required and adding complexity and costs.


The Case for Moving to Cloud PKI

AD CS has an operational debt problem

The average enterprise AD CS deployment was designed and implemented years ago by someone who may no longer work there. Certificate templates have accumulated. The root CA may be running on aging hardware with no documented recovery procedure. CRL distribution points may reference internal hostnames that no longer exist. The private keys for the root CA may be stored on the CA server itself rather than in an HSM.

None of this is unusual. It is, in fact, typical. And it means the organization is carrying significant operational and security risk in infrastructure that nobody owns, nobody reviews, and nobody has tested recovering from a failure.

Cloud PKI eliminates most of that operational burden. The provider manages the underlying infrastructure, handles availability, and maintains the service. You configure certificate profiles and policies; they handle the rest.

Modern device management requires cloud PKI

If your endpoint management strategy has shifted to Microsoft Intune — managing Entra-joined devices, BYOD, or remote workers who are never on the corporate network — AD CS creates a fundamental problem. Domain-joined auto-enrollment via GPO only works for devices that are on the domain. Intune-managed devices that are Entra-joined (not hybrid-joined) cannot reach an on-premises CA for SCEP or PKCS certificate issuance without a Network Device Enrollment Service (NDES) proxy or equivalent.

That proxy adds complexity, becomes a single point of failure, and requires ongoing maintenance. Microsoft Cloud PKI, by contrast, is natively integrated with Intune — no proxy, no on-premises dependency, no VPN requirement. For organizations moving toward a cloud-first or zero-trust endpoint posture, this is a significant architectural simplification.

Certificate lifecycle visibility

One of the most common failure patterns in enterprise environments is the unexpected certificate expiration — a server goes down, a Wi-Fi authentication policy stops working, an internal application throws TLS errors. AD CS does not provide enterprise-wide certificate visibility out of the box. You need additional tooling (or manual PowerShell scripting) to build a picture of what certificates have been issued, to what, and when they expire.

Modern cloud PKI platforms are built around visibility as a core feature. Certificate inventory, expiry alerting, and revocation management are first-class capabilities rather than afterthoughts.

Talent and knowledge transfer

AD CS expertise is narrowing. The pool of engineers who deeply understand certificate templates, OID structures, CDP and AIA extensions, and HSM integration is smaller than it was ten years ago and getting smaller. Cloud PKI services abstract away most of that complexity, which means more engineers on your team can own and operate the infrastructure without specialized PKI training.


The Middle Path: Hybrid PKI

For most organizations, the answer is not a binary choice. A practical approach that is gaining traction:

Keep the root CA on-premises, offline, and air-gapped. The root CA is the trust anchor for your entire PKI. It should be rarely used (only to sign issuing CA certificates), stored offline, and protected with the highest level of physical and cryptographic control your organization can provide. This part of the infrastructure rarely changes and benefits from the control that on-premises ownership provides.

Move issuing CA functions to the cloud. Microsoft Cloud PKI supports Bring Your Own CA (BYOCA) — you can subordinate a cloud issuing CA to your existing on-premises root, preserving your trust chain while offloading the operational complexity of certificate issuance and lifecycle management to a managed service.

Use cloud PKI for Intune-managed endpoints, AD CS for domain-joined legacy workloads. Run both in parallel during transition, with a defined roadmap for retiring the on-premises issuing CA as domain-joined workloads migrate to Entra join.


A Framework for the Decision

FactorFavor AD CSFavor Cloud PKI
Endpoint managementPrimarily domain-joined via GPOIntune-managed, Entra-joined, or BYOD
Network modelOn-premises or hybridCloud-first, zero-trust, remote-heavy
Compliance / key custodyStrict HSM and key control requirementsStandard enterprise compliance posture, AKV HSM backed
Air-gap requirementsPresent (OT, classified, critical infra)Not applicable
PKI operational maturityWell-documented, actively maintainedUndocumented, aging, no clear owner
Team expertiseAD CS skills in-houseGeneralist team, cloud-native preference
Legacy app dependenciesSignificant (Kerberos PKINIT, S/MIME), Data at rest encryption, contract signingMinimal or already migrated, Authentication only


AD PKI is not legacy technology that needs to be replaced on principle but we would like to reduce the dependency on it as we like to minimize or remove AD. In the right environment — hybrid, well-maintained, with genuine on-premises requirements — it is still the right answer. The problem is not the technology; it is the organizational reality that most AD CS deployments are not well-maintained, and the expertise to maintain them correctly is becoming harder to find and retain.

A Cloud-based PKI (SCEPman, Intune Cloud PKI, ...) is not a silver bullet. It introduces its own dependencies, its own vendor relationships, and its own constraints. But for organizations moving toward cloud-first endpoint management, a zero-trust network model, or simply trying to reduce the operational debt carried by aging infrastructure, it removes a category of complexity that has historically been a source of quiet, chronic risk.

The question worth asking is not "which is better?" It is: what does our PKI actually look like today, who owns it, and what would happen if we had to rebuild it from scratch tomorrow?

If you can answer all three confidently — keep what you have and invest in improving it. If you cannot — that is the case for change.


The Hidden Cost of a Username Change

 

There's a change request that looks routine on the surface. An employee gets married. A legal name change comes through HR. A company completes a domain consolidation after an acquisition. Someone just wants their email to finally match their preferred name.

The ticket says: Update UPN from jsmith@frontoso.com to john.smith@frontoso.com.

To the helpdesk, it looks like a two-minute task. To the identity engineer who understands what that UPN touches, it looks like a change management project.

I've been cataloguing the UPN change behavior of over 65 SaaS applications — everything from Microsoft 365 workloads to physical access control systems — and the pattern I keep seeing is the same: the identity change is easy. The downstream consequences are not.


Why a UPN Change Is Never Just a UPN Change

In a modern enterprise, the User Principal Name is not just a login credential. It is the primary key that dozens of systems use to identify, provision, and authorize a user. Change it in one place, and you have not changed it everywhere — you have created a mismatch that each system will resolve in its own way.

Some systems handle it gracefully. They anchor user identity to an immutable internal object ID, accept the UPN change via SCIM, and update their records without any service disruption. These are the well-architected ones.

Others use the UPN or email address as the primary key in their user store. Change the UPN upstream, and the next time that user tries to log in, the application does not recognize them. It creates a new account. The old account — with its permissions, its content, its history — sits orphaned.

And then there is a third category that is in some ways the most dangerous: applications where the UPN change succeeds silently, but a downstream business process quietly stops working. Approval workflows that route to the old email. Certificates that still carry the old UPN in the Subject Alternative Name. SIP addresses that were set at provisioning time and never updated.

The user can log in. Everything appears fine. And then three weeks later someone asks why their Oracle approval workflow notifications stopped arriving, or why their Wi-Fi drops every time their device certificate is checked.


The Failure Modes, Categorized

After testing and documenting 65 applications, the failure modes cluster into a few distinct patterns.

1. The Duplicate Account Problem

This is the most common failure mode for applications that use JIT (Just-In-Time) provisioning. JIT creates a user account on first login by matching the incoming SAML assertion to an existing record — or creating a new one if no match is found. If the UPN changes, the assertion no longer matches, and a new account is created.

The old account is not deleted. It just sits there, with all of its permissions and content, attached to an identity the user can no longer access. Examples: Zendesk, ArcGIS Online, Cisco Webex (if SCIM attribute mapping is misconfigured), SendGrid.

ArcGIS Online is a particularly painful case. The internal username is derived from the SAML NameID at first login and is immutable after that. A UPN change means a new ArcGIS account. All content, group memberships, and licenses on the old account must be manually migrated — there is no automated path.

2. The Silent Business Process Failure

These applications can survive a UPN change without breaking authentication — but something downstream stops working in a way that is not immediately obvious.

Oracle Fusion ERP is the clearest example in my research. Oracle approval workflows send notification emails using the email address stored in the Oracle user record. If that email address does not match the user's actual primary email after a UPN change, approval notifications route to a dead address. Approvals time out. Business processes stall. Nobody gets an error message — the workflow just stops producing results.

Microsoft Teams has a similar pattern. The Teams identity survives the UPN change because it is anchored to the Entra ID objectId. But the SIP address — which was provisioned from the UPN at the time the Teams account was created — does not automatically update. Federated calls still route to the old SIP address. The user's outbound SIP identity mismatches. External meeting invites show the old address. This requires a manual PowerShell operation to correct: Set-CsUser -Identity <newUPN> -SipAddress sip:<newUPN>.

Microsoft Intune surfaces the most technically subtle failure. Device enrollment and policy sync survive the UPN change cleanly — Intune uses objectId as the anchor. But SCEP and PKCS certificates issued to enrolled devices contain the old UPN in the Subject Alternative Name field. Those certificates remain valid until expiry. Applications that validate the SAN against the current Entra UPN — Wi-Fi using EAP-TLS, VPN with certificate authentication — will fail after the UPN change because the certificate no longer matches the identity. New certificates are issued on the next policy sync cycle, but the window between UPN change and certificate renewal is a silent authentication failure waiting to happen.

3. The MFA Device Orphan Problem

Duo Security deserves its own category. Duo identifies users by username — not by an immutable object ID. If the UPN is used as the Duo username (which is the default configuration with Entra ID directory sync), a UPN change causes Duo to create a new user record and mark the old one for deletion. Enrolled MFA devices — phones, hardware tokens — do not transfer automatically. The user must re-enroll every device on the new username.

For most applications this is inconvenient. For healthcare organizations using Duo with Epic for EPCS (electronic prescribing of controlled substances), it is a compliance-critical workflow failure. Duo sends the UPN as the identity claim to Epic by default. A UPN change without a coordinated Duo update breaks e-prescribing until the Duo record is corrected.

The mitigation requires specific sequencing: add the new UPN as a username alias on the existing Duo record before the UPN change syncs. This preserves the device enrollment through the transition. Then update the primary username after sync completes. Most organizations discover this only after the first failure.

4. The Consent and Access Grant Problem

Quest OnDemand Migration is an example of a category that is easy to overlook: tooling platforms where the impacted accounts are administrators and operators, not end users.

Quest OnDemand identifies administrators by UPN via Microsoft OAuth. A UPN change for the subscription owner breaks portal access entirely — Quest support must be involved to correct backend records. All per-workload OAuth consents (Exchange, Teams, SharePoint) are tied to the consenting admin's UPN and become orphaned on change. The mitigation is strict: add the new UPN to the OnDemand org RBAC and re-grant all consents before deactivating the old identity.


A Reference Table: How Common SaaS Applications Handle It

The following table summarizes the UPN change behavior of a selection of applications from our full 65-app research set. The full dataset — including SCIM availability, stable ID anchor, JIT risk rating, admin steps, and test cases — is available as a companion spreadsheet.

ApplicationSupport LevelKey Risk
Microsoft 365 / Entra ID✅ FullDownstream app friction; verify SCIM matching uses objectId
Salesforce✅ FullSCIM 2.0; internal SF User ID is stable anchor
Slack✅ FullMember ID is stable; email propagates via SCIM
AWS IAM Identity Center✅ FullMap objectId to SCIM externalId for stable anchor
Asana✅ FullBusiness+ plan required for SCIM
SharePoint / OneDrive⚠️ PartialOneDrive URL changes; all shared links break; user must re-share
Microsoft Teams⚠️ PartialSIP address does not auto-update; manual PowerShell required
Exchange Online⚠️ PartialPrimary SMTP may not update; old address retained as alias
Microsoft Intune⚠️ PartialSCEP/PKCS cert SAN contains old UPN; Wi-Fi/VPN auth fails until renewed
Atlassian (Jira/Confluence)⚠️ PartialDefault SCIM matching uses UPN not objectId — must reconfigure
GitHub Enterprise⚠️ PartialAdmin-only change; SAML re-link required; CODEOWNERS breaks
Cisco Webex⚠️ PartialUPN mismatch creates new account; old user goes Inactive (30-day soft delete)
SAP Concur⚠️ PartialEntra gallery connector deprecated/broken; route via SAP Cloud Identity Services
Duo Security⚠️ PartialNew Duo user created; enrolled devices orphaned; add alias BEFORE change
Oracle Fusion ERP⚠️ PartialEmail mismatch breaks approval workflow routing
Figma⚠️ PartialUPN/SCIM mismatch = stuck 'Pending SCIM' state
KnowBe4⚠️ PartialSCIM source attribute must match SSO NameID — update both consistently
8x8⚠️ PartialFederation ID update on CREATE only — manual admin update required after change
ArcGIS Online❌ NoneInternal username is immutable; UPN change = new account; old content orphaned
Zendesk❌ NoneEmail-keyed; UPN change via JIT creates new account
SentinelOne❌ NoneEmail cannot be changed via API; deactivate and recreate required
Epic (Hyperspace)❌ NoneNo SCIM; Hyperspace analyst must update; Duo EPCS breaks separately
Lenel OnGuard❌ NoneOn-premises PACS; manual operator record update; no SCIM
SendGrid❌ NoneNo native SCIM; JIT creates new account on mismatch

What Good Looks Like

The applications that handle UPN changes well share a common design principle: they separate identity from addressing. The user is identified internally by an immutable object ID — Slack's Member ID, Salesforce's User ID, Entra's objectId. The UPN or email address is just an attribute on that object, and attributes can be updated without disrupting the identity relationship.

Applications that fail at this use the UPN or email as the primary key. The address is the identity. Change the address and you have created a new identity — which means a new account, orphaned content, and lost history.

SCIM 2.0 is the mechanism that makes clean UPN changes possible — but only when it is configured correctly. The most common misconfiguration is using UPN as the SCIM matching attribute instead of objectId. Atlassian's default configuration still does this. It is the first thing I check in any environment before approving a bulk UPN change.


Before You Touch That UPN

If there is one operational takeaway from this research, it is this: a UPN change requires an application audit before it can be treated as routine.

The questions I ask before any UPN change in a client environment:

  1. Which applications does this user access?
  2. Of those, which use JIT provisioning with email as the primary key?
  3. Which have business process workflows that route notifications by email?
  4. Which issue certificates or tokens that embed the UPN?
  5. Which have MFA enrollments tied to the current username?
  6. Which have admin or operator access where the UPN is used for consent grants?

The answers to those questions determine whether a UPN change is a two-minute ticket or a coordinated multi-system change management event.

The two-minute assumption is how things break quietly.