ADFS service ‘starting’ | Chickens and Eggs

chickenandegg

One of my colleagues built a really simple lab to test ADFS logon to Dropbox.

The lab was a single domain controller, running Windows Server 2012, with ADFS installed and configured. Generally speaking, Microsoft don’t have a big problem with ADFS on domain controllers any more (check out the 0 – 1000 users section of this document) but it’s still something that will make any old school AD person cringe a little (and probably rightly so).

So, at first, things were great in this lab. But then, we had to perform a reboot…

After the reboot, SSO login to Dropbox no longer worked.

Attempting to navigate to the ADFS service endpoint manually to check if it was available also failed with “HTTP 503, The service is unavailable”.

We logged on to the DC/ADFS server and noticed that the ADFS service was stuck on “starting”.

We set the service to “manual”, then restarted the server.

We attempted to start the service – same problem.

One of my close friends is still at Microsoft AD field engineering and reminded me of one of the possible pitfalls of any application residing on a single domain controller lab; the chicken and egg problem with the KDC. (You might have seen another version of this issue in old domain controller labs with a single DC that sometimes take 20 minutes to allow you to log on…).

The simplified way to describe the chicken and egg problem with a single DC is:

if any security principal for a service or application requires a kerberos ticket before the KDC is up and running it obviously won’t be able to get a ticket till the KDC is up and will end up in a crappy state. If it’s something that blocks the KDC in the list of startup items it’s pretty serious.

The reason this is only applicable with a single Domain Controller environment is that with multiple DC’s, the security principals can get the kerberos ticket required from another domain controller that is already up.

So what’s the solution?

Another Domain Controller would do it.

Moving the ADFS server off the domain controller would also do the trick… (because restarting the ADFS server would have no impact on the KDC)

But neither is practical when you are just messing around in a lab.

The answer that let us move forward quickly in the lab came from Brian on the AD forums.

We can set the KDC to come up right after the network is available (before other things) using:

sc triggerinfo kdssvc start/networkon

[restart the server]

Once this was in play we the ADFS service is able to work with the KDC in the right order and everything works out.

Note: Doing some further reading on the topic and chatting to my Microsoft contact, I learned that this issue popped up when the ADFS service was set to use a managed service account which is the cause of this particular chicken and egg with the KDC.

Snip20160712_64

Dropbox, Single Sign On & You

DTV_cg_toothless_04

Dropbox Business can be configured to allow users authentication via their usual identity provider (IDP) credentials rather than a traditional Dropbox password.

The most common approach for Dropbox Business customers is to use Microsoft ADFS (Active Directory Federation Services) to allow folks to log on using Active Directory (Domain Services). Online Identity Providers (IDP’s) like OKTA, Ping and Azure AD work just fine as well though, in some ways they are even better. Any solution which is able to authenticate you, then send Dropbox a nice slice of SAML 2.0 to say that you should be trusted should do the trick.

You can get it all the information about setting up ADFS for Single Sign On from the Dropbox help site. That’s not what this note is about. Also, the instructions on the Dropbox site assume you already have ADFS configured. If you don’t, i find this article by Kelsey Epps really helpful.

Disclaimer: This big long (boring) note is going to trace the whole thing and cover most of the stuff people ask about using the most common case: ADFS service provider initiated SSO. You probably definitely don’t need to read it all. I suggest using the time to hang out with your family instead. I’m only writing it so i don’t need to go over it again to know what ‘normal’ looks like and it would be great if it saves you some time as well.

tl;dr:

The client is heavily involved in the process (your web browser is the client)

more detail:

When you are logging on to Dropbox it might appear as though the servers involved are ‘talking’ to each other. Like: You navigate to Dropbox, the Dropbox server asks your ADFS server to check you out, then the ADFS server tells Dropbox you are ok and you are logged in.

That is not quite the way it goes down, but the difference is subtle. 

In reality, everything is via the client (web browser).

  1. You navigate to Dropbox with your client
  2. Dropbox tells your client that it needs to go to your ADFS server to log in (via a redirect).
  3. Your client (browser) redirects to the ADFS server that Dropbox told it about.
  4. You log in using the ADFS interface.
  5. If you get your password right, ADFS generates a SAML token for you and sends it back to your client (web browser).
  6. The client redirects back to Dropbox attaching the SAML token to the message.
  7. Dropbox lets you in (Dropbox Business was told to trust tokens from the ADFS server by your admin, and because the token has been signed by the ADFS server and Dropbox has a copy of the certificate, Dropbox knows if the token is legit).

Maybe some visuals would help as well…

  1. The user navigates to Dropbox.com, and they see the familiar login UI:

Snip20160708_30

2. Dropbox knows that SSO is configured for the user (because the admin has configured it) and sends enough information back to the browser to get to the IDP in the form of a SAML request:

Snip20160708_31

3. The client redirects to the IDP to log in:

Snip20160708_33

4. If they are authenticated, a signed token is generated and sent back to the client.

Snip20160708_34

5. Now the client has the golden ticket. The client heads back to Dropbox.com with token attached. If everything checks out, Dropbox authorizes the user:

Snip20160708_36

That’s a really long way of saying: your client (browser) is the go between for the whole conversation, the servers are not talking directly to each other at any time.

Moving on…

So far we have a pretty good hand wavy bullshit explanation of how this all pieces together.

But i think we can do better…

better:

Let’s go back and look at what Dropbox needs configured to make this work:

Snip20160708_37

Using the green numbers:

  1. Simply tells Dropbox that this team is planning to use Single Sign On. Dropbox now knows that it needs to offer up the little SSO UI shown below to all the people on your Dropbox team.

    Snip20160708_38
    (SSO UI)
  2. Decides whether ‘normal’ users should be able to see that little blue link that says ‘Log in with Dropbox credentials’. (Admins will always see this as a failsafe for situations when the IDP is down).
  3. Is where the magic happens – when people click ‘continue’, this is where Dropbox will tell the browser they need to go (via a redirect). So in the example, they would redirect to https://adfs.dbtests.info/adfs/ls.
  4. Is the certificate that Dropbox should use to verify the SAML token when it comes back signed by your ADFS server. If you want to know where that certificate came from – check step 20 in the setup guide.

Now lets see if we can spot this action going down in some traces:

I am going to use “SAML chrome panel” even though it’s my first time trying it out. I usually use “SAML tracer” for firefox. Either should be fine though:

Once SAML tracer is installed, you should see a new pane in your Chrome Developer Tools (view > developer > developer tools).

  1. Navigate to Dropbox and enter the username of my SSO enabled account:Snip20160708_40
    Notice that as soon as Dropbox works out that the username is associated with a Single Sign on enabled team, I am directed to a different route on the Dropbox service – https://www.dropbox.com/sso_state
  2. When we click continue (to login via SSO) we see this in the trace:Snip20160708_41
    There are actually four requests logged in our trace, but if you look at the bottom two they are concerned with dragging the images and styles that your browser needs to display the page (zzzzz…..).Its the top two that are interesting to our flow.The first, is another route change, where we are directed to https://www.dropbox.com/ajax_login which handles setting up the SAML request which can be seen in the next line.The next line is the result of the redirect, most importantly this is where we can get the information about the SAML request that is being bundled up to go to the ADFS server (via the client).This is what Dropbox will generate for a good session
    (note this part for troubleshooting later):Snip20160708_47
    T
    hat’s kinda hard to read, here are the important parts:- The request is for SAML 2.0.- It provides a time stamp via the “InstantIssue” attribute.- It indicates to the IDP where the completed SAML token is headed via the “AssertationConsumerServiceURL” attribute.- The request describes what we will need for successful authentication on the Dropbox side via the “Format” attribute. In this case – the NameId field of the SAML token needs to take the format of an email address. (Note, check the SSO setup guide step 17 to see where you will have configured this requirement on your ADFS server)- Also, check out the ID attribute.
  3. Now we type our Active Directory credentials into the ADFS logon page and inspect the SAML that is created:First of all, note the redirect back to https://www.dropbox.com/saml_loginSnip20160708_45
    Then, the content of the SAML token that the ADFS server created for us in response to the SAML request:Snip20160708_48Again, to big and ugly to read easily, (and i had to paste it into sublime to get it to fit well for a screen capture) but here are the important parts:Important parts:
    InResponseTo = “id-7d39d47ceecd487daf36f317c4a1a2fa” attribute? Tip: Go back and look at the “ID” attribute in the SAML request screenshot (spoiler: they match).- IssueInstant=”2016-07-08T21:08:29.017Z” tip: very important, Dropbox is going to reject anything greater than 5 minutes out of alignment with their servers.

    – <samlp:Status><samlp:StatusCode Value=”urn:oasis:names:tc:SAML:2.0:status:Success“/></samlp:Status> tip: anything other than success is going to need a closer look.- The <ds:X509Certificate> is a representation of the certificate that was used to sign the SAML token. Tip: If there is any possibility that the certificates have changed, check this.Slight detour… How the heck would you compare that value to the certificate you think you are using?Copy it:
    Snip20160708_49 Go here: https://www.samltool.com/fingerprint.php

    Paste it:
    Snip20160708_50
    “Calculate Fingerprint”:

    Snip20160708_51
    Compare to the certificate you think you have configured on Dropbox (whichever way you prefer):
    Snip20160708_52

    If they don’t match, then the certificate you think your ADFS server is using, is probably not the certificate your ADFS server is using.

    – Then look in the <subject> tag, and note that the NameID that we are expecting is the main event: <NameID Format=”urn:oasis:names:tc:SAML:1.1:nameid-format:emailAddress”>chad@dbtests.info</NameID>. It even reminds us of the format that was requested. After NameID there is some additional information in the <SubjectConfirmation> that can be used to verify the information including the inResponseTo and the time stamp that we saw earlier in the SAML token. Tip: The interesting part, is that for Dropbox you should only see the two attributes in the subject tag: The <NameId> and the <SubjectConfirmation>.

    – Next, the <Conditions> tag includes interesting information about the lifespan of the token: <Conditions NotBefore=”2016-07-08T21:08:29.017Z” NotOnOrAfter=”2016-07-08T22:08:29.017Z”>. Tip: Dropbox will ignore tokens with time issues.

  4. The token is sent off to dropbox (the url for authorization is given as the AssertionConsumerServiceURL in the request, and as the Destination in the response – Destination=”https://www.dropbox.com/saml_login”) and as long as the following things check out the user is authorized:- The NameID – is in email form, and matches the email of a valid user on the team.
    – The NotOnOrBefore window of time is valid.
    – The SignatureValue represents a hash of the SAML token signed by the certificate that is being used as the the “Token Signing Certificate” on the ADFS server. Dropbox is able to check because you have uploaded the certificate to the Dropbox admin console, so it is able to perform the same calculation to make sure nothing has been tampered with in transit.

That’s the bulk of it. As you trace the connection you are going to see all sorts of images and stylesheets being loaded and plenty of traffic that is unrelated to the login experience, but if you narrow your troubleshooting to the SAML exchange request, response – most of the time you will catch any of the quirky issues. Also – use the examples to see if your SAML looks ‘normal’.

Update – Note from Customer: A lot of the documentation for Dropbox SSO talks about “Relaying party identifier”, but systems like RSA Federated Identity Manager will be looking for an “Entity Id” – in both cases, this will be “Dropbox”.

/usr/share/dict/words

homer-excited

Interesting tip from Robert on Stack Overflow today showed me that there is a nifty little word file built into most Unix based systems (including OSX).

(yep, probably old news to *nix gurus, but for a lifetime Windows guy who only made the switch six months ago this was yet another nice surprise).

The list is here: /usr/share/dict/words

(I have a teammate who has a habit of putting simple passwords on zip files and forgetting to share them with the team).


import sys
import zipfile
 
file_name = "test.zip"
password_file_path = '/usr/share/dict/words'

zipped_file = zipfile.ZipFile(file_name)
password_file = open(password_file_path)
for password_guess in password_file.readlines():
    try:
        zipped_file.extractall(pwd=password_guess.strip())
        print '\n[+] Pass: ' + password_guess.strip() + '\n'
        exit(0)
   except Exception, e:
       sys.stdout.write('.')
       sys.stdout.flush()
 

Nice one funnyman 🙂Snip20160624_25

  • Of course better word lists are available, and would suit this case but I was pretty stoked with this built in one 🙂

from a previous life:

dinosaur

(Dragging my Microsoft posts into one place to get this thing started):

Quickly identifying accounts with pre-auth disabled:

https://blogs.msdn.microsoft.com/canberrapfe/2013/03/18/identify-accounts-with-kerberos-pre-authentication-disabled-in-the-ui/

DirectAccess Connection Process:

https://blogs.msdn.microsoft.com/canberrapfe/2013/01/12/the-direct-access-connection-process-according-to-netmon/

DirectAccess issue on Win7

https://blogs.msdn.microsoft.com/canberrapfe/2012/12/07/direct-access-for-windows-7-works-and-then-it-stops/

Faking “Connected to Internet” status in your lab (NCSI):

https://blogs.msdn.microsoft.com/canberrapfe/2012/10/09/fake-internet-connectivity-for-your-lab-tricking-ncsi/

“Real World” DirectAccess on Server 2012

https://blogs.msdn.microsoft.com/canberrapfe/2012/07/12/real-world-direct-access-installation-using-windows-server-2012/

Going deep with AD: un-hosting/re-hosting partitions:

https://blogs.msdn.microsoft.com/canberrapfe/2012/04/13/un-hosting-re-hosting-active-directory-partitions/

Going deep with AD: Granular replication techniques:

https://blogs.msdn.microsoft.com/canberrapfe/2012/04/11/granular-active-directory-replication-for-advanced-troubleshooting-scenarios/

Basic Network trace using ETL (no wireshark, no netmon):

https://blogs.msdn.microsoft.com/canberrapfe/2012/03/30/capture-a-network-trace-without-installing-anything-capture-a-network-trace-of-a-reboot/

Quick XPERF traces:

https://blogs.msdn.microsoft.com/canberrapfe/2013/05/21/xperf-boot-traces/

Going deep with AD: Change notification:

https://blogs.msdn.microsoft.com/canberrapfe/2012/03/25/active-directory-replication-change-notification-you/

Kerberos Troubleshooting:

https://blogs.msdn.microsoft.com/canberrapfe/2012/01/01/kerberos-troubleshooting/

Bluescreen debug: Beware ‘verifier’ settings in production:

https://blogs.msdn.microsoft.com/canberrapfe/2011/09/02/blue-screen-beware-verifier-settings-on-production-machines/

Quick 100 test users with Powershell:

https://blogs.msdn.microsoft.com/canberrapfe/2011/07/08/create-a-quick-100-users-with-powershell/

Going Deep with AD: Who are my ISTG’s?:

https://blogs.msdn.microsoft.com/canberrapfe/2011/07/08/identify-the-istgs/

Going deep with AD: Dumping the AD database (ntds.dit) to text:

https://blogs.msdn.microsoft.com/canberrapfe/2011/07/08/dumping-the-ad-database/

Going deep with AD: Playing with DCLOCATOR:

https://blogs.msdn.microsoft.com/canberrapfe/2011/07/08/testing-the-dclocator-process/

Going deep with AD: Multiple domain controllers in a site with RODC:

https://blogs.msdn.microsoft.com/canberrapfe/2011/07/08/multiple-domain-controllers-in-a-site-with-a-rodc/

Going deep with AD: Adding attributes to the RODC filtered attribute set:

https://blogs.msdn.microsoft.com/canberrapfe/2011/07/08/adding-attributes-to-the-rodc-filtered-attribute-set/

Forest functional levels – What you get:

https://blogs.msdn.microsoft.com/canberrapfe/2011/07/08/forest-functional-levels-what-you-get/

Domain functional levels – what you get:

https://blogs.msdn.microsoft.com/canberrapfe/2011/07/08/domain-functional-levels-what-you-get/

Troubleshooting dynamic ports:

https://blogs.msdn.microsoft.com/canberrapfe/2011/07/08/portqry-exe-to-troubleshoot-dynamic-ports/

(MCM) Group policy notes:

https://blogs.msdn.microsoft.com/canberrapfe/2011/07/08/group-policy-notes/

(MCM) Kerberos delegation lab:

https://blogs.msdn.microsoft.com/canberrapfe/2011/07/08/kerberos-delegation-lab/

(MCM) Kerberos notes:

https://blogs.msdn.microsoft.com/canberrapfe/2011/07/08/kerberos-notes/

“Preparing Network Connections” promoting a domain controller in a lab:

https://blogs.msdn.microsoft.com/canberrapfe/2011/07/08/preparing-network-connections-domain-controller-in-a-lab/

The “Branch Office Deployment Guide” the original and best way to really understand AD:

https://blogs.msdn.microsoft.com/canberrapfe/2011/07/08/2003-branch-office-deployment-guide/