Wikis - Page

Troubleshooting pam_ldap Against eDirectory Issues

0 Likes

One of the great uses for a directory service is to point things against it to get identity information.



For example, Unix systems can authenticate against a number of different systems. The default is known as 'files', which means use the /etc/passwd to store user names, passwords, and other needed information. There was some added complexity over time, such that the passwords (hashed of course) which used to be stored in the /etc/passwd file itself, they got moved into the shadow password file.



Next came NIS, the Network Information Service, which I think was formerly known as yellow pages (thus most of the tools start with yp, like ypcat passwd). In this case it was possible to have a NIS server and configure other servers to authenticate to the NIS server.



A further twist was the addition of more functionality into the next version of NIS, called NIS (NIS plus). Honestly I lost track of the new features in NIS over NIS, but both are out there, but that is neither here nor there.



Depending on the Unix variant (AIX, HPUX, Solaris, or Linux family) the implementation is slightly different.



It also turns out that there is an excellent driver for Novell Identity Manager that will synchronize users to and from NIS, NIS , or even files on Unix machines. There are actually two different versions, one called the bidirectional driver for connecting to say one or few NIS/NIS servers. The other is called the Fanout Driver, to handle the case when you have hundreds to thousands of Unix machines that need to be managed.



Lets focus on Linux for now though, since each traditional Unix is sufficiently different to need its own explanation.



Linux added a really neat functionality called pam. Pluggable Authentication Modules. What this means is that there is an API (Application Programing Interface) that someone writing a tool on Linux that authenticates would use to handle authentication and authorization events. Then the PAM API would follow the configuration and use whichever module it is told to try the authentication event.



The choice of modules is controlled by the /etc/nsswitch.conf file. It can specify for a variety of services which modules to use. As expected from the discussion above there is files, nis, and others.



The pam module of choice that I find most compelling is pam_ldap. This pam module looks to an LDAP directory service to authenticate users. This is great because you basically just point your Unix or Linux servers or desktops at eDirectory. eDirectory can handle the load, and if you are finding you are generating too much traffic it is the work of but moments to add additional replicas just to serve the load.



There are lots of docs out there on how to set up pam_ldap, and using pam_ldap against eDirectory ought to be really straightforward. In fact, if you use Novell's Open Enterprise Server (Linux kernel) it is basically mostly configured already. If you are just using SuSE Linux Enterprise Server (SLES) you would just enable it via yast2, Network Services, LDAP Client.



This is really an easy thing to set up and configure, and very straightforward.



However, and there is always a however in my life... I was asked to configure this on a bunch of SLES 10 SP2 servers at a client, and it was took about five minutes to get everything set up as far as I could tell. But of course it just would not work.



Ok, so troubleshooting pam_ldap, where do we start... Well the name is something of a giveaway, pluggable authentication module for LDAP. Well I know what LDAP servers I am pointing against and they are all eDirectory, so I know how to troubleshoot eDirectory and LDAP.



On to Dstrace with the LDAP switch. Just like watching trace in Novell Identity Manager, (see the best article on the topic I have yet seen by an Novell Technical Services employee, Fernando Frietas, at this link: Capturing and Reading Novell Identity Manager Traces As it turns out Fernando's article is focused on reading Dstrace for Identity Manager events, but LDAP events are pretty straightforward to understand.). You can use whatever version of Dstrace you want, I happened to be using the web interface at port 8028/8030 since I was using a SLES 10 server. You can read about the different versions of Dstrace floating around, in this article: The Many Faces of DSTRACE

Now one of the problems with doing this against production LDAP servers is that there are LOTS of events happening at any given moment in time. If you have a couple of replicas available, it would make sense to pick the least busy, and if possible an idle server so that you can more easily find the events in trace.



The SLES 10 SP2 server I was trying to get pam_ldap running on was 10.1.1.10 as you will see in the trace log.



Here is the LDAP trace of an attempt to ssh to the server.




13:03:28 B5FD8BA0 LDAP: New cleartext connection 0x9776c80 from 10.1.1.10:42942, monitor = 0xb63dcba0, index = 110
13:03:28 803A9BA0 LDAP: (10.1.1.10:42942)(0x0001:0x77) Implied anonymous bind by operation 0x1:0x77 on connection 0x9776c80
13:03:28 803A9BA0 LDAP: (10.1.1.10:42942)(0x0001:0x77) DoExtended on connection 0x9776c80
13:03:28 803A9BA0 LDAP: (10.1.1.10:42942)(0x0001:0x77) DoExtended: Extension Request OID: 1.3.6.1.4.1.1466.20037
13:03:28 803A9BA0 LDAP: (10.1.1.10:42942)(0x0001:0x77) Start TLS request issued from connection 0x9776c80
13:03:28 803A9BA0 LDAP: (10.1.1.10:42942)(0x0001:0x77) Sending operation result 0:"":"" to connection 0x9776c80
13:03:28 B63DCBA0 LDAP: Monitor 0xb63dcba0 initiating TLS handshake on connection 0x9776c80
13:03:28 B66DFBA0 LDAP: (10.1.1.10:42942)(0x0000:0x00) DoTLSHandshake on connection 0x9776c80
13:03:28 B66DFBA0 LDAP: BIO ctrl called with unknown cmd 7
13:03:28 B66DFBA0 LDAP: (10.1.1.10:42942)(0x0000:0x00) Completed TLS handshake on connection 0x9776c80
13:03:28 8175DBA0 LDAP: (10.1.1.10:42942)(0x0002:0x60) DoBind on connection 0x9776c80
13:03:28 8175DBA0 LDAP: (10.1.1.10:42942)(0x0002:0x60) Bind name:cn=ldapibm,ou=Services,o=acme, version:3, authentication:simple
13:03:28 8175DBA0 LDAP: (10.1.1.10:42942)(0x0002:0x60) Sending operation result 0:"":"" to connection 0x9776c80
13:03:28 8150ABA0 LDAP: (10.1.1.10:42942)(0x0003:0x63) DoSearch on connection 0x9776c80
13:03:28 8150ABA0 LDAP: (10.1.1.10:42942)(0x0003:0x63) Search request:
base: "dc=acme,dc=corp"
scope:2 dereference:0 sizelimit:1 timelimit:0 attrsonly:0
filter: "(&(objectclass=user)(uid=jsmith))"
no attributes
13:03:28 8150ABA0 LDAP: (10.1.1.10:42942)(0x0003:0x63) Empty attribute list implies all user attributes
13:03:28 8150ABA0 LDAP: (10.1.1.10:42942)(0x0003:0x63) Sending search result entry "cn=jsmith,ou=Migrated,dc=americas,dc=acme,dc=corp" to connection 0x9776c80
05/28/09
13:03:28 8150ABA0 LDAP: (10.1.1.10:42942)(0x0003:0x63) Sending operation result 0:"":"" to connection 0x9776c80
13:03:28 B72EBBA0 LDAP: (10.1.1.10:42942)(0x0004:0x60) DoBind on connection 0x9776c80
13:03:28 B72EBBA0 LDAP: (10.1.1.10:42942)(0x0004:0x60) Bind name:cn=jsmith,ou=Migrated,dc=americas,dc=acme,dc=corp, version:3, authentication:simple
13:03:31 B72EBBA0 LDAP: (10.1.1.10:42942)(0x0004:0x60) Failed to authenticate local on connection 0x9776c80, err = failed authentication (-669)
13:03:31 B72EBBA0 LDAP: (10.1.1.10:42942)(0x0004:0x60) Sending operation result 49:"":"NDS error: failed authentication (-669)" to connection 0x9776c80



You can see a number of things in the trace. First off, I told pam_ldap to use the ldapibm account to search as, and that it succeeded and was able to find my jsmith user.



13:03:28 8175DBA0 LDAP: (10.1.1.10:42942)(0x0002:0x60) DoBind on connection 0x9776c80
13:03:28 8175DBA0 LDAP: (10.1.1.10:42942)(0x0002:0x60) Bind name:cn=ldapibm,ou=Services,o=acme, version:3, authentication:simple



I had specified a search context, and the user I was trying to ssh in as (jsmith) was found, as you can see here:



13:03:28 8150ABA0 LDAP: (10.1.1.10:42942)(0x0003:0x63) Search request:
base: "dc=acme,dc=corp"
scope:2 dereference:0 sizelimit:1 timelimit:0 attrsonly:0
filter: "(&(objectclass=user)(uid=jsmith))"
no attributes
13:03:28 8150ABA0 LDAP: (10.1.1.10:42942)(0x0003:0x63) Empty attribute list implies all user attributes
13:03:28 8150ABA0 LDAP: (10.1.1.10:42942)(0x0003:0x63) Sending search result entry "cn=jsmith,ou=Migrated,dc=americas,dc=acme,dc=corp" to connection 0x9776c80



But alas, the login failed with a 669 error, which is a bad password, usually, as you can see at this point:



13:03:28 B72EBBA0 LDAP: (10.1.1.10:42942)(0x0004:0x60) DoBind on connection 0x9776c80
13:03:28 B72EBBA0 LDAP: (10.1.1.10:42942)(0x0004:0x60) Bind name:cn=jsmith,ou=Migrated,dc=americas,dc=acme,dc=corp, version:3, authentication:simple
13:03:31 B72EBBA0 LDAP: (10.1.1.10:42942)(0x0004:0x60) Failed to authenticate local on connection 0x9776c80, err = failed authentication (-669)
13:03:31 B72EBBA0 LDAP: (10.1.1.10:42942)(0x0004:0x60) Sending operation result 49:"":"NDS error: failed authentication (-669)" to connection 0x9776c80



Well, I was pretty darn sure I typed the password correctly. I tried it a couple of times just to be sure, so something else funny had to be going on.



But you can see that pretty much everything else looks like it was working. SSL (actually TLS), the proxy user, the search context, the search for the user, and the bind attempt.



Well when you have password issues with eDirectory your first thought should be to look at NMAS issues (Novell Modular Authentication Services). NMAS is the way Novell handles different authentication methods from things as simple as passwords to more complex things like smart cards, biometrics, or even Active Directory via CIFS in domain mode.



So I tried again with NMAS tracing enabled in DStrace, and look what I saw as the bind with a bad password happened:



13:03:28 B72EBBA0 NMAS: 43: Create NMAS Session
13:03:28 B72EBBA0 NMAS: 43: Failed to find login sequence for proxy client can do: 0x9
13:03:28 B72EBBA0 NMAS: 43: Client Session Destroy Request
13:03:28 B72EBBA0 NMAS: 43: Destroy NMAS Session



If you have ever had to troubleshoot CIFS or AFP logins, then you may be familiar with this issue. You can look at an article I wrote on troubleshooting CIFS and AFP logins here: Example of Troubleshooting AFP NMAS Issues

Login Sequence 0x9 is the Simple Password login sequence, so first place to look is in the Security container at the root of the tree, and at the Login Policy.Security object. Here I saw that while we have the Simple Password Login Method in the Authorized Login Methods.Security container, the Login Policy.Security object does not have a sequence defined for it. Well that is easy to fix. I used Console One with the NMAS snapins, but the iManager NMAS snapins can do this as well.



Click add New Sequence, named Simple Password. Add Simple Password then NDS (tried with OR here) to the allowed sequences. Save. Trace is below:



Saw this refresh event:


13:22:06 80551BA0 NMAS: Accessing local replica of Security
13:22:06 80551BA0 NMAS: Accessing local replica of CN=Login Policy.CN=Security
13:22:06 80551BA0 NMAS: NMAS Login Policy Refresh Started
13:22:06 80551BA0 NMAS: Accessing local replica of Security
13:22:06 80551BA0 NMAS: Accessing local replica of CN=Login Policy.CN=Security
13:22:06 80551BA0 NMAS: NMAS Audit not enabled
13:22:06 80551BA0 NMAS: Accessing local replica of Security
13:22:06 80551BA0 NMAS: Accessing local replica of CN=Login Policy.CN=Security
13:22:06 80551BA0 NMAS: Accessing local replica of Security
13:22:06 80551BA0 NMAS: Accessing local replica of CN=Login Policy.CN=Security
13:22:06 80551BA0 NMAS: Accessing local replica of Security
13:22:06 80551BA0 NMAS: Accessing local replica of CN=Login Policy.CN=Security
13:22:06 80551BA0 NMAS: Accessing local replica of Security
13:22:06 80551BA0 NMAS: Accessing local replica of CN=Login Policy.CN=Security
13:22:06 80551BA0 NMAS: Accessing local replica of Security
13:22:06 80551BA0 NMAS: Accessing local replica of CN=Login Policy.CN=Security
13:22:06 80551BA0 NMAS: Sequence name NDS
13:22:06 80551BA0 NMAS: Type: AND Grade: 0x800000
13:22:06 80551BA0 NMAS: Login Methods: 0x7
13:22:06 80551BA0 NMAS: Accessing local replica of Security
13:22:06 80551BA0 NMAS: Accessing local replica of Challenge Response.CN=Authorized Login Methods.CN=Security
13:22:06 80551BA0 NMAS: Sequence name Challenge Response
13:22:06 80551BA0 NMAS: Type: AND Grade: 0x800000
13:22:06 80551BA0 NMAS: Login Methods: 0x1F
13:22:06 80551BA0 NMAS: Accessing local replica of Security
13:22:06 80551BA0 NMAS: Accessing local replica of Simple Password.CN=Authorized Login Methods.CN=Security
13:22:06 80551BA0 NMAS: Sequence name Simple Password
13:22:06 80551BA0 NMAS: Type: OR Grade: 0x0
13:22:06 80551BA0 NMAS: Login Methods: 0x9, 0x7
13:22:06 80551BA0 NMAS: SASL mechanism name: NMAS_LOGIN
13:22:06 80551BA0 NMAS: NMAS Login Policy Refresh Finished



There we can see that the login method was reloaded as NMAS refreshed after a change was detected. Still getting the same 669 error on login, so I tried what my other tree had shown in the Simple Password sequence, which was just the one, and it looks like this:



13:25:05 7F870BA0 NMAS: Accessing local replica of Security
13:25:05 7F870BA0 NMAS: Accessing local replica of CN=Login Policy.CN=Security
13:25:05 7F870BA0 NMAS: NMAS Login Policy Refresh Started
13:25:05 7F870BA0 NMAS: Accessing local replica of Security
13:25:05 7F870BA0 NMAS: Accessing local replica of CN=Login Policy.CN=Security
13:25:05 7F870BA0 NMAS: NMAS Audit not enabled
13:25:05 7F870BA0 NMAS: Accessing local replica of Security
13:25:05 7F870BA0 NMAS: Accessing local replica of CN=Login Policy.CN=Security
13:25:05 7F870BA0 NMAS: Accessing local replica of Security
13:25:05 7F870BA0 NMAS: Accessing local replica of CN=Login Policy.CN=Security
13:25:05 7F870BA0 NMAS: Accessing local replica of Security
13:25:05 7F870BA0 NMAS: Accessing local replica of CN=Login Policy.CN=Security
13:25:05 7F870BA0 NMAS: Accessing local replica of Security
13:25:05 7F870BA0 NMAS: Accessing local replica of CN=Login Policy.CN=Security
13:25:05 7F870BA0 NMAS: Accessing local replica of Security
13:25:05 7F870BA0 NMAS: Accessing local replica of CN=Login Policy.CN=Security
13:25:05 7F870BA0 NMAS: Sequence name NDS
13:25:05 7F870BA0 NMAS: Type: AND Grade: 0x800000
13:25:05 7F870BA0 NMAS: Login Methods: 0x7
13:25:05 7F870BA0 NMAS: Accessing local replica of Security
13:25:05 7F870BA0 NMAS: Accessing local replica of Challenge Response.CN=Authorized Login Methods.CN=Security
13:25:05 7F870BA0 NMAS: Sequence name Challenge Response
13:25:05 7F870BA0 NMAS: Type: AND Grade: 0x800000
13:25:05 7F870BA0 NMAS: Login Methods: 0x1F
13:25:05 7F870BA0 NMAS: Accessing local replica of Security
13:25:05 7F870BA0 NMAS: Accessing local replica of Simple Password.CN=Authorized Login Methods.CN=Security
13:25:05 7F870BA0 NMAS: Sequence name Simple Password
13:25:05 7F870BA0 NMAS: Type: AND Grade: 0x4000000
13:25:05 7F870BA0 NMAS: Login Methods: 0x9
13:25:05 7F870BA0 NMAS: SASL mechanism name: NMAS_LOGIN
13:25:05 7F870BA0 NMAS: NMAS Login Policy Refresh Finished



Tried to bind again and I saw:



13:28:02 8165CBA0 LDAP: (10.1.1.10:60515)(0x0004:0x60) Bind name:cn=jsmith,ou=Migrated,dc=americas,dc=acme,dc=corp, version:3, authentication:simple
13:28:02 8165CBA0 NMAS: 54: Create NMAS Session
13:28:02 8165CBA0 NMAS: 54: Found login sequence Simple Password for proxy client
13:28:02 8165CBA0 NMAS: 54: NMAS Client supplied user DN .jsmith.Migrated.americas.acme.corp.ACME-LDAP.
13:28:02 8165CBA0 NMAS: Accessing local replica of Security
13:28:02 8165CBA0 NMAS: Accessing local replica of CN=Login Policy.CN=Security
13:28:02 8165CBA0 NMAS: Accessing local replica of CN=All User Policy.CN=Password Policies.CN=Security
13:28:02 8165CBA0 NMAS: 54: Actual user DN CN=jsmith.OU=Migrated.dc=americas.dc=acme.dc=corp
13:28:02 8165CBA0 NMAS: 54: Create thread request
13:28:02 8165CBA0 NMAS: 54: Using thread 0xe281d70
13:28:02 8165CBA0 NMAS: 54: Server thread started
13:28:02 8165CBA0 NMAS: 54: Proxy client started local server session
13:28:02 7D569BA0 NMAS: 54: Pool thread 0xe281d70 awake with new work
13:28:02 7D569BA0 NMAS: Accessing local replica of Security
13:28:02 7D569BA0 NMAS: Accessing local replica of CN=Login Policy.CN=Security
13:28:02 7D569BA0 NMAS: 54: CanDo
13:28:02 7D569BA0 NMAS: 54: No client network address
13:28:02 7D569BA0 NMAS: 54: Selected requested login sequence == "Simple Password"
13:28:02 7D569BA0 NMAS: 54: Login Method 0x00000009
13:28:02 7D569BA0 NMAS: 54: PWD LSM: started
13:28:02 7D569BA0 NMAS: 54: Begin Server Module 0x00000009
13:28:02 8165CBA0 NMAS: 54: PWD LCM: started
13:28:02 8165CBA0 NMAS: 54: Begin Client Module 0x00000009
13:28:02 8165CBA0 NMAS: 54: Client Module 0x00000009 Get attribute AID: 6
13:28:02 8165CBA0 NMAS: 54: Client Module 0x00000009 Get attribute AID: 7
13:28:02 8165CBA0 NMAS: 54: Client Module 0x00000009 Write
13:28:02 8165CBA0 NMAS: 54: Client Module 0x00000009 Write
13:28:02 8165CBA0 NMAS: 54: Client Module 0x00000009 Read
13:28:02 7D569BA0 NMAS: 54: ERROR: -1665 Server Module 0x00000009 Get attribute AID: 22
13:28:02 7D569BA0 NMAS: Accessing local replica of Security
13:28:02 7D569BA0 NMAS: Accessing local replica of Simple Password.CN=Authorized Login Methods.CN=Security
13:28:02 7D569BA0 NMAS: 54: ERROR: -16049 Server Module 0x00000009 Get attribute AID: 18
13:28:02 7D569BA0 NMAS: 54: Server Module 0x00000009 Get Password
13:28:02 7D569BA0 NMAS: 54: Server Module 0x00000009 Read
13:28:02 7D569BA0 NMAS: 54: Server Module 0x00000009 Read
13:28:02 7D569BA0 NMAS: 54: Server Module 0x00000009 Write
13:28:02 8165CBA0 NMAS: 54: PWD LCM: MAF_Read(wire_pwdInfo) replyLen 16
13:28:02 8165CBA0 NMAS: 54: Client Module 0x00000009 Read
13:28:02 7D569BA0 NMAS: 54: Server Module 0x00000009 Write
13:28:02 8165CBA0 NMAS: 54: PWD LCM: hash id 8 requested
13:28:02 8165CBA0 NMAS: 54: Client Module 0x00000009 Write
13:28:02 8165CBA0 NMAS: 54: Client Module 0x00000009 Write
13:28:02 8165CBA0 NMAS: 54: Client Module 0x00000009 Read
13:28:02 7D569BA0 NMAS: 54: Server Module 0x00000009 Read
13:28:02 7D569BA0 NMAS: 54: PWD LSM: multiHashes = 0
13:28:02 7D569BA0 NMAS: 54: Server Module 0x00000009 Read
13:28:02 7D569BA0 NMAS: 54: PWD LSM: Calling verifyPassword
13:28:02 7D569BA0 NMAS: 54: Server Module 0x00000009 Get Password
13:28:02 7D569BA0 NMAS: 54: Server Module 0x00000009 Write
13:28:02 8165CBA0 NMAS: 54: PWD LCM: login failed
13:28:02 8165CBA0 NMAS: 54: ERROR: -1642 Client Module 0x00000009 End
13:28:02 8165CBA0 NMAS: 54: Client Session Destroy Request
13:28:02 7D569BA0 NMAS: 54: PWD LSM: login failed
13:28:02 7D569BA0 NMAS: 54: ERROR: -1642 Server Module 0x00000009 End
13:28:02 7D569BA0 NMAS: 54: Server get data detected that the session was cleared
13:28:02 7D569BA0 NMAS: 54: Failed login delay 3 seconds
13:28:05 8165CBA0 LDAP: (10.1.1.10:60515)(0x0004:0x60) Failed to authenticate local on connection 0xf266780, err = failed authentication (-669)
13:28:05 8165CBA0 LDAP: (10.1.1.10:60515)(0x0004:0x60) Sending operation result 49:"":"NDS error: failed authentication (-669)" to connection 0xf266780
13:28:05 7D569BA0 NMAS: 54: Failed login
13:28:05 7D569BA0 NMAS: 54: Server put data detected that the session was cleared
13:28:05 7D569BA0 NMAS: 54: Server get data detected that the session was cleared
13:28:05 7D569BA0 NMAS: 54: ERROR: -1670 NMAS Manager
13:28:05 7D569BA0 NMAS: 54: Server thread exited
13:28:05 7D569BA0 NMAS: 54: Pool thread 0xe281d70 work complete
13:28:05 803A9BA0 LDAP: (10.1.1.10:60515)(0x0005:0x60) DoBind on connection 0xf266780
13:28:05 803A9BA0 LDAP: (10.1.1.10:60515)(0x0005:0x60) Bind name:cn=ldapibm,ou=Services,o=acme, version:3, authentication:simple
13:28:05 803A9BA0 LDAP: (10.1.1.10:60515)(0x0005:0x60) Sending operation result 0:"":"" to connection 0xf266780



Lets try and parse this error down:



13:28:02 8165CBA0 NMAS: 54: Found login sequence Simple Password for proxy client
13:28:02 8165CBA0 NMAS: 54: NMAS Client supplied user DN .jsmith.Migrated.americas.acme.corp.ACME-LDAP.



Ok, so the users DN is good, and the Simple Password login sequence definition is found. But then things start to go south!



13:28:02 8165CBA0 NMAS: 54: Begin Client Module 0x00000009
13:28:02 8165CBA0 NMAS: 54: Client Module 0x00000009 Get attribute AID: 6
13:28:02 8165CBA0 NMAS: 54: Client Module 0x00000009 Get attribute AID: 7
13:28:02 8165CBA0 NMAS: 54: Client Module 0x00000009 Write
13:28:02 8165CBA0 NMAS: 54: Client Module 0x00000009 Write
13:28:02 8165CBA0 NMAS: 54: Client Module 0x00000009 Read
13:28:02 7D569BA0 NMAS: 54: ERROR: -1665 Server Module 0x00000009 Get attribute AID: 22
13:28:02 7D569BA0 NMAS: Accessing local replica of Security
13:28:02 7D569BA0 NMAS: Accessing local replica of Simple Password.CN=Authorized Login Methods.CN=Security
13:28:02 7D569BA0 NMAS: 54: ERROR: -16049 Server Module 0x00000009 Get attribute AID: 18




Well -1665 is NMAS_E_LOGIN_ATTRIBUTE_NOT_FOUND, looks like no Simple password set on the user? But I know that my Password Policy in this tree says to synchronize the Universal Password to Simple Password and to NDS password. However there is a great tool to troubleshoot these sorts of issues. Novell has one called diagpwd, but I like the one Jim Willeke wrote, at Dump Password Information Tool



The output for this user shows:



# dn: cn=jsmith,ou=Migrated,dc=americas,dc=acme,dc=corp
Password Policy for Entry: cn=All User Policy,cn=Password Policies,cn=Security
Does Current password meet password policy assigned to user? true
===> Password Status <===
==> Universal Password <==
Is UPwd Enabled: true
Is the UPwd history full: false
Does UPwd match NDSPwd: true
Does UPwd match SimplePwd: true
Is UPwd older than NDSPwd: false
==> Simple Password <==
Is Simple Password Set: true
Is Simple Password Clear Text: true
Does Simple Password match NDSPwd: true



Well here we see that Simple and NDS password match the Universal Password, and I know what that is. So all should be good!



Next error we see is a little bit later:


13:28:02 7D569BA0 NMAS: 54: PWD LSM: Calling verifyPassword
13:28:02 7D569BA0 NMAS: 54: Server Module 0x00000009 Get Password
13:28:02 7D569BA0 NMAS: 54: Server Module 0x00000009 Write
13:28:02 8165CBA0 NMAS: 54: PWD LCM: login failed
13:28:02 8165CBA0 NMAS: 54: ERROR: -1642 Client Module 0x00000009 End



Well -1642, is NMAS_E_LOGIN_FAILED as expected, after all, it does not like the password on the user.



Finally we see a -1670 NMAS_E_CONN_CLEARED error:



13:28:05 7D569BA0 NMAS: 54: Server get data detected that the session was cleared
13:28:05 7D569BA0 NMAS: 54: ERROR: -1670 NMAS Manager



But that is just closing the connection I think. Looks like those last two are cascading errors because of the first one.



I tried a number of things. I got rid of the ldapibm proxy user, and granted the LDAP Group object supervisor rights to the tree, just to see if it was a rights issue. I had only granted the ldapibm user a limited set of rights.



13:14:00 B5FD8BA0 LDAP: New cleartext connection 0xa39da00 from 10.1.1.10:47546, monitor = 0xb63dcba0, index = 113
13:14:00 B68E1BA0 LDAP: (10.1.1.10:47546)(0x0003:0x63) DoSearch on connection 0xa39da00
13:14:00 B68E1BA0 LDAP: (10.1.1.10:47546)(0x0003:0x63) Search request:
base: "dc=acme,dc=corp"
scope:2 dereference:0 sizelimit:1 timelimit:0 attrsonly:0
filter: "(&(objectclass=user)(uid=jsmith))"
no attributes
13:14:00 B68E1BA0 LDAP: (10.1.1.10:47546)(0x0003:0x63) Empty attribute list implies all user attributes
13:14:00 B68E1BA0 LDAP: (10.1.1.10:47546)(0x0003:0x63) Sending search result entry "cn=jsmith,ou=Migrated,dc=americas,dc=acme,dc=corp" to connection 0xa39da00
13:14:00 B68E1BA0 LDAP: (10.1.1.10:47546)(0x0003:0x63) Sending operation result 0:"":"" to connection 0xa39da00
13:14:00 811B6BA0 LDAP: (10.1.1.10:47546)(0x0004:0x60) DoBind on connection 0xa39da00
13:14:00 811B6BA0 LDAP: (10.1.1.10:47546)(0x0004:0x60) Bind name:cn=jsmith,ou=Migrated,dc=americas,dc=acme,dc=corp, version:3, authentication:simple
13:14:00 811B6BA0 NMAS: 51: Create NMAS Session
13:14:00 811B6BA0 NMAS: 51: Failed to find login sequence for proxy client can do: 0x9
13:14:00 811B6BA0 NMAS: 51: Client Session Destroy Request
13:14:00 811B6BA0 NMAS: 51: Destroy NMAS Session
13:14:03 811B6BA0 LDAP: (10.1.1.10:47546)(0x0004:0x60) Failed to authenticate local on connection 0xa39da00, err = failed authentication (-669)
13:14:03 811B6BA0 LDAP: (10.1.1.10:47546)(0x0004:0x60) Sending operation result 49:"":"NDS error: failed authentication (-669)" to connection 0xa39da00
13:14:03 B68E1BA0 LDAP: (10.1.1.10:47546)(0x0005:0x60) DoBind on connection 0xa39da00
13:14:03 B68E1BA0 LDAP: (10.1.1.10:47546)(0x0005:0x60) Bind name:cn=ldapibm,ou=Services,o=acme, version:3, authentication:simple
13:14:03 B68E1BA0 LDAP: (10.1.1.10:47546)(0x0005:0x60) Sending operation result 0:"":"" to connection 0xa39da00
13:14:03 B63DCBA0 LDAP: (10.1.1.10:47546)(0x0000:0x00) TLS read failure 5 on connection 0xa39da00, setting err = -5875. Error stack:
13:14:03 B63DCBA0 LDAP: Monitor 0xb63dcba0 found connection 0xa39da00 socket failure, err = -5875, 0 of 0 bytes read
13:14:03 B63DCBA0 LDAP: Monitor 0xb63dcba0 initiating close for connection 0xa39da00
13:14:03 8150ABA0 LDAP: Server closing connection 0xa39da00, socket error = -5875
13:14:03 8150ABA0 LDAP: Connection 0xa39da00 closed



Ok, no joy with that.



Tried an older tree, that was doing AIX equivalent of pam_ldap (it has a different name in AIX, of course, what doesn't?), and same exact error.



Just to confirm that LDAP was even working in this tree, I used a Java based LDAP browser I like, and connected just to see if it would work, using the jsmith user. This also nicely confirmed that I had the password correct. I was pretty darn sure LDAP was working properly since I had been using this tool regularly to troubleshoot other issues. Additionally we had GroupWise doing LDAP authentication against this tree which was working as well.



Here is what a successful bind looks like:



Dstrace of an LDAP Browser bind:


14:21:51 956DE580 LDAP: New TLS connection 0x44b98840 from 192.168.98.137:1532, monitor = 0x359, index = 28
14:21:51 7C05C520 LDAP: Monitor 0x359 initiating TLS handshake on connection 0x44b98840
14:21:51 41811500 LDAP: DoTLSHandshake on connection 0x44b98840
14:21:51 41811500 LDAP: BIO ctrl called with unknown cmd 7
14:21:51 41811500 LDAP: Completed TLS handshake on connection 0x44b98840
14:21:51 41811500 LDAP: DoBind on connection 0x44b98840
14:21:51 41811500 LDAP: Bind name:cn=jsmith,ou=Migrated,dc=americas,dc=acme,dc=corp, version:3, authentication:simple
14:21:51 41811500 LDAP: Sending operation result 0:"":"" to connection 0x44b98840
14:21:51 41811500 LDAP: DoSearch on connection 0x44b98840
14:21:51 41811500 LDAP: Search request:
base: ""
scope:1 dereference:3 sizelimit:0 timelimit:0 attrsonly:0
filter: "(objectclass=*)"
attribute: "objectclass"
14:21:51 41811500 LDAP: Sending search result entry "o=acme" to connection 0x44b98840
14:21:51 41811500 LDAP: Sending search result entry "cn=Security" to connection 0x44b98840



Well the only error I had so far was Simple Password related, so I made a user with no Simple Password whatsoever... Harder than it sounds! Had to make a new password policy, assign it to a container to override the tree wide policy that I wanted to apply to everyone, and create the user in there. We were using Identity Manager to synchronize users, so I had to cheat and break some of the rules there, that would otherwise have prevented me from doing that.



Slightly different error this time, just a 16049 error, which is no NMAS secret found, which makes sense as there was no Simple Password set, but later now, we get a 1642 no nmas login attribute found.



17:50:18 A8B4ABA0 LDAP: (10.1.1.10:47743)(0x0001:0x60) DoBind on connection 0x94f4280
17:50:18 A8B4ABA0 LDAP: (10.1.1.10:47743)(0x0001:0x60) Treating simple bind with empty DN and no password as anonymous
17:50:18 A8B4ABA0 LDAP: (10.1.1.10:47743)(0x0001:0x60) Bind name:NULL, version:3, authentication:simple
17:50:18 A8B4ABA0 LDAP: (10.1.1.10:47743)(0x0001:0x60) Sending operation result 0:"":"" to connection 0x94f4280
17:50:18 A9251BA0 LDAP: (10.1.1.10:47743)(0x0002:0x63) DoSearch on connection 0x94f4280
17:50:18 A9251BA0 LDAP: (10.1.1.10:47743)(0x0002:0x63) Search request:
base: "dc=acme,dc=corp"
scope:2 dereference:0 sizelimit:1 timelimit:0 attrsonly:0
filter: "(uid=tuser2)"
no attributes
17:50:18 A9251BA0 LDAP: (10.1.1.10:47743)(0x0002:0x63) Empty attribute list implies all user attributes
17:50:18 A9251BA0 LDAP: (10.1.1.10:47743)(0x0002:0x63) Sending search result entry "cn=tuser2,ou=Migrated,dc=americas,dc=acme,dc=corp" to connection 0x94f4280
17:50:18 A9251BA0 LDAP: (10.1.1.10:47743)(0x0002:0x63) Sending operation result 0:"":"" to connection 0x94f4280
17:50:18 A8C4BBA0 LDAP: (10.1.1.10:47743)(0x0003:0x60) DoBind on connection 0x94f4280
17:50:18 A8C4BBA0 LDAP: (10.1.1.10:47743)(0x0003:0x60) Bind name:cn=tuser2,ou=Migrated,dc=americas,dc=acme,dc=corp, version:3, authentication:simple
17:50:18 A8C4BBA0 NMAS: 5: Create NMAS Session
17:50:18 A8C4BBA0 NMAS: 5: Found login sequence Simple Password for proxy client
17:50:18 A8C4BBA0 NMAS: 5: NMAS Client supplied user DN .tuser2.Migrated.americas.acme.corp.ACME-LDAP.
17:50:18 A8C4BBA0 NMAS: Accessing local replica of CN=NoSimple.CN=Password Policies.CN=Security
17:50:18 A8C4BBA0 NMAS: 5: Actual user DN CN=tuser2.OU=Migrated.dc=americas.dc=acme.dc=corp
17:50:18 A8C4BBA0 NMAS: 5: Create thread request
17:50:18 A8C4BBA0 NMAS: 5: Using thread 0x9782cf8
17:50:18 A86F8BA0 NMAS: 5: Pool thread 0x9782cf8 awake with new work
17:50:18 A86F8BA0 NMAS: Accessing local replica of Security
17:50:18 A86F8BA0 NMAS: Accessing local replica of CN=Login Policy.CN=Security
17:50:18 A8C4BBA0 NMAS: 5: Server thread started
17:50:18 A8C4BBA0 NMAS: 5: Proxy client started local server session
17:50:18 A86F8BA0 NMAS: 5: CanDo
17:50:18 A86F8BA0 NMAS: 5: No client network address
17:50:18 A86F8BA0 NMAS: 5: Selected requested login sequence == "Simple Password"
17:50:18 A86F8BA0 NMAS: 5: Login Method 0x00000009
17:50:18 A86F8BA0 NMAS: 5: PWD LSM: started
17:50:18 A86F8BA0 NMAS: 5: Begin Server Module 0x00000009
17:50:18 A86F8BA0 NMAS: 5: ERROR: -16049 Server Module 0x00000009 Get attribute AID: 22
17:50:18 A86F8BA0 NMAS: Accessing local replica of Security
17:50:18 A86F8BA0 NMAS: Accessing local replica of Simple Password.CN=Authorized Login Methods.CN=Security
17:50:18 A86F8BA0 NMAS: 5: ERROR: -16049 Server Module 0x00000009 Get attribute AID: 18
17:50:18 A86F8BA0 NMAS: 5: Server Module 0x00000009 Get Password
17:50:18 A86F8BA0 NMAS: 5: Server Module 0x00000009 Read
17:50:18 A8C4BBA0 NMAS: 5: PWD LCM: started
17:50:18 A8C4BBA0 NMAS: 5: Begin Client Module 0x00000009
17:50:18 A8C4BBA0 NMAS: 5: Client Module 0x00000009 Get attribute AID: 6
17:50:18 A8C4BBA0 NMAS: 5: Client Module 0x00000009 Get attribute AID: 7
17:50:18 A8C4BBA0 NMAS: 5: Client Module 0x00000009 Write
17:50:18 A86F8BA0 NMAS: 5: Server Module 0x00000009 Read
17:50:18 A8C4BBA0 NMAS: 5: Client Module 0x00000009 Write
17:50:18 A86F8BA0 NMAS: 5: Server Module 0x00000009 Write
17:50:18 A86F8BA0 NMAS: 5: Server Module 0x00000009 Write
17:50:18 A86F8BA0 NMAS: 5: Server Module 0x00000009 Read
17:50:18 A8C4BBA0 NMAS: 5: Client Module 0x00000009 Read
17:50:18 A8C4BBA0 NMAS: 5: PWD LCM: MAF_Read(wire_pwdInfo) replyLen 16
17:50:18 A8C4BBA0 NMAS: 5: Client Module 0x00000009 Read
17:50:18 A8C4BBA0 NMAS: 5: PWD LCM: hash id 8 requested
17:50:18 A8C4BBA0 NMAS: 5: Client Module 0x00000009 Write
17:50:18 A86F8BA0 NMAS: 5: PWD LSM: multiHashes = 0
17:50:18 A86F8BA0 NMAS: 5: Server Module 0x00000009 Read
17:50:18 A8C4BBA0 NMAS: 5: Client Module 0x00000009 Write
17:50:18 A86F8BA0 NMAS: 5: PWD LSM: Calling verifyPassword
17:50:18 A86F8BA0 NMAS: 5: Server Module 0x00000009 Get Password
17:50:18 A86F8BA0 NMAS: 5: Server Module 0x00000009 Write
17:50:18 A86F8BA0 NMAS: 5: PWD LSM: login failed
17:50:18 A86F8BA0 NMAS: 5: ERROR: -1642 Server Module 0x00000009 End
17:50:18 A8C4BBA0 NMAS: 5: Client Module 0x00000009 Read
17:50:18 A8C4BBA0 NMAS: 5: PWD LCM: login failed
17:50:18 A8C4BBA0 NMAS: 5: ERROR: -1642 Client Module 0x00000009 End
17:50:18 A8C4BBA0 NMAS: 5: Client Session Destroy Request
17:50:18 A86F8BA0 NMAS: 5: ERROR: -1645 Server timed out waiting for data
17:50:18 A86F8BA0 NMAS: 5: Failed login delay 3 seconds
17:50:21 A8C4BBA0 LDAP: (10.1.1.10:47743)(0x0003:0x60) Failed to authenticate local on connection 0x94f4280, err = failed authentication (-669)
17:50:21 A8C4BBA0 LDAP: (10.1.1.10:47743)(0x0003:0x60) Sending operation result 49:"":"NDS error: failed authentication (-669)" to connection 0x94f4280
17:50:21 A86F8BA0 NMAS: 5: Failed login
17:50:21 A86F8BA0 NMAS: 5: Server put data detected that the session timed out
17:50:21 A86F8BA0 NMAS: 5: Server get data detected that the session timed out
17:50:21 A86F8BA0 NMAS: 5: ERROR: -1645 NMAS Manager
17:50:21 A86F8BA0 NMAS: 5: Server thread exited
17:50:21 A86F8BA0 NMAS: 5: Pool thread 0x9782cf8 work complete
17:50:21 B6034BA0 LDAP: (10.1.1.10:47743)(0x0004:0x60) DoBind on connection 0x94f4280
17:50:21 B6034BA0 LDAP: (10.1.1.10:47743)(0x0004:0x60) Treating simple bind with empty DN and no password as anonymous
17:50:21 B6034BA0 LDAP: (10.1.1.10:47743)(0x0004:0x60) Bind name:NULL, version:3, authentication:simple
17:50:21 B6034BA0 LDAP: (10.1.1.10:47743)(0x0004:0x60) Sending operation result 0:"":"" to connection 0x94f4280



So far this was leading no where. Lets confirm what the Linux side is seeing.



Jun  3 09:36:12 and601lnx sshd[13421]: Invalid user jsmith from 10.1.1.10
Jun 3 09:36:20 and601lnx sshd[13423]: pam_ldap: error trying to bind as user "cn=jsmith,ou=Migrated,dc=americas,dc=acme,dc=corp" (Invalid credentials)
Jun 3 09:36:20 and601lnx sshd[13421]: error: PAM: User not known to the underlying authentication module for illegal user jsmith from acme601lnx.acme.com
Jun 3 09:36:20 and601lnx sshd[13421]: Failed keyboard-interactive/pam for invalid user jsmith from 10.1.1.10 port 59155 ssh2



Nothing unexpected there.



Finally I gave up and opened an incident with Novell, since I know this works, and clearly something was goofy.



Well the answer was so ridiculous I still cannot believe it!



Turns out all that troubleshooting is interesting and a useful exercise it turns out to be completely unrelated to anything! This was the worst kind of red herring. It looked completely plausible, but in the end had absolutely nothing to do with the problem.



It looks like either NMAS or LDAP knows how to recognize a pam_ldap authentication attempt, and the user it finds, MUST have the Auxiliary class posixAccount added to the users Object Class.

For a long variety of reasons, this particular tree happened to have the Domain Services for Windows schema included, and it looks like that or something else extended base schema so that User objects included the needed POSIX attributes, (uidNumber, gidNumber, gecos, loginShell, homeDirectory, etc) and since I was adding them via Identity Manager, when they got created in an Active Directory tree and synched into eDirectory, I fell into a minor trap, based on an actually useful feature of Identity Manager. You can read more about it in this article Auxiliary Classes and Identity Manager
but the gist of it is, that IDM will add any needed auxiliary classes that are needed, so I did not bother explicitly adding posixAccount to the Users, confident that IDM would do it for me.



The solution was then to add the posixAccount value to the Object Class attribute of all the users.

I used a toolkit rule to do this, based on the model I discuss in this series of articles:










I will probably write an article explaining the particular approach I took in this toolkit rule, but it was pretty easy once you understand my approach.



It happens that if any other attributes are wrong on the user, for example a loginShell pointing at an AIX specific shell that was not installed on Linux by default, and you would see the same error. If fixed that by sym-linking the listed loginShell value to /bin/bash with a command, that was something like:



ln -s /bin/bash /usr/bin/pksh-sh


Once both those issues were wrapped up I was finally able to get it all working. Here is what a proper successful bind looks like:




09:54:27 B5FD8BA0 LDAP: New cleartext connection 0x92b8000 from 10.1.1.10:37683, monitor = 0xb63dcba0, index = 20
09:54:27 8150ABA0 LDAP: (10.1.1.10:37683)(0x0001:0x77) Implied anonymous bind by operation 0x1:0x77 on connection 0x92b8000
09:54:27 8150ABA0 LDAP: (10.1.1.10:37683)(0x0001:0x77) DoExtended on connection 0x92b8000
09:54:27 8150ABA0 LDAP: (10.1.1.10:37683)(0x0001:0x77) DoExtended: Extension Request OID: 1.3.6.1.4.1.1466.20037
09:54:27 8150ABA0 LDAP: (10.1.1.10:37683)(0x0001:0x77) Start TLS request issued from connection 0x92b8000
09:54:27 8150ABA0 LDAP: (10.1.1.10:37683)(0x0001:0x77) Sending operation result 0:"":"" to connection 0x92b8000
09:54:27 B63DCBA0 LDAP: Monitor 0xb63dcba0 initiating TLS handshake on connection 0x92b8000
09:54:27 B72EBBA0 LDAP: (10.1.1.10:37683)(0x0000:0x00) DoTLSHandshake on connection 0x92b8000
09:54:27 B72EBBA0 LDAP: BIO ctrl called with unknown cmd 7
09:54:27 B72EBBA0 LDAP: (10.1.1.10:37683)(0x0000:0x00) Completed TLS handshake on connection 0x92b8000
09:54:27 8150ABA0 LDAP: (10.1.1.10:37683)(0x0002:0x60) DoBind on connection 0x92b8000
09:54:27 8150ABA0 LDAP: (10.1.1.10:37683)(0x0002:0x60) Bind name:cn=ldapibm,ou=Services,o=acme, version:3, authentication:simple
09:54:27 8150ABA0 LDAP: (10.1.1.10:37683)(0x0002:0x60) Sending operation result 0:"":"" to connection 0x92b8000
09:54:27 811B6BA0 LDAP: (10.1.1.10:37683)(0x0003:0x63) DoSearch on connection 0x92b8000
09:54:27 811B6BA0 LDAP: (10.1.1.10:37683)(0x0003:0x63) Search request:
base: "dc=acme,dc=corp"
scope:2 dereference:0 sizelimit:1 timelimit:0 attrsonly:0
filter: "(&(objectclass=posixAccount)(uid=jsmith))"
no attributes
09:54:27 811B6BA0 LDAP: (10.1.1.10:37683)(0x0003:0x63) Empty attribute list implies all user attributes
09:54:27 811B6BA0 LDAP: (10.1.1.10:37683)(0x0003:0x63) Sending search result entry "cn=jsmith,ou=Migrated,dc=americas,dc=acme,dc=corp" to connection 0x92b8000
09:54:27 811B6BA0 LDAP: (10.1.1.10:37683)(0x0003:0x63) Sending operation result 0:"":"" to connection 0x92b8000
09:54:27 8175DBA0 LDAP: (10.1.1.10:37683)(0x0004:0x60) DoBind on connection 0x92b8000
09:54:27 8175DBA0 LDAP: (10.1.1.10:37683)(0x0004:0x60) Bind name:cn=jsmith,ou=Migrated,dc=americas,dc=acme,dc=corp, version:3, authentication:simple
09:54:27 8175DBA0 LDAP: (10.1.1.10:37683)(0x0004:0x60) Sending operation result 0:"":"NDS error: password expired (-223)" to connection 0x92b8000
09:54:27 811B6BA0 LDAP: (10.1.1.10:37683)(0x0005:0x60) DoBind on connection 0x92b8000
09:54:27 811B6BA0 LDAP: (10.1.1.10:37683)(0x0005:0x60) Bind name:cn=ldapibm,ou=Services,o=acme, version:3, authentication:simple
09:54:27 811B6BA0 LDAP: (10.1.1.10:37683)(0x0005:0x60) Sending operation result 0:"":"" to connection 0x92b8000
09:54:27 7F870BA0 LDAP: (10.1.1.10:37683)(0x0006:0x6e) DoCompare on connection 0x92b8000
09:54:27 7F870BA0 LDAP: (10.1.1.10:37683)(0x0006:0x6e) compare: dn (cn=slesadmins,ou=admins,o=acme) attr (uniquemember) value (cn=jsmith,ou=Migrated,dc=americas,dc=acme,dc=corp)
09:54:27 7F870BA0 LDAP: (10.1.1.10:37683)(0x0006:0x6e) Sending operation result 5:"":"" to connection 0x92b8000
09:54:27 811B6BA0 LDAP: Connection 0x92b8000 closed



As you can see, an almost perfect bind, in fact this error:


09:54:27 8175DBA0 LDAP: (10.1.1.10:37683)(0x0004:0x60) Sending operation result 0:"":"NDS error: password expired (-223)" to connection 0x92b8000



is actually a GOOD error! The test users password had expired based on all my crazy testing, and getting that as an 'error' was perfect!



Crazy that a missing unneeded auxiliary class on a user object was the root cause, and looked like a Simple Password login problem, but that seems to have really been it! Hope this helps anyone else who runs into a similar situation. If you have seen something as ridiculous as this before, let me know, I am curious as to what else could cause something like this.

Comment List
Related
Recommended