How does Linux know that the new password is similar to the previous one?

Question

A few times I tried to change a user password on various Linux machines and when the new password was similar to the old one, the OS complained that they were too similar.

I always wondered, how does the system know this? I thought that the password is saved as a hash. Does this mean that when the system is able to compare the new password for similarity the old one is actually saved as plain text?

In windows, it keeps a history of past passwords you have used, i didn't know Linux did this too. But hash would probably be the `/etc/user` file that you're thinking of. Not the password history — xR34P3Rx, Dec 27 '14 at 17:33
1st off: plain text? no. If(!) saved you save the hash and compare hashes. In Linux though it checks current password with new password. BOTH are supplied by the user when changing passwords. — Rinzwind, Dec 27 '14 at 17:43
@Rinzwind But comparing hashes won't work because a one character difference should result in a completely different hash — slhck, Dec 27 '14 at 18:01
See also [Does Facebook store plain-text passwords?](http://security.stackexchange.com/questions/53481/does-facebook-store-plain-text-passwords) on [security.se] for other ways to detect similarity given only the hash of the old password and plaintext of the new password (no plaintext for old). — Bob, Dec 28 '14 at 06:56
You can actually test for similarity between a hashed old password and a plaintext new password. Simply generate a list of passwords similar to the new one, hash them all, and compare the resulting hashes to the old password hash. If any match, then it's similar. — BWG, Dec 29 '14 at 07:11
@BWG: That is a slight oversimplification – current hashing schemes salt the hash, so first you havce to extract the salt from the old password hash and make sure you use that salt for your similar-to-new passwords. (I'm pointing this out because it's possible that the API wouldn't expose a way to force a specific salt.) — Ulrich Schwarz, Jan 01 '15 at 14:07
@UlrichSchwarz: There is generally a way to force checking with the right salt - otherwise it's very difficult to verify the user's password at login, for example ;-) — psmears, Jan 01 '15 at 18:53
@psmears: but checking with the right salt is not at all creating with a given salt. Creating turns a password into an opaque string, nothing outside the API knows anything about it, not even if salt is there, it just knows that if you hand that string and a password to the API again, you get `true` exactly if you have the password right. — Ulrich Schwarz, Jan 01 '15 at 19:11
@UlrichSchwarz:In practice the whole operation *is* exactly the same: checking whether a string (that happens to be a variant of a proposed new password, for "similarity" checking) matches the stored hash+salt is *exactly* the same operation as checking whether a string (that happens to be the password the user gave at login) matches the stored hash+salt. Yes, you might have to do the "generate hash with appropriate salt and compare" parts all inside the API, rather than having the API spit out the hashes, but it's not going to be the case (in any sensible API) that the API doesn't allow it :) — psmears, Jan 01 '15 at 19:42

slhck · Accepted Answer · 2015-01-02T19:22:58.760

156

Since you need to supply both the old and the new password when using passwd, they can be easily compared in plaintext, in memory, without writing them somewhere on the drive.

Indeed your password is hashed when it's finally stored, but until that happens, the tool where you're entering your password can of course just access it directly like any other program can access things you entered on your keyboard while it was reading from STDIN.

This is a feature of the PAM system which is used in the background of the passwd tool. PAM is used by modern Linux distributions.

More specifically, pam_cracklib is a module for PAM which allows to reject passwords based on several weaknesses that would make them very vulnerable.

It's not just passwords which are too similar that can be considered insecure. The source code has various examples of what can be checked, e.g. whether a password is a palindrome or what the edit distance is between two words. The idea is to make passwords more resistant against dictionary attacks.

See also the pam_cracklib manpage.

edited Jan 02 '15 at 19:22

answered Dec 27 '14 at 19:27

slhck

223,558
70
607
592

do you have ideas in "how" your explanation fits with arguments reported in my answer? Are there two different approaches, taken by the "passwd" application, when host is **not** PAM-aware? P.S.: No critics at all. I'm just wondering (as PAM, BTW, was my first guess... just before grepping the source code). – Damiano Verzulli Dec 27 '14 at 19:38
Ah, I did actually not consider a system not using PAM, but I suppose both are possible. Good answer; I had not seen it. – slhck Dec 27 '14 at 19:44
28

More disturbing are the corporate password rules that alert you if you've used the same or similar password among any of the last four. – Nick T Dec 30 '14 at 03:58
4

@NickT How is that (necessarily) disturbing - couldn't they just save your last 4 hashes, then compare each of those to your proposed new one in the same way as this question? – neminem Dec 30 '14 at 17:14
2

@neminem "...or similar" – Nick T Dec 30 '14 at 17:20
1

@NickT Ah, fair enough, because in this particular case you're comparing against the "old password" that's input by the user to change the password, rather than against a saved hash. Still, you *could* hypothetically use the method BWG posted in a comment, for at least checking really simple changes (one character substitution, one character added/removed, etc.). – neminem Dec 30 '14 at 17:34

Damiano Verzulli · Answer 2 · 2014-12-27T20:02:45.390

At least in my Ubuntu, the "too similar" messages cames out ~~when: "...more than half of the characters are different ones...." (see below for details).~~ thanks to the PAM support, as clearly explained in the @slhck answer.

For other platform, where PAM is not used, the "too similar" messages comes out when: "...more than half of the characters are different ones...." (see below for details)

To further check this statement on your own, it's possible to check the source-code. Here is how.

The "passwd" program is included in the passwd package:

verzulli@iMac:~$ which passwd
/usr/bin/passwd
verzulli@iMac:~$ dpkg -S /usr/bin/passwd
passwd: /usr/bin/passwd

As we're dealing with Open Source technologies, we have unrestricted access to source code. Getting it is as simple as:

verzulli@iMac:/usr/local/src/passwd$ apt-get source passwd

Afterwards it's easy to find the relevant fragment of code:

verzulli@iMac:/usr/local/src/passwd$ grep -i -r 'too similar' .
[...]
./shadow-4.1.5.1/NEWS:- new password is not "too similar" if it is long enough
./shadow-4.1.5.1/libmisc/obscure.c:     msg = _("too similar");

A quick check to the "obscure.c" gives out this (I'm cut-and-pasting only the relevant piece of code):

static const char *password_check (
    const char *old,
    const char *new,
    const struct passwd *pwdp)
{
    const char *msg = NULL;
    char *oldmono, *newmono, *wrapped;

    if (strcmp (new, old) == 0) {
            return _("no change");
    }
    [...]
    if (palindrome (oldmono, newmono)) {
            msg = _("a palindrome");
    } else if (strcmp (oldmono, newmono) == 0) {
            msg = _("case changes only");
    } else if (similar (oldmono, newmono)) {
            msg = _("too similar");
    } else if (simple (old, new)) {
            msg = _("too simple");
    } else if (strstr (wrapped, newmono) != NULL) {
            msg = _("rotated");
    } else {
    }
    [...]
    return msg;
}

So, now, we know that there's a "similar" function that based on the old-one and the new-one check if both are similar. Here's the snippet:

/*
 * more than half of the characters are different ones.
 */
static bool similar (const char *old, const char *new)
{
    int i, j;

    /*
     * XXX - sometimes this fails when changing from a simple password
     * to a really long one (MD5).  For now, I just return success if
     * the new password is long enough.  Please feel free to suggest
     * something better...  --marekm
     */
    if (strlen (new) >= 8) {
            return false;
    }

    for (i = j = 0; ('\0' != new[i]) && ('\0' != old[i]); i++) {
            if (strchr (new, old[i]) != NULL) {
                    j++;
            }
    }

    if (i >= j * 2) {
            return false;
    }

    return true;
}

I haven't reviewed the C code. I limited myself in trusting the comment just before the function definition :-)

The differentiation between PAM and NON-PAM aware platforms is defined in the "obscure.c" file that is structured like:

#include <config.h>
#ifndef USE_PAM
[...lots of things, including all the above...]
#else                           /* !USE_PAM */
extern int errno;               /* warning: ANSI C forbids an empty source file */
#endif                          /* !USE_PAM */

This is a long answer that doesn't seem to directly answer the question of how it can compare against the old password when passwords as hashed. — jamesdlin, Dec 27 '14 at 21:47
@jamesdlin : as stated in Rinzwind comment to original question, hashes do **NOT** play any role in this matter: when you issue the "passwd" command to change password, you're required to provide both "old" and "new" password. So the "passwd" code has no problem at all in comparing/checking both the password at once (in clear forms; not hashed at all). — Damiano Verzulli, Dec 27 '14 at 22:10
@DamianoVerzulli Nevertheless, this doesn't really address the question. The question wasn't "what C code do you use to tell if two strings are similar;" that's exactly the same for passwords as for anything else. The thing about _passwords_ that makes them interesting is that they're never stored in plaintext, and that's what the question asks about. This answers "what criteria are used and how is it done in C," but it's way too long for "what criteria" and "how would I do this in C" is an SO question, not an SU question. — cpast, Dec 27 '14 at 22:55
@DamianoVerzulli And the fact that `passwd` asks for both old and new passwords *is the answer*. The rest of this answer is irrelvant. — jamesdlin, Dec 28 '14 at 05:18
+1 for and extremely relevant and interesting answer! It is nice to see that the actual code comparing password actually works on the plaintext and, as expected, not on the hash. — nico, Dec 29 '14 at 20:15
@jamesdlin The source code is the proof that the answers are compared in plaintext. The other answers are just conjecture that it would have to be done that way. After that, the explanation of what exactly is similar is given as further information. And further information has never been discouraged here. The deficiency in this answer is just that it doesn't spell this out. — trlkly, Jan 02 '15 at 11:41
@cpast See question title: "How does Linux know that the new password is similar to the previous one?" - Answer, by checking the old against the new _in these specific ways_. This most definitely does answer the question, and better than most of the other answers — Izkata, Jan 02 '15 at 16:01
@Izkata No, see the question _body_: "I thought the password was saved as a hash, does the comparison mean it's saved as plaintext?" This doesn't answer that question at all, or if it does it implies the wrong answer (implying the password is saved as plaintext). — cpast, Jan 03 '15 at 16:03

score 36 · Answer 3 · edited Dec 28 '14 at 23:46

36

The answer is far simpler than you think. In fact, it almost qualifies as magic, because once you explain the trick, it's gone:

$ passwd
Current Password:
New Password:
Repeat New Password:

Password changed successfully

It knows your new password is similar... Because you typed the old one in just a moment before.

edited Dec 28 '14 at 23:46

Peter Mortensen

12,090
23
70
90

answered Dec 28 '14 at 18:14

Cort Ammon

2,386
13
9

2

"... or candy." – Nick T Dec 30 '14 at 17:39
1

Silly rabbit, trix are for kids! – iAdjunct Dec 30 '14 at 17:55
1

What it doesn't explain is when it knows your past n passwords :) "Password has been used too recently", which prevents swapping the same few passwords in a corporate environment. – Juha Untinen Dec 31 '14 at 09:48
3

@Juha Untinen: That is true, but that can be handled by simply remembering the last N hashes. Catching "same as Nth password" is easy, its the "*similar* to Nth password" that is hard. As far as I am aware, these systems only check for similarity with the last password, and sameness wtih the last N. If they do check for similarity with the last N... that's a really interesting trick now, isn't it! I have no idea how they'd do that. – Cort Ammon Dec 31 '14 at 17:34

score 7 · Answer 4 · answered Dec 29 '14 at 13:41

7

Although the other answers are right, it may be worth to mention that you don't need to supply the old password for this to work!

In fact, one can generate a bunch of password similar to the new password you supplied, hash them, and then check if any of these hashes matches the old one. If this is the case, then the new password is judged similar to the old one! :)

answered Dec 29 '14 at 13:41

Ant

311
1
3
12

2

While this is indeed a means to achieve this feat (and is used by many websites), that's not what's going on in this instance. – Brian S Dec 29 '14 at 15:43
That is a neat trick! A wee bit more computationally intensive, but clever! – Cort Ammon Dec 29 '14 at 18:55
You should at least give some estimation of how many similar passwords would need to be generated to have a meaningful check, or link to external resource. Otherwise this is just an idea of possible alternative, not a substantiated answer. – hyde Dec 31 '14 at 07:41
@hyde that depends on criteria someone may think of. For me passwords are similar if there was max 3 characters added / removed / modified. So that's 62 hashes for every character (and that's if we only use alphanumerics) times combination of 3 from password's length (`n`), which is `62 * (n!)/(6 * (n - 3)!)`, which equals 13540 for 12 characters long password. But if anyone thinks about something different the equation is useless, so why bother? – Killah Dec 31 '14 at 14:20
Stupid answer, but an insight nevertheless. Why stupid? 1. You'd have to generate an unimaginable number of hashes. 2. Such setup would weaken the security of the original password. If someone obtained hashes of all similar passwords instead of just one hash, they'd have much easier time to crack it. – Rok Kralj Jan 03 '15 at 20:58

score 5 · Answer 5 · answered Jan 02 '15 at 11:06

One aspect was not covered: password history. Some systems support this. In order to do that, it keeps a history of passwords and encrypt them with the current password. When you change your password it uses the "old" password to decrypt the list and verify. And when it sets a new password, it does save the list (again) encrypted with a key derived from the new password.

This is how remember=N works in PAM (stored in /etc/security/opasswd). But also Windows and other Unix vendors offer similar functions.

How does Linux know that the new password is similar to the previous one?

5 Answers5