Congratulations, your program is vulnerable to length-extension attack [0].
What you want is something secret determining the output of something public... so instead of reinventing the wheel (the number 1 source of issues in crypto), use the standard, ie HMAC [1].
Now for the rest of the algorithm, IANAC, but truncating sounds like a bad idea.
> The password quality is just adequate, but I think the idea has potential to be very secure.
The op makes no broad claims. I think your insights are helpful, but I would change your wording from "Congratulations" to "unfortunately" and then politely point out how to resolve the issue.
There is no issue. Length extension requires the full (or as close to full as to make no difference) output of the hash function.
The problem with hash functions like SHA1,2 and MD5 is that their output represents their entire internal state.
In this case let's say an attacker recovered the password for door 1. They can then compute what the passwords associated with Door's whose number starts with 1.
As the password is substantially truncated, this does not represent the final state of the hash, and the attack is not useful.
That aside, this still suffers from weak key derivation, allowing a more direct bruteforce attack, as others have mentioned.
I agree that HMAC is the better option, but I don't see any reasonable scenario where length-extension is actually relevant for this application.
Also, I think that truncating actually improves security: An attacker who knows n (door_id, password) pairs and is bruteforcing the base_phrase can only check each base_phrase candidate for the first m bytes. This increases the odds of a 'false collision' in the brute forcing; that is, they find a different base_phrase which happens to work for their (door_id, password) pairs, but won't work for door_ids which they don't know.
1. Various services have differing ideas about what characters can be used in a password and how long it can be. My default settings are to generate 20-character passwords with all kinds of characters and I frequently have to adjust those settings if the service in question only allows me to use letters. Or just 8 characters. Granted, if MaxLength for input fields is set consistently then generating a too-long password will do no harm, but character set might remain problematic (or insecure).
2. Changing a password (e.g. after a leak) requires one to us a different door id. Which then leads to remembering the appropriate ID for the currently-valid password. To properly mitigate this I guess the only option is to sort-of go back to traditional password managers by having a lookup table door id to enter → door id that is used to generate the password.
Solving these issues was a fun challenge when I built a project similar to this a few years ago. It's one of the few things I've built that I use every single day.
My project, Cryptasia (www.cryptasia.com), uses a Google Spreadsheet as a data-source where each line defines a single set of generation parameters: the allowed character set, number of characters, domain name, hash seed, and name of the key. When you pick a service (e.g. gmail) and enter your passphrase, it creates a password for you using that unique combination. It also hashes the passphrase and displays an image based on what you typed, so if you recognize your image, you know you typed it properly. Some advantages of using a data-source like this are that it can be entirely client-hosted (the google doc need not be public) and you can regenerate your password for a site at any time just by changing the seed.
One interesting difference in OP's implementation is the use of a PIN as well as a passphrase. What led to this implementation?
(unfortunately) The more I think about this, the more flaws I find... I looked through pastor.py and essentially you're just creating a different password. There's no difference between using this generated password and another password (you could argue that the generated password is harder to brute force, but that's it really).
"The generated password is also site-unique and thus leaves you more resilient against sites losing their password databases or being outright malicious"
Assuming this tool would become popular, I do not think it would make any difference in scenario you described. If I know that a lot of people use this to generate their passwords, I can:
* guess the door id (e.g. facebook or fb for facebook.com),
* concatenate it to usual attack guess,
* hash the result one more time and continue in exactly the same way as usually.
Bonus for attacker: if any of password databases leaks and attacker manages to acquire your passphrase the above way, he needs only few door id guesses to get access on any of yours accounts.
That's the usual purpose of a password manager. Freeing you from having to remember long and/or complex passwords so you can effortlessly have stronger passwords and more convenience (also different passwords for different services without having to remember them all).
Keep in mind that this is something for personal use to retrieve passwords used somewhere, not for storing passwords for users within a service (at least your confusion sounds like you might be confusing those two things).
The problem is that the 'door' is your password now - you have to remember all the different doors, or use a password manager to store them for you... But then why not just keep the actual passwords in the manager?
The door is just the identifier you use for retrieving a certain password. You can just use your username, or the e-mail address you used to sign up. Said identifier is not a password in that it's not secret. You can even write them down.
The generated password is also site-unique and thus leaves you more resilient against sites losing their password databases or being outright malicious, but the rekeying problem in the grandparent post is a major (essentially fatal) downside.
Many commenters claim that pastor is not watertight, and that is of course 100% true. I completely agree! There is much room for improvement and I appreciate these comments very much.
However we all have an Achilles heel in our online security management strategies. I'm guessing that master email addresses, used to reset passwords and authenticate and also smartphones are a much much weaker link then a system like pastor, even when it is only used to generate parts of passwords and everything else is written down. Would you continue reading this comment if you suddenly became aware that you had no clue where you phone was and vaguely remembered putting on some counter in some store somewhere? I even had to checked the presence of my phone when writing that. Twice.
Also there are alot of integrated systems for managing passwords. Centralized payed services can of course sport far higher quality assurance, but pastor is a completely self contained script that can be remembered and kept anywhere. By using a simple solution one removes a lot of the pitfalls that are present in more complex solutions. The weaknesses, which are always there in any system, can be more easily understood and managed.
Disregarding security of this particular implementation and trade offs, the problem is that while it's a simple solution, it's not convenient: password managers are more usable. I'm saying this as a guy who wrote similar password generator once, and used it for some time, then switched to a password manager.
When you need to change a password for some website, you'll have to add something to your "door id", e.g. a counter. Then you'll have to remember such counters for every website where you decide to change password. Due to this I was a bit reluctant to change passwords, which is dangerous.
If you want simplicity, the simpler and more convenient solution is to keep an encrypted text file.
--
As for security, the best approach to deterministic password generators I've seen is this:
This is true. I have door ids that with version numbers in them, but then again I store a list of door ids in clear text.
I used to have an encrypted list of passwords, but I was reliant on more complicated software to encrypt and decrypt the password file. Also there was more bookkeeping involved. Whenever I added a password I had to remember to save and encrypt the file and then decrypt it again to verify that I had use the right master password.
Do you think pwclip [1] would work well for you? The key file is optional, if you provide a -c argument that prompts for your passphrase (and ideally returns scrypt(pw)).
I used to use something like this, then you realize, it won't work for 50% of the sites out there as you can't customize how the passwords are generated.
Can't use special characters? Must use a special character? Can use $%^& but not * ?
Amex has a limit of 8 characters, how many does this generate? I want to use more than 8 for most sites, how do I do that? etc.
The Javascript bookmarklet let me change these settings, but then I had to remember them for each site that had custom settings.
Yup, sites that impose their own password restrictions, with the intent of strengthening passwords, that actually serve to weaken them, have a lot to answer for.
You can add a standard string to all passwords, for example "Ab1#", so that it satisfies like 99% of sites. You do get in situations where you have to know remember some more information, but this can mostly be written down.
"You do get in situations where you have to know remember some more information, but this can mostly be written down."
That boils down to "this idea won't work"; "writing down my passwords" is the exact failure case we're trying to get away from. And I don't have that problem with LastPass.
The name of the game isn't to try to make this idea work at all costs. The name of the game is to find the best way to manage passwords. It's important not to lose sight of that in the argument. (If you look around, you'll see this particular cognitive problem come up a lot in engineering... never lose sight of the overall goal no matter how far into the trees you go.)
Unfortunately, at least a dozen iterations of this idea have come to my attention, to say nothing of who-knows how many hundreds or thousands of implementations of this basic idea there are, and they haven't succeeded because they really don't work in the real world.
If I write it down, I will lose it. I do have some passwords that follow a specific pattern that I can remember easily for sites I might need to access outside of having KeePass on a machine (haven't put it on my phone).
I love this idea, I hate having to have a actual database to keep after and secure, and I wish it could be done in a way that wouldn't require that, but sites have too many variables for passwords that for me, literally, this wouldn't work on sites I use every day.
If your pass phrase is compromised you've compromised all your passwords, even future passwords you've not yet used if you're unaware of the compromised phrase.
You could easily increase security by adding another rule that only you know to the seed.
For example, if I am generating a password for [email protected], I could have a rule that I always append the first four letters of the domain name to my seed.
In this case, I would actually generate a password for [email protected]+exam.
In this scenario, my master password may be compromised, but the hacker would still have a lot of difficulty using the master passphrase to access any of my accounts.
This, of course, only works if no one else knows your rule, and you could get far more clever with the seeding rule than I was just now before my first cup of coffee.
Thanks, I came here to make pretty much those same points. From the link:
Normal Encrypted DB:
To steal your passwords, the attacker has to steal the database. Then she has two options: steal the master password or guess/bruteforce it.
Password Generation Method:
To steal your password, the attacker has to steal your master password, or steal one of the derived passwords and guess/bruteforce the master password. Once she has it, she can generate any current and future passwords, until you stop using the generator with the same master password.
> None of these passwords are ever stored. The pass-phrase is not a master password to some stored list of passwords.
True... but what makes this method superior to simply having a master password to an (encrypted) stored list of passwords? What makes it more secure than that ordinary way of doing things?
It seems to be less convenient in some ways (there are more things you have to remember), although perhaps more convenient in others (there is no password database to keep sync'd accross multiple computers or to lose).
This strategy deftly sidesteps three potential problems:
* Oh great, I forgot to sync my password list on my MacBook before I left the house with only my iPhone
* Oh great, my MacBook got stolen and my backups are all corrupted
* Oh great, my MacBook got stolen and I don't know the password to my backup service that has all my passwords because the password was stored on the machine that just got stolen
Yes, all solvable problems. But, essentially, if you can commit this algorithm to memory, you can always retrieve your passwords, no matter what hardware, backups, or 3rd-party systems go missing.
Hi, building on what a lot of other commenters have said, there are some major improvements you should consider:
* Do not use plain SHA-256. Use a KDF, like PBKDF, bcrypt or scrypt.
* There must be some efficient method by which I can re-issue compromised passwords.
* For convenience, allow me to have different character sets for different sites, and different lengths.
* Make the default to be going directly into my clipboard.
* A way to recover from the compromise of the master passphrase
The way I would do this is to have a small file, detailing, for each door_id:
* Random seed. The is effectively a version number. Change this, and get a new password for only that door_id.
* Character set
* Length
This file, although not advisable, is not enough to break the other passwords generated. Given a strong enough passphrase (I'd say... a 5 or 6 word diceware password), it can be synced to all sorts of places, kept on a USB stick, etc.
I cannot, off the top of my head see any way to re-issue the master passphrase effectively.
As it stands, I could not, in good conscience recommend anyone use pastor as it stands. Although a few, relatively minor, modifications, could make it quite an interesting tool.
Using a file on disk to keep track of version numbers, lengths, and character sets seems quite extreme, given the aim is to remove the need for files on disk!
Perhaps some other way of keeping track of these, without the user needing to remember them could be devised?
If you use anything other than a very strong master password, this scheme leaves you unprotected. When you sign up to a site, you are effectively giving that site, and anybody intercepting that communication SHA256(base_phrase + " - " + door_id). This enables anybody with one of the generated passwords to crack the hash and obtain the master password. This should use a very slow hash function instead.
I'm not sure if I quite understand this - so you're hashing the pass-phrase+doorID, and then storing that? Doesn't that mean that the hashed, well, generated password is always the same, regardless which website you are on?
I like the idea of supergenpass, as it's creating unique passwords on a website basis as it hashing a password+URL to create unique website passwords.
I think the idea is that you choose a unique door id for every website, something you can easily remember.
It's also a deterministic (but difficult to reverse, I assume) algorithm, so it doesn't ever store anything, you just regenerate the password to "look it up".
The password is essentially hash(passphrase + door id). Which means, nothing is stored, because both of those are inputs to the program. At least that's how I understood it.
But what I don't understand is, if the password is never stored, how can you allow for logins, etc.? I mean, there must be a way of comparison, or something? I think perhaps I'm having a brain melt, or I'm misunderstanding this completely. To me, this looks like it's creating a password from hashing the pass-phrase and doorID, but that just then generates another password...
There are scores of these sites out there, like http://oplop.appspot.com/. The nice thing about having a small application do it is that you don't have to run it on a browser, which I inherently trust less... all the same, it's pretty convenient to have it on a web page (e.g., when you're trying to enter something in on your mobile phone, or on a device that isn't yours).
My favorite password manager, pass, is fantastic, but unfortunately I can't use it if I ever leave my laptop.
It gets the domain of the website you need a password for, he login you want to use, a pin that can be website dependent. It outputs a deterministic password that's 12 character long and contains at least one uppercase, one lowercase, one number, and one symbol (since some websites require you to have that)
(And no, there are no benefits in using a KDF if you think for a second about the constraints and the threat model)
I've built a similar tool called Password Wallet for Android two years ago, but there was little interest. I made the generation process customizable so that you could choose to have special characters, numbers etc.
First, it's inconvenient because the same door/user id and password on multiple sites will just result in the same hash, so getting different hashes means having different user id's, something not everyone does or can do.
Second, it seems insecure. All you have to do is know the door id (username) and you can brute force the password just like any other salted password hash.
ok, so [email protected]@site.com would be my unique door id, so each site would have a unique hash. that solves the first problem, basically.
if someone compromises that site, they get the unique hash. presumably they're not using an expensive hashing algorithm (most sites don't). since they know my username and the site name, they can start brute forcing my unique hash to determine the password. sure, the program could be using an expensive hash algo that would make this take some time, but the whole point (i thought) was to prevent passwords at rest. with this you still have passwords at rest... they're just distributed out across the internet.
there is benefit here, in that you have to compromise each site to get each unique set of credentials. but the downside is a much bigger single point of failure. you trade off the inherent security of memorized complex passwords for one complex master password and the hope that nobody will ever discover or brute force it.
at the end of the day, passwords are still vulnerable to the same flaw: you only need one attack vector to succeed in compromising accounts. with 2-factor auth you need two attack vectors to succeed, which isn't impossible, but is harder, which is really what security is all about.
If you looked at a random cryptographically secure hash on the internet, would you be able to determine what hashing algorithm was used? Would you know how many rounds it was sent through, or whether any other hashing techniques were used prior to the final hash?
Yes, yes, yes. Password hashes that use salts or rounds are prefixed with said information, either so you know how to decrypt it, or know how to strip away the salt, or know what number of rounds to decrypt to.
So if I did a salted sha256 followed by 8 rounds of bcrypt each with their own unique salt (probably a bad idea and I know it) you would know that I started with sha256? Wouldn't the final round of hashing obscure all previous rounds? And how would you strip away a salt without having the original, unhashed information? The whole point of a salt is to be factored in prior to the hashing process.
e.g. see www.supergenpass.com (who use an identicon as a visual hash of your passphrase for verification, rather than a 4-digit code, so might be easier to recognise)
I'm personally a big fan, at least for low-importance logins. As other commenters have mentioned, there are definitely security trade-offs that need to be considered.
Thought of this a while ago and it probably already exists... but I hadn't seen the one with a pin code yet to verify that your passphrase is correct. That's a nice addition!
My Chrome extension (which I don't use anymore, but here it is: https://chrome.google.com/webstore/detail/cryptopass/hegbhhp...) uses similar technique, but instead of displaying this "PIN code" it uses different colors for master password input box text: the color changes as you type.
Saw the colors thing before. Interesting but I think I prefer the PIN because it's easier to check (you can't compare color shades on different screens), though I must say I haven't used the PIN method before (I have used the color method and found that it didn't really add much).
I have used a similar aproach, but I don't hash them. Here's a example:
Supose I have the master password "t3st1ng" and "!@#" as separator. When I want to register on site www.reddit.com, I just use the password "reddit.com!@#t3st1ng".
This way I always have a strong password and I can use different passwords in every site, and I just have to remember the master password.
Let's say badsite.com stores your password in plaintext and their database is compromised (or they're malicious actors in the first place who created the site with the purpose of gathering login credentials).
Now, an attacker who sees this will try go to gmail.com and enter the password gmail.com!@#t3st1ng (with your email address), or bankofamerica.com and try bankofamerica.com!@#t3st1ng.
I would never recommend using this for mission-critical passwords like your bank or Gmail, but I think for most throwaway sites in the past, this was OK. Now with the availability of password managers, I think the clear winner is to use a password manager.
I prefer running a simple algorithm in my head so if my plaintext password gets leaked, an attacker can't just replace "reddit.com" with the name of another site.
What you want is something secret determining the output of something public... so instead of reinventing the wheel (the number 1 source of issues in crypto), use the standard, ie HMAC [1].
Now for the rest of the algorithm, IANAC, but truncating sounds like a bad idea.
[0] https://en.wikipedia.org/wiki/Length_extension_attack
[1] https://en.wikipedia.org/wiki/HMAC