Contraseñas en Linux

En Arabesque estuvieron escribiendo una interesante serie sobre diversos aspectos de seguridad para el usuario en Linux (el índice en http://blog.sanctum.geek.nz/linux-crypto-introduction/»).

Tal vez volvamos sobre ella pero hoy me gustaría traer aquí el tema de las contraseñas: Linux Crypto: Passwords donde nos habla de pass como utilidad para gestionar las claves que utilizamos habitualmente.

Para los aficionados a los programas con interfaz textual puede ser una buena alternativa a otros métodos publicitados frecuentemente.

¿Deberían informar los sitios web sobre sus mecanismos internos?

La pregunta la hace Troy Hunt en Should websites be required to publicly disclose their password storage strategy?: hemos olvidado nuestra clave en un sitio y pedimos recuperarla; nos llega un correo electrónico con nuestra clave; un sudor frío empieza a recorrer nuestra espalda.

O vale, no nos la manda, porque dicen que no pueden/no quieren, ¿lo estarán haciendo suficientemente bien? ¿Me importa? ¿Le importa a los usuarios?

Sobre el recordatorio de contraseñas:

Another reason this doesn’t make much sense is that many websites leak information about the password storage mechanism anyway. Ever used a “forgot password feature” and been emailed your password? There’s disclosure that they’re not hashing it so it’s either immediately accessible once the database is disclosed or accessible once the key is obtained and once a box is popped, this is very, very frequently a trivial task. There’s an entire site dedicated to naming these purveyors of poor password management over at plaintextoffenders.com so certainly there is voluminous public data on them already.

Y tampoco es tan sencillo:

Password storage isn’t always just as simple as “we use this hashing algorithm with this salt” and indeed the protections offered by, say, symmetric encryption may be as good as null and void if the key management strategy is bad. So how much information should be disclosed? Where do you draw the line between a simple statement as seen in the badges above and a more comprehensive – and perhaps revealing – statement of a website’s security position?

Vale la pena, en todo caso, darle un par de vueltas al tema antes de tomar decisiones en nuestros propios desarrollos.

Almacenamiento seguro de contraseñas

En Storing
Passwords Securely
un texto interesante.

Cuidado con los hash que utilizamos para almacenar las claves:

Typically, system designers choose one of two ways to store their users’ passwords: 1. in their original format, as plain text, or 2. as the digest (output) of a one-way hash function. It probably goes without saying that the first option is a bad idea considering that any kind of compromise of the users/password database immediately exposes login credentials clients may be using on many other sites—but would it surprise you that the latter, as implemented in the majority of web systems, only provides marginally stronger security?

El consejo sería:

If you create a digest of a password, then create a digest of the digest, and a digest of that digest, and a digest of that digest, you’ve made a digest that is the result of four iterations of the hash function. You can no longer create a digest from the password and compare it to the iterated digest, since that is the digest of the third digest, and the third digest is the digest of the second digest. To compare passwords, you have to run the same number of iterations, then compare against the fourth digest. This is called stretching.

A good password storage system takes so long to process a single input, e.g. 0.2 seconds on a modern computer, that guessing a password using a brute force will take significantly longer. (With a hash algorithm like SHA-256, this might be 100,000 iterations or more.) Where, previously, one might have been able to compare digests 5,6 billion times per second, it might now be 5 times per second on the same computer without parallelization; more, maybe a few hundred or thousand attempts per second using hardware like GPUs—but still significantly less than 5,600,000,000!

En realidad, ¿qué habría que hacer?

Adaptive key derivation functions are exactly what we’ve discussed above: Functions that generate digests from passwords whilst applying salting and stretching. They implement all of the above features, and often in a way that would be difficult to achieve using just a programming language’s standard library. For instance, they might work such that the digest computation can’t easily be parallellized—something that is very doable with plain MD5 and all members of the SHA family. In effect, attackers can’t easily apply specialized hardware like GPUs or FPGAs to greatly improve the speed at which passwords can be guessed using a brute force approach.

Y, a modo de conclusiones:

Here is my view:

– MD5, SHA-1, SHA-256, SHA-512, et al, are not «password hashes.» By all means use them for message authentication and integrity checking, but not for password authentication.
– If you are a government contractor, want to be compliant with security certifications or regulations like ISO 27001 or FIPS 140-2, or don’t want to depend on third-party or less-scrutinized libraries, use PBKDF2-HMAC-SHA-256/SHA-512 with a large number of iterations to generate digests of your users’ passwords. (Ideally it should take a second or more to generate a single digest.)
– If you want very strong password digests, and a system that is very easy to use, use bcrypt. Simple, easy-to-use libraries exist for nearly every programming language. (Just google «bcrypt «, and chances are you’ll find a solid implementation.)
– If you are designing a new system which either relies on encryption to store very sensitive information using a weak secret (user passwords), or where it is imperative that nobody guesses any of the passwords in any reasonable amount of time, you should investigate if there is a solid implementation of scrypt for the language or framework you’re using.

Cada vez se sabe más sobre estas cosas, y cada vez hay que estar atento a más cuestiones.

Ya habíamos hablado de Elegir un hash.

También es muy interesante echarle un vistazo a The History of Password Security aunque como sólo es una presentación algunos detalles no se pillan del todo (al menos yo).

Mejorar las contraseñas en empresas

En How Companies Can Beef Up Password Security dan algunas ideas sobre la mejora de la calidad de las claves en empresas. Se trata de una entrevista a Thomas H. Ptacek. Me quedo con un par de párrafos:

Sobre los hashes para almacenar contraseñas, no hay que olvidar que uno de sus objetivos es que no sean muy rápidos de calcular: eso no supone un problema para el usuario porque no le frena demasiado y lo es para un hipotético atacante.

Well, that’s the opposite of what you want with a password hash. You want a password hash to be very slow. The reason for that is a normal user logs in once or twice a day if that — maybe they mistype their password, and have to log in twice or whatever. But in most cases, there are very few interactions the normal user has with a web site with a password hash. Very little of the overhead in running a Web application comes from your password hashing. But if you think about what an attacker has to do, they have a file full of hashes, and they have to try zillions of password combinations against every one of those hashes. For them, if you make a password hash take longer, that’s murder on them.

Y otro sobre el proceso de aplicar un hash repetidas veces:

– Can you explain in layman’s terms what it is that makes a password hash like Bcrypt take so much longer to crack?

Ptacek: It’s similar to if you said, let’s take SHA-1 password, but instead of just using SHA-1, let’s run SHA-1 on itself thousands of times. So, we’ll take the output of SHA-1 and feed it back to SHA-1, and we’ll do that thousands and thousands of times, and you’ll only know if your password hash is right, when you look at the result of that 1,000th SHA-1 run. So, in order to make that password hash work, you have to run the algorithm 1,000 times for each guess. That’s roughly the tactic that the modern, secure password hashes take. These are algorithms that are designed so that you can’t arrive at the result without lots and lots of work. I mean, we’re talking about 100 milliseconds [one-tenth of a second] worth of work on modern hardware to get the results of a single password attempt.

Almacenamiento seguro de contraseñas

Cuando leí Storing Passwords Securely me pareció una lectura recomendable sobre el tema:

Time and time again you hear about a company having all of their users’ passwords, or «password hashes», compromised, and often there’s a press response including one or more prominent security researchers demonstrating how 1,000 users had the password «batman», and so on. It’s surprising how often this happens considering we’ve had ways to do password authentication that don’t expose users’ passwords, or at least makes it significantly harder to crack them, for several decades.

Personally, I think it boils down to a fundamental misunderstanding about what cryptographic hash functions are and what they are—or should be—used for, and a failure on the part of security researchers and advocates, myself included, to properly explain and emphasize the differences. So here’s an attempt to explain why «SHA 256-bits enterprise-grade password encryption» is only slightly better than storing passwords in plain text.

Se habla de los hash y otras cuestiones que hay que tener en cuenta. Además tiene The History of Password Security como ‘bola extra’.

Gestores de contraseñas y XSS

En Abusing
Password Managers with XSS
que se refiere a una forma de engañar al
gestor de contraseñas del navegador.

An issue with both in-browser as well as third-party password managers that gets hardly any attention is how these can be abused by XSS. Because many of these password managers automatically fill login forms, an attacker can use JavaScript to read the contents of the form once it has been filled. The lack of attention this topic receives made me curious to see how exploitable it actually would be. For the purpose of testing, I built a simple PHP application with a functional login page aswell as a second page that is vulnerable to XSS (find them here). I then proceded to experiment with different JavaScript, attempting to steal user credentials with XSS from the following password managers:

LastPass (Current version as of April 2012)
Chrome (version 17)
Firefox (version 11)
Internet Explorer (version 9)

I first visited my login page and entered my password. If the password manager asked me if I wanted it to be remembered, I said yes. I then went to the XSS vulnerable page in my application and experimented with different JavaScript, attempting to access the credentials stored by the browser or password manager.

Soy muy partidario de utilizar un gestor de contraseñas, pero no me siento confortable con que esté en el propio navegador porque pueden pasar cosas como estas.

La fragilidad de los números de identificación personal

En [PDF] A Birthday Present Every Eleven Wallets? The Security of Customer-chosen Banking PINs un análisis de datos obetnidos sobre algunos PINs (Personal Identification Number).
Entre los descubrimientos, utilizar la fecha de cumpleaños permitiría acceder al cajero en 1 de cada 11 (en realidad, entre 11 y 18) tarjetas.

Se hacen análisis de los numeritos y algunos corresponden a palabras (recordar que los teclados numéricos incluyen caracteres a veces): LOVE sería la más frecuente (5683) pero hay otras.

De todas formas, parece que la elección por parte del usuario de los PINs no es tan mala como en otros tipos de clave (¿seguramente porque nos dan un pin y no lo cambiamos?).

Más datos sobre claves

En The science of password selection un (otro) informe sobre las claves que se obtienen de diversas maneras de usuarios reales en la red.

De mis notas.

¿Cómo eligen las claves los usuarios?

En este caso los datos vienen de varias fuentes:

The data I’m going to analyse comes from a variety of sources including the Sony and Gawker breaches I referenced in the previous post as well as other LulzSec releases including pron.com and a collection of their random logins.

Se usan nombres (un 14% de claves, derivadas de nombres):

I also suspect they feature heavily when someone reaches into the recesses of their mind to come up with a password. Now of course the name is not necessarily the name of the account holder; it could be a spouse, the kids or even the family dog. Furthermore, it could be a first name, a middle name or a last name.

Un 25% son palabras del diccionario (incluyendo la palabra ‘password’)

A huge 25% of passwords are derived directly from dictionary words. In
reality, it’s probably somewhat higher than this as my dictionary had less
than a couple of hundred thousand words. And they’re all only English
language.

Top among the dictionary favourites are:

password (oh dear)
monkey
dragon

Las claves numéricas tienen en su mayoría (83%) cuatro, seis u ocho dígitos, pero hay un buen número de longitud uno, por ejemplo.

Why is this interesting? Well firstly, within a spread of numeric password lengths which range from 1 (yes, 1, and there’s a heap of ‘em) to 21, 83% of the passwords are either four, six or eight digits long. Is this a propensity for even numbered password lengths or something else?

Los de cuatro podrían corresponder al PIN del cajero automático.
Los de seis serían fechas donde el año tiene dos cifras.
Los de ocho serían fechas con el año completo.

Las claves con dos palabras repetidas (blabla) serían menos del 3%, pero es un patrón que utiliza la gente.

También hay quien utiliza frases cortas del estilo de: ‘dejameentrar’ o similares.

Naturalmente, he hecho una selección según mis gustos y sesgos, pero vale la pena leerlo todo para hacerse una idea.

Gestión de claves en los programas

En seguridad siempre hay que ponerse en lo peor (aunque sea difícil). Sobre todo, si es fácil evitar los problemas que podrían ocurrir. Algo que solemos comentar es que cuando se lea una clave se haga lo que sea con ella y la eliminemos de la memoria lo antes posible (para evitar problemas con escrituras a disco por diversos motivos: el ‘swapping’ del sistema operativo, volcados de memoria ante fallos, …

Son ese tipo de cosas que suenan más teóricas que realistas pero en el ‘mundo real’ ™ Passwords left in memory using SSH keyboard-interactive auth podemos ver como realmente alguien las toma en serio. En este caso, los desarrolladores de Putty.