If you protect your Excel Sheet (or, in the same fashion, an Open Document Sheet) with a password from cell modifications, you might be aware that this is just a protection from incidental changes. If the user of this file wants to change it, he can do so easily by removing the "sheet protection" tag from the worksheet XML files inside an archive manager. Detailed manuals for this are abundant on the internet.

Apart from making the file accessible one might also be interested in the password the creator of the file chose. In this note we demonstrate that recovering the actual passwords for the removal of the protection inside Excel (or LibreOffice Calc) is almost as easy as removing the password. Code for this purpose is provided.

Inhaltsverzeichnis

1 TL;DR
2 What is the problem?
3 How fast can passwords be recovered?
4 So why did MS never change the algorithm?
5 Why publish code to recover passwords?
6 What about responsible disclosure?
7 And where is the code now?

TL;DR

Do not use any valuable password for protecting Excel sheets! A password is valuable if you use it (or a similar one) also for other, more sensitive purposes. Or if the password itself would contain personal information. Well, both cases should not occur. If you have reused such a password you should change the ones for the sensitive purposes.

What is the problem?

There is a clear and concise description of the "algorithm" on http://chicago.sourceforge.net/devel/docs/excel/encrypt.html, written in 2001. The scheme XORs and shifts the ASCII character bytes with the password length and a certain constant leading to a two byte hash which is the stored in the XLS file. Of course, there is such a great loss of information from the initial password that the set of shortest passwords matching a given hash is usually in the billions. It is a matter of a blink to find a suitable password, but this is not really a security issue since the file can be made available even easier anyway. The question is how Microsoft protects the intended password of the creator of the file from being identified. The creator, being unaware of the weakness of the algorithm, might have reused the same password elsewhere so there is a privacy matter. If the attacker has the opportunity to attack these more senstive data by brute-forcing billions of passwords, an a priori identification of the intended password ist not even necessary. By the way, if you protect a sheet in an ODS file in LibreOffice, a proper hash function is used and a brute force attack with off-the-shelf equipment seems not to be viable.

How fast can passwords be recovered?

It seems that Microsoft took the view that hiding a tree in a wood might be the best protection for a secret. Admittedly, this kind of works if the intended password is long and random enough. Since there are billions of working passwords how can you recover them and identify the intended one? Well, the recovery problem is easy to solve. The operations of the algorithm, XOR and left shift, are very fast bitwise operations, and so are the reversals XOR and right shift. So you can recover at a rate of passwords per second which is in the percent order of the CPU frequency. The most limiting factor is writing the results to disk. In practice, an old laptop can recover and write to disk half a million passwords per second. So you are done with a full set of shortest passwords after a few seconds or a a day, depending on the password size and your hardware. Microsoft probably chose such fast operations twenty years ago in order not to slow down the user experience any further when un-/protecting a sheet. The second problem, how to identify the intended one, is more of a psychological and statistical problem. Since this page is only about alerting on the issue, we do not go into detail here but surely the set of probable passwords can be narrowed down to a few hundred under certain conditions and circumstances.

So why did MS never change the algorithm?

Don't ask me. Ask MS.

Why publish code to recover passwords?

This is for demonstration purposes only. The code works only for passwords with at most eight characters. The chance to recover a password as the intended one this way is very slim if the creator followed basic modern guidelines for the choice. Nevertheless, the code shows how fast and easy the hash reversal works and hopefully stimulates users to use more secure products or MS to change the hash algorithm.

What about responsible disclosure?

There is nothing disclosed here. The algorithm is public for about twenty years now and everybody who wants to grab passwords from Excel sheet protection has probably tried so all along. This note just wants to draw attention to the problem.

And where is the code now?

Here is the C source code: Datei:Xlspwd.c

Xlspwd

Inhaltsverzeichnis

TL;DR

What is the problem?

How fast can passwords be recovered?

So why did MS never change the algorithm?

Why publish code to recover passwords?

What about responsible disclosure?

And where is the code now?

Navigationsmenü

Meine Werkzeuge

Namensräume

Varianten

Ansichten

Mehr

Suche

Navigation

Werkzeuge