Credit card numbers checksums and hashes, the story of a scamming attempt
by Jeffrey Goldberg on
As Lívia and I were out walking Molly and Patty on Monday evening, I received a telephone call from an unknown number.
I decided to answer the phone anyway, and I was greeted by a recorded voice telling me that my Bank of America debit card beginning with 4217 has been limited and whether I would like to revalidate it.
I don’t have a Bank of America debit card (though I did many years ago), and I know US rules bar most calls that begin with recorded messages, otherwise known as “robocalls.” I also know that the first few digits of credit card and debit card numbers indicate the card issuer and so are not secret if you know that it is a card issued by a particular bank. It would be easy for a scammer to appear to know my card number. Once I got home, I was able to confirm that 4217 is, indeed, how most Bank of America debit card numbers begin.
Naturally I pressed 1 to “revalidate” my card (not-so-sidenote: you should not try this at home. Indeed, you really should just hang up when you get such calls). I wanted to see how this scam proceeded. At this point I was told a few more details explaining that my card beginning with 4217 needed to be revalidated because of a internal security failure and that I should press 1 to continue. I pressed 1 again.
I was then asked to enter my Bank of America debit card number. Unfortunately, I was not able to construct a number off of the top of my head that would pass their validation check. While I knew it should begin with 4127 (thanks to the robocall), I did not know at the time that the next two digits should be between 64, 65, or 66 for Bank of America. I did know that the last digit of the sixteen digit number should be calculable from the preceding 15 digits by the Luhn checksum algorithm. But I couldn’t recall the details of the algorithm at the time, and I certainly would not have been able to construct something on the fly quickly enough to enter a “valid” number.
As a consequence, I wasn’t able to get past the “enter your debit card number” step, and so never learned what other information they would ask for, such as my Social Security Number, billing address, or PIN. But it is a safe bet that those would have been requested at some point. Those of course, would have been easier to fabricate as they have no quick validation system.
The Luhn checksum is not some super secret security feature, it was a simple system designed to identify errors when exchanging credit card numbers in common use. At best, it can play a minor role in basic security checks to weed out those who just try random numbers.
The system is designed to catch common errors. If one digit of the 15 digits is entered incorrectly the checksum will fail. For example if the correct number is 4217 6601 2345 6784 (here 4 is the checksum) then if the number is given as 4217 6681 2345 6784, where the “0” has been mistyped as an” 8″, the check digit will be 7, and so 4217 6681 2345 6784, with a “4” as the last digit, will not validate.
Some people accidentally type “2435” instead of “2345”, swapping the “3” and the “4”. It happens to the best of us, and it’s another example of the the type of errors the Luhn system can catch.
Check digits of this sort, where the checksum is just one digit long, have properties that make them poorly suited for cryptographic purposes. The most obvious is that they are too small. That is, the check digit can only be one of ten possibilities—0 through 9. Put another way, there is a pretty much a 1 in 10 chance that two numbers will have the same check digit.
When two numbers result in the same checksum, we have a “collision”. If it is feasible to create two numbers that have the same check digit, the system is not collision resistant.
If we also have one number that has a particular checksum then we can easily construct another number that yields the same checksum. So the Luhn algorithm isn’t second-preimage resistant. The difference between collision resistance and second-preimage resistance is subtle. For collision resistance, the problem is “find two numbers whose checksums collide.” For second-preimage resistance, the problem is “here is a number m, now go and find another number that has the same checksum as m.”
It is also the case that if we have a checksum, we can easily construct a number that shares the same checksum. If you give me a checksum of “5”, I can construct a 15 digit number that has 5 as its checksum. That is, it is possible to work backwards from the checksum to an original. Hence, this checksum system is not preimage resistant. Functions that are preimage resistant are sometimes called “one way functions”.
The obvious reason why the Luhn algorithm fails to be preimage resistant, second-preimage resistant, and collision resistant is because it is just one digit. There just isn’t anything you can do when you only have one digit (about 3 bits) with which to work. But there are other reasons as well. I won’t discuss them at all, other than to say that the Luhn algorithm doesn’t have those three properties because it was never designed to.
Cryptographic hash functions, however, are used for more than just checking for accidental errors in data entry or data transmission. This means they do need to be one way functions (preimage resistant); they need to be second-preimage resistant; and they really should be collision resistant.
We’ve discussed how preimage resistance is one of several essential properties of proper treatment of passwords on servers. We’ve also discussed how second-preimage resistance plays a vital role in network security. Over the next few posts I will be talking about how hash functions are used for data integrity checks and specifically for authenticated encryption. Authenticated encryption will play a big role in our next data format.
I am relieved to say that both Patty and Molly did not really mind that I spent the remainder of our walk talking about the nature of the scam; how the criminals’ knowledge of the issuer number on a card can easily fool a target; and, of course, on the difference between checksums and cryptographic hashes. Lívia did not seem quite as complacent as the dogs. “The difference between us, Jeff,” she said, “is that I get really annoyed when something like that happens; but you seem to be really enjoying yourself.”
Robocalls are a lot like email spam:
As I write this, the US Federal Trade Commission is hosting a summit on robocalls. There should be more information at that site in coming days and weeks.
Correction October 22, 2012: I incorrectly suggested above that the Luhn algorithm will catch all single transcription errors involving the replacement of one digit with another and all cases where adjacent distinct digits are transposed. I was wrong. The ISBN checksum has these properties, but the Luhn Algorithm used for credit cards does not. The crucial difference revolves around the fact that the ISBN-10 system uses a prime modulus (11), while the Luhn system uses a composite modulus (10). This is also why the check digit for ISBN numbers can sometimes be “X”. The check digit there works out to be a number from 0 through 10, but “X” is used to indicate 10.
I should have spotted this instantly when I looked at the Luhn algorithm, but it required a different path for me to learn of my error. Today is a student holiday in our school district, and so I assigned my daughter the task of proving that the Luhn system will have the properties that I claimed it had. She quickly constructed some counter-examples, where changing an individual digit leads to the same check digit.