Researchers at Carnegie Mellon University said on Monday that they have developed a method for predicting many if not all of a person's Social Security Number based on publicly available data. The findings were released in an article published in The Proceedings of the National Academy of Sciences
Here's what the researchers said in the preface to the report:
Information about an individual’s place and date of birth can be exploited to predict his or her Social Security number (SSN). Using only publicly available information, we observed a correlation between individuals’ SSNs and their birth data and found that for younger cohorts the correlation allows statistical inference of private SSNs. The inferences are made possible by the public availability of the Social Security Administration’s Death Master File and the widespread accessibility of personal information from multiple sources, such as data brokers or profiles on social metworking sites. Our results highlight the unexpected privacy consequences of the complex interactions among multiple data sources in modern information economies and quantify privacy risks associated with information revelation in public forums.
The researchers, Alessandro Acquisti, an associate professor of information technology and public policy, and Ralph Gross, a postdoctoral researcher, tested their algorithm on 500,000 publicly available records in the Social Security Administration’s Death Master File.
Accuracy of the prediction can be greatly affected by the state in which a person was born as well as the year in which they were born. The researchers state that, in a single try, it was possible to identify the first five digits for 44% of deceased individuals who were born after 1988 and for 7% of those born from 1973 to 1988. It was possible to identify all nine digits for 8.5% of those born after 1988, in fewer than 1,000 attempts.
The smaller the state, the more accurate the prediction, and the methodology was also more accurate for those born after 1988. Some of this, the researchers indicated, was due to the enacting of the SSA's Enumeration at Birth (EAB) initiative in 1989. Together with a small state, this can lead to high predictability:
prediction accuracies are as high as 5% for certain years and states (such as Delaware, 1996), corresponding to 1 of every 20 SSNs issued in those years and states identifiable with 10 or fewer attempts.
SSNs were never designed to be the be-all and end-all of identification that is has become. Unfortunately, it is now used and misused for ID for just about everything.
A spokesperson for the SSA, Mark Lassiter, cautioned against panic:
“The public should not be alarmed by this report because there is no foolproof method for predicting a person’s Social Security number. The method by which Social Security assigns numbers has been a matter of public record for years. The suggestion that Mr. Acquisti has cracked a code for predicting an S.S.N. is a dramatic exaggeration.”
Lassiter also said the agency was in the process of creating a random system for assigning numbers, which will be put in place next year. Of course, this does nothing to fix predictability for those SSNs already in place, but coming from a huge state (California) and being born before 1988, we're not too worried personally