Privacy in context : the costs and benefits of a new deidentification method
Author(s)
Trepetin, Stanley
DownloadFull printable version (13.97Mb)
Other Contributors
Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science.
Advisor
Peter Szolovits.
Terms of use
Metadata
Show full item recordAbstract
The American public continues to be concerned about medical privacy. Policy research continues to show people's demand for health organizations to protect patient-specific data. Health organizations need personally identifiable data for unhampered decision making; however, identifiable data are often the basis of information abuse if such data are improperly disclosed. This thesis shows that health organizations may use deidentified data for key routine organizational operations. I construct a technology adoption model and investigate if a for-profit health insurer could use deidentified data for key internal software quality management applications. If privacy-related data are analyzed without rigor, little support is found to incorporate more privacy protections into such applications. Legal and financial motivations appear lacking. Adding privacy safeguards to such software programs apparently doesn't improve policy-holder care quality. Existing technical approaches do not readily allow for data deidentification while permitting key computations within the applications. A closer analysis of data reaches different conclusions. I describe the bills that are currently passing through Congress to mitigate abuses of identifiable data that exist within organizations. (cont.) I create a cost and medical benefits model demonstrating the financial losses to the insurer and medical losses to its policy-holders due to less privacy protection within the routine software applications. One of the model components describes the Predictive Modeling application (PMA), used to identify an insurer's chronically-ill policy-holders. Disease management programs can enhance the care and reduce the costs of such individuals because improving such people's health can reduce costs to the paying organization. The model quantifies the decrease in care and rise in the insurer's claim costs as the PMA must work with suboptimal data due to policy-holders' privacy concerns regarding the routine software applications. I create a model for selecting variables to improve data linkage in software applications in general. An encryption-based approach, which allows for the secure linkage of records despite errors in linkage variables, is subsequently constructed. I test this approach as part of a general data deidentification method on an actual PMA used by health insurers. The PMA's performance is found to be the same as if executing on identifiable data.
Description
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2006. Includes bibliographical references (leaves 131-150).
Date issued
2006Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology
Keywords
Electrical Engineering and Computer Science.