| Foreword | 6 |
|---|
| Acknowledgments | 8 |
|---|
| Contents | 9 |
|---|
| Contributors | 11 |
|---|
| Part I Overview | 13 |
|---|
| 1 Introduction to Privacy and Anonymity in Information Management Systems | 14 |
| 1.1 Background and Motivation | 14 |
| 1.2 Organization of the Book | 15 |
| 1.2.1 Part II: Theory of SDC | 15 |
| 1.2.2 Part III: Preserving Privacy in Distributed Applications | 16 |
| 2 Advanced Privacy-Preserving Data Managementand Analysis | 18 |
| 2.1 Introduction | 18 |
| 2.2 Managing Anonymized Data | 20 |
| 2.2.1 Randomization-Based Anonymization Techniques | 20 |
| 2.2.2 Aggregation-Based Anonymization Techniques | 22 |
| 2.3 Managing Time-Varying Anonymized Data | 23 |
| 2.3.1 Anonymizing Multiple Releases | 24 |
| 2.3.2 Anonymizing Data Streams | 26 |
| 2.4 Privacy-Preserving Data Analysis (PPDA) | 27 |
| 2.4.1 Privacy-Preserving Association Rule Mining | 27 |
| 2.4.2 Privacy-Preserving Classification | 29 |
| 2.4.3 Privacy-Preserving Clustering | 33 |
| 2.5 Conclusions | 35 |
| References | 36 |
| Part II Theory of SDC | 39 |
|---|
| 3 Practical Applications in Statistical Disclosure ControlUsing R | 40 |
| 3.1 Microdata Protection Using sdcMicro | 40 |
| 3.1.1 Software Issues | 41 |
| 3.1.2 The sdcMicro GUI | 41 |
| 3.1.3 Anonymization of Categorical Variables | 43 |
| 3.1.4 Anonymization of Numerical Variables | 52 |
| 3.1.5 Disclosure Risk | 55 |
| 3.1.6 Case Study Using Real-World Data | 57 |
| 3.2 Tabular Data Protection Using sdcTable | 59 |
| 3.2.1 Frequency and Magnitude Tables | 59 |
| 3.2.2 Primary Sensitive Cells | 60 |
| 3.2.3 Secondary Cell Suppression | 61 |
| 3.2.4 Software Issues | 61 |
| 3.2.5 Anonymizing Tables Using sdcTable -- A Guided Tour | 63 |
| 3.2.6 Summary | 68 |
| 3.3 Summary | 68 |
| References | 69 |
| 4 Disclosure Risk Assessment for Sample Microdata Through Probabilistic Modeling | 72 |
| 4.1 Introduction | 72 |
| 4.2 Disclosure Risk Measures and Their Estimation | 75 |
| 4.2.1 Notation and Definitions | 75 |
| 4.2.2 Estimating the Disclosure Risk | 76 |
| 4.2.3 Model Selection and Goodness-of-Fit Criteria | 78 |
| 4.3 Complex Survey Designs | 80 |
| 4.4 Measurement Error Models for Disclosure Risk Measures | 81 |
| 4.5 Variance Estimation for Global Disclosure Risk Measures | 83 |
| 4.6 Examples of Applications | 85 |
| 4.6.1 Estimating Disclosure Risk Measures Under No Misclassification | 85 |
| 4.6.2 Estimating Disclosure Risk Measures Under Misclassification | 90 |
| 4.6.3 Variance Estimation and Confidence Intervals | 93 |
| 4.7 Extensions to Probabilistic Modeling for Disclosure Risk Estimation | 93 |
| References | 97 |
| 5 Exploiting Auxiliary Information in the Estimation of Per-Record Risk of Disclosure | 99 |
| 5.1 Introduction | 100 |
| 5.2 Risk Measures and Models for Risk Estimation | 101 |
| 5.2.1 Superpopulation Models for Risk Estimation with Survey Data | 102 |
| 5.2.2 SPREE-Type Estimators for Cross-Classifications | 104 |
| 5.3 Simulation Plan and Data | 108 |
| 5.4 Risk Estim
|