I’m working on a project that might be exempt from IRB review…if I can convert a particular ID code to an anonymous code. The ID codes are an eight digit number. Here’s a simple approach that should work for this project.
The steps are
Create a random number between 100,000,000 and 900,000,000.
Add the same random number to each eight digit id.
Link the eight and nine digit ids to the raw data.
Store the merged data with the nine digit id, but not the eight digit id.
Shred the random number.
The process of shredding the random number will prevent anyone from converting the nine digit id back into an eight digit id.
1. Create a random number between 100,000,000 and 900,000,000.
There are several easy ways to create a random number with a specific range. In Microsoft Excel, you would place the formula
=randbetween(100000000, 900000000)
Count your zeros carefully, there should be eight of them for both numbers.
This will produce a value. Mine is 480035489, but yours will be different.
There is a complication in Microsoft Excel that you have to work around. The randbetween function will recalculate everytime you make a change anywhere else in the workbook. You need to “freeze” the random number by copying and pasting it to a different cell using the “past values” option under “paste special.”
2. Add the same random number to each eight digit id.
Here’s some data with some made up id values
private_id |
---|
12345678 |
23608461 |
34871244 |
46134027 |
57396810 |
68659593 |
79922376 |
91185159 |
Here’s what those ID numbers look like when you add my random number. We now have a random nine digit id associated with each eight digit id.
private_id | random_id |
---|---|
12345678 | 492381167 |
23608461 | 503643950 |
34871244 | 514906733 |
46134027 | 526169516 |
57396810 | 537432299 |
68659593 | 548695082 |
79922376 | 559957865 |
91185159 | 571220648 |
3. Link the eight and nine digit ids to the raw data.
Here’s the file with the actual data and the private ID code
private_id | report_date | data |
---|---|---|
12345678 | 2023-08-21 | 161 |
12345678 | 2023-08-30 | 113 |
12345678 | 2023-09-16 | 101 |
12345678 | 2023-08-21 | 137 |
23608461 | 2023-09-04 | 184 |
23608461 | 2023-07-25 | 138 |
23608461 | 2023-07-30 | 143 |
34871244 | 2023-09-16 | 101 |
34871244 | 2023-08-18 | 179 |
… | … | … |
Merge the two files together.
private_id | random_id | report_date | data |
---|---|---|---|
12345678 | 492381167 | 2023-08-21 | 161 |
12345678 | 492381167 | 2023-08-30 | 113 |
12345678 | 492381167 | 2023-09-16 | 101 |
12345678 | 492381167 | 2023-08-21 | 137 |
23608461 | 503643950 | 2023-09-04 | 184 |
23608461 | 503643950 | 2023-07-25 | 138 |
23608461 | 503643950 | 2023-07-30 | 143 |
34871244 | 514906733 | 2023-09-16 | 101 |
… | … | … | … |
4. Store the merged data with the nine digit id, but not the eight digit id.
You now have a new random_id in place of the private id.
random_id | report_date | data | |
---|---|---|---|
2 | 492381167 | 2023-08-21 | 161 |
3 | 492381167 | 2023-08-30 | 113 |
4 | 492381167 | 2023-09-16 | 101 |
5 | 492381167 | 2023-08-21 | 137 |
6 | 503643950 | 2023-09-04 | 184 |
7 | 503643950 | 2023-07-25 | 138 |
8 | 503643950 | 2023-07-30 | 143 |
9 | 514906733 | 2023-09-16 | 101 |
1 | … | … | … |
5. Shred the random number.
Once the random number is destroyed, the nine digit id cannot be converted back to the eight digit id.