Skip to content

Anonymization on GDPR GET possibly anonymizes too much #173

@artemagvanian

Description

@artemagvanian

Currently, if a record is owned via multiple paths, all anonymizations that are relevant to each of the paths are applied to the record. This could lead to having overly anonymized data on the invocation of GDPR GET. Perhaps considering an intersection of the anonymizations is the way to go.

For instance, consider this example adapted from commento:

CREATE DATA_SUBJECT TABLE commenters ( \
  commenterHex TEXT NOT NULL UNIQUE PRIMARY KEY \
);
INSERT INTO commenters VALUES ('0');
INSERT INTO commenters VALUES ('1');
INSERT INTO commenters VALUES ('2');
CREATE TABLE comments ( \
  commentHex TEXT NOT NULL UNIQUE PRIMARY KEY, \
  commenterHex TEXT NOT NULL, \
  parentHex TEXT NOT NULL, \
  FOREIGN KEY (commenterHex) OWNED_BY commenters(commenterHex), \
  ON DEL parentHex DELETE_ROW, \
  FOREIGN KEY (parentHex) ACCESSED_BY comments(commentHex), \
  ON GET parentHex ANON (commenterHex) \
);
INSERT INTO comments VALUES ('0', '0', NULL);
INSERT INTO comments VALUES ('1', '0', '0');
INSERT INTO comments VALUES ('2', '1', '1');
INSERT INTO comments VALUES ('3', '2', '2');
INSERT INTO comments VALUES ('4', '0', '3');
INSERT INTO comments VALUES ('5', '1', '4');
GDPR GET commenters '0';

Here is the output associated with the last GDPR GET statement (for the comments table):

+------------+--------------+-----------+
| commentHex | commenterHex | parentHex |
+------------+--------------+-----------+
| 2          | NULL         | 1         |
| 3          | NULL         | 2         |
| 5          | NULL         | 4         |
| 0          | 0            | NULL      |
| 1          | NULL         | 0         |
| 4          | NULL         | 3         |
+------------+--------------+-----------+

Clearly, since the commenter with commenterHex = '0' owns the records with commentHex = '1' and commentHex = '4', anonymizing the commenterHex for those seems to be redundant.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions