Skip to content
This repository was archived by the owner on Jan 26, 2026. It is now read-only.
This repository was archived by the owner on Jan 26, 2026. It is now read-only.

Shallow scan should recognize phone, credit card, person and location from column names #68

@vrajat

Description

@vrajat

It is not surprising that deep and shallow scan show different results. Shallow scan only looks at column names. Deep scan looks at a sample of the data. I've even noticed that two different runs of deep scan show different results as sample rows are different. This is the challenge with not scanning all of the data. Its a trade-off between performance/cost and accuracy. There is no right answer.

W.R.T the output in particular, my observations are:

  1. Shallow scan should recognize phone, credit card, person and location from column names
  2. Deep scan did not recognize PII in a few columns. I need to look at the data to figure out if thats a bug or the column did not have any relevant data.
  3. Deep scan should also scan column names for candidates
  4. Along with an array, PIICatcher should add confidence numbers.

Originally posted by @vrajat in #67 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions