Understanding Category and Match Type

Learn the difference between an entity type's category and match type.


Overview

An entity type is what DryvIQ is detecting. The DryvIQ Platform comes with preinstalled entity types that cover a wide range of data, but you also have the option to create custom entity types. Entity types are assigned to a category that identifies the type of data being detected and a match type that identifies how the data is being detected. This page provides a high-level overview to help you understand entity types. To learn more about a specific match type, refer to the DryvIQ Platform documentation page for that particular match type.

Full Entity Type List

You can download a complete list of the preinstalled entity types to review descriptions for over 160 entity types in the DryvIQ Platform. 

 

Categories

Categories identify the type of data being detected. DryvIQ comes with the default categories listed below. You can also create custom categories if one of the default categories doesn’t capture the entity type you want to identify. (Refer to “Managing Entity Type Categories” for information about creating and managing custom categories.)

AI Classifier

This category includes entity types that use DryvIQ’s Artificial Intelligence (AI) engine to classify documents. These include:

 
 

Domestic Identifiers

These entity types identify identification numbers for the United States of America, including:

  • IRS Tax Identification Numbers (EIN, ATIN, PTIN)
  • US Driver’s License Numbers (including driver’s license numbers for each US state)
  • US Individual Taxpayer Identification Number (ITIN)US Individual Taxpayer Identification Number (ITIN)
  • US Passport Numbers
  • US Social Security numbers.
 
 

Duplicate Data Identification

These entity types work together to identify duplicate content. 

 
 

Employee Data

These entity types identify common employee data, including:

  • Employee Compensation
  • Employee ID Template
  • Employee Hire Date
  • Employee Termination Date. 
 
 

File Properties

These entity types leverage file details to classify documents. This category includes:  

  • Content Category
  • File Details
  • File Status.
 
 

Finance

These entity types are for financial information, such as:

  • ABA routing numbers
  • BIC/SWIFT Codes
  • Credit Card Numbers
  • International Bank Account Numbers
  • International Securities Identification Numbers
  • US Bank Account Numbers.
 
 

General

This is a catch-all category for any entity type that doesn’t fall into one of the other categories. Custom entity types are assigned to this category.

 
 

International Identifiers

These entity types use regular expressions to identify driver’s license numbers, passport numbers, and other national identifiers for specific countries.

 
 

IT Confidential 

This category includes entity types related to the technology field, such as:

  • Client & Platform IDs
  • Client Secret
  • Common Product License Keys
  • Domain Names
  • Encryption Keys
  • Generic API Keys
  • IMEI Number
  • IP Addresses - v4
  • IP Addresses - v6
  • MAC Addresses
  • Most Common Passwords
  • Password - No Format
  • Password - Secure Format
  • Tenant ID
  • User IDs.
 
 

Metadata

These entity types leverage file metadata to classify documents. Supported metadata includes:

  • Custom Metadata
  • File Versions
  • Microsoft Purview Information Protection (MPIP)
  • Tags.
 
 

Permissions

This entity type provides file permissions details, such as permissions rules, shared links, inheritance information, and the user’s active status.

 
 

Personal Details

These entity types identify personal information that doesn’t fall into PII, such as:

  • COVID-19 Vaccination Status
  • Date of Birth
  • Disability Status
  • Email Addresses
  • Marital Date
  • Marital Status
  • Military and Veteran Status
  • US Phone Numbers
  • US Vehicle Identification Number (VIN).
 
 

PII

This entity type can identify personally identifiable information (PII) in a document. 

 
 

Privacy

This category identifies GDPR keywords related to healthcare and insurance. 

 
 

Regulation

Regulation entity types identify information that is generally associated with regulatory agencies as:

  • Ethnic Characteristics
  • GDPR Banking and Finance Related Keywords
  • GDPR Government Identification Related Keywords
  • GDPR Personal Profile Related Keywords
  • GDPR Travel Related Keywords
  • Gender Identities
  • Precise Geolocation (Global, DD Format)
  • Precise Geolocation (USA Only, DD Format)
  • Racial Characteristics
  • Religious And Philosophical Characteristics
  • Sexual Orientation
  • Trade Union Memberships.
 
 

Match Type

Match type is how the data is being detected. DryvIQ supports four match-type methods.

Block List

A block list entity type allows matching on a given list of keywords within a file. 

 
 

Classification

This is a custom classification model plugged into the system. DryvIQ has the following preinstalled classification entity types:

 
 

Regular Expression

A regular expression entity type uses a pattern used to identify text. It allows you to control the content DryvIQ detects at a fine-grained level.

 
 

Transformation

A transformation entity type allows you to generate a new value by applying a custom metadata expression against any prior matches. This option is available only through the DryvIQ API.