DryvIQ's Classification Entity Types

Learn about the information each classification entity type identifies.

On This Page

Overview Content Signature Document Type File Metadata Signature File Name Form Matcher Language Detection Microsoft Purview Information Protection PII Extraction Module Sensitive Object Detection Supported Image Types TIFF Image Types

Overview

A classification entity type is a custom classification model plugged into the DryvIQ Platform. You can add an entity type to a policy to have the policy find the associated information, and you also have the option of uploading files against an entity type to see if the corresponding information appears in the file. (Refer to Uploading Samples to learn how to upload individual files for analysis against any entity type.) DryvIQ has the following preinstalled classification entity types.

Content Signature

The content signature is a unique identifier generated based on the file contents. The Content Signature entity type can identify the signature for a given document, which can help identify duplicate data. While this entity type was added as part of the duplicate detection for content scans, you can upload samples against it to identify the content signature.

Document Type

The Document Type Classifier is an AI classifier trained to identify and classify documents as one of 188 document types. You can also upload individual documents to the classifier, which will determine the document type and provide a confidence score.

Some documents are categorized into groups. When a document matches one of these types, the group name will be displayed rather than the individual document type. Use the link below to download and review a complete list of the document types DryvIQ will identify.

⤓ Download the Document Classifier List

File Metadata Signature

The file metadata signature is a unique identifier generated based on the file name and size. The File Metadata Signature entity type can identify a document's file metadata signature, which can help identify duplicate data. While this entity type was added as part of the duplicate detection for content scans, you can upload samples against it to identify the file metadata signature.

File Name

The File Name Classifier is an AI classifier trained to identify and classify a document as one of almost 700 document types based on the file name. You can also upload individual documents to the classifier, and it will identify the document type and provide a confidence score. Use the link below to download and review a complete list of the document types DryvIQ will identify.

⤓ Download the File Name Classifier List

Form Matcher

The Form Matcher is an AI classifier trained to identify and classify a document as one of over 6000 government and other commonly used organization forms. The Form Matcher can match a “query” document to an indexed one. It attempts to match the query document against all indexed documents and returns the indexed document with the highest similarity score between it and the query document. When you upload a file against the matcher, DryvIQ will include the confidence level for the matched form. The list of forms is too extensive to include on this page, but you can download the complete list using the link below.

⤓ Download the Forms Matcher List

Language Detection

Language Detection is an AI classifier that will identify the language of a document. It currently detects 176 languages. When you upload a file against the module, DryvIQ will also include the confidence level for the detected language. Use the link below to download the completed list of languages DryvIQ will identify.

⤓ Download the Language Detection List

Microsoft Purview Information Protection

The Microsoft Purview Information Protection extension allows you to identify the Microsoft Purview Information Protection (MPIP) security labels applied to content. This extension requires you to register an application in your Microsoft Azure account to obtain the Application (Client) ID and Directory (Tenant) ID required to allow DryvIQ to access the security labels through the Microsoft Purview Information Protection Sync Service. Refer to “Microsoft Purview Information Protection Classifier Extension” for more information.

PII Extraction Module

The Personally Identifiable Information (PII) Extraction Module is a pre-trained AI model that can reliably identify and extract PII elements in unstructured data. You can also upload individual documents to the classifier, and it will identify any PII found in them and provide a confidence score. Use the link below to download a list of the PII types DryvIQ detects.

⤓ Download the PII Extraction List

Sensitive Object Detection

Sensitive Object Detection is trained to identify and classify images of sensitive data, such as identification cards, fingerprints, license plates, etc. You can also upload individual documents to the classifier, and it will identify any images of sensitive information. If an image contains multiple sensitive objects, all items will be identified. For example, if the document includes an image of a driver’s license, the scan will identify both the ID card and signature as detected sensitive objects. Use the link below to download the list of sensitive objects DryvIQ will identify.

⤓ Download the Sensitive Object Detector List

Supported Image Types

BM
BMP
GIF
ICB
JFIF
JPEG
JPG
PBM

PDF
PNG
TGA
TIFF (See note below.)
VDA
VST
WEBP.

TIFF Image Types

A TIFF is a complex image file made up of multiple parts; therefore, DryvIQ cannot scan all TIFF files for various reasons. DryvIQ will log an error in the Activity Log if it identifies a TIFF file but can’t scan it.