About Visual Classification Technology

Visual classification is the only technology that classifies both electronic and scanned documents regardless of the amount or quality of text associated with them.

From the user perspective, visual classification is extremely easy to understand and work with. Once documents are collected, visual classification groups documents based on their appearance. This normalizes documents regardless of the types of files holding the content. The Word document that was saved to PDF will be grouped with that PDF and with the TIF that was made from scanning a paper copy of either document.

The grouping is automatic, there are no rules to write up front, no exemplars to select, no seed sets to try to tune. This is what a collection of documents might look like before visual classification is applied – no order and no way to classify the documents:

Visual Classification - Before BR Text Class

When the initial results of visual classification are presented to the client, the groups are arranged according to the number of documents in each grouping. Reviewing the first group impacts the most documents. Based on reviewing one or two documents per group, the reviewer is able to determine (a) should the documents in the group be retained, and (b) if they should be retained, what document-type label to associate with the group.

Visual Classification - After v04-BRev

By easily eliminating groups that have no business or regulatory value, content collections can be dramatically reduced. Groups that remain can have granular retention policies applied, be kept under appropriate access restrictions, and can be assigned business unit owners. Plus of course, the document-type labels can greatly assist users trying to find specific documents.

Visual classification is persistent meaning that as new documents are processed, the same decisions that were made about the initial documents are extended to the new documents. At some point the process reaches convergence and the documents being processed all fall into groups that have been previously examined.

Because the documents have been grouped by visual similarity, users can use zonal attribute extraction technology to identify and extract attributes that are apparent on the face of the documents:


Typical use cases for visual classification beyond the usage in the legal discovery and review arena (BeyondReview) include digitizing paper archives, remediating file shares or content collections, and migrating content to a common platform.

About BeyondRecognition

BeyondRecognition (BR) is a technology company that enables data-driven information governance. 

BR’s visual classification technology enables data management, analysis and governance tasks. BR technology automates the collection, reduction, classification and governance of large volumes of data. It is unique in the fact that it supports data in any file structure, format or type.

Contact us today to learn more.