With the continuing expansion of privacy regulations, the Enterprise File Fabric’s Content Discovery engine can play a huge role in identifying and classifying sensitive data that you may have within your organization’s unstructured data sources.
In a nutshell, our Content Discovery engine analyses the textual content of your unstructured data, such as office documents, videos, images and more, and identifies and tags / classifies documents that contain information that might be sensitive to an organization.
All data can be reviewed and searched in-place via a single pane of glass, no matter where that data is stored, be that public cloud on infrastructure such as Amazon S3 or Azure, or on-premises on infrastructure such as Windows Filers , NAS, SAN, on-premises object storage etc.
This is made possible by The File Fabric’s connectors, of which there are currently over 60. Additional connectors are always being developed to further help facilitate company compliance efforts.
The Content Discovery engine comes out-of-the-box with over 50 (and growing) Content Detectors, all powered from our open-source collection of Content Detectors. This includes everything from basic Telephone Number detectors through to, for example, Social Security Numbers detectors.
Although the built-in detectors will prove helpful for a core of sensitive information, often organizations may want to find information that is specific to them, for example, Customer Reference Numbers.
To create a Content Detector specific to your own organization, visit the Content Discovery section from the File Fabric’s Organization menu.
In the Available Content Detectors section, click the button Create New Detector. We’ll give this a name, like for example Customer Reference Number. We’re also prompted to input a Tag, which will be the tag/label applied to any files found matching. We could input crn for example.
When this has been created you will see something similar to the following:
To activate our new detector we will need to apply the rules around our Customer Reference Numbers.
Imagining that our CRN pattern is the characters “AC” following by a series of numbers, suffixed with the character “L”, e.g. “AC234902L”, we would click the button labeled Add new filter , provide the fiter with a friendly Title like CRN Pattern, choose Regex as theType, and input the following as the as the Value:
Here is an example of that in the user interface.
Once you have added the detector, it can then either be enabled in any existing Detection Category, or created under its own.
Once enabled in a Detection Category and files scanned from this point by the Content Discovery engine, will begin flagging any documents potentially containing our Cutomer Reference Numbers
Here is an example of a document containing Customer Reference Numbers being flagged.
Expanding the matches will reveal the sensitive information that has been found.
Furthermore, you can also search across the rest of your storage estate for any Customer Reference numbers, helping you more easily identify your risk and exposure.
For compliance and security purposes, it is vital for companies to know what kind of sensitive information they store, where it is, how it’s used, and who has access to it.
With its content discovery capabilities the File Fabric identifies sensitive files of all types and is easily extendable.
** Image by Mudassar Iqbal from Pixabayby