Data, data everywhere….and then……what?
That is the issue with digital data. As digital data doubles very year it leads to vast quantities of unstructured data as employees create office documents, PDF’s, videos etc. Searching and classifying this data however presents a challenge particularly as the data within companies is spread across a multitude of on-cloud and on-premises systems and is simply not joined up.
One of the basic precepts of the Storage Made Easy® Enterprise File Fabric™ is creating a unified view of company storage assets that span on-site and on-cloud data sets. The solution facilitates the indexing of file metadata, across each of these storage silos.
The File Fabric indexes data to, amongst other things, speed up reads, and to provide on-demand content search and content discovery across the corporate data estate.
Where files are stored should be company approved but the reality as we know that more often than not employees save files where is convenient to them. This leads to cloud sprawl and Shadow IT issues, amongst other things (security, corporate governance and regulatory compliance instantly come to mind ie. GDPR, HIPPA etc).
This therefore becomes a problem to solve. How do you find files? The latest versions of files? Files appertaining to a particular project? Is it in someone’s mailbox or on a Cloud CRM such as Salesforce, or on the corporate fileserver? Without solutions such as the File Fabric that tie together the corporate file estate and enable easy search and discovery it does not take long before employee efficiency suffers as files cannot be found or worse, old or out of date versions are used.
This is where metadata comes in. Metadata is data that provides information about other data. When the File Fabric first indexes a storage provider it does not try to cache the files, it extracts a very small amount of data ie. filename, date, timestamp, size etc. This enables the File Fabric to provide details of the file without actually owning or storing the file, and importantly this is done for every storage system connected to the Fabric.
Many companies have legacy storage assets that were not built to handle large data volumes. Although the File Fabric is still able to index and work with these, many companies have either started or are undergoing digital transformation programmes to future proof their data by investing in scalable software designed storage infrastructures such as object storage.
Moving data from legacy storage to object storage can solve access and performance issues but companies can quickly end up in the same boat with regards data management issues as there is still a need to find and make sense of the data.
The File Fabric has long worked with Object Storage vendors to ensure that it can index object storage repositories, and today Storage Made Easy can index all of the major object storage solutions, both on-cloud and on-premises.
The File Fabric also supports the concept of classifications and logical tagging. Classifications are entity hierarchies which can be subsequently tagged. An example is easier to demonstrate the concept. Consider a legal company in which their are domain specific entities such as Judge, Witness, and Case. These entities can be created as classifications that can have tags applied against them. For example:
Classification:Witness Tag:Mr F. Bloggs
Classification:Case Tag:Samsung v Apple
In large domain sets having classification enables searching to be more precise and quicker.These classifications and tags span storage subsystems and enable users to quickly search against all files across all storage sub-systems.
Another domain specific example is the media industry. Consider the storing of video files along with the associated scripts. The video files are indexed for the metadata they contain and the scripts are indexed for the content they contain. Later when a production team wants to jump a scene that contains a particular line they can easily search on the line and instantly be taken to the relevant video scene and associated script.
The File Fabric also has another trick up its sleeve in which it can provide content discovery and content validation. What this means is that it can index the contents of files providing in essence a type of private Google search facility, but for a companies corporate data. This search is available not only from the web but also from applications and common user productivity apps such as Microsoft office.
Once the content has been indexed and discovered it can be validated and any enforcement rules can be applied to it. An example of this is the Personal Identified Information (PII) File Fabric module that identifies metadata content that has personal information (such as credit card, passport numbers etc) within it and flags it for review.
In summary, unification of metadata can turn files sharded into separate storage systems into a single searchable repository that becomes a tangible business asset that has substantial business value and delivers a quantifiable return on investment.
[Post syndicated from Storage Made Easy CEO’s personal blog]by
Latest posts by admin (see all)
- OpenStack Vancouver Summit 2018 presentation: OpenStack and the GDPR - May 23, 2018
- GDPR Watch – Auditing Data Access - April 25, 2018