By this time we are familiar with the tremendous amount of data that is being produced at a massive scale every single day by every single “smart” device around us.
Before going any further let us differentiate between structured and unstructured data and why structured data storage techniques are now obsolete.
Structured Data: This type of data works after creating a pretty cut and dry model for the information that will be stored in the database, having sturdy pre-defined procedures as to how they will be stored, accessed and processed. For ex.- What fields the will be stored in? The data types (numeric, symbolic, alphabetic), What relation these different fields will have with each other, the capacity of each field and so on.
Unstructured Data: This type of data can be defined as the one where we cannot apply simple logic and retrieve and create information out of. For ex.- Your photos, how will you search a photo in a database where there was a yellow backdrop and you were smiling with your wife? You can’t, Can you? However, when dealing with structured data you can find a person’s name, age, and other information that is stored smoothly.
Okay, so we’ve got the definitions out of the way.
Another point that I will touch on is why the techniques and usage of structured data is now obsolete.
To work with Structured Data we prominently use language like SQL (Structured Query Language). If you’ve had any previous experience with SQL you’ll know that isn’t very efficient in handling any dynamic changes, in layman’s term you have fields with pre-defined rules, you follow those rules and you fill those fields and that is stored in the database.
But look around now, all that data you just created reading this post, did you fill any field? Are you thinking How did I create any data, well you did create history while reading it, unless you are using the incognito mode in that case you still created history, just not that shows up in your browser.
What about when you uploaded a picture on Instagram, that’s data.
So how do we handle this data, that is so different from what it used to be.
When we are producing trillions of gigabytes of this Unstructured Data.
Firstly, we have to evolve our old practices to be able to properly manage this high increased production of data.
Secondly, we have to come up with new ways to store and access and work on this data to extract information.
What is Object Storage?
Think of storing files in a pool: no folders, directories or hierarchies. Theis is how objects are stored in a flat structure. All that is required to store an Object is its Object ID. Since they are stored in a flat manner, their retrieval is the same, i.e, through their Object IDs.
The other important term that you need to know is metadata. So basically what metadata is data about data, so in this case data about the object is what we will be referring to as metadata.
An Object’s metadata is arbitrary, which means that we can any information stored in the metadata of the Object. It is not limited to what our storage system thinks has merit. You can assign the type of information to be attached with an Object’s metadata, the level of protection you want to assign to the Object, whether you want to replicate this Object to create a backup or send over to other servers.
Why Object Storage?
As you’ve read and understood that there is a lot of unstructured data that is being generated, and it needs to be utilised properly, and for that, we need a solution.
Why do we need a new solution and Why can’t the old ways work?
Because of the rapidity this unstructured data is generated in, the quantity, and the difference of the data.
File systems storage methods are unable to work on this.
We need to store this rapidly generated data and storing isn’t enough we need to extract useful information from this data.
So this is where the concept of Object Storage comes in. Because it can work on file streams in real time and is able to handle and manipulate and store data even when an overwhelming amount of data is fed to it.