Like the other “things” in life you can hoard, the Internet of Things (IoT) is a playground for IoT data hoarders.
Many people are hoarders, whether it is clothes, gadgets, or - in the recent pandemic - toilet rolls. And if you do hoard a bit, you know that you should regularly clear out what you don’t need and are unlikely to ever need.
In IoT, the sensors monitor data – some in real-time – and all of that data has to go somewhere. It is rarely sorted into nice little boxes and organized into a pile of “what you need now.” It stays in drawers, so that what you do need is mixed with “what you don’t ever need” (and can go to the charity shop), as well as “what you don’t need now, but know you need for later.”
As IoT is relatively new for many companies, most have been in the “just gather it and fill up the house with it” stage, as they haven’t made any decisions about their data yet, nor have they realized the cost impact of storage yet. It is often placed in the cloud and left there in case it is needed later.
What to keep - and what not to
But as costs rise for storing data in the cloud - and the costs only go up every day as data is collected from all the sensors - what are you planning to do with the data?
Where you store the data has a direct impact on how much it is going to cost you. There are various options and assuming that all data persistence is equal can be an expensive assumption. A little time taken to understand what you need your data for can pay you back in saved budget. Some questions you should be asking yourself include:
- Do you need to keep your data at all? Is it only of use immediately when you create it and then you can ditch it?
- Do you have data that only needs to stay in an operational store for a week and then you can get rid of it?
- Do you have data that you need to analyze in future? Maybe you want to use it to train machine learning models? Could you could put it into a data lake, where storage is far cheaper and structured, for long term storage?
- Do you have any regulatory requirements on the amount of time you need to keep data?
- Do you need to retain the data at a granular or atomic level, or can you aggregate or summarize it? For example, could you take the average, minimum or maximum of a value over a period of time, thereby retaining less data and using less storage?
Not all data is equal
With hoarding generally, we are worried if we throw something away, we might need it in future. With data, some is useful, some has no use, and some has historical and analysis use. But what if you don’t need it? Then you are keeping it for the sake of it – while paying for the privilege.
Seven step plan for data hoarders
If you need it, keep it. If you don’t, get rid of it. Here are some suggested steps to go through for analyzing the value of your data:
Step 1: Work out the different types of data you have.
Step 2: Work out the cost of storing this data and consider the financial, legal and business cost of losing the data:
Step 3: Analyze how you used the different types of data over the last 18 months to two years. What business value was gained from doing so?
Step 4: Consider how it could have been used to provide business value.
Step 5: Look at the processes you have or need in place to create value from your data. If you don’t have the skills, can’t outsource or see no way to use it, is it worth keeping? If you can use it to create value or can see this happening in future, it’s worth storing it.
Step 6: Decide how you want to store it if you are keeping it. Data lake options are far cheaper but you need to make sure it’s converted correctly from the operational store so it can be stored and used properly.
Step 7: Put together your case and discussion points and decide together with key stakeholders in the business
If you go through these seven steps, you will have a much better idea of what data you need to keep.
And it might be worth considering whether hoarding data isn't such a bad thing - you never know when this data will prove useful. Humans are not the best judge of what is useful or not - data mining algorithms, unconstrained by human preconceptions, are much more capable of deciding. But that can be a little too much data for you, especially if it is kept in a data store designed for short-term storage only.
In my next article I will explore data lakes, and how they can be deployed to help you hoard efficiently. After all, just like you might need that gadget in your closet one day, you might also need that data you threw away.