Guest column: 5 data protection considerations for government agencies
Jarrett Potts, director of strategic marketing for STORServer, is a FedScoop contributor.
As government security and IT authorities make plans for moving into the new year, data protection should be top of mind. It is no secret data is exploding at an alarming rate. According to International Data Corporation’s Digital Universe Study, in 2012, less than one-fifth of the world’s data was protected, despite 35 percent requiring such action. Levels of data protection are significantly lagging behind the expansion in volume.
This begs the question, how do we protect this ever-growing tide of data, particularly in government settings where data protection is a priority? Below are five key items for consideration as governments move forward with data protection plans for 2014.
Planning is key
When considering your IT environment and how to manage your data protection, upfront planning is necessary to successful execution. Plan for the birth, life and death of every bit and block of data.
To begin, the classification of data is the most important part of all planning procedures. Planning for the worst means you have to decide which data is important and which data needs to be recovered first. This process is also known as tiering.
Tiering your data helps align the value of the data with the cost of protecting it. It helps stretch your backup budget and makes data protection and recovery more efficient. The recovery point objective and recovery time objective should vary for each application and its data, and all data should not be treated in the same way during backup and recovery procedures. All data is not created equal.
This tiering of data is not simple as it requires many different parties to agree on which data is most important. For example, any data that is historical (not used on a daily or monthly basis) should be the lowest tier. This data needs to come back after a disaster, but not until everything else is up and running. Tier-one data should stay on disk for fast restore.
Reduce the front-end data footprint
Primary storage is just that, primary storage. It should be used for data you frequently access. It is not for data that does not get used. That being said, offloading your primary storage of nonessential data is not only a smart idea; it should be mandatory.
If data is not essential or not being used on a daily, weekly or monthly basis, it should not be stored on an expensive and highly available disk subsystem. Both archiving and hierarchical storage management are options that should be considered as they will extend the life of your disk and make recovery faster and easier.
Reduce the back-end data footprint
There are three main ways to reduce your background data footprint. Compression is the most prominent way to shrink the size of your data. All data protection products and solutions compress data in one way or another. Compression should get you about 50 percent of your space back. Some vendors give you a higher number, but in all honesty, they are skewing the numbers. Usually, they are testing with Word documents or flat files.
Second is deduplication. The process of comparing all of your files and information and storing it only once is commonly called deduplication or single instance store. This process should net you about 40 percent of your back-end storage.
The final way to reduce your backend data foot print is to only back up data that has changed. Incremental forever or progressive incremental backup relieves your network and systems from having to do a full backup on a weekly basis. This can net you up to 90 percent or more of your back-end storage as it only transfers and stores changed data.
Cloud is not the answer
The dirty thing about cloud is that there are only a few cloud providers that are government certified, and this produces a high cost. Having only a few to choose from also means the providers have you right where they want you. They can charge for recovery of data at whatever rate they wish.
The foggy thing about the cloud is transfer rate. It is solely dependent on the bandwidth from your site to the hosted cloud. Usually, there is also a line where the amount of data you wish to transfer will cost more to recover than the cloud service. For instance, if you are trying to transfer 1 terabyte of daily, it may cost you less to have a secondary server for data protection on one of your other sites. Cost is important, and this cloud will cost you much more in the long run.
Testing is important
Disaster recovery is no exception to the old adage “practice makes perfect.” If you do not test at least once a year, then you have no idea what your timing looks like or if you can even recover your data.
It may seem simple, but just try it. Declare a disaster at your organization and see how fast you can line up the right resources to pull off the recovery. See how fast all parties could get to your recovery site. You do not even have to do the recovery. Perform this simple assessment, and you will start to see the need for thorough testing.
These five facts and techniques will save you from a lot of heartache. Planning for growth while managing your front-end and back-end storage is extremely important, providing you with a future that allows for better decisions and use of budgets. While the cloud may seem like a good idea, it usually is not the best thing for budgets and/or recovery. Lastly, testing of all your recovery scenarios becomes not only prudent but required.