An unstructured data migration plan template to consider

This is part of Solutions Review’s Premium Content Series, a collection of reviews written by industry experts in maturing software categories. In this submission, Komprise Chief Customer Success Architect offers a template for unstructured data migration planning, along with tools to consider.

Premium SR ContentData migrations have never been easy. But now the need to do them intelligently and painlessly is urgent, as enterprises simply have too much unstructured data relying on their best-in-class storage technologies in legacy environments. Even though storage technology prices have come down in recent years, at the same time data growth has been exponential. It is imperative to continuously assess what data is stored on your best performing tiers and whether the data can be migrated to a solution at a better price and/or to meet organizational needs such as provisioning cloud data lakes or compliance with ever-changing regulations. .

There are many options for storing unstructured data these days, from storage as a service (STaaS) to object storage, cloud network storage (NAS), and deep archives such as AWS Glacier and Azure Archive Storage. These choices mean that IT teams responsible for unstructured data need a detailed understanding of their data and the ability to pivot at any time to adapt to change. And let’s face it: Pivoting workloads of any size and scale, on-premises or to the cloud, can be time-consuming and disruptive without a plan.

Creating a plan of what you need to know before migrating will avoid errors and delays associated with cost overruns while ensuring you meet your overall unstructured data management goals; for most organizations, this means moving to the cloud faster and maintaining an agile, hybrid cloud environment.

The plan should encompass the key questions:

  • What level ? What cloud?
  • What about rules and regulations?
  • What are our common data types and workloads?
  • What topology requirements do we have?
  • Do I really need to test?
  • Free tools or enterprise solution?
  • How do you write communications that people will read?

Unstructured data migration plan: steps to follow during development

Map sources and targets

First you will want to get the lay of the land by setting your sources and targets. When you develop your plan, make sure it details the locations of points A and B and that you have a process for identifying and resolving mitigating factors and potential complications with your source and target storage.

Rules and regulations

Rules are usually established and governed within your organization, such as retention policy, legal hold, deletion policy, and disaster recovery. Regulations are usually set by a governing body that can impose fines for non-compliance, such as HIPAA, SOX, GDPR, and GxP. It is essential to work in partnership with your HR, security, legal and compliance teams to ensure that everyone is doing their part to meet or exceed applicable rules and regulations. Additionally, consider collaborating with data owners or subject matter experts who can shed light on potential roadblocks and provide feedback while establishing the best unstructured data management strategy.

Data Discovery

When you perform proper data discovery, you understand your workloads and potential speed bumps. Are you migrating one share or thousands? Are they millions of small files, terabytes of large files, or a mixture of everything? Can you identify orphaned data and move it to an archive or contain it for deletion? Tools can help create a central index to help make better decisions through global visibility, which, by the way, usually gets the full attention of legal and/or compliance teams and opens up partnership opportunities for funding or demonstration of cost avoidance.

Simplify and standardize

Just because you did things a certain way in the past doesn’t mean it will be the right way tomorrow. Legacy standards that haven’t evolved over time or those adopted through mergers can wreak havoc on migrations, including cloud adoption strategies. You’ll need to consider whether to carry over the old permissions or standardize them in the new target, for example. Another example is choosing between two shares – SMB or NFS where one protocol takes precedence or a mixed protocol architecture. In the latter case, both protocols can set permissions and crash, which usually presents support issues.

Budget optimization

A huge benefit of data visibility is that it allows you to make layered decisions about your data rather than taking a one-size-fits-all approach. Instead of directly moving 2PB of unstructured data to another platform, you might want to consider archiving or tiering cold data to cost-effective object storage, which will save you a lot of money. from one year to the next. Organizations implementing data visibility strategies can identify 60-80% of their data as cold. Reducing hot storage capacity directly reduces data protection and replication costs, which can represent a significant percentage of your overall storage budget.

Topological analysis

Network and security configurations can have a huge impact on migrations. Are you moving data between sites or regions, cloud to cloud, or even from the cloud? Define your path and understand your round trip latency, total versus. consumable bandwidth and security requirements. Security technologies, especially antivirus and IDS/IPS, are known to negatively impact migrations when not configured to compensate for increased workloads. The purpose of understanding the topology is to avoid bottlenecks in advance that can slow down or even completely stop the migration.

Test, test, test

Pre-migration testing is equally critical. Some of the most common issues include nodes or clusters being used, misconfigurations, vendor-specific technology limitations such as shares that include a million files or more, short filenames (app 8.3), names long pathnames and Unicode versus Non-Unicode (which affects data storage due to differences in character standards). Oversubscribed or saturated networks, asymmetric routing, or security systems can cause problems: frequent packet drops, out-of-order packets, or retransmissions usually go back to one of the above. Starting with the basic tools included in most operating systems is a good idea. Nothing is more basic than ping, traceroute, and nslookup, which test network connectivity, network path, and DNS configurations. iPerf can be used to measure bandwidth while Wireshark is excellent for showing blocked, dropped, failed or retransmitted packets.

Free Copy Tools vs Enterprise Migration Software

Robocopy and Rsync are common open source tools designed to copy data only and lack the functionality of enterprise migration software. Look for a solution that provides the ability to efficiently run, monitor, and manage hundreds of data migrations to hybrid cloud storage; identify the right files to migrate to maximize efficiency and reduce expenses; minimize network usage; automatic retry if network or storage is unavailable; migrate with or without full file permissions and access control and preserve data integrity by performing MD5 checksums not only on parts of files, but on all files.

Communication plan

The most effective way to keep people informed of the migration plan and milestones is to send out an email thread to all stakeholders on a pre-determined interval during the migration. Keep the topic and summary brief and include relevant details near the end. Less is sometimes more, and consider applying color-coded status updates: red, yellow, and green. The subject line can be used strategically, like green status. People want to hear the positives. Avoid the blame game and celebrate wins.

Final Thoughts

Cloud migrations are a team sport. While there are many useful tools and metrics, bringing teams together across IT and lines of business promotes shared responsibility, which is imperative to achieving a successful outcome that meets objectives. of the organization and the end user. The more you plan and test in advance, the less chance there is of later issues that will erode confidence in your cloud data management strategy.

Benjamin Henry
Latest articles from Benjamin Henry (see everything)

Sean N. Ayres