BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Articles SaaS DR/BC: If You Think Cloud Data is Forever, Think Again

SaaS DR/BC: If You Think Cloud Data is Forever, Think Again

Bookmarks

Key Takeaways

  • SaaS is quickly becoming the default tool for how we build and scale businesses. It’s cheaper and faster than ever before. However, this reliance on SaaS comes with one glaring risk that’s rarely discussed.
  • The “Shared Responsibility Model” doesn’t just govern your relationship with AWS, it actually impacts all of cloud computing. Even for SaaS, users are on the hook for protecting their own data.
  • Human error, cyber threats and integrations that have gone wrong are the main causes of data loss in SaaS. And it’s not uncommon, in one study, about 40% of users have said they have lost data in SaaS applications.
  • It’s possible to create your own in-house solution to help automate some of the manual work around backing-up SaaS data. However, there are limitations to this approach and none of them will help you restore data back to its original state.
  • A data continuity strategy is essential in SaaS, otherwise you may be scambling to restore all information you rely on each and every day. 
     

The Cloud is Not Forever and Neither is Your Data 

When I began my career in technical operations (mostly what we call DevOps today) the world was dramatically different. This was before the dawn of the new millennium. When the world’s biggest and most well-known SaaS company, Salesforce, was operating out of an apartment in San Francisco. 

Back then, on-premise ruled the roost. Rows of towers filled countless rooms. These systems were expensive to set up and maintain, from both a labour and parts perspective. Building a business using only SaaS applications was technically possible back then but logistically a nightmare. On-prem would continue to be the default way for running software for years to come. 

But technology always progresses at lightspeed. So just three years after Salesforce began preaching the “end of software”, Amazon Web Services came online and changed the game completely.

Today a new SaaS tool can be built and deployed across the world in mere days. Businesses are now embracing SaaS solutions at a record pace. The average small to medium-sized business can easily have over 100 SaaS applications in their technology stack. Twenty years ago, having this many applications to run a business was unthinkable and would have cost millions of dollars in operational resources. However, at Rewind, where I oversee technical operations, I looked after our software needs with a modem and a laptop. 

SaaS has created a completely different reality for modern businesses. We can build and grow businesses cheaper and faster than ever before. Like most “too good to be true” things, there’s a catch. All this convenience comes with one inherent risk. It’s a risk that people rarely discussed in my early days as a DevOps and is still rarely talked about. Yet this risk is important to understand, otherwise, all the vital SaaS data you rely on each and every day could disappear in the blink of an eye.

And it could be gone for good. 

The Shared Responsibility of SaaS

This likely goes without saying but you rent SaaS applications, you don’t own them. Those giant on-prem server rooms companies housed years ago, now rest with the SaaS provider. You simply access their servers (and your data) through an operating system or API. Now you are probably thinking, “Dave, I know all this. So what?” 

Well, this is where the conundrum lies. 

If you look at the terms of service for SaaS companies, they do their best to ensure their applications are up and running at all times. It doesn’t matter if servers are compromised by fire, meteor strike, or just human error, SaaS companies strive to ensure that every time a user logs in, the software is available. The bad news is this is where their responsibility ends. 

You, the user, are on the hook for backing up and restoring whatever data you’ve entered and stored in their services. Hence the term “Shared Responsibility Model”. This term is most associated with AWS but this model actually governs all of cloud computing.

The above chart breaks down the various scenarios for protecting elements of the cloud computing relationship. You can see that with the SaaS model, the largest onus is on the software provider. Yet there are still things a user is responsible for; User Access and Data.  

I’ve talked to other folks in DevOps, site reliability, or IT roles in recent years and I can tell you that the level of skepticism is high. They often don’t believe their data isn’t backed up by the SaaS provider in real time. I empathize with them, though, because I was once in their shoes. So when I meet this resistance, I  just point people to the various terms of service laid out by each SaaS provider. Here is GitHub’s, here is Shopify’s and the one for Office 365. It’s all there in black and white.

The reason the Shared Responsibility Model exists in the first place essentially comes down to the architecture of each application. A SaaS provider has built its software to maximize the use of its operating system, not continually snapshot and store the millions or billions of data points created by users. Now, this is not a “one-size fits all scenario”. Some SaaS providers may be able to restore lost data. However, if they do, in my experience, it’s often an old snapshot, it’s incomplete, and the process to get everything back can take days, if not weeks. 

Again, it’s simply because SaaS providers are lumping all user data together, in a way that makes sense for the provider. Trying to find it again, once it’s deleted or compromised, is like looking for a needle in a haystack, within a field of haystacks.    

How Data Loss Happens in SaaS

The likelihood of losing data from a SaaS tool is the next question that inevitably comes up. One study conducted by Oracle and KPMG found that 49% of SaaS users have previously lost data. Our own research found that 40% of users have previously lost data. There are really three ways that this happens; risks that you may already be very aware of. They are human error, cyberthreats, and 3rd party app integrations. 

Humans and technology have always had co-dependent challenges. Let’s face it, it’s one of the main reasons my career exists! So it stands to reason that human inference, whether deliberate or not, is a common reason for losing information. This can be as innocuous as uploading a CSV file that corrupts data sets, accidentally deleting product listings, or overwriting code repositories with a forced push.

There’s also intentional human interference. This means someone who has authorized access, nuking a bunch of stuff. It may sound far-fetched but we have seen terminated employees or third-party contractors cause major issues. It’s not very common, but it happens.       

Cyberthreats are next on the list, which are all issues that most technical operations teams are used to. Most of my peers are aware that the level of attacks increased during the global pandemic, but the rate of attacks had already been increasing prior to COVID-19. Ransomware, phishing, DDoS, and more are all being used to target and disrupt business operations. If this happens, data can be compromised or completely wiped out. 

Finally, 3rd party app integrations can be a source of frustration when it comes to data loss. Go back and read the terms of service for apps connected to your favourite SaaS tool. They may save a ton of time but they may have a lot of control over all the data you create and store in these tools. We’ve seen apps override and permanently delete reams of data. By the time teams catch it, the damage is already done.

There are some other ways data can be lost but these are the most common. The good news is that you can take steps to mitigate downtime. I’ll outline a common one, which is writing your own backup script for a Git.

One approach to writing a GitHub backup script

There are a lot of ways to approach this. Simply Google “git backup script” and lots of options pop up. All of them have their quirks and limitations. Here is a quick rundown of some of them.

Creating a local backup in Cron Scripts

Essentially you are writing a script to clone a repo, at various intervals, using cron jobs. (Note the cron job tool you used will depend on the OS you use). This method essentially takes snapshots over time. To restore a lost repo, you just pick the snapshot you want to bring back.  For a complete copy use git clone --mirror to mirror your repositories. This ensures all remote and local branches, tags, and refs get included. 

The pros of using this method are a lack of reliance on external tools for backups and the only cost is your time. 

The cons are a few. You actually won’t have a full backup. This clone won’t have hooks, reflogs, configuration, description files, and other metadata. It’s also a lot of manual work and becomes more complex if trying to add error monitoring, logging, and error notification. And finally, as the snapshots pile up, you’ll need to consider accounts for cleanups and archiving.

Using Syncthing

Syncthing is a GUI/CLI application that allows for file syncing across many devices. All the devices need to have Syncthing installed on them and be configured to connect with one another. Keep in mind that syncing and backing up are different, as you are not creating a copy, but rather ensuring a file is identical across multiple devices.  

The pros are that it is free and one of the more intuitive methods for a DIY “backup” since it provides a GUI.  Cons: Syncthing only works between individual devices, so you can’t directly back up your repository from a code hosting provider. Manual fixes are needed when errors occur. Also, syncing a git repo could lead to corruption and conflicts of a repository, especially if people work on different branches. Syncthing also sucks up a lot of resources with its continuous scanning, hashing, and encryption. Lastly, it only maintains one version, not multiple snapshots. 

Using SCM Backup

SCM Backup creates an offline clone of a GitHub or BitBucket repository. It makes a significant difference if you are trying to back up many repos at once. After the initial configuration, it grabs a list of all the repositories through an API. You can also exclude certain repos if need be. 

SCM lets you specify backup folder location, authentication credentials, email settings, and more. 

Here’s the drawback though, the copied repositories do not contain hooks, reflogs, or configuration files, or metadata such as issues, pull requests, or releases.  And configuration settings can change across different code hosting providers. Finally, in order to run it, you need to have .NET Core installed on your machine.

Now that’s just three ways to backup a git repository. As I mentioned before, just type a few words into Google and a litany of options comes up. But before you get the dev team to build a homegrown solution, keep these two things in mind.

First, any DIY solution will still require a significant amount of manual work because they only clone and/or backup; they can’t restore data. In fact, that’s actually the case with most SaaS tools, not just in-house backup solutions. So although you may have some snapshots or cloned files, it will likely be in a format that needs to be reuploaded into a SaaS tool. One way around this is to build a backup as a service program, but that will likely eat up a ton of developer time. 

That brings us to the second thing to keep in mind, the constantly changing states of APIs. Let’s say you build a rigorous in-house tool: you’ll need a team to be constantly checking for API updates, and then making the necessary changes to this in-house tool so it’s always working. I can only speak for myself, but I’m constantly trying to help dev teams avoid repetitive menial tasks. So although creating a DIY backup script can work, you need to decide where you want development teams to spend their time.

Data Continuity Strategies for SaaS

So what’s the way forward in all of this? There are a few things to consider. And these steps won’t be uncommon to most technical operations teams. First, figure out whether you want to DIY or outsource your backup needs. We already covered the in-house options and the challenges it presents. So if you decide to look for a backup and recovery service, just remember to do your homework. There are a lot of choices, so as you go through due diligence, look at reviews, talk to peers, read technical documentation and honestly, figure out if company X seems trustworthy. They will have access to your data after all.  

Next, audit all your third-party applications. I won’t sugarcoat it, this can be a lot of work. But remember the “terms of service” agreements? There are always a few surprises to be found. And you may not like what you see. I recommend you do this about once a year and make a pro/cons list. Is the value you get from this app worth the trade-off of access the app has? If it’s not, you may want to look for another tool. Fun fact: Compliance standards like SOC2 require a “vendor assessment” for a reason. External vendors or apps are a common culprit when it comes to accidental data loss.

And finally, limit who has access to each and every SaaS application. Most people acknowledge the benefits of using the least privileged approach, but it isn’t always put into practice. So make sure the right people have the right access, ensure all users have unique login credentials (use a password manager to manage the multiple login hellscape) and get MFA installed.

It’s not a laundry list of things nor is it incredibly complex. I truly believe that SaaS is the best way to build and run organizations. But I hope now it’s glaringly obvious to any DevOps, SRE or IT professional that you need to safeguard all the information that you are entrusting to these tools. There is an old saying I learned in those early days of my career, “There are two types of people in this world – those who have lost data and those who are about to lose data”. 

You don’t want to be the person who has to inform your CIO that you are now one of those people. Of course, if that happens, feel free to send them my way. I’m certain I’ll be explaining the Shared Responsibility Model of SaaS until my career is over!  

About the Author

Dave North has been a versatile member of the Ottawa technology sector for more than 25 years. Dave is currently working at Rewind leading 3 teams (devops, trust, IT) as the director of technical operations. Prior to Rewind, Dave was a long time member of Signiant, holding many roles in the organization including sales engineer, pro services, technical support manager, product owner and devops director. A proven leader and innovator, Dave holds 5 US patents and helped drive Signiant’s move to a cloud SAAS business model with the award winning Media Shuttle product. Prior to Signiant, Dave held several roles at Nortel, Bay Networks and ISOTRO Network Management working on the NetID product suite. Dave is fanatical about cloud computing, automation, gadgets and Formula 1 racing.

Rate this Article

Adoption
Style

BT