Changed lots of text and wording

This commit is contained in:
Joshua Boniface 2017-02-22 01:03:24 -05:00
parent 9f5fc95070
commit 1e1faa01ea
4 changed files with 16 additions and 19 deletions

View File

@ -8,10 +8,10 @@ class="post first"
RAID is a common technique to provide _resiliency_ and _availability_ to a set of data and protect against one of the most common data loss scenarios: the failure of a disk.
The simplest type of RAID is a 'mirror', which does just what it sounds like: keeps two (or more) copies of data on two (or more) different disks. If one disk fails, the second copy is still available and no data loss has occurred. You would usually see this for system disks in uptime-critical servers.
The simplest type of RAID is a 'mirror', which keeps two or more copies of data on two or more different disks. If one disk fails, the second copy is still available and no data loss has occurred. You would usually see this for system disks in uptime-critical servers.
There also exist more advanced modes, the most common of which is called RAID-5, and consists of 3 or more disks with data stripped (written sequentially), along with parity information, across the disks.
There also exist more advanced modes, the most common of which are RAID-5 and RAID-6, and consists of 3-4 or more disks with data stripped (written sequentially), along with parity information, across all disks.
It's worth noting that a pure 'stripe', also called RAID-0, is not really a RAID level - it *increases* the risk of data loss rather than decreasing it,since one disk failure destroys the whole array, and should not ever be used for redundancy.
It's worth noting that a pure 'stripe', also called RAID-0, is not really a RAID level - it *increases* the risk of data loss rather than decreasing it, since one disk failure destroys the whole array. It should never be used for redundancy or any critical data.
The [Wikipedia page for RAID](http://en.wikipedia.org/wiki/RAID) provides some helpful information about the history and benefits of the various RAID implementations.

View File

@ -11,11 +11,9 @@ RAID protects you against one and only one thing: a disk failure.
It does _not_ protect you against any of the following things:
* Multiple disk failures beyond the RAID level chosen (e.g. both disks in a mirror, or 3 disks in a RAID-6), including possible [UREs](https://holtstrom.com/michael/blog/post/588/RAID-5-URE-Failures.html).
* Failure of the RAID controller itself (especially when using hardware RAID), the computer running the RAID, or the environment (a flood, fire, theft, etc.).
* Data corruption on-disk from filesystem bugs, cosmic rays, or minor hardware or firmware failures.
* Malicious or accidental deletion or modification of files by yourself or another party, including viruses, bad application writes, or administrative mistakes (e.g. `rm`-ing the wrong file or `mkfs` on an existing filesystem).
* Multiple disk failures beyond the RAID level chosen (e.g. both disks in a mirror, or 3 disks in a RAID-6), including possible [UREs](https://holtstrom.com/michael/blog/post/588/RAID-5-URE-Failures.html) - on that later subject, RAID-5 should be considered harmful these days for any disks larger than 1TB.
* Failure of the RAID controller itself (especially when using hardware RAID), the computer running the RAID, or the environment containing the servers (a flood, fire, theft, etc.).
* Data corruption on-disk from filesystem bugs, cosmic rays, or minor hardware or firmware failures, which can and do happen all the time - you usually just don't notice and software works around it.
* Malicious or accidental deletion or modification of files by yourself or another party, including viruses, bad application writes, or administrative mistakes (e.g. `rm`-ing the wrong file or `mkfs` on an existing filesystem), which any seasoned sysadmin has done at least once (and hopefully not to production data)!
The adage is simple: "RAID replicates _everything_, instantly, even the stuff you don't want." Like the deletion of a file or corruption.
For these reasons and more, RAID IS NOT A BACKUP!
The adage is simple: "RAID replicates _everything_, instantly, even the stuff you don't want it to."

View File

@ -5,7 +5,6 @@ weight = 2
type = "post"
+++
There exists a number of file and storage systems with some advanced, RAID-like features. These include ZFS, btrfs, and Ceph. On the surface, these might give you the illuson of protection, but don't be deceived. You can still trash your whole system (or cluster, for Ceph). You can still `rm` files or other destructive commands. A fire can still destroy your whole rack.
Like RAID, ADVANCED FILESYSTEMS STILL AREN'T BACKUPS!
There exists a number of file and storage systems with some advanced, RAID-like features. These include ZFS, btrfs, and Ceph. On the surface, these might give you the illuson of protection, but don't be deceived. You can still trash your whole system (or cluster, for Ceph). You can still `rm` files or run other destructive commands accidentally. A fire can still destroy your whole rack. A malicious user could overwrite your database. Even the smartest most advanced storage engine is still susceptable to at least one, and almost always several, fatal failure modes.
Like RAID, ADVANCED FILESYSTEMS STILL AREN'T BACKUPS! Just make another copy of the data, okay!?

View File

@ -5,12 +5,12 @@ weight = 3
type = "post"
+++
* Always back up in _some way_. While a copy of the data on the same array won't protect you against all problems, it will protect you against some.
* Always back up in _some way_. While a copy of the data on the same array won't protect you against all, or even very many, failure modes, it will protect you against some, and those are usually the most common.
* A _backup on the same server_ is susceptable to the _same failures as the original data_ set (hardware failure, natural disasters, and the like).
* A good rule of thumb is _three copies_ (the RAID is only one copy for this purpose): the _original_, one _onsite copy_, and one _offsite copy_. Store the offsite copy in the cloud, or at a friend's house.
* _Make backups regularly_, at least once a week, and automate if possible; the day you need a backup is the day you realize you hadn't run it in 6 months and what you need isn't backed up.
* _Test backups regularly_, at least once a month; _a backup is worthless if you can't restore from it_. Just because you have a backup doesn't mean you're protected; always test them.
* A good rule of thumb is _three copies_: the _original_ (RAID or otherwise); one _onsite copy_ on a different, preferrably offline, medium; and one _offsite copy_. Store the offsite copy in the cloud, a data vault, or at a friend's house, just keep it somewhere else.
* _Make backups regularly_, at least once a week, preferrably more, and automate it! Forgetting to back something up and then needing just that backup is never fun, and the more frequently you back up, especially incrementally, the better your recovery resolution.
* _Test backups regularly_, at least once a month; _a backup is worthless if you can't restore from it_. Just because you have a backup doesn't mean you're protected; always test them and fix any problems. If you never test your backup, you will almost certainly find it doesn't work, right when you need it.
There are dozens of backup utilities out there; I'm not going to prosthelytize for any one of them, but I personally use [BackupPC](http://backuppc.sourceforge.net/) for my server and workstation backups.
There are dozens of backup utilities out there; I'm not going to prosthelytize for any one of them, but I personally use [BackupPC](http://backuppc.sourceforge.net/) and good ol' fashoned `rsync` for my server and workstation backups.
Do you need to back up everything? Of course not. That's up to you to decide. Some data is replaceable, some isn't. If it isn't, back it up!
Only you can determine what you need to back up, but if you can't replace some data, you should definitely back it up - Murphy's Law applies here as much as anywhere.