Hi there, I’ve been meaning to go get more serious about my data. I have minimal backups, and some stuff is not backed up at all. I’m begging for disaster.
Here’s what I’ve got: 2 8tb drives almost full in universal external enclosures A small formfactor PC as a server, with one 8tb drive connected. An unused raspberry pi. No knowledge of how to properly use zfs.
Here’s what I want: I’ve decided I don’t need raid. I don’t want the extra cost of drives or electricity, and I don’t need uptime. I just need backups. I want to use what drives I have, and an additional 16tb drive I’ll buy.
My thought was that I would replace the 8tb drive with a 16tb one, format it with zfs (primarily to avoid bit rot. I’ll need to learn how to check for this), then back it up across the two 8tb drives as a cold backup. Either as two separate drives somehow? Btrfs volume extension? Or a jbod connected to the raspberry pi, that I leave unplugged except for when it’s time to sync the new data?
Or do you have a similarly cheap solution that’s less janky?
I just want to back up my data, with an amount of rot protection, cheaply.
I understand that it might make sense to invest in something a bit more robust right now, and fill it with drives as needed.
But the thing I keep coming to is the cold backup. How can you keep cold backups over several hard drives, without an entire second server to do the work?
Thanks for listening to my rambling.
A pet subject of mine.
Firstly - sit down and consider what you need to backup.
- Tier 1 - unique data. Stuff you created that doesn’t exist elsewhere.
- Tier 2 - Stuff that would take a few days to repeat. Local configs, etc.
- Tier 3 - Stuff you can just download again. (Steam library, media etc)
Don’t backup Tier 3. I’m betting the size of data you need to back up shrinks a lot.
Secondly - automate it. If there’s anything manual, then you’ll eventually stop doing it. Automate, automate, automate - and throw in some manual or automated checks of the backups to verify they’re actually usable.
Thirdly - airgap it if you can, and if there’s much Tier 1 data. Offline disks. This gives you some protection against ransomware. Consider the risks and how to protect yourself. Obviously media failure, accidental deletion and ransomware, but also consider theft and fire. Do you really want your backups in the same location? Do they need encryption?
I wrote quite a long blog on the subject if you’re interested in more.
Thanks for this! This is a really rundown.
One question on the media though. While things like games and media can be redownloaded, that’s a significant effort. And also, how do I know what I’ve lost once it’s gone? Do I backup a directory of what I had somehow? I have a terrible memory, and will forget things ever existed.
If I can get it later, personally I don’t back it up. I have lost 100tb of media before and since then I’ve redownloaded most of it. For that stuff, if I forget about it then it wasn’t important. When I care about it, I’ll remember and just go get it then.
I agree with this, though if you have something like Sonarr or Radarr, the titles you have (or had) would all be on there, so reaquiring isn’t quite as significant a task.
Unfortunately I’ve recently had to put this into practice because I didn’t understand hardlinks and zfs subpools…
I recently lost a media drive and Radarr was a godsend. I’ve made database backups a priority. It’s also much easier to recover from a dead drive with access to a private BitTorrent tracker that allows free leeching.
After I stopped other programs except for Radarr and qBitTorrent, I let those two with for two days and got most of everything back. There are a few more movies that I need to manually recover and I should properly back those up. Besides that, it worked very well.
You’re welcome.
Yes, you can create a list of files that takes little space, in linux that’s just “tree” to produce a list of directories and files (I don’t know about Windows, sorry)
But only you can answer what you need to back up. If you judge the effort to re-download this data is more than the effort of backing it up (especially if you’re on a slow link), then backing it up makes more sense. Everyone has their own appetite for risk and their own shape of what they can spend in both time and money in sorting this. The important thing is that you’re thinking about it before you need it, that’s good!
It’s nice to see that people are not just throwing shit to the “cloud”. The “cloud” is just someone else’s disk space and from there, anything can happen.
No advice here but whatever method you choose, script in a notification that all went well with backups so you know if you need to troubleshoot or not.
Thanks! I do have some stuff backed up to Dropbox, but I’m hoping to cancel that before it renews as I don’t use it anymore
Using Linux,i just got an Xtb drive, set it up with btrfs, then with a little bit of scripting that took less than an hour i setup automated daily incremental backups using copy on write method.
I can see backups on a per day basis for years back now
I only buy used drives. The money I save from buying and using them is more than enough to backup everything multiple times. I personally think a new drive but only one back up is not safe.
Other thought, if you want to be seriously safe about your backup you should put it at your parents house or work in case something happens to yours.
But the thing I keep coming to is the cold backup. How can you keep cold backups over several hard drives, without an entire second server to do the work?
You could just plug them into your server or computer and move them there. Also, a server doesn’t need to be expensive. You could just buy a used optiplex for 40$.
Used all the way. I haven’t looked at prices recently but I have gotten 8TB SAS drives for $40 each. Hard to beat that.
That’s about what I pay too :)
- where are you getting used drives from?
- my dad and I both have hypervisors. Any suggestions on how to share backups without a vpn?
I originally got my used drives from an electronic recycling company. They were selling them around 3-5$ per TB.
Then I got a ton of my sysadmin family member.
my dad and I both have hypervisors. Any suggestions on how to share backups without a vpn?
Just use Tailscale on the devices instead of VPN. Secure and no faffing around with port forwarding.
@droolio @Galapagon
Tailscale literally is a VPN.More than just a traditional VPN. @Galapagon@sh.itjust.works was concerned about security and it being always-on. Tailscale is an overlay network that links devices directly, deals with authentication, punching holes through NAT.
Why no VPN? It is what the tech is for, making a private Network between places.
If you are both behind cgnat then you will need a third place to relay through like a dirt cheap VM or VPS.
It’d have to be always up, essentially merging our networks. I thought it’d be better to keep them separate for security.
A VPN doesn’t have to merge networks, you can still keep them as 2 separate broadcast domains. like your house could be 192.168.1.0/24 while your parents could be 192.168.5.0/24
Thanks for the recommendations, especially used drives. I was looking at water panther drives because of their warranty. Do you have a preferred source?
That’s a fair point about off-site backups. Though as this point, any backups at all would be much better than what I’ve got. Baby steps 🤷♂️
Plug them in and move them there? Sorry I’m confused. Cold backups have to be disconnected and unpowered, right? I could keep them bare, and plug them into a dock sequentially, but what does that look like on the software side?
And yeah, my needs are very light. A used optiplex would be plenty. Right now I’m using an m920q with an 8th Gen i5. While it doesn’t have drive bays, it idles at like 10 watts, full tilt is 25. Plus it’s small and silent, I’m in an apartment.
I’m not opposed to upgrading in the future, I guess I just want to get a handle on some basics with what I have, before I decide what I want.
I mean I’ve got a spare computer sitting around with a 2600x in it, that could be a server pretty easy, maybe I’ll look into that 🤔
At any rate, what do you think about the software side of things? Specifically the basic procedure for updating cold backups?
https://serverpartdeals.com/ Is frequently recommended as reliable and with a good warranty. When I first started I got my from a local electronics recycler with only a 30 day warranty. I was only caring about price. Now I’m just using bigger used ones. My only warranty is them being cheap to free and backing up data multiple places.
I agree it’s a bigger step. I currently only have one at my office and it took me forever to get around to setting it up.
I mean just plug them in to your computer or server then move your data. If it was me I would rsync-avz source destination the data :)
An Optiplex only has one maybe two bays. But the low power and cost are worth it!
I think it really doesn’t matter what hardware you have as long as it won’t randomly break.
Maybe I’ll give my specific example. I have a QNAP 1u server that was given to me so I decided to use it as a cold backup. I turn it on once a week and then ssh into my main servers and rsync important data into their respective locations on the QNAP.
If I was making a new system I would make a fedora server 41 usb, install that to a new system using the gui, restart it and ssh in to setup my folder structure, then rsync ally stuff there and be done :) Would probably take less than 30-60 minutes to setup
You’re limiting yourself somewhat if you’re not able to plug in multiple drives at the same time. Otherwise, I might suggest mergerfs for basic JBOD. You won’t be able to use a single ZFS to avoid bit rot - only detect it. SnapRAID - ideal for offline setups - would be the next step up if you could dedicate one of your drives to parity.
In your position, I’d do Duplicacy backups split/spanned over multiple backup drives (however you connect them).
It has a pretty cool Erasure Coding feature that protects individual chunks from bit rot and possibly even bad sectors, plus the whole database-less architecture makes it very robust. De-duplication, high levels of compress, and encryption. Plus you can keep historic snapshots, so you can avoid the risk of accidentally sync’ing ransomware over the top.
Edit: the CLI is free for personal use, and is source-available. Written in Go and extremely performant.
Thanks for this! I’m starting to consider repurposing an old PC with room for drives. The power draw isn’t that much more.
I got an external ORICO USB hard drive dock and two 3.5" hard drive cases (also by Orico). Every month or so, I plug them in and rsync files over. I’m lucky that one specific folder is about the size of one drive, so I just manually split them. There’s probably a better way to do that if you don’t have an obvious split.
Then the 3.5" drives in their colorful cases go into a fireproof safe in the basement. I also added another pair for semiannual backups that go to my inlaws.
Does rysnc merge files and folders? I think that’s the key I’m missing
Rsync will compare the contents of the source directory with the target directory. If it finds a match, it won’t copy the files, if there’s no match, it copies.
If your goal is to have 1:1 copies, you can use the --delete flag to remove extraneous files in the target directory that aren’t in the source directory.
If you use the -a flag, it’ll maintain all of your permissions.
You can literally rsync a linux installation from one machine to another. While the source machine is running. Pretty nuts.
I used to mail a drive to a relative in a different state with all the media that I created. Offsite ftw in case of natural disaster.
What sort of media?
Music that put together in Logic Audio.
deleted by creator
deleted by creator
I’m worried about bit rot because I think it happened to me. Maybe it was some other corruption. But I have a video backed up on Dropbox, it’s been they’re for over a decade, and somewhere along the line it has developed glitches and artifacts. It’s still playable, but I’m annoyed, and worried about other media that I haven’t manually checked.