I know that for data storage the best bet is a NAS and RAID1 or something in that vein, but what about all the docker containers you are running, carefully configured services on your rpi, installed *arr services on your PC, etc.?

Do you have a simple way to automate backups and re-installs of these as well or are you just resigned to having to eventually reconfigure them all when the SD card fails, your OS needs a reinstall or the disk dies?

  • emax_gomax@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    ·
    7 months ago

    I use docker so don’t really have to worry about reproducibility of the Services or configurations. Docker will fetch the right services and versions. I’ve documented the core configurations so I can set them back up relatively easily. Anything custom I haven’t documented I’ll just have to remember or find I need to reset up.

  • simpleslipeagle@lemmynsfw.com
    link
    fedilink
    English
    arrow-up
    1
    arrow-down
    1
    ·
    8 months ago

    My server has a raid1 mdadm boot drive. And an 8 dive raid6 with zfs. It’s been running for 14 years now. The only thing that I haven’t replaced over it’s lifetime is the chassis. In fact the proc let out the magic smoke a few weeks ago, after some new parts it’s still going strong.

  • RegalPotoo@lemmy.world
    link
    fedilink
    English
    arrow-up
    7
    ·
    8 months ago

    Infrastructure as code/config as code.

    The configurations of all the actual machines is managed by Puppet, with all its configs in a git repo. All the actual applications are deployed on top of Kubernetes, with all the configurations managed by helmfile and also tracked in git. I don’t set anything up - I describe how I want things configured, and the tools do the actual work.

    There is a “cold start” issue in my scheme - puppet requires a server component that runs on Kubernetes but I can’t deploy onto kubernetes until the host machines have had their puppet manifests applied, but at that point I can just read the code and do enough of the config by hand to bootstrap everything up from scratch if I have to

  • HeartyBeast@kbin.social
    link
    fedilink
    arrow-up
    0
    ·
    8 months ago

    carefully configured services on your rpi

    I have a back up on an SD Card waiting for the day the SD Card fails. Slot it in and reboot

    • desentizised@lemm.ee
      link
      fedilink
      English
      arrow-up
      1
      ·
      8 months ago

      I recently “upgraded” one of my raspberrys SD cards to an industrial grade one. Seems to me like those are a lot slower but for that particular use case it doesnt matter to me. What matters is that the card doesn’t die. It runs noticeably cooler when lots of data is being written to it so I feel like I must be onto something there.

  • dr_robot@kbin.social
    link
    fedilink
    arrow-up
    2
    ·
    8 months ago

    My configuration and deployment is managed entirely via an Ansible playbook repository. In case of absolute disaster, I just have to redeploy the playbook. I do run all my stuff on top of mirrored drives so a single failure isn’t disastrous if I replace the drive quickly enough.

    For when that’s not enough, the data itself is backed up hourly (via ZFS snapshots) to a spare pair of drives and nightly to S3 buckets in the cloud (via restic). Everything automated with systemd timers and some scripts. The configuration for these backups is part of the playbooks of course. I test the backups every 6 months by trying to reproduce all the services in a test VM. This has identified issues with my restoration procedure (mostly due to potential UID mismatches).

    And yes, I have once been forced to reinstall from scratch and I managed to do that rather quickly through a combination of playbooks and well tested backups.

  • tetris11@lemmy.ml
    link
    fedilink
    English
    arrow-up
    2
    arrow-down
    1
    ·
    8 months ago

    Radical suggestion:

    • Once a year you buy a hard drive that can handle all of your data.
    • rsync everything to it
    • unplug it, put it back in cold storage
    • atzanteol@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      2
      arrow-down
      1
      ·
      8 months ago

      Once a… year? There’s a lot that can change in a year. Cloud storage can be pretty cheap these days. Backup to something like backblaze, S3 or Glacier nightly instead.

  • vividspecter@lemm.ee
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    8 months ago

    I put all docker data in one directory (or rather, a btrfs subvolume) and both snapshot and back it up daily to multiple machines. docker-compose files are also kept in the same subvolume.

    My latest server is NixOS, so I don’t even bother backing up the root subvolume, since the actual config is tracked on git and replicated on multiple machines. If I want to reinstall, I can just install NixOS and deploy the config, then just copy over the docker subvolume, and rebuild the containers. Some of this could be automated further (nixos-anywhere and disko look promising for the actual OS install) but my systems don’t typically break often enough for that to be a significant issue.

    You can go even further and either just use nix for the services, or use nix to build containers themselves, but I have a working setup already and it’s good enough, and I can easily switch to another distribution if issues start occurring in NixOS.

  • ikidd@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    8 months ago

    I run everything on a 2 node proxmox cluster with ZFS mirror volumes and replication of the VMs and CTs between them, run PBS with hourly snapshots, and sync that to multuple USB drives I swap off site.

    The docker VM can be ZFS snapshotted before major updates so I can rollback.

    • twei@feddit.de
      link
      fedilink
      English
      arrow-up
      1
      ·
      8 months ago

      You should get another node, otherwise when node1 fails node2 will reboot itself and then do nothing because it has no quorum

        • twei@feddit.de
          link
          fedilink
          English
          arrow-up
          2
          ·
          8 months ago

          I know, but every time I had to do that it felt like it’s a jank solution. If you have a raspberry pi or smth like that you can also set it up as a qdevice.

          …and if you’re completely fine with how it is you can also just leave it like it is

          • ikidd@lemmy.world
            link
            fedilink
            English
            arrow-up
            3
            ·
            8 months ago

            So I started to write a reply that said basically that I was OK doing that manually, but thought that “hell, I have a PBS box on the network that would do that fine”. So it took about 3 minutes to install the corosync-qdevice packages on all three and enable it. Good to go.

            Thanks for the kick in the ass.

          • ikidd@lemmy.world
            link
            fedilink
            English
            arrow-up
            2
            ·
            8 months ago

            So since I now had a “quorate” cluster again, I thought I’d try out HA. I’d always been under the impression that unless you had a shared storage LUN, you couldn’t HA anything. But I thought I’d trigger a replication and then down the 2nd node just as a test. And lo and behold, the first node brought up my OPNsense VM from the replicated image about 2 minutes after the second node lost contact, and internet starts working again.

            I’m really excited about having that feature working now. This was a good night, thank you.

            • twei@feddit.de
              link
              fedilink
              English
              arrow-up
              1
              ·
              7 months ago

              If you need another thing to do, you could try to make your opnsense HA and never have your internet stop working while rebooting a node. It’s pretty simple to set up, you might finish it in 1-2 evenings. Happy clustering!

  • CameronDev@programming.dev
    link
    fedilink
    English
    arrow-up
    1
    ·
    8 months ago

    I rsync my root and everything under it to a NAS, will hopefully save my data. I wrote some scripts manually to do that.

    I think the next best thing to do is to doco your setup as mich as possible. Either by typed up notes, or ansible/packer/whatever, any documentation is better than nothing if you have to rebuild.

  • atzanteol@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    8 months ago
    1. Most systems are provisioned in proxmox with terraform.
    2. Configuration and setup is handled via ansible playbooks after the server is available. 2.a) Do NOT make changes on the server without updating your ansible scripts - except during troubleshooting. 2.b) Once troubleshooting is done delete and re-create the VM from scratch using only scripts to ensure it works.
    3. VM storage is considered to be ephemeral. All long-term data/config that can’t be re-created with ansible is either stored on an NFS server with a RAID5 dive configuration or backed up to that same file-server using rsnapshot.
    4. NFS server is backed-up nightly to backblaze using duplicacy.
    5. Any other non-VM systems like personal laptops and the like are backed up nightly to the file-server using rsnapshot. Those snapshots are then backed up to backblaze using duplicacy.
  • lemmyvore@feddit.nl
    link
    fedilink
    English
    arrow-up
    1
    ·
    8 months ago
    • Install Debian stable with the ssh server included.
    • Keep a list of the packages that were installed after (there aren’t many but still).
    • All docker containers have their compose files and persistent user data on a RAID1 array.
    • Have a backup running that rsyncs once a day /etc, /home/user and /mnt/array1/docker to another RAID1 to daily/, from daily/ once a week rsync to weekly/, from weekly/ once a monthb timestamped tarball to monthly/. Once a month I also bring out a HDD from the drawer and do a backup of monthly/ with Borg.

    For recovery:

    • Reinstall Debian + extra packages.
    • Restore the docker compose and persistent files.
    • Run docker compose on containers.

    Note that some data may need additional handling, for example databases should be dumped not rsunced.

  • Decronym@lemmy.decronym.xyzB
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    7 months ago

    Acronyms, initialisms, abbreviations, contractions, and other phrases which expand to something larger, that I’ve seen in this thread:

    Fewer Letters More Letters
    Git Popular version control system, primarily for code
    HA Home Assistant automation software
    ~ High Availability
    LXC Linux Containers
    NAS Network-Attached Storage
    Plex Brand of media server package
    RAID Redundant Array of Independent Disks for mass storage
    RPi Raspberry Pi brand of SBC
    SBC Single-Board Computer
    SSD Solid State Drive mass storage

    [Thread #287 for this sub, first seen 18th Nov 2023, 10:35] [FAQ] [Full list] [Contact] [Source code]