I woke up this morning to a text from my ISP, “There is an outage in your area, we are working to resolve the issue”

I laugh, this is what I live for! Almost all of my services are self hosted, I’m barely going to notice the difference!

Wrong.

When the internet went out, the power also went out for a few seconds. Four small computers host all of my services. Of those, one shutdown, and three rebooted. Of the three that ugly rebooted some services came back online, some didn’t.

30 minutes later, ISP sends out the text that service is back online.

2 hours later I’m still finding down services on my network.

Moral of the story: A UPS has moved to the top of the shopping list! Any suggestions??

  • AnarchistArtificer@slrpnk.net
    link
    fedilink
    English
    arrow-up
    15
    ·
    4 months ago

    Though I wonder if even besides adding an uninterruptible power supply (UPS) (writing acronym out for anyone else who would’ve had to Google it), this might be a useful exercise recovering from outages in general. This is coming from someone who hasn’t actually done any self hosting of my own, but you saying you’re still finding down services reminds me of when I learned the benefit of testing system backups as part of making them.

    I was lucky in that I didn’t have any data loss, but restoring from my backup took a lot more manual work than I’d anticipated, and it came at an awkward time. Since then, my restoring from backup process is way more streamlined.

  • notannpc@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    ·
    4 months ago

    Could also be a good opportunity to add a service monitor like Uptime Kuma. That way you know what services are still down once things come back online with less manual discovery on your part.

  • oldfart@lemm.ee
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    4 months ago

    Two pitfalls I had that you can avoid:

    • look at efficiency. It’s not always neglible, was like 40% of my energy usage because I oversized the UPS. The efficiency is calculated from top power the UPS can supply. 96% efficient 3kW UPS eats 4% of 3kW, 120 watts, even if the load you connected is much smaller than 3kW
    • look at noise level. Mine was loud almost like a rack server, because of all the fans.

    I replaced that noisy, power hungry beast with a small quiet 900W APC and I couldn’t be happier

  • CameronDev@programming.dev
    link
    fedilink
    English
    arrow-up
    35
    ·
    4 months ago

    Did the services fail to come back due to the bad reboot, or would they have failed to come back on a clean reboot? I ugly reboot my stuff all the time, and unless the hardware fails, i can be pretty sure its all going to come back. Getting your stuff to survive reboot is probably a better spend of effort.

    • Padook@feddit.nlOP
      link
      fedilink
      English
      arrow-up
      14
      ·
      4 months ago

      I didn’t mean to imply that Services actually broke. Only that they didn’t come back after a reboot. A clean reboot may have caused some of the same issues because, I’m learning as I go. Some services are restarted by systemctl, some by cron, some…manual. This is certainly a wake up call that I need standardize and simplify the way the services are started.

      • CameronDev@programming.dev
        link
        fedilink
        English
        arrow-up
        16
        ·
        4 months ago

        We’ve all.committed that sin before. Its better to rely on it surviving the reboot than to try prevent the reboot.

        Also worth looking into some form of uptime monitoring software. When something goes down, you want to know about it asap.

        And documenting your setup never hurts :D

        • Nimmo@lem.nimmog.uk
          link
          fedilink
          English
          arrow-up
          5
          ·
          4 months ago

          On the uptime monitoring I’ve been quite happy with uptime kuma, but… If you put it on the same host that’s down… Well, that’s not going to work :p (I nearly made that mistake)

          • CameronDev@programming.dev
            link
            fedilink
            English
            arrow-up
            3
            arrow-down
            2
            ·
            4 months ago

            Same, Uptime Kuma is fantastic. I put it on my most critical server, if Kuma is down, everything is down :D

          • elvith@feddit.de
            link
            fedilink
            English
            arrow-up
            4
            ·
            4 months ago

            It’s not the most detailed thing, but I just use a free account on cron-job.org to send a head request every two minutes to a few services that are reachable from the internet (either just their homepage or some ping endpoint in the API) and then used the status page functionality to have a simple second status page on a third party server.

            You can do a bit more on their paid tier, but so far I didn’t need that.

            On the other hand, you could try if a free tier/cheap small vps on one of the many cloud providers is sufficient for an uptime Kuma installation. Just don’t use the same cloud provider as all other of your services run in.

            • Nimmo@lem.nimmog.uk
              link
              fedilink
              English
              arrow-up
              2
              ·
              4 months ago

              Oh, I’m fine with my setup, I have a couple of external servers that can monitor all my web accessible stuff with kuma and then I’ve got another local one to monitor my non-web accessible stuff.

              Thanks for those tips though, definitely useful to consider other options

      • iknowitwheniseeit@lemmynsfw.com
        link
        fedilink
        English
        arrow-up
        2
        ·
        4 months ago

        I reboot every box monthly to flush out such issues. It’s not perfect, since it won’t catch things like circular dependencies or clusters failing to start if every member is down, but it gets lots of stuff.

    • fuckwit_mcbumcrumble@lemmy.world
      link
      fedilink
      English
      arrow-up
      12
      ·
      4 months ago

      Yeah an unclean reboot shouldn’t break anything as long as it wasn’t doing anything when it went down. I’ve never had any issues when I have to crash a computer unless it was stuck doing an update.

  • Deebster@programming.dev
    link
    fedilink
    English
    arrow-up
    17
    ·
    4 months ago

    A general tip on buying UPSes: look for second hand ones - people often don’t realise you can just replace the battery in them (or can’t be bothered) so you can get fancier/larger ones very cheap.

    • elucubra@sopuli.xyz
      link
      fedilink
      English
      arrow-up
      9
      arrow-down
      1
      ·
      4 months ago

      Also, a larger capacity one is better, and it’s likely you’ll find a secondhand one with more capacity/features for a similar price.

      • ElderWendigo@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        1
        arrow-down
        1
        ·
        edit-2
        4 months ago

        Why? If the power has gone out there are very few situations (I can’t actually think of any except brownouts or other transient power loss) where it would be useful to power my server for much longer than it takes to shut down safely.

        • Deebster@programming.dev
          link
          fedilink
          English
          arrow-up
          2
          arrow-down
          1
          ·
          4 months ago

          Longer means you’re more likely to be able to ride out a power cut, and gives you more options if you want/need to complete something more involved than saving and shutting down.

  • /bin/bash/@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    ·
    4 months ago

    when you say some services on your network are you talking about machines or softwares?

    for machines yes ups makes sense for softwares writing some scripts to run on start up should be enough another alternative can be setting up wake on Lan that way you can bring all up again wherever you are

  • recapitated@lemmy.world
    link
    fedilink
    English
    arrow-up
    10
    ·
    4 months ago

    I’m a big fan of running home stuff on old laptops for this reason. Most UPSs give you a few minutes to shut down, laptops (depending on what you run) could give you plenty of extra run time and plenty of margin for a shutdown contingency.

    • Drewelite@lemmynsfw.com
      link
      fedilink
      English
      arrow-up
      7
      ·
      4 months ago

      Small, good value, quiet, power efficient, built in battery backup and server terminal. Laptops are dope for home labs!

  • Shimitar@feddit.it
    link
    fedilink
    English
    arrow-up
    6
    ·
    4 months ago

    I use a laptop and external jbod covered with a low power ups. As other said, the point is to bridge powergaps now long term working powerless. I live in the countrisied, so small powergaps happens specially when my photovoltaic don’t produce (no, i have no battery accumulators, too expensive)

    • BreakDecks@lemmy.ml
      link
      fedilink
      English
      arrow-up
      28
      arrow-down
      7
      ·
      4 months ago

      My favorite part about using an old laptop as a 24/7/365 plugged-in server is the anticipation of when the lithium battery will explode from overcharging.

      • skilltheamps@feddit.de
        link
        fedilink
        English
        arrow-up
        44
        ·
        4 months ago

        “overcharging” doesn’t exist. There are two circuits preventing the battery from being charged beyond 100%: the usual battery controller, and normally another protection circuit in the battery cell. Sitting at 100% and being warm all the time is enough for a significant hit on the cell’s longetivity though. An easy measure that is possible on many laptops (like thinkpads) is to set a threshold where to stop charging at. Ideal for longetivity is around 60%. Also ensure good cooling.

        Sorry for being pedantic, but as an electricial engineer it annoys me that there’s more wrong information about li-po/-ion batteries, chargers and even usb wall warts and usb power delivery than there’s correct information.

        • Aganim@lemmy.world
          link
          fedilink
          English
          arrow-up
          2
          ·
          4 months ago

          Isn’t dendrite formation and the shorts they can cause a much bigger concern when dealing with old batteries that are being charged 24/7? Asking a genuine question here, so please don’t shoot me if I’m wrong. 🙂 I’d love to hear more about the most common failure modes and causes for li-po/ion batteries.

          • skilltheamps@feddit.de
            link
            fedilink
            English
            arrow-up
            5
            ·
            edit-2
            4 months ago

            Those are symptoms of sitting at that operation point permanently, and they are a of course a concern. What I’m after is that people think that energy gets put in to the battery, i.e. it gets charged, as long as a “charger” is connected to the device (hence terms like “overcharged”). But that is not true, because what is commonly referred to as “charger” is no charger. It is just a power supply and has literally zero say in if, how and when the battery gets charged. It only gets charged if the charge controller in the device decides to do that now, and if the protection circuit allows it. And that is designed to only happen if the battery is not full. When it is full, nothing more happens, no currents flow in+out of the battery anymore. There’s no damage due to being charged all the time, because no device keeps on pumping energy into the cell if it is full.

            There is however damage from sitting (!) at 100% charge with medium to high heat. That happens indipendently from a power supply being connected to the device or not. You can just as well damage your cells by charging them to 100% and storing them in a warm place while topping them of once in a while. This is why you want to have them at lower room temperature and at ~60%, no matter if a device/“charger” is connected or not.

            (Of course keeping a battery at 60% all the time defeats the purpose of the battery. So just try to keep it cool, charged to >20% and <80% most of the time, and you’re fine)

  • ChojinDSL@discuss.tchncs.de
    link
    fedilink
    English
    arrow-up
    8
    ·
    4 months ago

    UPS with usb allows you to configure a script to properly shutdown your server when a power outage happens and the UPS battery is about to run out.

    • ripcord@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      4 months ago

      I’d like that, but also a really long-running UPS. multi-hour power outages are surprisingly common in my area.

      • towerful@programming.dev
        link
        fedilink
        English
        arrow-up
        1
        ·
        4 months ago

        Thats no longer a UPS.
        You could get something like a powerwall, something designed to power things from batteries for a long time.
        Or get a generator with an automatic failover. The UPS then covers the downtime between powerfailure and generator taking load

          • towerful@programming.dev
            link
            fedilink
            English
            arrow-up
            1
            ·
            4 months ago

            Generally, UPS (lead acid) batteries are not designed for long-cycle deep discharge.
            They are designed to hold their rated load for a minute or so until the power is restored (generators start, power-uncuts) or the servers have a chance to shut down.
            But maybe thats dated information, and modern UPSs are designed to run from batteries for a few hours.

    • bitwolf@lemmy.one
      link
      fedilink
      English
      arrow-up
      1
      ·
      4 months ago

      Does this require a lot of gear? Or does it simply act as another gateway?

      • BoofStroke@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        3
        ·
        4 months ago

        There are devices like the Netgear lm1200 that can do it inline by themselves.

        I have that device, but configured as a second gateway. My firewall manages the failover based on primary packet loss and latency.

      • themoonisacheese@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        1
        ·
        4 months ago

        It requires an LTE capable gateway and a data plan. As for the rest you can simply write your routing tables so that if the main gateway doesn’t work, use the secondary gateway with lower prio.

  • Decronym@lemmy.decronym.xyzB
    link
    fedilink
    English
    arrow-up
    3
    arrow-down
    1
    ·
    edit-2
    4 months ago

    Acronyms, initialisms, abbreviations, contractions, and other phrases which expand to something larger, that I’ve seen in this thread:

    Fewer Letters More Letters
    CSAM Child Sexual Abuse Material
    DNS Domain Name Service/System
    NAS Network-Attached Storage
    PiHole Network-wide ad-blocker (DNS sinkhole)
    Plex Brand of media server package
    RAID Redundant Array of Independent Disks for mass storage
    SATA Serial AT Attachment interface for mass storage
    VPS Virtual Private Server (opposed to shared hosting)
    ZFS Solaris/Linux filesystem focusing on data integrity

    8 acronyms in this thread; the most compressed thread commented on today has 10 acronyms.

    [Thread #567 for this sub, first seen 3rd Mar 2024, 16:05] [FAQ] [Full list] [Contact] [Source code]