• 0 Posts
  • 54 Comments
Joined 1 year ago
cake
Cake day: July 30th, 2023

help-circle

  • Others have some good information here - all I’d like to add to the root is that Windows and Mac have a built-in DNS cache and it’s pretty straightforward to add a DNS cache to systemd distros (if it’s not already installed or in use) using systemd-resolved or dnsmasq if you really dislike systemd. Some distros enable this from install time.

    Systems that utilize a DNS cache will keep copies of DNS query results for a period of time, making the application-level name lookup speed essentially 0ms for a cached result. Cold results obviously incur the latency of the DNS server itself.




  • TLDR: probably a lot of people continue using the thing that they know if it just works as long as it works well enough not to be a bother.

    Many many years ago when I learned, I think the only ones I found were Apache and IIS. I had a Mac at the time which came pre installed with Apache2, so I learned Apache2 and got okay at it. While by release dates Nginx and HAProxy most definitely existed, I don’t think I came across either in my research. I don’t have any notes from the time because I didn’t take any because I was in high school.

    When I started Linux things, I kept using Apache for a while because I knew it. Found Nginx, learned it in a snap because the config is more natural language and hierarchical than Apache’s XMLish monstrosity. Then for the next decade I kept using Nginx whenever I needed a webserver fast because I knew it would work with minimal tinkering.

    Now, as of a few years ago, I knew that haproxy, caddy, and traefik all existed. I even tried out Caddy on my homelab reverse proxy server (which has about a dozen applications routed through it), and the first few sites were easy - just let the auto-LetsEncrypt do its job - but once I got to the sites that needed manual TLS (I have both an internal CA and utilize Cloudflare’ origin HTTPS cert), and other special config, Caddy started becoming as cumbersome as my Nginx conf.d directory. At the time, I also didn’t have a way to get software updates easily on my then-CentOS 7 server, so Caddy was okay-enough, but it was back to Nginx with me because it was comparatively easier to manage.

    HAProxy is something I’ve added to my repertoire more recently. It took me quite a while and lots of trial and error to figure out the config syntax which is quite different from anything I’d used before (except maybe kinda like Squid, which I had learned not a year prior…), but once it clicked, it clicked. Now I have an internal high availability (+keepalived) load balancer than can handle so many backend servers and do wildcard TLS termination and validate backend TLS certs. I even got LDAP and LDAPS load balancing to AD working on that for services like Gitea that don’t behave well when there’s more than one LDAPS backend server.

    So, at some point I’ll get around to converting that everything reverse proxy to HAProxy. But I’ll probably need to deploy another VM or two because the existing one also has a static web server and I’ve been meaning to break up that server’s roles anyways (long ago, it was my everything server before I used VMs).








  • On/off:
    I have 5 main chassis excluding desktops. Prod cluster is all flash, standalone host has one flash array, one spinning rust array, NAS is all spinning rust. I have a big enough server disk array that spinning it up is actually a power sink and the Dell firmware takes a looong time to get all the drives up on reboot.

    TLDR: Not off as a matter of day/night, off as a matter of summer/winter for heat.

    Winter: all on

    Summer:

    • prod cluster on (3x vSAN - it gets really angry if it doesn’t have cluster consistency)
    • NAS on
    • standalone server off, except to test ESXi patches and when vCenter reboots cause it to be WoL’d (vpxd sends a wake to all stand by hosts on program init)
    • main desktop on
    • alt desktops off

    VMs are a different story. Normally I just turn them on and off as needed regardless of season, though I will typically turn off more of my “optional” VMs to reduce summer workload in addition to powering off the one server. Rough goal is to reduce thermal load as to not kill my AC as quickly which is probably running above its duty cycle to keep up. Physical wise, these servers are virtualized so this on/off load doesn’t cycle the array.

    Because all four of my main servers are the same hypervisor (for now, VMware ESXi), VMs can move among the prod cluster to balance load autonomously, and I can move VMs on or off the standalone host by drag-and-drop. When the standalone host is off, I usually move turn it’s VMs off and move them onto the prod cluster so I don’t get daily “backup failure” emails from the NAS.

    UPS: Power in my area is pretty stable, but has a few phase hiccups in the summer. (I know it’s a phase hiccup because I mapped out which wall plus are on which phase, confirmed with a multimeter than I’m on two legs of a 3-phase grid hand-off, and watched which devices blip off during an event) For something like a light that will just flicker or a laptop/phone charger that has a high capacitance, such blips are a non issue. Smaller ones can even be eaten by the massive power supplies my Dell servers have. But, my Cisco switches are a bit sensitive to it and tend to sing me the song of their people when the power flickers - aka fan speed 100% boot up whining. Larger blips will also boop the Dell servers, but I don’t usually see breaks more than 3-5m.

    Current UPS setup is:

    • rack split into A/B power feeds, with servers plugged into both and every other one flipped A or B as it’s primary
    • single plug devices (like NAS) plugged into just one
    • “common purpose” devices on the same power feed (ex: my primary firewall, primary switches, and my NAS for backups are on feed A, but my backup disks and my secondary switches are on feed B)
    • one 1500VA UPS per feed (two total) - aggregate usage is 600-800w
    • one 1500VA desktop UPS handling my main tower, one monitor, and my PS5 (which gets unreasonably upset about losing power, so it gets the battery backup)

    With all that setup, the gauges in the front of the 3 UPSes all show roughly 15-20m run time in summer, and 20-25m in winter. I know one may be lower than displayed because it’s battery is older, but even if it fails and dumps it’s redundant load onto the main newer UPS I’ll still have 7-10m of battery at worst case and that’s all I really need to weather most power related issues at my location.



  • Apologies for being late, I wanted to be as correct as I could be.

    So, straight to the point: Nextcloud by default uses plain files if you don’t configure the primary storage to be an S3/object store. As far as I can tell, this is not automatic and is an intentional change at system creation by the original admin. There is a third-party migration script, but there does not appear to be a first-party method of converting between the two. That’s very good news for you! (I think/hope)

    My instance was set up as a standalone, so I cannot speak for the all-in-one image. Poking around the root data directory (datadirectory in the config.php), I was able to locate my user account by internal username - which if you do not use LDAP will be the shortened login name. On default LDAP configs, this internal username may be a GUID, but that can be changed during the LDAP enablement process by overriding the Internal Username field in the Expert LDAP settings.

    Once in the user’s home folder in the root data directory, my subdirectory options are cache, files, files_trashbin, files_versions, uploads.

    • files contains the “live” structure of how I perceive my Nextcloud home folder in the Web UI and the Nextcloud Desktop sync engine
    • files_trashbin is an unstructured data folder containing every file that was deleted by this user and kept per the trash folder’s retention policy (this can be configured at the site level). Files retain their original name, but have a suffix added which takes the form .d######... where the numbers appear to be a Unix timestamp, likely the deletion date. A quick scan of these with the file command in Linux showed that each one had an expected file header based on its extension (i.e., a .png showed as a PNG image with an expected resolution). In the Web UI, there is metadata about which folder the file originally resided in, but I was not able to quickly identify this in the file structure. I believe this info is coming in from the SQL database.
    • files_version are how Nextcloud is storing its file version history (if enabled). Old versions are cleaned up per a set of default behaviors to keep more copies of more recent changes, up to a maximum age deletion threshold set at the site level. This folder is stored in approximately the same structure as the main files live structure, however each copy of each version is appended a suffix .v######... where the number appears to be the Unix timestamp the version was taken (*I have not verified that this exactly matches what the UI shows, nor have I read the source code that generates this). I’ve spot checked via the Linux file command and sha256 that the files in this versions structure appear to be real data - tested one Excel doc and one plain text doc.

    I think that should get a fairly rough answer to your original question, but if I left something out you’re curious about, let me know.


    Finally, I wanted to thank you for making me actually take a look at how I had decided to configure and back up my Nextcloud instance and ngl it was kind of a mess. The trash bin and versions can both get out of hand if you have frequently changing or deleting/recreating files (I have network synchronization glued onto some of my games that do not have good remote save support). Retention policy on trash and versions cleaned up extraneous data a lot, as only one of those was partially configured.

    I can see a lot of room for improvements… just gotta rip the band-aid off and make intelligent decisions rather than just slapping an rsync job that connects to the Nextcloud instance and replicates down the files and backend database. Not terrible, but not great.

    In the backend I’m already using ZFS for my files and Redis database, but my core SQL database was located on the server’s root partition (which is XFS - I’d rather not mess with a DKMS module from a boot CD if something happens and upstream borks the compile, which is precisely what happened when I upgraded to OpenZFS 2.1.15).

    I do not have automatic ZFS snapshots configured at this time, but based on the above, I’m reasonably confident that I could get data back from a ZFS snapshot if any of the normal guardrails within Nextcloud failed or did not work as intended (trash bin and internal version history). Plus, the data in that cursed rsync backup should be at least 90% functional.



  • I don’t have a full answer to snapshots right now, but I can confirm Nextcloud has VFS support on Windows. I’ve been working on a project to move myself over to it from Syno drive. Client wise, the two have fairly similar features with one exception - Nextcloud generates one Explorer sidebar object per connection, which I think Synology handles as shortcuts in the one directory. If prefer if NC did the later or allowed me to choose, but I’m happier with what I got for now.

    As for the snapshotting, you should be able to snapshot the underlying FS/DB at the same time, but I haven’t poked deeply at that. Files I believe are plain (I will disassemble my nextcloud server to confirm this tonight and update my comment), but some do preserve version history so I want to be sure before I give you final confirmation. The Nextcloud root data directory is broken up by internal user ID, which is an immutable field (you cannot change your username even in LDAP), probably because of this filesystem.

    One thing that may interest you is the external storage feature, which I’ve been working on migrating a large data set I have to:

    • can be configured per-user or system-wide
    • password can be per-user, system-wide, or re-use the login password on the fly
    • data is stored raw on an external file server - supports a bunch of protocols, off hand SMB, S3, WebDAV, FTP
    • shows up as a normal-ish folder in the base user folder
    • can template names, such as including your username as part of the share name
    • Nextcloud does not independently contribute versioning data to the backend file server, so the only version control is what your backing server natively implements

    Admin docs for reference: https://docs.nextcloud.com/server/latest/admin_manual/configuration_files/external_storage_configuration_gui.html

    I use LDAP user auth to my nextcloud, with two external shares to my NAS using a pass-through session password (the NAS is AD joined to the same domain as Nextcloud uses for LDAPS). I don’t know if/how the “store password in database” option is encrypted, but if anyone knows I would be curious, because using session passwords prevents the user from sharing the folder to at least a federated destination (I tried with my friend’s NC server, haven’t tried with a local user yet but I assume the same limitations apply). If that’s your vibe, then this is a feature XD.

    One of my two external storage mounts is a “common” share with multiple users accessing the same directory, and the second share is \\nas.example.com\home\nextcloud. Internally, these I believe is handled by PHP spawning smbclient subprocesses, so if you have lots of remote files and don’t want to nuke your Nextcloud, you will probably need to increase the PHP child limits (that too me too long to solve lol)

    That funny sub-mount name above handles an edge case where Nextcloud/DAV can’t handle directories with certain characters - notably the # that Synology uses to expose their #recycle and #snapshot structures. This means that remote mount to SMB has a limitation at the moment where you can’t mount the base share of a Synology NAS that has this feature enabled. I tried a server-side Nextcloud plugin to try to filter this out before it exposed to DAV, but it was glitchy. Unsure if this was because I just had too many files for it to handle thanks to the way Synology snapshots are exposed or if it actually was something else - either way I worked around the problem for now by not ever mounting a base share of my Synology NAS. Other snapshot exposure methods may be affected - I have a ZFS TrueNAS Core, so maybe I’ll throw that at it and see if I can break Nextcloud again :P

    Edit addon: OP just so I answer your real question when I get to this again this evening - when you said that Nextcloud might not meet your needs, was your concern specifically the server-side data format? I assume from the rest of your questions that you’re concerned with data resilience and the ability to get your data back without any vendor tools - that it will just be there when you need it.



  • Adding on one aspect to things others have mentioned here.

    I personally have both ports/URLs opened and VPN-only services.

    IMHO, it also depends on the exposure tolerance the software has or risk of what could get compromised if an attacker were to find the password.

    Start by thinking of the VPN itself (Taliscale, Wireguard, OpenVPN, IPSec/IKEv2, Zerotier) as a service just like the service your considering exposing.

    Almost all (working on the all part lol) of my external services require TOTP/2FA and are required to be directly exposed - i.e. VPN gateway, jump host, file server (nextcloud), git server, PBX, music reflector I used for D&D, game servers shared with friends. Those ones I either absolutely need to be external (VPN, jump) or are external so that I don’t have to deal with the complicated networking of per-user firewalls so my friends don’t need to VPN to me to get something done.

    The second part for me is tolerance to be external and what risk it is if it got popped. I have a LOT of things I just don’t want on the web - my VM control panels (proxmox, vSphere, XCP), my UPS/PDU, my NAS control panel, my monitoring server, my SMB/RDP sessions, etc. That kind of stuff is super high risk - there’s a lot of damage that someone could do with that, a LOT of attack surface area, and, especially in the case of embedded firmware like the UPSs and PDUs, potentially software that the vendor hasn’t updated in years with who-knows-what bugs lurking in it.

    So there’s not really a one size fits all kind of situation. You have to address the needs of each service you host on a case by case basis. Some potential questions to ask yourself (but obviously a non-exhaustive list):

    • does this service support native encryption?
      • does the encryption support reasonably modern algorithms?
      • can I disable insecure/broken encryption types?
      • if it does not natively support encryption, can I place it behind a reverse proxy (such as nginx or haproxy) to mitigate this?
    • does this service support strong AAA (Authentication, Authorization, Auditing)?
      • how does it log attempts, successful and failed?
      • does it support strong credentials, such as appropriately complex passwords, client certificate, SSH key, etc?
      • if I use an external authenticator (such as AD/LDAP), does it support my existing authenticator?
      • does it support 2FA?
    • does the service appear to be resilient to internet traffic?
      • does the vendor/provider indicate that it is safe to expose?
      • are there well known un-patched vulnerabilities or other forum/social media indicators that hosting even with sane configuration is a problem?
      • how frequently does the vendor release regular patches (too few and too many can be a problem)?
      • how fast does the vendor/provider respond to past security threats/incidents (if information is available)?
    • is this service required to be exposed?
      • what do I gain/lose by not exposing it?
      • what type of data/network access risk would an attacker gain if they compromised this service?
      • can I mitigate a risk to it by placing a well understood proxy between the internet and it? (for example, a well configured nginx or haproxy could mitigate some problems like a TCP SYN DoS or an intermediate proxy that enforces independent user authentication if it doesn’t have all the authentication bells and whistles)
      • what VLAN/network is the service running on? (*if you have several VLANs you can place services on and each have different access classes)
      • do I have an appropriate alternative means to access this service remotely than exposing it? (Is VPN the right option? some services may have alternative connection methods)

    So, as you can see, it’s not just cut and dry. You have to think about each service you host and what it does.

    Larger well known products - such as Guacamole, Nextcloud, Owncloud, strongswan, OpenVPN, Wireguard - are known to behave well under these circumstances. That’s going to factor in to this too. Many times the right answer will be to expose a port - the most important thing is to make an active decision to do so.


  • I’m not the commenter but I can take a guess - I would assume “data source” refers to a machine readable database or aggregator.

    Making the system capable of turning off a generic external service in an automated way isn’t necessarily trivial, but it’s doable given appropriate systems.

    Knowing when to turn a service off is going to be the million dollar question. It not only has to determine what the backend application version is during its periodic health check, it also needs to then make an autonomous decision that a vulnerability exists and is severe enough to take action.

    Home Assistant probably provides a “safe list” of versions that instances regularly pull down and automatically disconnect if they determine themselves to be affected, or, of the remote UI connection passes through the Home Assistant Central servers, the Central servers could maintain that safety database and off switch. (Note - I don’t have a home assistant so I can’t check myself)