SBCs are great for public webservers and suited to save you quite a bit in energy costs. I've used a Raspbery Pi4B for about 5 years with around 10k human visitors (~5k bots) per year just fine. I'd like to try a RISC-V SBC as server, but maybe I have a few more years to wait.
I don't run into resource issues on the Pi4B, but resource paranoia (like range anxiety in EVs) keeps me on my toes about bandwidth use and encoding anyway. I did actually repurpose my former workstation and put it in a rackmount case a couple weeks ago to take over duties and take on some new ones, but it consumes so much electricity that it embarrasses me and I turned it off. Not sure what to do with it now; it is comically over-spec'd for a web server.
Most helpful thing to have is a good router; networking is a pain in the butt, and there's a lot to do when you host your own when you start serving flask servers or whatever. Mikrotik has made more things doable for me.
crudely. apache2 logs are parsed every 5 minutes. if the IP address exists already in post-processed database, ignore the entry; if they didn't exist in database, a script parses user agent strings and checks against a list of known "consumer" browsers; a whitelist. If they match, we assume they're human. we then delete the detailed apache2 logs and put just the IP address, when we first saw them (date, not datetime), and whether they were deemed human or bot into database. faking user agent strings or using something like playwright would confuse the script; but the browser list will also inherently not have all entries of existing "consumer browsers".
every day, a script checks all IP addresses in the post-processed database to see if there are "clusters" on the same subnet. I think it's if we see 3 visitors on the same subnet, we consider it a likely bot and retroactively switch those entries to being a bot in the database. Without taking in millions of visitors, I think this is reasonable, but it can introduce errors, too.
I don't run into resource issues on the Pi4B, but resource paranoia (like range anxiety in EVs) keeps me on my toes about bandwidth use and encoding anyway. I did actually repurpose my former workstation and put it in a rackmount case a couple weeks ago to take over duties and take on some new ones, but it consumes so much electricity that it embarrasses me and I turned it off. Not sure what to do with it now; it is comically over-spec'd for a web server.
Most helpful thing to have is a good router; networking is a pain in the butt, and there's a lot to do when you host your own when you start serving flask servers or whatever. Mikrotik has made more things doable for me.