As somebody who's been looking really hard at a project/side business that'd use Spaces (DO's object storage system), this makes me super, super nervous. To say nothing of block storage--yikes.
Can anyone speak to quality/reliability of other object storage providers that have S3-compatible (including presigned URL) APIs? S3's pricing is absolutely ridiculous by comparison, but they have the reliability argument on their side...
>S3's pricing is absolutely ridiculous by comparison, but they have the reliability argument on their side...
Well, unless your S3 buckets are in us-east-1. For some reason Amazon keeps having issues with S3 in that region.
Since the storage costs appear to be the same between Spaces and S3 ($0.02/GB/month) and neither charge for inbound transfer, I'm assuming your problem is with the outbound transfer pricing (S3 charges 9x what DO charges) and/or the per-request pricing. GCP's Regional Cloud Storage has the same storage pricing, even higher outbound transfer pricing, and the same request pricing. I haven't looked at any other providers, but if you want reliability, you're going to have to pay for it.
Their us-east-1 region is notorious for being the region with the least reliability. Unless there's a good reason not to I always recommend people default to us-west-2.
I can't speak about S3 but we're using heavily GCS (google cloud storage) and haven't experienced any problems. From time to time we can see a slowdown, but never seen an outage.
I also looked at B2 [0] once or twice. The price is great, but the traffic cost (egress from GCE) renders it unusable for us.
There are actually a few options in egress land for us if cost is your primary concern.
If you're doing serving over http(s), you should probably be using Cloud CDN with your bucket [1] or put one of our partners like Cloudflare or Fastly with CDN Interconnect [2]. Both of these get you closer to $.04-$.08/GB depending on src/dest.
If not, and you don't care that we have a global backbone, you can get a more AWS-like network with our Standard Tier [3] (curiously with pricing squirreled away at [4], I'll file a bug). The packets will hop off our network in a hot potato / asap fashion, so you're not riding our backbone as much.
I know about Standard Tier, though it had slipped my mind--thanks for pointing it out.
It's still way too expensive. And Cloud CDN isn't an appropriate tool for my use case. I really do just need a bunch of egress from a single location that isn't insanely expensive. $0.085/GB is in that insanely-expensive tier, for me.
Don't know yet! But I expect to see between 100x and 1000x on a per-megabyte basis. Not evenly distributed across objects (objects between 40MB and 150MB), but as a rough estimate.
I'd be happy to talk further via email; it's not secret, just not public.
Your bandwidth pricing is a joke. Yes, you got a nice network and yes you pay premiums to get transit of providers that are "hard to work with". And yes, you have dark fiber between your locations, which is costing a lot of money, but even considering those facts you are still charging at least 10x as much as your bandwidth should cost your customers.
How have you even calculated those prices?
"Let's look at AWS and make it even more expensive"?
The bandwidth charges are hefty if you're moving a lot of bits, but I wouldn't use anything other than S3 or GCS probably -- the other guys just don't have a track record of reliability yet.
But, you can build a poor-man's CDN -- varnish caches on DO/Linode/whatever where you get multiple terabytes of bandwidth for a small VM. So, you use the best object storage provider, but move most of the bits cheaply using Varnish + Route53 geo-dns.
I mean, Digital Ocean itself has been “jokeish”. I had a VM ago down for 12 hours, it took 6 hours to even get them to confirm they had an issue on that machine. It was stupid.
There is Sheepdog distributed file system which recently went into version 1.0. Sheepdog is similar to Ceph but seem simpler to operate.
For distributed object storage: I have also used MooseFS,LizardFS distributed object storage and MooseFS,Lizard runs very steady on production work loads. Steady as setup and then no ops issues.
Also to the short list is BeeGFS, BeeGFS is created by Fraunhofer is seriously fast distributed file system.
I can't speak for the quality of other object storage providers, but being in the storage business I can say that if someone is running Ceph, find another provider.
If you are relying on a single object storage provider and cannot survive downtime, data loss or simply being very slow at times you will never find a good one. Expect things to fail. Distributed systems are not that trivial for a random object storage provider to have enough expertise to run Ceph or any other open source solution at scale with no issues.
Or you should purchase commercial support for said storage...
Salesforce runs several large Ceph clusters, and they have a dedicated team to run it. If you can't invest in the employees, you should invest in commercial support.
Salesforce also commits a lot of updates and patches back to the Ceph community
The issue with Ceph isn't that it's some how deficient. It's amazing. The problem is that it's difficult to engineer correctly and hard to troubleshoot.
There is no other software, open source or otherwise that works quite as well as Ceph for providing durability and scale.
ScaleIO gets high marks for block storage performance compared to Ceph. It's not quite as durable and lacks some other features, but people seem to like it.
A lot of companies using Ceph at scale are facing huge issues (OVH, etc.), so he is not wrong. Why take the risk of going with a solution that is known to cause issues?
I've talked to a lot of large-ish commercial Ceph customers and they seem to spend a lot of time building kludge-arounds for support. And tend to live terrified that the whole clumsy edifice will come crashing down at the cost of their jobs.
Also too, Ceph is block, object and file. Block is ok up to a point, object is dubious and file is utterly untrustworthy. At least at any kind of real scale - 3 servers in a rack aren't "scale".
Why must someone who isn't a Ceph fan (and I fail to see why storage systems are a "fan" activity) live in the evil pockets of EMC? I know people who've smoked for years and don't have any sign of lung cancer either.
OVH isn't exactly a shining example of a quality engineering organization. Simple web searches show how they have misused things and cause large outages.
Ceph is very reliable and durable. We've actually gone out of our way to try and corrupt data, but we failed every time. It always repaired the data correctly and brought things back into a good working state.
Ceph and Yahoo run very large Ceph clusters at scale, too.
You can use Ceph together with OpenStack. They used ceph for their cloud services but had huge problems. If I am not mistaken they completely threw out Ceph by now.
Any idea what the underlying issues with Ceph were?
My story is a bit dated, but we went from gluster to ceph to moosefs at one startup. Gluster had odd performance problems (slow metadata operations - scatter/gather rpcs and whatnot I would guess) and it was hard to know from the logs what was going on.
Ceph was very very early at this point, but part of it ran as a kernel module and the first time it oops'd, I deleted that with fire. MooseFS ran all in userspace, had good tools for observability into the state of the cluster, and the source code was simple and clean. It didn't have a good story around multi-master at that time, but I think that is improved now.
Ceph is extraordinarily complicated to run correctly. The docs aren't great and commercial support is pretty mediocre.
It's an amazing piece of software, but takes a great deal of engineering to get right. Most folks won't invest that much engineering into their storage.
This is why Providers like EMC and NetApp can extract 10x the cost of the raw storage from enterprises.
The RedHat ceph docs are great and open to everyone for free.
The RedHat commercial support has been pretty good for us. We presented them with 2 bugs, and they addressed both. One took a few weeks but one only took a few hours to get a hotfix started.
EMC storage is absolute trash post Dell merger. Pure 100% dumpsterfire. Their customers know their systems better than they do. It's pathetic.
No clue what the underlying issue was but when reading:
"We have about 200 harddisk in this cluster... 1 of the disks was broken and we removed it. For some reasons, Ceph stopped to working : 17 objectfs are missed. It should not."
This isn't related to block storage at all, but I was a big fan of DO until I hit a weird issue where they wanted me to prepay via PayPal to spin up more than 50 droplets at a time. I work in an organization that is spinning up many nodes at once for a short time and then destroying them soon after, for various but totally legitimate reasons. One look at our account history can demonstrate that this is almost exclusively how we use their services, so it's not like this was a weird request. And we've never missed a payment or paid late or otherwise ever given Digital Ocean any reason to think we wouldn't be good for the charges at the end of the month (especially considering we were already spending sometimes in the thousands of dollars every month). This was so off-putting. I stopped using Digital Ocean that day.
We do have a business account, if I'm reading the console correctly. I was wrong about the limit of 50. It is a limit of 100 droplets but this is an artificial limitation that they wouldn't budge on. It's clear from the (lengthy) account history that this is normal behavior for us, and I work for a company whose name everyone knows, so it's not like we're some no-name scammers. Regardless, asking customers to buy vouchers for what really is only moderate use of a service is really off-putting. It was so off-putting actually that, more than halfway through writing it, I scrapped a driver for some pretty popular software that would have enabled the use of Digital Ocean as a backend.
Minio is cool. Unfortunately, performance is anyone's guess.
It has erasure coding as well. You could deploy on bare VM's with local storage in any cloud provider and have no dependency on network blocked storage.
With k8s 1.10 you get persistent local storage as well, as such you could probably build a fairly highly available system. Pro tip: do it in GCP as they have nice local SSDs you can attach to any instance. They're 375GB, 25k IOPS, and $.08GB, way cheaper than AWS I2 instances.
So I have no problem with network storage. I have a cost problem (the project I'm working on is not intended to make a bunch of money and I'm trying to keep costs low so I can keep prices extremely low). What you're describing would functionally be even more expensive than just using S3 directly, if it were to be done in AWS or in GCS.
Right now the leading (uncomfortable) solution is probably DigitalOcean Spaces and a little bit of prayer.
But...I don't care about the technology. I care about the object storage available to me without caring about the technology. So what's this do for me?
Thanks for the suggestion, but a dollar per gigabyte per month is ridiculous unless you're downloading everything in your store eleven times a month. Even S3 only costs $0.02/GB storage and $0.09/GB egress.
I would suggest you take a look at Wasabi for object storage. I'm just a customer, but have been using them for close to a year for off-site backup storage and it's been great.
Reposting from my comment from yesterday[0], how are the speeds from you location (and where's that)? It have been ridiculously slow from Northern Europe when I've tried it, like not even 1 MB/s down.
> Wasabi’s hot cloud storage service is not designed to be used to serve up (for example) web pages at a rate where the downloaded data far exceeds the stored data or any other use case where a small amount of data is served up a large amount of times
Can anyone speak to quality/reliability of other object storage providers that have S3-compatible (including presigned URL) APIs? S3's pricing is absolutely ridiculous by comparison, but they have the reliability argument on their side...