Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Ask HN: How much would you pay for an extremely scalable, resilient database?
8 points by twokei on Nov 10, 2019 | hide | past | favorite | 14 comments
Let's say that there was a database with a magic consensus protocol that allows it to easily scale from one to tens to hundreds to thousands to tens of thousands of nodes.

Tens of thousands of transactions may be performed per second, each taking about 1 to 4 seconds to be applied (sufficing for any use cases that are not necessarily near-real time).

The database supports procedural functions written in all kinds of hipster/systems-oriented programming languages, and may be bootstrapped to support all kinds of data structures and queries.

It was made to be developer-friendly as possible.

The database can run on minimal hardware, from as little as 512MB of RAM and 2vCPUs to more along the lines of 2GB RAM and 4vCPUs. The database may be hosted for a minimal cost.

The database is resilient in spite of all but one of your nodes crashing, allowing you to fulfill Tier 4 SLA agreements in terms of uptime without the need of a team of Site Reliability Engineers (SRE), or engineers with domain expertise in distributed systems.

My question to you in managerial or in development positions at small/large startups or enterprises in HN: how much would you pay for such a database?

How would the price differ if it was through a one-time/monthly licensing fee? What fee structure would you personally prefer?

If instead this database was bootstrapped around with a developer productivity framework like Meteor that would significantly speed up time-to-production, where you can build an entire platform - from backend to frontend - in a single codebase, would you pay more?



I would prefer -

1). Either "open source code solution" - allowing other members to fix, enhance, optimize the app.

2). OR, "one-time licensing fees with open source code allowing a change in code for own use case -AGPL"

3). "if monthly then - expecting constant updates and innovation on the product on a regular basis" to justify the subscription fees [still self-hosted and not on other parties cloud].


Gotcha! Makes sense to me :). What price tag would you put on this software?

There's been a pretty interesting development from what I've seen on annual licensing fees for enterprise features, with the code open-sourced.


> The database is resilient in spite of all but one of your nodes crashing,

Well, once I hear that, I wouldn't be willing to pay anything. I admit, you did say magic consensus protocol. But by definition, this is impossible, unless you can reliably detect a node has gone down and isn't up and serving requests somehow. TBF that is to some extent possible. But a network partition will still take it down.

At a certain point, the biggest reliability factor is the complexity of the software, not the reliability of the hardware. You want the clustering logic as simple as possible. You need parts of query evaluation, hipster programming language evaluation, to be precisely deterministic.


It would be possible if say for example, the database was fully replicated in a masterless fashion.

The tradeoff is slower writes (there are ways to make it so that the more nodes you replicate across, write latency won't be affected!).

Reads will scale linearly for the more nodes you replicate across.

So, the novelty in some sense is fast replication of data in a masterless fashion (without any leader such as, for example in Raft or Paxos).

If this protocol was surprisingly simple (which dumbs down the complexity of the software significantly), would you pay for this sort of database?


Yeah, no, the database being replicated "in a masterless fashion" doesn't help. The problem is that under some network partition conditions, you can't do some writes or any writes, unless you want to sacrifice resilience and/or consensus.


Why wouldn't there be failures only under a complete network partition? So long as one node in one partition may communicate with another node in another partition, then writes may still be performed.

Availability would be what is sacrificed in the advent that a node is partitioned away from the main network.


Only when that's true of those two nodes and no other disjoint subset of nodes.


I'm reading this post and really confused.if you have something, it might be easier to show it. It also seems like some of what you are selling is devoid of any understanding of the real world nuance and specificity that you encounter at scale.

Whatever you have built, clearly doesn't do all of these things today, if ever. What have you built or what are you building? How is it different, today, from whatever else is in the market.

I have a sushi restaurant. But, instead of us making sushi for you, we show you and your guests how to make it and by the end of the night, you're doing it like a pro and having a great experience, with wonderful food.

Then, we'll let you know if it's worth giving it a shot.


OP sounded very hypothetical to me - not asking as a startup idea...


Yep! Just to clarify, this is entirely hypothetical! I want to know the value-add people would put on such a database if it were to exist.

There's little insight from what I've seen in the database market as to what people really want the most out of a database, or what makes a database "appealing" to people apart from its reputation from being time-tested.


> Tens of thousands of transactions may be performed per second, each taking about 1 to 4 seconds to be applied.

That is just too slow for anything that I would need. I'd be impressed with hundreds of thousands of transactions per second with double digit millisecond latencies.


Difficult question.

But look at the closest competitors and see what they charge perhaps?

For example, take a look at Cockroach Labs, or MySQL InnoDB Cluster, or one of the many database as a service providers, or whatever you think is the closest competitor, and see what their licensing terms and pricing are...


Look at the cost of existing databases that do this. Snowflake, Elasticsearch, etc. Their pricing models are public.


One bajillion snake oil dollars.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: