I have worked on a few projects where graph databases were used. I have not personally seen a case where I feel they add much value relative to their complexity and tradeoffs.
One of the projects was a business workflow application centered around validating business processes by collecting and reporting on process data—think manufacturing quality control. A graph database was used in an attempt to allow application users more control in defining their workflows and give them more expressive semantic reporting abilities. We tried several graph databases. In reality what happened was that the scheme became implicit and performance was truly awful. The choice of a graph db was a strategic decision; we wanted to enable a different user experience. We probably could have done this project in 20% of the time with a standard database and wound up with a better result.
I have also worked on a problem related to storing and retrieving graph data for image processing. The graph db was obscenely slow and inefficient despite the data models being actual graphs.
Both of the projects I worked on involved people who are experts in graph databases. The level of nuance and complexity was astounding. Even simple tasks like trying to visualize the data became monumentally complex.
My takeaway from both of these experiences was that unless you intend to ask questions about the relationships, a graph might not be a very good fit. Even in that case, other databases will likely perform just as well.
Holy crap, have we worked on some of the same projects? GraphDB inappropriately applied to a process definition/exploration application. In my case I'm pretty sure the correct solution taking into account all desired functionality and existing support was a desktop (sigh, maybe Electron I guess) app and good ol' SQLite, but nooooo, we did a web app with server-side storage in Neo4j. I tried to sell PostgreSQL as it was 100% for sure a better fit for the kind of queries we'd be running, but that didn't fly. They had a Neo4j "expert" to whom I sometimes had to explain how Neo4j worked. The highest-tier tech manager at the client with whom I interacted was learning about Neo4j from what was mostly a marketing book from the Neo4j folks, turns out.
They burned a shitload of money on those bad decisions, on that and other products they'd previously stuck on Neo4j for no good reason, which were also seeing poor and unpredictable performance and having a rough time with immature supporting tools for the DB. Whole thing's closely related to the "we have big data, it's in the single digit GB range, so big! We need big data tools!" error, I think.
> My takeaway from both of these experiences was that unless you intend to ask questions about the relationships, a graph might not be a very good fit. Even in that case, other databases will likely perform just as well.
Precisely the same conclusion I reached, at least in the case of Neo4j. If the main thing you need to do is answer questions about graphs, it might be an OK DB to use. If the main thing you need to do is extract data from graphs, then you sure as hell don't want it as your primary datastore. Maybe—maybe—some kind of supplement to a SQL DB or whatever depending on your exact needs, but it shouldn't be what you're actually storing most of the data in.
One of the projects was a business workflow application centered around validating business processes by collecting and reporting on process data—think manufacturing quality control. A graph database was used in an attempt to allow application users more control in defining their workflows and give them more expressive semantic reporting abilities. We tried several graph databases. In reality what happened was that the scheme became implicit and performance was truly awful. The choice of a graph db was a strategic decision; we wanted to enable a different user experience. We probably could have done this project in 20% of the time with a standard database and wound up with a better result.
I have also worked on a problem related to storing and retrieving graph data for image processing. The graph db was obscenely slow and inefficient despite the data models being actual graphs.
Both of the projects I worked on involved people who are experts in graph databases. The level of nuance and complexity was astounding. Even simple tasks like trying to visualize the data became monumentally complex.
My takeaway from both of these experiences was that unless you intend to ask questions about the relationships, a graph might not be a very good fit. Even in that case, other databases will likely perform just as well.
(edit: add last sentence)