My company has three offices. We have about 200GB of shared storage. We often need to access and write files that are 10-20MB in size. Currently we have a centralized file server at the main office and a VPN. Using the file server at the main office is a pleasure, using it from the other two offices is slow. I am looking for a stable distributed filesystem, so that I can have a full copy of the data on a server at each of the three locations. I wish to share between each server and it's local clients via both NFS and Samba. Clients should be able to read and write to their local server, and the three servers should collaborate to keep all their data in sync. Can Ori offer this? Something else?
The goal of FS-Cache and CacheFS is to reduce network traffic because some of the data requests will be satisfied by local storage (CacheFS) reducing the amount of network traffic. The load on the server should also be reduced since it will not have to satisfy all data requests. Consequently, this reduction may make up for the increased file lookup time and file read time due to the cache.
Depending on the exclusivity/locking requirements, and the amount of changes, you might be better off with complete replication among offices, such as DropBox (if you trust them) / SparkleShare (if you want to host yourself - sync is less efficient though), or simply rsync scripts going back and forth.
A key for convenience, which DropBox delivers, is to fetch the data before you need it. 200GB is not that much in the grand scheme of things today - if you only have 1GB/day of changing data, it could be viable.
I set up syncthing, it's syncing now. I think this software will do exactly what I want. It's a standalone binary and runs as a standard user space app, keeping data synchronized between all the systems without the need to mount a new filesystem or do any advanced configuration.
I tried it ~2 weeks ago and was unable to figure out how to get it to synchronize across multiple computers as promised (could have been operator error or documentation lack). I also had a few times where I dismounted the virtual file system (or maybe not) and ended up with a repository that was in a broken state with an error message that said fixing it was not implemented yet.
I've also played with SparkleShare, which has been working well in my very limited testing. The downside is that it is written in .net which has a large footprint (pre-paid if you are in the Microsoft world). With SparkleShare, I'm using my own git repositories (gitolite) which works well but requires some hand editing of the configuration file to make it work.
It sounds like it might work (caveat: I haven't tried it yet). The best info I could find on it was this paper[1]. It gives some command line examples, explains how the data model works (spoiler: it's similar to git) and provides some benchmarks. The NFS comparison benchmarks were very interesting (generally faster over a WAN than NFS is over a LAN).
Thanks for the link. "Ori over a WAN outperforms NFS over a LAN" that's a ridiculous claim, maybe if you ignore bandwidth. I might have to give this a shot though. I hope it's stable.
Can't say about Ori or a lot of the distributed file systems being showcased, but Panzura's global file system addresses this exact use case. But since they make enterprise products that run on dedicated hardware, they might be a bit pricey.