One way to get an error when retrieving a git hash is by building inside a Docker container. If you mount the root directory of a work tree, but not the “git-common-dir”, git will fail to give you the commit hash.
From experience, I can also recommend using SQLite as an application file format. I landed on SQLite after looking for solutions for a file format for an educational app we made for simulating biological neural networks. The app is cross-platform, written in Qt and the simulations needed to be stored as a JSON describing the network, a thumbnail and some metadata. It was also intended to be extensible with more features and backwards compatible if new versions were released. I considered creating our own simple format, using ZIP files, HDF5, Qt resource files or SQLite.
I landed on SQLite for many of the reasons outlined in this article and in particular because of how easy it was to implement and maintain. SQLite is supported natively in QtSql, which made it extremely easy to write the save and load functions, and later extend these with more data fields. In addition, we did not have to worry about cross-platform support since this was covered by SQLite and Qt already.
It sounds like future schema changes may be a potential concern for your application. One thing you can look into using is the SQLite user_version pragma. We use this right now to roll our own migrators and it's light years better than how migrators work for Entity Framework, et. al.
Interesting, I used HDF5 in a similar situation because we needed to save a lot of same-sized rows of data (simulation time steps), so a matrix-oriented format seemed to make sense but it wasn't entirely without some need for cross-referencing between tables, so it does make me wonder now if sqlite would have been a comparable or better choice. Any reason for rejecting HDF5 in your case?
It may be a regional difference, but the link goes to a book narrated by Wil Wheaton for me. I found Wheaton's narration to be a great listen. He's not the slow-paced, calm narrator you will find in many other non-fiction audiobooks, but rather an enthusiastic storyteller. For this book, I think it was a good fit.
We recently published a paper suggesting an alternative to HDF5 [1] using directories for objects, YAML for metadata and NumPy for data. Many of the points in this article were raised by the reviwers or were worries we had about choosing YAML as the metadata format. In the end, we decided to use a subset of YAML with only basic tags, enforced quoted strings, no directives, and no block scalar styles (fancy multiline strings). So far it has worked out great. I hope it will make the format easier to understand for users and make it possible to write faster parsers in the future.
You have a good point, but there are some cases where I have a hard time seeing how you would avoid getters and setters completely.
Say you are creating a UI library which has a TextField class. What should the TextField API have in place of the usual getText/setText and isEnabled/setEnabled methods? Say, if the library should support developers who want to create a text field that is automatically filled with data on a button press and can be disabled when a checkbox is ticked.
Note that I'm not arguing against you - I'm honestly curious about what the TextField should be replaced with or what its API should look like.
In this case I would separate the data (probably the model the text field is wrapping) from the UI widgets that expose it... the model being a dumb struct and the text field being a smart object with methods that look (if you squint) more like getters/setters.
But that’s just the thing, the text field in this case isn’t something you’d ever confuse with a dumb struct... it has methods that accept key input, or disable/enable it, or calculate its clipping area, etc. I don’t look at these as “getters” or “setters”, but instead just methods on an object like any other.
It may seem like I’m shifting the goalpost here and I’m sorry, I really do have trouble articulating this, but getters/setters seem like an anti pattern to me precisely because they let the consumer think they’re just dumb properties (along with invariants like the data you get out matching the data you put in, which you can never actually assume) when they’re anything but.
Thanks for clarifying! I really don't think you're shifting the goalpost. What you're saying is basically that the getText/setText methods are part of the API of the visual TextField, like a moveLeft method would be part of the API of a robot controller. A dumb struct with mutable members, such as a Point with public x, y, z members, is on the other hand both the API and the data in the same object.
It seems to be important to separate data from logic, and API from implementation in discussions like these. Sure, I can make a nice API for a visual Rectangle object with dynamic width and height, and add a check to my setWidth method to make sure the width is never negative and the changes trigger a repaint. But if I need a Rectangle struct to store data in the implementation of my library, it can be way simpler to just make the Rectangle have a public final/const width member, and set and check the width in the constructor.
Sometimes you need the smart Rectangle, and sometimes you need a dumb struct with width and height.
Why wouldn't you want to keep the constraints as close to the data as possible? That way you avoid the issue that the Rectangle struct is reused somewhere else, but without the correct constraints.
IMO the constraints should be as close to the thing that needs the data, as possible. Your object is an expression of data that may be valid in some contexts and invalid in others, and trying to pick any one of them is difficult (and leads to things like complex inheritance trees to express all the different flavors of constraints.)
I think this is the essence of the "composition over inheritance" idea that seems to be the most violated piece of sound OO advice out there.
That link says: blocking access due to copyright issues. Respectfully, I don't think you're doing justice to the Turkish freedom of speech issue of Wikipedia being blocked by bringing up The Pirate Bay.
> That link says: blocking access due to copyright issues. Respectfully, I don't think you're doing justice to the Turkish freedom of speech issue of Wikipedia being blocked by bringing up The Pirate Bay.
Can we both agree that both of these things shouldn't happen and ThePirateBay and Wikipedia shouldn't be censored by any country?
Sure, but there's a time and a place, if you know what I mean. One directly undermines access to information, whereas another has to do with copyright law.
I did not mean to compare the two cases at all and am sorry if it came out that way. I just wanted to answer the parent post on 'what the "anomalies" this site detects in Denmark, Finland, and France represent' by pointing out that TPB is blocked/censored in these countries. I only meant to point out one possible reason why the above site listed these countries.
Denmark censors a lot more than just TPB, though it is probably the one website that causes most people to circumvent the censorship.
In Denmark it all started out with blocking child porn sites, then it was expanded to copyright infringement, then it was expanded to companies operating illegally (counterfeit goods stores, illegal gambling, alternative medicine stores) and most recently the government started blocking "terror propaganda" [1] - which means they have begun censoring political speech.
This is in spite of a constitution forbidding censorship, with the counterpoint being that since it's technically private companies doing the censorship, it doesn't violate the constitution. Even though ISPs will be taken to court if they don't censor sites [2]
Sorry, but I did not intend to compare or conflate the two cases. I wanted to give an answer to the parent poster why those countries might be listed on the above site.
If you want to quickly add new tasks to your todo.txt from the command line, I strongly recommend http://todotxt.com/ It is a fantastically lightweight solution and has an Android app for those times when you are not in front of a computer. The Android app just reads/writes to the same todo.txt file over Dropbox.
It amazes me that Syncthing handles rapidly-changing files better for me than Dropbox. Where Dropbox has left different versions of files on different computers even after they have changed multiple times, Syncthing happily synchronizes them without issues. And the fact that Syncthing is open source and allows me to keep files on my own machines is a big plus. However, I would love to see a paid option for easy backup of selected files to a machine in the cloud. Some people want their files backed up and accessible through a web service and I wish I could recommend Syncthing to them as well.
Just a heads up: that almost certainly means SyncThing is failing to recognize some classes of conflicts and concurrency races, and is destroying some changes. Doing this right is a hard problem. I know they can be a pain to deal with, but we don't spin out conflicts for no reason!
Do you have any specific bug or design reference to the Syncthing codebase that illustrates this, or do you simply believe that it's impossible to do it faster than Dropbox does it?
Fwiw, i frequently get notifications from Syncthing that it's found a conflict it can't resolve and asks me to resolve manually.
Sorry, not super familiar with Syncthing. It's definitely not impossible to do better than Dropbox, but to some degree, as in all things, it's a function of engineering resources, telemetry, and usage.
We've likely put 25-100x more resources into solving this problem over the last 10 years, and we just have a lot more data b/c we have 100s of millions of users on lots of platforms using Dropbox with every application you can imagine. So we're able to tease out the "long tail" of weird file system and application behavior in a way that's very difficult for smaller projects. Truly durable conflict management in the face of arbitrary mutations by user applications on the filesystem ends up being a really, really hard problem to cover exhaustively. The Dropbox client handles literally hundreds of special cases.
So, yeah, I believe it is (in general) safe to assume that Dropbox is probably doing A-more-correct-thing for a complicated (and admittedly confusing) reason when it comes to sync behavior. But we're not perfect--we do still find surprises from time to time, so feel free to contact support if you see something that looks wrong!
That was my initial thought - however after rereading his post he seems to refer to changing files quickly on a single system and seeing the changes replicated to multiple additional systems. This reading suggests Dropbox fails to keep up and eventually decides some of its own yet-unfinished syncs in additional systems are conflicts.
Are you saying that you solve conflicts specifically for different types of applications (e.g., Word, Photoshop, Excel)? That's impressive. However, now I'm wondering how I get my own apps supported.
I think that Dropbox typically handles conflicts quite well, and the issues I had are more likely bugs outside the conflict resolution implementation. I was a bit brief in my comment above, so let me elaborate in case you or someone else is interested:
The issues I had didn't result in conflicted files. Rather, after making a big change (i.e. switching git branches) some files were never updated or synced. Dropbox stopped picking up changes in the folder and eventually removed new changes once restarted.
The order of events were something along the lines of:
1) Did work on computer A that caused massive file changes (i.e. moving between git branches).
2) Moved to computer B to continue work.
3) Noticed files were old or missing on B.
4) Syncing files in some other folders worked, but nothing happened in the folder with missing files.
5) Restarted Dropbox on both machines in hope that this would trigger a fresh sync.
6) Observed files being reverted to old versions or deleted on machine A.
The end result was that Dropbox threw away the changes I had made on A and left me with the original state of B. I was able to recover the changes from a backup, so it was no big deal in the end (although it left me a bit scared I could have lost those files without noticing).
I was in contact with Dropbox support about the issue and explained in detail what I had done and what happened. I was offered help to recover the files, but since I had already done so, I just told them I didn't need any more support on the issue. I thought it might be because /proc/sys/fs/inotify/max_user_watches had a low value on one machine, so I wrote back that they might want to add back the old warning about this. However, the same problem with deleted files happened again after I had verified that this value was high enough on all machines.
I have also seen how a script run by a colleague managed to confuse Dropbox. The script was running a test which repeatedly created and deleted the same file before checking its correctness. Running the script in the Dropbox folder left him with some old version of this file and a failed test. Running the scirpt in a folder outside Dropbox left him with the correct final version of the file. He was only working on one machine.
And yes, I know it's "bad" to run scripts like this or switch git branches on top of sync software, but it happens, and it is interesting to see how different software handles these cases.
It should be noted that Dropbox usually handles these massive file changes well, so moving to Syncthing has for me been more about it being open source and the possibility to keep files on my own machines. I was just glad to see that Syncthing also handles heavy use cases gracefully.
Last I recall it didn't, which kind of soured me on it - in an ideal world every sync operation would be prefaced with "send a backup to one of my backup endpoints first".
Backing up your filesystem can and really should be part of an entirely different process. Let syncthing focus on syncing. Get CrashPlan, rsync, or some other process going on your filesystem for backups. Unless you are hoping for backing up every single version of your documents.
For big repositories with long histories (like Qt), worktrees also save some disk space.