XFS: the filesystem of the future?

owenmarshall · on Feb 2, 2012

It's somewhat amusing that the filesystem of the future was developed by SGI in 1994 ;)

I've had little to complain about when I've used XFS. It seems like a very good file system, and has handled the large datasets I've thrown at it without any problems.

mariuz · on Feb 2, 2012

Yes is funny that they invented the future of 3d graphics programming too : OpenGL in 1992 http://en.wikipedia.org/wiki/OpenGL

sespindola · on Feb 2, 2012

As mentioned on the article's comments, when I used XFS about 8 years ago, I found exceptionally prone to data corruption.

It remains the only filesystem that I couldn't at least partially recover.

For a good mix of speed, reliability and huge data capacity on Linux, I stick to ReiserFS, at least until btrfs becomes more stable.

owenmarshall · on Feb 2, 2012

8 years ago would make you an extremely early adopter of the Linux port. XFS has seen significant development since then. I don't think experiences based on eight year old software hold much water.

XFS has adopted some nasty stigma of being a filesystem that eats your data. But it seems that for every user that complains about data loss, another does not.

Terretta · on Feb 2, 2012

> But it seems that for every user that complains about data loss, another does not.

So, a coin toss then? You're not helping!

owenmarshall · on Feb 2, 2012

No, we need a careful investigation of the filesystem.

Listening to my "works fine" comment is as useless as any other "didn't work for me" comment.

What might be helpful is a comment saying "I encountered the following issues with the following configuration, reported the bug, and the maintainers said ..." What would be even better would be for actual experts to audit the code, look through the bug reports, and give their opinions.

But "works for me"/"broke for me" comments are, unfortunately, as useless as most filesystem benchmarks. Indeed, any time filesystem discussions come up, a stunning majority of the opinions are unhelpful. Unfortunately, I jumped right in with one as well :(

andrewvc · on Feb 2, 2012

Agreed, I was using XFS aggressively on ~20 servers with wildly varying IO loads 4-5 years ago and it was rock solid.

RexRollman · on Feb 2, 2012

My experience with XFS and Arch Linux is that it is rock solid.

cbsmith · on Feb 2, 2012

That sad part is that a lot of what was perceived as data loss from XFS was actually due to bugs in apps & the kernel that XFS exposed. These same issues would show up with ext4 & btrfs, but XFS has already flushed them out so they've been fixed.

jonhohle · on Feb 2, 2012

I was using XFS on a production web/ftp server (RedHat 7.x with XFS patches) in or around 2001. The port was done in 2000. I don't think a 4 year old port of a 10 year old file system is that extreme. That said, I've been using ZFS in FreeBSD since FreeBSD 7.

moe · on Feb 2, 2012

It's a FAQ:

http://xfs.org/index.php/XFS_FAQ#Q:_What_is_the_problem_with...

In summary: XFS used to default to having write-barriers disabled (for performance), which makes it prone to data-loss during power outages. This default has changed nowadays, but you must still review your controller-/disk-cache settings to ensure stuff is really flushed to disk when XFS thinks it is.

simcop2387 · on Feb 2, 2012

I've actually been moving off of ReiserFS myself. Mostly for some of the known bugs and unfixable problems with it relating to storing another ReiserFS image on it and then trying to recover data from a crash. I've run into that more than once and that made me move to something that was a little more sane during recovery (ext3 at the time, now ext4).

gnaffle · on Feb 2, 2012

I've never had a problem with XFS, but I had several silent data corruption issues with ReiserFS after power losses, where I would find the contents of one open file intermixed in another open file. I think such problems have been fixed now, but I ended up losing trust in ReiserFS at that point.

Wilya · on Feb 2, 2012

5 or 6 years ago, it took a single power failure to basically screw up my filesystem. From the comments on the article, it seems it's not really a solved issue (or rather, people don't consider that an issue).

I'd think again before using XFS on my desktop.

(I've had power failures on quite a few other occasions, with all sorts of other filesystems. I did lost the occasional file, but at least I could still boot those systems..)

forcefsck · on Feb 2, 2012

Same experience here. And after xfs_repair, still remaining files and directories which couldn't be deleted or moved. The only solution to move the parent folder or reformat. Though performance wise, it was really fast.

asabil · on Feb 2, 2012

I had the same experience concerning XFS, I experienced repeated data loss on power failure about 4 years ago. These days I just stick with ext4

mathnode · on Feb 2, 2012

SGI was munging Terabytes of data before Windows had the "hi-color icons" tickbox.

yason · on Feb 2, 2012

A file system for generic purpose has only one primary feature: "Don't lose data. Ever." If I pop the SATA cable off my drive while writing, the filesystem should later remount in 100% uncorrupted condition with whatever data it had time to write to the disk. I don't want to run fsck or use debugfs to recover from a hairy state. Backups or RAID-1 take care of physical failures of the disk.

Beyond avoiding data loss, anything else is ultimately secondary. Speed is nice but near-average filesystem performance is all right for a HDD. SSD gives you more speed. Totally abysmal speed might be a reason to switch filesystems—though, I'd still take the abysmal speed if it saved me my data and the faster filesystem wouldn't.

Due to this conservative approach, switching filesystems is really hard. I've been using ReiserFS since 2000 or so because it hasn't failed me once. I've had HDDs going slowly broken and lost some individual files until I cloned the old disk to a new one. But I've never had to fsck, defrag, recover, or anything my ReiserFS partitions. Never. It's getting harder and harder to switch. A conservative alternative would be ext3 but ext3 has lost me data on another computer.

I have some interest in btrfs but I probably won't switch until I have to. XFS would be very interesting but not just worth the risk because ReiserFS is good enough.

cbsmith · on Feb 2, 2012

It must seem very strange to you that most filesystems don't do data journaling and that they have options which increase the risk of data integrity problems.

gcb · on Feb 2, 2012

you are lucky because if you ever need to use data rescue, well, there's nothing that works well with reiserfs.

that being said, i'm a reiserFS fan too. been using it for almost a decade. two months ago, installed debian sid on my box, and ext4 ate my data after the third boot! kid you not. tried to find the cause to submit a report, but run out of time, just reinstalled with ext3 for now.

kijin · on Feb 2, 2012

> For I/O-heavy workloads with a lot of metadata changes - unpacking a tarball was given as an example - Dave said that ext4 could be 20-50 times faster than XFS.

A while ago, I read somewhere that XFS is good for a small number of large files, whereas ReiserFS is good for a large number of small files. I don't know whether that's true anymore if it ever was; but perhaps extracting a tarball with thousands of small files in it is not the best way to bring out XFS's strength?

By the way, when are we expecting the "experimental" label to be taken off of btrfs?

Tuna-Fish · on Feb 2, 2012

> A while ago, I read somewhere that XFS is good for a small number of large files, whereas ReiserFS is good for a large number of small files. I don't know whether that's true anymore if it ever was

It used to be true. XFS has for a long time been the very fastest real filesystem available when it comes to moving data, and one of the slowest ones in real-world use when it comes to changing metadata. Untarring a kernel tarball a second time (on top of the first) is pretty much the canonical benchmark for metadata operations, and the one where XFS used to be at it's absolute worst.

XFS is in the news right now exactly because they have fixed that deficiency. It's still faster than ext*, zfs and btrfs in pretty much everything else, but now it's comparable to them when modifying metadata.

So this is exactly the right benchmark to use, because it's the one where there has been recent change.

Video of Dave Chinner presenting this at linux.conf.au: https://www.youtube.com/watch?v=FegjLbCnoBw

noahdesu · on Feb 2, 2012

> By the way, when are we expecting the "experimental" label to be taken off of btrfs?

I believe distributions are waiting on an fsck for btrfs.

waitwhat · on Feb 2, 2012

"It looks like this will finally be released by the middle of February." -- http://www.phoronix.com/vr.php?view=MTA0Njk

sneak · on Feb 2, 2012

Hahaha. That's pretty much /thread, innit?

eru · on Feb 2, 2012

What are you trying to say?

yusufg · on Feb 2, 2012

Is XFS available in stock RHEL 6.2 ? or does it require an additional purchase of the Scalable File System Add-On

http://www.redhat.com/products/enterprise-linux-add-ons/file...

Redhat doesn't list prices of its add-ons and its generally onerus if one has to call sales to find the price of a filesystem

Reminds me of Veritas VxFS

antoncohen · on Feb 2, 2012

I think it's in the base RHEL 6, try 'yum list xfsprogs', if it's there you have XFS.

http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6...

waffle_ss · on Feb 2, 2012

I've had good experiences with XFS on my home file server. The only things to keep in mind are that once you cannot shrink the filesystem once it's created, and that fsck doesn't run when booted (you can write an init script to run xfs_repair if you like).

Also, my fragmentation often gets quite high - over 90% - but it doesn't seem to really affect performance.

RexRollman · on Feb 2, 2012

XFS filesystems can be defragmented. The tool is called xfs_fsr

http://linux.die.net/man/8/xfs_fsr.

waffle_ss · on Feb 2, 2012

Yes, I have a cron job that does that.

RexRollman · on Feb 2, 2012

And yet you are getting 90% defragmentation? I wonder why that is happening to you.

colanderman · on Feb 2, 2012

I want to throw in another XFS data-eating anecdote. When I used it, I found it had a tendency to zero out files which were open for writing during a system crash.

I've been using JFS now for 6 years or so and have never had files go missing on me. JFS routinely takes second place to XFS in various benchmarks but still far outperforms ext3 and the like.

gcp · on Feb 2, 2012

When I used it, I found it had a tendency to zero out files which were open for writing during a system crash.

This was pretty much a FAQ and very much by design. See for example:

http://madduck.net/blog/2006.08.11:xfs-zeroes/

colanderman · on Feb 2, 2012

Interesting. I don't buy the security argument though. If, after a crash, the file size is larger than the actual extents, why not just truncate the file size to match the extents?

gcp · on Feb 3, 2012

Because there's no guarantees the extents actually point to the file data.

Zak · on Feb 2, 2012

I had this happen once too. As I recall, the machine lost power after a cached write and before the data was actually written to disk. The file was zeroed and all its former contents lost. My friend, who had borrowed the computer to write a paper and lost hours of work was very upset.

I switched all my XFS filesystems to something else that day.

RexRollman · on Feb 2, 2012

I always thought that was interesting because on my computer, JFS feels faster than XFS does (especially when using Pacman, on Arch Linux, which creates lots of small files).

Personally, I have never lost data with XFS or JFS.

calloc · on Feb 2, 2012

How does XFS stack up with ZFS when it comes to speed, reliability and protecting my data?

jamesfmilne · on Feb 3, 2012

At our company, we have been shipping XFS as the filesystem for our products for over 8 years. We've shipped many petabytes of storage in that time, all of which is being heavily hammered every day in feature film & TV post-production.

We've found XFS to be a very robust filesystem, and any problems have generally been traceable to modifications we've made to tools like xfs_fsr.

In general it's been considerably more frequent for people to end up in a panic after making a mistake replacing a disk in a 3ware array and deleting a RAID unit! Luckily our resident RAID ninja has been able to get the majority of people out of that hole. :-)

tytso · on Feb 2, 2012

All open source developers want users. Users means more bug reports and often more developers to contribute to the prokect. In addition, the greater relevance also means it's more likely they will be able to get companies to be able to pay for them to do what they do, conferences to invite them to talk about their projects, etc. (It's for this reason that when the IMHO over-aggressive enforcement of busybox license forcing people who get sued to release code on code unrelated to busybox has caused FUD amongst Linux embedded companies, the emergency of a replacement toybox project which avoids these issues has caused so much dismay amongst the busybox development community.)

Given that Dave Chinner is one of the primary developers of XFS, it's not surprising that he wants to promote XFS. And to be fair, XFS has not gotten as much attention as it has perhaps deserved based on technical considerations alone (as has other perfectly capable filesystems at the their time, such as JFS) and that's no doubt frustrated him. In addition, the work he has done to improve XFS removes one of the significant performance bottlenecks often seen by desktop users and developers, and he should be saluted for that.

That being said, it's also true that in many cases the file system is not the bottleneck, and so other issues that aren't tightly focused on performance (i.e., the quality of the userspace tools such as e2fsck and debugfs for ext4, and their equivalent or lack thereof in other file systems), familiarity by sysadmins, data recovery services, etc., ease of upgrading of existing large, production file systems, and so on.

In addition, it's dangerous to draw conclusions from a single microbenchmark such as fs_mark alone. It's not common that you have workloads which create thousands and thousands of small (< 64k) files in parallel across lots of CPU cores at the same time, on the same file system. So using this this benchmark alone to say that file system X is more scalable than file system Y is just not going to tell the whole story. Personally, I like to use microbenchmarks as a tool for improving a file system, and not as an argument to try to get people to switch from one file system to another. Unless someone's use case is exactly mirrored by the microbenchmark, I personally find this approach to be a little dishonest.

I will say that at the moment, many of the developers who have been working on ext4 are employed by companies who are using ext4 as part of a cloud data storage stack. This is why there has been changes such as no journal mode (which is great when you have consistency guarantees being provided by a cluster file system above you, since they have provide the file even if an entire server's power supply has exploded), and good performance when under severe memory pressure (funny how most benchmarks are done when the only thing running on the server is the benchmark, so there is no competition for the CPU and for system memory --- XFS in particular have proven to be a memory hog, and others have noted severe performance degredations and in some cases stability problems, under memory pressure; not a problem on a stand-alone file server, but not so great if you are also trying to run VM's or other applications using the file system on the same machine). Arguably some of these improvements don't mean as much for desktop users, although I believe some of the performance enhancements we've made have also trickled down to help the desktop.

XFS, in contrast, has been focusing a lot of attention on the desktop use case, and they've traditionally owned the big stream writes, HPC workloads using huge servers, huge memory, and huge RAID arrays. It's good that XFS has made these improvements, and I salute them for it. But to state that these two workloads are the only ones which are important, and therefore they are the file system of the future, may be overstating matters --- with all due respect to Dave and his many years of experience working on XFS.

tytso · on Feb 2, 2012

I can no longer edit the above article, and I just realized that I made a very critical typo. When I wrote: "XFS in particular have proven to be a memory hog, and others have noted severe performance degredations and in some cases stability problems, under memory pressure", I had intended to say that "ZFS in particular..."

My apologies to the XFS developers and fans of XFS. I currently have no information about how XFS behaves under extreme memory pressure. I don't find competitive benchmarking to be all that interesting, so I've never gotten around to do that experiment. My information about ZFS requiring lots of memory for good performance/stability was taken from an Open Solaris mailing list, and not from personal experience, just to be clear.

zanny · on Feb 2, 2012

Reading the post, a lot of what he says is that XFS scales better than ext4 due to better algorithmic implementations of various "things".

Why not just get active with ex4 (more likely 5) development and work to introduce his performance improvements into the main line file system already in use?

tytso · on Feb 2, 2012

His performance improvements are very specific to XFS; arguably they were fixing a problem/fundamental design issue in the original journaling code in XFS. Ext4's journaling code is very, very different from XFS. So a design improvement that applies to XFS is not necessarily applicable to ext4. That being said, there have been times when we've looked at what XFS has done with some feature (such as delayed allocation writeback for example) and written code which has been inspired by XFS's algorithms. I'm an engineer, I'll take good ideas from wherever I can. (Unless I suspect there may be legal reasons why I had better stay away from certain techniques :-/)

Also, the scalability issues that Dave has been talking about aren't ones that matter for any of the workloads that the ext4 developers or their employers care about. We don't have 32 CPU cores all writing small files and requiring small block allocations to a single file system at the same time. So yes, there are scalability problems which he has identified with that specific benchmark that Dave was experimenting with (fs_mark) where ext4 has its own performance problems that could be fixed with the appropriate developer attention. It's on my list to look at and hopefully address, but I have higher priority things that do benefit the workloads that my company is interested in.

For example, I am currently working on making Async I/O truly Async even when we need to read metadata blocks, which is a bug that all file systems suffer from under Linux; AIO is not truly "A". This is something that has been known for over a decade, but up until now, no one who was funding fs development, for ext4 or XFS, has had a workload where this has mattered enough. When I do fix this, it will be an area where ext4 will have an advantage over XFS. Will I then trumpet this as the reason why they should switch from XFS to ext4? Of course not. Not everyone has a need for true Async I/O. On the other hand, if ext4 has such a feature, it may appeal to some application writers, probably of various different storage servers or perhaps web servers, and if they start using it, then there may be more workloads where true AIO is relevant. And that in turn might inspire XFS developers to add a similar feature. This is why I've always believed competition is a good thing, and why I've never argued that fs developers should abandon one file system and go work on another file system. (As Dave has done on the linux-ext4 list, but never mind that.)

mburns · on Feb 2, 2012

There will not be an ext5. ext4 was only created as a stop-gap. BRTFS is the future 'blessed' filesystem.

aidenn0 · on Feb 2, 2012

I'll have to give XFS a look again. We use subversion at my work and an "svn up" took 4-8x longer on XFS than on reiser3.6; it was insanely faster for just about everything else though.

It definitely is not great for sudden power-failure, but really if you're not using a laptop, buy a UPS, it's not that expensive.