Skip navigation

This has been moved to my personal website.

Advertisements

24 Comments

  1. I hear you chief. It makes me sick too. Too many eExperts.

    Right now I’m building file server and wherever I go…I hear NAS, ZFS, FreeNAS, BTRFS.

    I seek stability and reliability. I really love my photos taken around the world with so many stories and “suffering”

    So far I wrote paranoid script for copying data and I can check all data integrity with checksums.

    I’m also paranoid (perhaps it is the teaching on extreme sports I do over 30years)

    So far I found nothing intelligent…I think I use standard ext4 on Debian. Keep checksum created right from the beginning of file. If anything get corrupted it starts all on flash cards in camera and I cannot do anything about it. Run scripts that check file system.

    What do you suggest as optimal intelligent solution for secure file storage/archive. I won’t run file server 24/7. I just want to satisfy my paranoia and sleep better 😀

    • There is no substitute for a good backup. Store your files on two different kinds of media at two different physical locations and sync those media as often as possible. If something goes sour on the server, you’ll have backups of most of it. Since nothing can guarantee data will never bit rot, it’s best to skip things like filesystem checksums and use redundancy instead. For the paranoia you can use a tool like md5deep to make a list of (and verify) data checksums periodically, but I’d only bother doing such a thing very infrequently (maybe every half-year) because it takes a ton of time and if you’re rsync-ing for backup it’s not going to transfer a rotted file unless the source file’s change time is also different.

  2. Cheers, finally some logical points regarding ZFS, bit rot, raid, and backing up your data.

  3. I’ve been in threads about the maths behind RAID5 failures. If they were taken at face value, I’d be sitting on six nines likelihood of seeing a URE take an array offline for any given year – but I’ve never seen it happen. People suggest I have to be lying when I say how many RAID5 arrays I have in production without a failure. It’s absurd.

  4. I don’t know much about btrfs so I’ll stick to ZFS related comments. ZFS does not use CRC, by default it uses fletcher4 checksum. Fletcher’s checksum is made to approach CRC properties without the computational overhead usually associated with CRC.

    Without a checksum, there is no way to tell if the data you read back is different from what you wrote down. As you said corruption can happen for a variety of reason – due to bugs or HW failure anywhere in the storage stack. Just like other filesystems not all types of corruption will be caught even by ZFS, especially on the write to disk side. However, ZFS will catch bit rot and a host of other corruptions, while non-checksumming filesystems will just pass the corrupted data back to the application. Hard drives don’t do it better, they have no idea if they’ve bit rotted over time and there are many other components that may and do corrupt data, it’s not as rare as you think. The longer you hold data and the more data you have the higher the chance you will see corruption at some point.

    I want to do my best to avoid corrupting data and then giving it back to my users so I would like to know if my data has been corrupted (not to mention I’d like it to self-heal as well which is what ZFS will do if there is a good copy available). If you care about your data use a checksumming filesystem period. Ideally, a checksumming filesystem that doesn’t keep the checksum next to the data. A typical checksum is less than 0.14 Kb while a block that it’s protecting is 128 Kb by default. I’ll take that 0.1% “waste of space” to detect corruption all day, any day. Now let’s remember ZFS can also do in-line compression which will easily save you 3-50% of storage space (depending on the data you’re storing) and calling a checksum a “waste of space” is even more laughable.

    I do want to say that I wholeheartedly agree with “Nothing replaces backups” no matter what filesystem you’re using. Backing up between two OpenZFS pools machines in different physical location is super easy using zfs snapshot-ting and send/receive functionality.

    [Admin edit: I got mad when senpai didn’t notice me]

    • It does not matter what algorithm is used for the CRC/checksum/hash. In all cases it is a smaller number generated from data that (if taken as one string of bits) constitutes a massively larger number, and it takes time to compute and storage to keep around. The question is this: is it worth the extra storage and the extra computation times for every single I/O operation performed on the filesystem? I say it isn’t.

      Hard drives DO in fact know if something has bit rotted, assuming the rot isn’t so severe that it extends beyond the error detection capabilities of the on-disk ECC. Whenever a drive reports an “uncorrectable error” it’s actually reporting an on-disk ECC error that was severe enough that the data couldn’t be corrected. In my opinion, on-disk checksums (CRCs, hashes, whatever term is preferred) are targeting a few types of very rare hardware failures (they must mangle data despite all hardware error checking mechanisms AND must not cause any other damage that crashes the program or machine which would process or write that data out to disk) and do so at significant expense (a check must be done for every piece of data that is read from disk). Even ZFS checksums are not foolproof; for example, if data is damaged in RAM or even in a CPU register before being sent to ZFS, the damaged data will still be treated as valid by ZFS because it has no way to know anything is wrong.

      As discussed in my post, ZFS checksums are useless without a working backup of the data to pull from, preferably a ZFS-specific RAID configuration that enables real-time “self-healing” as you’ve mentioned. Without some sort of redundancy…well, what are you going to do? You know it’s damaged but you have no way to fix it.

      You seem to take particular issue with my assertion that checksums are a waste of space. Granted, they’re relatively small compared to file data, however the space issue pales in comparison to the processing time and additional I/O for storing and retrieving those checksums. If the checksums aren’t beside the data then that 128K read will incur at least one 4K read to fetch the checksum which is not nearby, resulting in a disk performance hit. Enough read operations with checksum checking at once and streaming read speeds approach the speed of fully random I/O a lot faster than it would otherwise. It also takes CPU time to calculate a hash value over a 128K block; while some are faster than others, all take CPU time and large enough block sizes will repeatedly blow away CPU D-cache lines during the checksum work, reducing overall system performance. Since many ZFS users seem to pair it with FreeNAS and relatively small, weak systems like NAS enclosures, the implications of all this extra CPU hammering should be obvious. Of course, a Core i7 machine with 16GB of DDR4 RAM might do it so fast that it doesn’t matter as much, but being able to buy a bigger box to minimize the impact of lower efficiency does not change the fact that such a drop exists.

      In computing, we have to choose a set of compromises since rarely does any given solution satisfy speed, precision, reliability, etc. all at the same time. In my opinion, ZFS data checksums are not worth the added cost, particularly since the problem surface area is very small and unlikely to ever happen once the error checking coverage of hard drive ECC, RAM and on-CPU ECC if applicable, and various bus-level transceiver error detection methods are taken away. The beauty of computing is that you are free to make a different trade-off in favor of bit rot paranoia if it makes you sleep better at night. What’s right for me may not be right for you. I do not consider the very tiny risk of highly specific and unlikely corruption circumstances that can be detected to be worth covering ESPECIALLY since the same cosmic rays that can bit-flip the data in a detectable place could just as easily flip it in an undetectable place, but I’m not in your situation and making your choices.

      tl;dr: one of us is less risk-averse, and that’s okay.

  5. Perhaps this is long since dead, but I wanted to give an example where “bitrot” is quite common. Plenty of laptops still have 2.5″ mechanical hdds, if the drive is spinning and you pick up the laptop, it is quite likely to cause a few kilobytes of sequential broken data. Switch to zfs, activate copies x2 and the errors which the drive could notice, but not fix, are no longer a problem. Drive abuse to be sure, but quite common non the less.

    • It’s pretty hard to cause the damage you’re talking about, but the damage to the disk surface will be caught by the on-disk error correcting code if this happens. It is extremely unlikely that physical damage to the platter surface will cause data damage that can fool the ECC.

        • Ryt
        • Posted January 1, 2018 at 10:43 am
        • Permalink

        I don’t like the idea that you simply assume the hard drive hardware ALWAYS catches these errors, and I think that is your major flaw in all of your arguments. I worked in aerospace, where you are not allowed to assume any hardware is flawless at any point in time, therefore redundant checks are always needed. Imagine a plane with only one alarm system on the hardware to tell you if something is wrong…. that’s a jet I dont want to be on, just in case something goes wrong and isn’t detected in time to stop further problems.

        The CPU overhead of zfs is almost nothing with modern hardware that usually sits 80% idle anyways. 1% of storage for some extra security, literally cost me less than $10 of harddrive space on my 7 drives, but the benifits of time savings using “zfs send” to save backups and replicate data is far more valuable. Time for me is way more important than saving a few bucks worth of storage.

        If there were more integrity features that cost less than 10% of anything on the system, I’d enable them, which is why I have an automated script that sha256 each and every file every couple months, and checks against previous known values, to let me know if there is a problem.

        It’s not rocket science, more checks are just better when it comes to the value of your data… more than one backup too.

      • boy Albert was right…human stupidity is infinite.

        I consider myself extremely paranoid but not dumb. I must confess I almost fell for ZFS/BRFS propaganda.

        What cost average ZFS servers did cost me less even with 900VA UPC. It is most secure and redundant system ever built…still with 70% idle system resources.

        I can write even more paranoid system checks to utilise more system resources.

        Only weakness is fire and HW failures that nobody can avoid…that’s why you still have at least two backups/mirrors on two different places.

        Laziness, stupidity caused more failures than those theoretically fears that is whole industry based on.

        • admin
        • Posted January 1, 2018 at 11:26 am
        • Permalink

        I made no such assumption. Where did I ever say that the hard drive hardware ALWAYS catches these errors? You need to re-read what I’ve written and fully understand what the point of my post was. It is extremely unlikely that your hard drive will send bad data to your motherboard, but it is always possible.

        The ultimate stinger is that no matter how clever ZFS is, once the data is in RAM (even ECC RAM) it can be damaged due to a variety of hardware issues. Anything from the right series of bit flips to a CPU defect to a cosmic ray hitting a bus trace can mangle your data with no way to detect the damage.

        At some point the law of diminishing returns kicks in too hard. For most people ZFS in the setup required for the oft-touted integrity boost is impractical, and my greater point is that ZFS can’t just be deployed and voila, integrity! RAID-Z is mandatory. ZFS without RAID-Z adds little value over traditional RAID with external backups. Advocates of ZFS do not always or even frequently point this out while telling newbies to use ZFS and this bothers me.

        If you take the time to fully know and understand what you’re doing and you ultimately choose to deploy ZFS, that’s totally fine. I’m not saying your choice is not valid or that you should not have made it. I’m simply challenging the practical value of what ZFS brings to the table for people that are slapping together a machine to store stuff. Most ZFS talk is by ZFS zealots that loudly scream its virtues; I offer counter-points that are sorely needed for a user to fully understand what they’re getting into if they go the ZFS route and why it may not be as much of a miraculous data-saving black box as it is touted to be, especially without RAID-Z.

  6. And this is basically my thoughts on ZFS ever since I started hearing about it. Now, I’m not sure I would talk about ZFS independently of RAID-Z. The two are basically always paired. I will give them that the array expandability could be nice, but I have never seen a detailed speed test, and we highly value speed.

    That brings me to what I actually wanted to comment on. We will never use RAID-5 ever again. It’s not because of anything you mention, but rather, because the write speed is atrocious. After some problems, we found that our 5 drive array averaged about 5MB/s on write operations. This compared to a single drive averaging around 45MB/s. We tracked down the problem to something inherent in RAID-5. The data and parity are saved in different locations on each disk. This means that each write requires the head to write, then seek to another location on the drive, and write again. The reason random I/O is slow is all the head seeking, and RAID-5 forces this for every write. For $100 more we moved to a RAID-10 with slightly less space and 200MB/s writes. This is, of course, for mechanical drives. SSDs are not nearly as heavily effected, but are still slowed by random writes.

    Now, it does seem like to get the most out of modern drives and SMART, something would need to periodically force a read of every used bit on the drive to prevent bad bits from building up undetected. A full backup would do this, but they take forever. zpool scrub would do this. Does a full drive rsync do this as well? It’s much faster than a full backup.

    • I disagree that ZFS + RAID-Z are usually paired. The entire reason for my article is that people constantly sing the praises of ZFS without making it clear that RAID-Z (specifically, as opposed to ZFS on md/LVM RAID) is mandatory for many of the touted integrity features, specifically the magical self-healing that is such a huge draw. I feel that it is dangerous to advocate the features of ZFS without also explaining the requirements for those features to work, yet that’s what you see going on in most “what filesystem for my NAS/server?” threads: “ZFS, it magically stops bit rot and fixes damage! [But I’m not going to tell you about RAID-Z or emphasize good backups, nor about how detecting bit rot is useless without a non-broken backup copy!]”

      Your RAID-5 issue might be the same one I discovered if you’re using the md raid5 driver: very large stripe sizes cause massive write speed degradation and the default Linux md raid5 stripe cache size is too small. You’ll often see raid5 how-to guides say to use larger stripes for faster throughput but they are written by people that don’t understand that RAID-5 must be updated for an entire stripe at a time; it’s a form of write amplification just like SSDs, so even just writing one 4K sector requires reading not only a stripe width worth of sectors (minus the one being updated) from every disk excluding the parity disk but also writing one stripe width of parity in addition to the modified sector. For sequential workloads this tends to be of little consequence but for random writes it is simply a disaster. That’s why Linux caches up the stripe updates and tries to write them out more optimally, but the stripe cache is usually too small. It maxes out at 32768. Try using a 64k stripe width and setting the stripe cache size for all md raid5 arrays to 32768 after booting; you’ll probably notice a big difference in performance.

      RAID-10 has some issues of its own. I tried out md raid10 (far2) and found the overall performance to be quite poor relative to RAID-5. Of course, I didn’t try any sort of tuning knobs so I may not have given it a fair shake; however, I find that a well-tuned RAID-5 with a properly formatted and aligned XFSv5 filesystem performs well enough to easily handle dumping lossless compressed video data to it in real time while still serving up random small reads without issue, so it’s good enough for my situation. I can understand others choosing a different path though, and that’s what is so wonderful about the Linux ecosystem in general: everyone has options and can pick the one that suits them.

      A full-drive rsync will force reading of file data and most of the filesystem metadata but if you really want to force a full disk or array read from end-to-end, there’s an elegant and absurdly simple solution (though it’ll surely starve other tasks trying to perform I/O):

      cat /dev/md0 > /dev/null

      Or if you have the wonderful amazing glorious pv utility and want a progress indicator:

      pv -pterab /dev/md0 > /dev/null
    • 6TB RAID-10 XFS array scrub takes about 8hours on my file server.

      read/write over SMB tragedy (most likely Apple vs Linux vs Win)

      NFS 100+ MBs over Giga LAN

      otherwise max. speed 250MBs of RAID-1

  7. Bit rot is a problem now, it isn’t 1995 and you are just incorrect.

    Read the studies on hard drives and what the ACTUAL hard drive manufactures say.

    The entire reason for the extra checksum and checking/correcting on every read is the shear size of hard drives now.

    No hard drive ECC /CRC will not save you. Statistically every 12TB of data read there will be a silent data read error and that is what the manufacturers say, not some zfs zealots.
    The error read rate hasn’t changed much since 1995 and hardly anyone in 1995 would have been reading 12TB .
    You can buy a single 12 TB hard drive now , problem is you cannot read all 12 TB , without an error.

    Finally basically all OSes are going down the same route that zfs did for checking checksums of data on the fly. Linux has btrfs(use zfsonlinux) , Mac OS X new APFS and Microsoft’s ReFS.

    Read more here https://web.archive.org/web/20090228135946/http://www.sun.com/bigadmin/content/submitted/data_rot.jsp

    • You are objectively wrong and I can prove it any night of the week. I have a 12TB RAID-5 array sitting eight feet from me. If your “can’t read 12TB without an error” assertion is true for a single drive then five drives should be five times worse off, yet I’ve run a weekly data scrub on the array since I built it and there has not been a single parity mismatch. Even if the drive had a set of bit flips that happened to pass by ECC, the RAID-5 parity check would almost certainly still fail. For the parity check to pass despite the bit flips they’d have to be extremely specific and possibly span multiple disks in that specific manner.

      You also cite an article that cites studies from nearly a decade ago. Storage technology has changed a lot since 2008. The article is ultimately a marketing article, not a technical article. It’s written by a Sun “evangelist” which is a stupid name for “obnoxious marketing guy.”

      ReFS is being disabled as a new FS option in Windows 10 Pro SKUs soon, APFS is slow and has a lot of growing pains, btrfs is wonky in all sorts of ways and not trustworthy…what’s your point with all that other stuff? None of those are ZFS and none of those are seeing mass adoption.

      How do you explain my 12TB RAID-5 scrub consistently passing? Am I just super lucky and somehow blessed by God himself to the point that I never experience these data errors or is your assertion based on grossly outdated knowledge and the bit rot panic hype pushed by ZFS fanboys?

  8. You do make some very interesting points. I certainly agree that one would be foolish to use ZFS solely on the basis of a few anecdotes. What I am curious about is whether there have been large scale studies on bit rot and their results. Unless we have such data, we can’t make an informed decision about the best fs suited to our needs.

    • See, that’s the problem: there are statistics on bit rot out there but they’re accepted without question, passed around, and as with all technological statistics they become outdated. A hard drive (say 80GB) from the early 2000s might be statistically guaranteed to have data loss after 10TB of reads, but that’s irrelevant to a 3TB drive today which uses completely different magnetic storage and retrieval methods. Of course, if one were to (incorrectly) quote the 10TB figure for the 3TB drive, that means the drive can only be read three times before it is guaranteed to lose data…but while that figure may be one of many passed around during a ZFS bit rot paranoia pow-wow, it is not applicable to the modern 3TB drive for multiple reasons. One of the other reasons is that uncorrectable read errors in HDDs and bit rot are two different things: one is a set of bit flips that fails drive ECC checking while the other is a set of bit flips that either fools the ECC method used or that happen in hardware beyond the drive read hardware.

      In my personal experience I have seen many incidents of damaged data due to hardware issues such as bad capacitors or power supplies or a power failure, but I have not ever been bitten by bit rot that I am aware of (and if I have been, it clearly didn’t matter since it has not affected me.)

  9. Thanks for this post. You’ve got some great points here. It is indeed fair that the need for filesystems like ZFS is mostly mitigated by technology that has been built in to hard drives for years and years now.

    It’d be great if we had some more complete and up-to-date statistics, but we don’t. While it’s definitely not reasonable to assume error rates of old hard drives apply to new drives like some of your critics have, that doesn’t mean new drives don’t have error rates. I just wonder what those error rates are. I note that you haven’t had any undetected issues that you know of with your disk arrays, and I can’t say I’ve come across any with mine (I only use ZFS on one of my arrays). I don’t want to come across any, though, which is why I make use of ZFS/ReFS in some circumstances.

    I disagree with you on two points, though. First and foremost IMO it is absolutely reasonable to assume that any ZFS deployment will involve mirroring/striping/RAIDZ. Any complaints you have regarding ZFS zealots which only apply when ZFS is used on single disks are almost redundant IMO. I’d bet money that almost nobody* uses ZFS on single disks (*relative to total ZFS users). Just go ahead and google ZFS guides, all the ones I just found assume more than one hard drive and that the reader already knows about RAID, and most of them cover why you probably shouldn’t bother with ZFS on a single disk. Nobody I know would try to use ZFS on a single disk. I think it’s reasonable to assume that at this point in time (perhaps not in the future, if somebody creates a click-to-magically-ZFS-all-the-things for Windows/Mac then it will be different), someone who is interested in deploying ZFS (who didn’t hear about it by stumbling across it on BuzzFeed or whatever) is already using multiple disks in RAID arrays for their critical data.

    Second, regarding RAID5. Whilst it is true that anybody who cares about their data should have working, regular backups, this does not negate the availability feature of RAID. Using RAID5 on large disks will mean rebuilds take ages as you acknowledged. But having a backup doesn’t make a failed rebuild OK. The downtime might be acceptable depending on the installation, but you seemed to dismiss the implied downtime outright, or perhaps I’m misinterpreting you.

    At the end of the day though we’re talking about problems that are very small and may not even matter. It puts the ZFS zealotry in perspective. And the Anti-RAID5 brigade too, though RAID5 still isn’t great.

    • “It is absolutely reasonable to assume that any ZFS deployment will involve mirroring/striping/RAIDZ” – no, it is not. It is reasonable to assume that someone who takes the time to fully understand what is required for the ZFS auto-healing magic to work will usually choose to deploy it properly. The problem is that ZFS advocates are all over the place in forums, particularly forums (I’m thinking of big tech discussion sites like Reddit, Tom’s Hardware, Ars Technica) relating to data storage, and they often say “use ZFS, it does [insert list here] that others don’t!” The caveat that ZFS detecting bit rot is useless without a way to recover the data from the rot (backups or RAID-Z) rarely comes up. It’s good that many guides will go over this, but I think you’ve made three bad assumptions: one, that other people who are finding out about or attempting to deploy ZFS are technically competent and take the time required to understand something technical before trusting it with their data; two, that people wanting to deploy ZFS will find guides that steer them into making the correct choices about how to do it; and three, that people who find guides that suggest RAID-Z will actually do it. The people in the third group are probably beyond help, but the other two may screw up through pure naivete.

      With RAID-5, I was trying to say that the rebuild time may not be a big deal to some installations like mine (I can afford to wait for a 12TB array to rebuild) and having proper backups makes the very unlikely failure of a second disk during rebuilding a moot point. If I had a 28-disk array instead of a 5-disk array, I would probably want something else with faster rebuilds. RAID-5 has a space economy advantage that no other non-RAID-0 array formats (obviously ignoring RAID-2/3/4) offer, so it has its place. When you’re dumping 3TB 7200RPM drives into an array at $90 a pop plus assembling external 3TB backup drives for the same array, it’s nice to minimize the total cost by using RAID-5 instead of RAID-6 or RAID-10. The point was that the “RAID-5 is dead” hype is not necessarily accurate.

      Absolutely agreed on the last point. Arguing over very unlikely problems is the way of the nerd! 😉

  10. Lets start of by stating that I am by no means an expert, novice would more accurately describe me. And as such I would like to hear some answers from the other side of the ZFS-fanatics.
    Yes, there is no replacements for backups, but some data is not important enough to be backuped (with the cost that comes with it). Instead redunancy may be enough for some data in the home user scenario. In that respect i’d like to have an as safe as possible data storage with redunancy.
    Is it not a large advantage of ZFS raid-forms that it works on block level, so that in a raid-5 case, a URE or other kind of error during rebuild will kill all your data, while it will only corrupt some of your data using raid-z1.
    Additionally, is it not an advantage that ZFS is a complete sollution. It includes snapshots, checksumming/scrubbing, compression, deduplication and raid configuration (maybe more that I don’t know of). I would imagine that this is an advantage for a novice end-user, that doesn’t have to read up and install/configure multiple tools, and hope that they will work togather. As well as for general compatibillaty, where it has less dependacys then the mix of multiple tools, so that updates are less likely to break things (in a way that is probally easaly fixable, but again not for me)

  11. First of all, nice article, I enjoyed reading it.
    As kop noted before me, ZFS is not just about the bit rot fuss. Once you properly shape your physical layer, which in my opinion is much less flexible than Linux software raid, it gives you flexibility on a very different level. True, you can’t reshape RAID-Z to MIRROR/RAID-Z2/3 (or vice versa), but once you have your pool(s) you can easily make logical filesystems on top of that which you can expand/shrink at will, create block devices and other filesystems on top of these block devices (like XFS/ext3/ext4/etc), change various filesystem options on the fly – like compression, record size, atime (without the need to remount)… and the list goes on and on.
    Given ZFS is expensive on resources it gives back some of its toll like in cheap snapshots that can be used with send/receive to transfer logical filesystems through network (like with ssh), and later transfer only what differs between the last snapshot and current filesystem state. That’s a very neat feature in my opinion, very useful if you need to transfer VM block device from one server to another with minimal downtime for example.
    I agree that the hardware problems that ZFS claims to protect us from are indeed very rare and when there are proper backups that does not even matter. But the combination of physical management, volume groups and logical volumes and the interaction between these components greatly simplifies storage administration tasks. And it’s not without a reason they made the filesystem aware of what’s going on in the physical layer and vice versa.
    Of course these benefits come at a price. People often neglect the hardware requirements to properly run a filesystem as complex as ZFS. Lack of ECC RAM may not only cause it to fail to detect problems (like bit rot) but may even cause these problems, like any other filesystem by the way. And it’s not just the RAM that matters.
    Compared to BTRFS it is very mature and stable, even under Linux. I did the mistake once to try btrfs with somewhat production backups data, and after that I would probably not even dare to touch it again in the next few years at least. That’s not the case with ZFS, I already use it extensively and feel very happy about it.
    All that said I still feel great love for XFS, I think this filesystem is greatly underrated compared to some more famous choices like ext4. I use it for all kinds of needs and I think it’s just the greatest for general purpose use from these “simpler” filesystems out there.

    • Excellent points all around. I would like to point out that LVM on Linux provides a lot of the storage pool functionality that ZFS does, so people have plenty of choices. My biggest concern with ZFS fanaticism lies with the danger to newbies and the questionable need for the added complexity given the apparent rarity of the bit rot problem. I don’t have anything against ZFS itself since it does have purposes that it apparently serves quite well, I just think that fanaticism must be tempered with cold hard reality and it’s dangerous to explain bit rot and ZFS without explaining the “proper setup” required to make that actually work as expected. I think the average user is hugely more likely to have an “I don’t back up” problem than any “I lost data to bit rot!” problems and ZFS without proper hardware and configuration can make that situation even worse.

      What little experience I’ve had with btrfs makes me prefer to avoid it like the plague.

      The only thing I don’t like about XFS is that there can still be rare hiccups that truncate extents to zero-length. It has happened to me roughly twice in the past six years, but I have snapshot-based rsync backups 😉 so I catch the problem and restore from backups.

        • Bozhin
        • Posted December 28, 2017 at 11:17 am
        • Permalink

        I totally agree, it’s nice to have the possibility to chose from such diverse software projects. LVM is a wonderful piece of software, long proven so far. I just tend to prefer ZFS over it lately.
        It’s worth noting that CEPH RBDs also has some logical volume capabilities similar to thinly provisioned LVM, but it’s useful in different use cases.
        I guess I’ve been lucky enough to have never encounter any kind of truncate problems with XFS. Or if I’ve ever encountered problems like these I’ve not noticed that or I’ve ignored it due to other bigger problems with the same setup that have caused me to resort to backups.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: