SnapRAID Split Parity Sync Script14 min read

SnapRAID v11 adds a sweet new feature, split parity. In the past, adding larger data disk always came with the issue of needing parity disks as large or larger than your data disks. For example, let’s say you add an array made up of (4) 4TB data disks and (1) 4TB parity disk. What if you want to buy one of those 6 or 8TB disks to use in your array? In the past, you could have either chosing to use the new larger disk as your new parity disk, or risk having part of your new disk not protected. With split parity, you could use the new 8TB disk as a data disk and then use (2) of your old 4TB disks, joined together as one complete set of parity (or, you could create parity in this scenario with (4) 2TB disks or even (8) 1TB disks). Pretty neat!

So, this would allow you going forward to add 6 or 8TB data disks and have all your data protected, without having to buy an extra one or two larger disks just to use on parity. Now that we’ve discussed split parity, how can we automate syncing like we did with my previous script? We can’t use that script as is because of the split parity files. So, I already had a modified version of my script, but when mtompkins presented his cleaned up version of my script, I thought I’d extend it for split parity and add a couple of extra functions. I present you now with the new split parity script (this version is setup for dual parity with 4 disks setup to complete split parity).

As a sidenote, I would love it if someone could provide a BASH method to read the snapraid.conf file and automatically build the array rather than having to manually set that up in the config. I fear with split parity, complex grepping may be over many user’s heads.

Here’s how I have the parity files setup from this example in my /etc/snapraid.conf file.

Zack

Zack

I love learning new things and trying out the latest technology.

You may also like...

63 Responses

  1. woodensoul2k says:

    Hey Zack, it is still possible to use this script for just a standard dual parity setup? I plan on eventually using a split parity in the future but for now I was just wondering if I could use this new script after upgrading to Snapraid V11. I’m using your older script at the moment on Snapraid V10.

    • Zack Zack says:

      Great question. You can’t just drop this in and have it work, but with a very small modification, it should work fine. You would need to remove these lines.

      and replace them with something like this instead.

      or… if you have dual parity…

      You just need to adjust the names of the parity files that you have setup in your snapraid.conf file, and that should be it 🙂

  2. woodensoul2k says:

    Thanks for the easy explanation. Can’t wait to give it a go.

  3. Dulanic says:

    Is there any reason it unpauses the containers twice?

    • Zack Zack says:

      It only unpauses once, unless you have an error in your script so it doesn’t exit gracefully. Here’s the output of my run last night.

      • Dulanic says:

        FYI, it happened again. I don’t know what the heck is causing this… because I also get a “normal” email that shows it worked properly. The cronn process seems to be thinking it failed… and I guess it did, at least partially since it hit the 2nd restart and they fail.

        The 1st email fires off successfully and then the 2nd fires off when it failed immediately after when it tries to restart again. I think ti doesn’t send me off the fail email if I manually run it as it isn’t run automatically causing the cron failure email to fire off.

        • Dulanic says:

          OK sorry to spam your comments, but I did find that somehow the trap keeps being triggered but no idea how. I edited it to show “trap triggered” by replacing the function and it triggered at the end again. So I figured it looked like there is an issue /w the clean_desc function on my system, but I can’t tell why? So I tried replacing the trap /w…

          trap 'err_report $LINENO' ERR

          but now it isn’t triggering the trap. I did try changing to trap to a new function i had to echo trap triggered and it always did it at the end /w clean_desc . Since that worked for now I’ll keep it that way, and see if eventually it does trigger that so I can see where the line is that it errors on.

  4. Dulanic says:

    So odd, I did a notepad++ txt compare and the only 2 changes were the services and the the thresholds… yet I repasted it and it worked OK this time. Maybe a line break or something got messed up, no idea because even notepad++ couldn’t tell a difference. That was with a forced run, so we’ll see overnight I guess

    • Zack Zack says:

      Thanks for the info. Keep me posted.

      • EmanuelW says:

        Hi Zack and Dulanic,

        First of all, thank you Zack for sharing this script!

        To you both: Did you gain anymore insight in this issue? in my testing it seems that the trap gets executed at a clean exit as well (I did not find any errors or weird exit values from commands at least). Separating the service_restore and clean_desc seems to fix it for me, see patch below. It relies on the trap being executed on the intended exit call in main. This was tested on Ubuntu 18.04, bash version 4.4.20(1)-release. What are your thoughts on this, should the trap not be executed on the exit call in main (row 285 above)?

        Also worth noting is that during my debugging a ran shellcheck and corrected some of the warnings but it did not make a difference in this case. The patch below might look a bit weird due to it though, so please tell me if I should post one that is directly applicable to the script above.

        • Zack Zack says:

          This looks like a good approach. I just patched a test version of this script on my local version. I’m also on Ubuntu 18.04 with mine with a bash version 4.4.20(1)-release. The trap order seems to work fine, and logically makes sense. Thank you for this! I will continue to test, but this seems to be a good fix. I’d love to hear back from others with newer BASH versions to see if this remedies their issues.

  5. Dulanic says:

    So I have been using your script and mergerfs setup for a month now and twice I have run into an issue twice where it says something along the lines of…

    I have a feeling it is likely due to a file that mergerfs assigns to d4 that is either later moved or deleted between syncs. This is frustrating and the force empty sync takes forever. Is there any way around this?

    • Dulanic says:

      FYI what is the right way to post code? lol I keep trying things but can’t figure it out.

    • Zack Zack says:

      If you are running into needing to use –force-empty sync, you really need to check the files in question. In this case, this means that all of the files that were on /mnt/data/disk4 have been removed or have been deleted. If you haven’t completely removed all content from that disk, this means that something has went wrong. For example, a disk that didn’t mount, or a script that’s deleted all files, etc.

      To put this in perspective, I have never had to run –force-empty-sync unless, I have manually removed all files on a disk. I would do some more investigation.

      • Dulanic says:

        It looks like it was because of torrents soon after a disk got full… it would sync and then later on the files would be removed and boom that error showed up. What I did was add a DONOTDELETE.txt file to all of the drive to avoid the issue.

        • Zack Zack says:

          I would suggest handling all of your torrent downloading outside of SnapRAID. Or, just make a torrents folder for in process files, and then move the final, completed files to a location that SnapRAID is including. That way you don’t run the risk of a failed restore because of some missing files that were there during your last sync.

  6. oxzhor says:

    Hi Zack,

    I try to use your script on CentOS 7 and check all path this are the same like into ubuntu.
    But if i want to run the script i get the follow error:

    [root@media scripts]# sh snapraid_diff_n_sync.sh
    snapraid_diff_n_sync.sh: line 121: syntax error near unexpected token >'
    snapraid_diff_n_sync.sh: line 121:
    exec > >(tee -ia “${TMP_OUTPUT}” ) 2>&1′

    I only modified the script so i can use dual parity, for the rest it is untouched.

    • Zack Zack says:

      Did you try the commented out method above to see if that works instead (obviously comment out the current exec line)?

  7. chad says:

    Zack –

    Thanks for your work on this, it’s been working very well for many months now! I have two related issues I need to ask you about. The background is that I’ve recently begun using Zoneminder for video surveillance. I have a single camera and keep recorded video for about 15 days. The way Zoneminder works is that it saves tens of thousands of files, sometimes hundreds of thousands of files per day. Then, after 15 days, those files are deleted in order to save space. This operation happens daily with just one camera. This would experience significant growth as cameras are added.

    You probably know where I’m going with this – your script sees these as massive changes to the system and tries to run the Snapraid operation but I’m receiving the following error:


    remove zoneminder/events/2/17/07/10/.1119
    remove zoneminder/events/2/17/07/10/.1120
    remove zoneminder/events/2/17/07/10/.1121
    remove zoneminder/events/2/17/07/10/.1122
    remove zoneminder/events/2/17/07/10/.1123

    613286 equal
    63606 added
    15725 removed
    1 updated
    0 moved
    0 copied
    0 restored
    There are differences!

    DIFF finished [Tue Jul 25 23:30:46 PDT 2017]
    **ERROR** – failed to get one or more count values. Unable to proceed.
    Exiting script. [Tue Jul 25 23:30:46 PDT 2017]

    As you can see, 63k+ files added, 15k+ files deleted but I’m not sure why the script is erroring out. Can you help with this? I’ve set my delete threshold to 10,000 but it’s still too low.

    The second question is there any way to exclude a folder from being included in the delete threshold count? With that method I can exclude my Zoneminder folder and avoid triggering the threshold nightly. Also since I have to increase the threshold, this opens opportunity for script to continue even if many files have been deleted in error.

    Again thanks for your great work!

    Chad

    • Zack Zack says:

      Thank you for the kind words! The only way to figure out why it’s erroring is to re-create the script by hand (run a snapraid diff, and then run the grep’s that the script runs). I would think it’s due to the massive amount of adds/removes throwing my greps off.

      As a sidenote, I would strongly suggest you manage Zoneminder outside of SnapRAID (either moving them to a different setup (maybe a difference ZFS mirror array or exclude that path in SnapRAID). Rapidly changing files are not what SnapRAID is designed for at all. Rapidly changing filesystems can make recovery suffer/not work in the event of a disk failure.

      • chad says:

        Agreed. As I think about the use case of SnapRAID, it makes sense that it’s not designed for what I’m trying to do. Therefore I’m trying to exclude the folder but it’s not working and I think I’m not understanding the exclude option properly. In snapraid.conf I’m simply trying to ‘exclude /mnt/storage/zoneminder/’ but it’s not working neither is ‘exclude /mnt/storage/zoneminder/*”. Reading through the docs, it seems as though it doesn’t exclude recursively through all the subfolders (and Zoneminder has ALOT of them – one for each day AND timestamp). I saw a post where a user was able to exclude using ‘exclude /rootfolder/subfolder/*/*.jpg’ and ‘exclude /rootfolder/subfolder/*/*/*.jpg” and so on but that seems tedious.

        Any idea on how to handle this situation? My /mnt/storage is my main array and I want to keep my video storage on that.

        • Zack Zack says:

          Good question. The exclude is actually really easy. It is a relative path in the array. So, if your folder is at /mnt/storage/zoneminder, you would exclude that whole folder recursivly by adding just this one line to your snapraid.conf file.

  8. chad says:

    That did it, thanks!

  9. blotsome says:

    Hey Zach, Great work all around! I was hoping to get some advice on how to modify this script. Does the output get written locally to a log file somewhere, or is it only e-mailed? If it is logged locally, where? If not, how would I set the output to be recorded locally? Also, I don’t want daily e-mails every time it runs successfully, I only want an e-mail if it fails. the prepare_mail function has a number of if statements that change the subject based on warning conditions. I’d like to set something up so it only send_mail if subject contains warning? something along those lines. Any advice?

    • Zack Zack says:

      Hello! Thanks for the kind words. The script does write to a local file, you can see that in the INIT variables (/tmp/snapRAID.out). If you don’t want to receive emails unless it fails, you will need to edit that send_mail function to contain an if statement (I’m writing this from my phone, so this is untested).

      Honestly, I like the nightly emails though. It ensures that the script ran correctly, and didn’t silently fail. That way I always KNOW my data is safe. I hope that helps.

  10. kocane says:

    Thanks for the script.

    Does this script spin down disks? If so, how do I disable this?

    • Zack Zack says:

      Yes it does. You can see on this line.

      Just comment it out…

  11. clayboyx says:

    i’m having an issue with line 127 where services need to be stopped… it says ‘missing’ not sure what to do everything else works using triple parity so disabled make service array
    # Stop any services that may inhibit optimum execution
    if [ $MANAGE_SERVICES -eq 1 ]; then
    echo "###Stop Services [
    date]"
    stop_services

    • Zack Zack says:

      That line is for managing Docker containers. Are you using Docker containers, and if so, would you like to stop them? If so, you need to add the names of the services you’d like to stop to this line.

      If you are not using Docker containers, just change this line from a 1 to a 0.

      As a general rule, it’s a good idea to read through the commented lines in any script your are using to try to understand roughly how it works. All of the configuration options are at the top and I have tried to provide comments for every option.

      I hope that helps,

  12. clayboyx says:

    i got the script working now … but when i do crontab -e then add the script …. the script runs but it doesn’t send any emails… can you assist. Mutt is setup correctly. Emails do get sent if i run the script from terminal

    # Run a SnapRAID diff and then sync
    30 23 * * * /root/Backup/snapraid.sh

  13. clayboyx says:

    yes the script is chmod +x. I setup a cronjob thru webmin as a test to have it run immediately the script runs but no email is sent

    • Zack Zack says:

      As I said in my email, the script appears to be working fine. It looks like you need to properly configure ssmtp to work with gmail and Mutt. Once that is done, the email should work fine 🙂

  14. nerdfury says:

    Hey, not sure if this helps but this might be a solution for supporting different parity setups

    https://gist.github.com/nerdfury/7b5de21e8f8c54616feca73638f97fe1#file-snapraid-sh-L106

    should work with parity, 2-parity and z-parity options

  15. kiwijunglist says:

    Hi thanks for this. Can you please explain how would i adjust for my single parity setup?

    content /var/snapraid.content
    content /mnt/disk-3tb1/snapraid.content
    content /mnt/disk-3tb2/snapraid.content
    content /mnt/disk-3tb3/snapraid.content
    content /mnt/disk-3tb4/snapraid.content
    content /mnt/disk-4tb1/snapraid.content
    content /mnt/disk-6tb1/snapraid.content
    content /mnt/disk-8tb1/snapraid.content
    content /mnt/disk-8tb2/snapraid.content
    content /mnt/disk-8tb3/snapraid.content

    data d1 /mnt/disk-3tb1/
    data d2 /mnt/disk-3tb2/
    data d3 /mnt/disk-3tb3/
    data d4 /mnt/disk-3tb4/
    data d5 /mnt/disk-4tb1/
    data d6 /mnt/disk-6tb1/
    data d7 /mnt/disk-8tb1/
    data d8 /mnt/disk-8tb2/
    data d9 /mnt/disk-8tb3/

    parity /mnt/parity/snapraid.parity

  16. Sejrup says:

    Any issues using this script with Debian? I have had it working with Ubuntu, but no luck so far in Debian.

    Get the following errors when executing the script:
    /root/scripts/snapraid_diff_n_sync.sh: line 308: unexpected EOF while looking for matching `)’
    /root/scripts/snapraid_diff_n_sync.sh: line 457: syntax error: unexpected end of file

    I have adjusted to dual parity as per your instructions in a previous post. But don’t think that has anything to do with it?

    Line 308: UPDATE_COUNT=$(grep -w ‘^ \{1,\}[0-9]* updated

    • sburke says:

      Add a ‘ after each of the following => removed, added, moved, updated and copied. Move commands onto a single line. You should get the following:

  17. sburke says:

    Hey,
    First off, great script Zack, thank you for publishing it. It’s come in very handy.
    Secondly, I was using this version of the script on Debian 9 and it worked without issue. Needed some minor formatting but noting major. I upgraded to Debian 10 and the wait commands got stuck waiting forever. I know the simple solution was to probably use the old version of the script but I overlooked this until just now.

    So if anybody out there is running Debian 10 or a newer version of bash, not quite sure at what version this kicks in but I’m using > 5.0 on Debian 10, so at least that version or greater and is running into the same issue. I’d advise to you to look here as to why:
    https://unix.stackexchange.com/questions/530457/script-hanging-when-using-tee-and-wait-why

    Save yourself the headache and use the old script.
    https://zackreed.me/updated-snapraid-sync-script/

    • Zack Zack says:

      I didn’t realize that. It would be nice to actually fix this script so that it works. It seems like mosvy’s fix would be like this…

      • sburke says:

        Sorry for the late reply. Here is how I have butchered your script but it seems to work: https://pastebin.com/PBzrBXq0
        I have only tested the ‘happy path’ flow.
        Your above solution solves the initial issue but what was happening to me was that, anytime sed_me is called it redirects again and opens a new tee process and then the next time we hit a wait, it’s back to square 1 and we wait forever. Does your above fix solve both problems?
        What I think I’ve done is essentially kill the tee process before each wait and open a new one after waiting.

        • Zack Zack says:

          Hello 🙂 No, my above solution was just proposed because on the solution offered on Stack Exchange. You are correct, the sed_me function would put you right back to square one. And, I wouldn’t call what you did a butchering. You made new functions, and it looks pretty good. Also, your assumption is correct, you are allowing the tee process to close before each wait. That’s why it is working. Good job!

    • realbosselarsson says:

      It took me about a month to realize I was not getting any mails from Snapraid and then I remembered it was about a month since I updated Proxmox, bash probably got updated too.
      Your changes sorted it for me, thanks!

  18. MMauro says:

    Hi, thanks for the great work! There’s a syntax error with the IFS variable on line 110 that renders the script unusable. I don’t know what that variable’s supposted to be, so I’m stuck.

  19. Egleu says:

    Nice script. My only suggestion is to either default to sync -h for hashing or to leave a comment. I feel that hashing is almost necessary if you aren’t using ECC memory.

  20. svh1985 says:

    Love the script, thanks! One thing I’m wondering is how to setup smtp authentication. I’m using Gmail, so I have to authenticate. Is this something I can setup?

  21. svh1985 says:

    It looks like there is a type in the script on line 116 which breaks the script:
    The IFS= is never used in the script i think, should it even be there?

    Line 16
    IFS=
    \n’ PARITY_FILES=(cat /etc/snapraid.conf | grep "^[^#;]" | grep "^\([2-6z]-\)*parity" | cut -d " " -f 2 | tr ',' '\n')

    • Zack Zack says:

      Thanks for letting me know. That is supposed to be there. WordPress just through in a random line break after my last edit. It is supposed to be like this…

      IFS stands for “internal field separator”. It is used by the shell to determine how to do word splitting, i. e. how to recognize word boundaries.

  22. EmanuelW says:

    Hi (again) Zack,

    When doing some debugging of another issue (see comment originally made by Dulanic about dual un-pausing of docker services) if found that when waiting for processes to exit, sometimes the value of the pgrep in “close_output_and_wait()” would be invalid (“not a valid PID error” thrown from wait). This would happen for example with a diff that does not have changes (and thus runs very shortly?) A temporary workaround for me was:

    But preferably one would instead save the PID when spawning the process. Does this seem reasonable?

    • Zack Zack says:

      Maybe I’m dense, but I don’t have, or see, a close_output_and_wait function in my version of the script. Where did you use this? And, thanks for all the questions and ideas!

  23. mbourd25 says:

    Hi gang, I’m using OpenMediaVault for my NAS software. I have 2 2Tb drive and another 2 2Tb drive for Snapraid parity. I also have a 6Tb hard drive that I backup directly to.

    Would this script be an overkill for my usage? I have very little personal documents saved on my NAS. The storage is mostly for my movie and music collection.

    If I can use this script, would anyone have a good tutorial on how I could make this script work in OMV? Right now, I just run a snapraid sync every 2 hours.

    Thanks.

    • Zack Zack says:

      Hello, this script “should” work fine in OMV as it is just a Debian based distribution just like Ubuntu. So, you’d need to paste the contents of this script into a file, change the variables to the correct email addresses and disk locations, finally, you’ll want to chmod +x your script and add it to your crontab. Before you do any of that, just get the script setup and try to run it outside of crontab first. Below I saved the script to /root/scripts/SnapRAID-sync-script

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.