Jump to content

Choosing the correct RAID configuration


Oliver Daniel
 Share

Recommended Posts

This morning, I plugged in a G-Speed Studio Thunderbolt 2 4-bay (24TB) to edit video on my iMac 5k, and I'm kindly looking for a little advice on people who know better. 

Previously, I've just edited from a single Thunderbolt drive (with external backups) and it's been fine until recently (more complex tasks/more drive space required/faster speed). hence the purchase of this recommended item. I've looked into setting up the best configuration, but my brain has melted into mush (I've had a lot of personal/business changes to deal with recently, so very technical things are not computing at the moment! I need a break!)

The drive is currently setup as RAID 5 - but I'm trying to get my head round the workflow of such configurations. 

These tasks will be done: 

  • Editing 4k and 1080p footage... finishing in 1080p on FCPX. 
  • Some grading in DaVinci Resolve. (not too heavy duty). 
  • Codec types are ProRes (mostly LT), XAVC I,S & L, H.264, Cinema DNG (only a few shots). I transcode to ProRes if possible but may not due to time. 
  • Basic VFX work (quick logo animations). 

It would be ideal to get the benefit of great speed and redundancy, from what I see RAID 5 looks good. My question about this configuration is the workflow. It seems excessive to write the data to all the drives and not get the benefit of the large capacity - so the objective would be to remove the data from the drives once the project is complete (to clear space), but keep a copy of the files/project data - just in case I need the files again in the future (very likely). 

While the process was simple beforehand, this RAID stuff is spinning my battered delicate head. (yes, I need a break). 

What workflows do you recommend and in which RAID configuration? 

Any comments are much appreciated. 

 

Link to comment
Share on other sites

EOSHD Pro Color 5 for Sony cameras EOSHD Z LOG for Nikon CamerasEOSHD C-LOG and Film Profiles for All Canon DSLRs

If you're looking for both speed and redundancy, RAID-10 or RAID 0+1 will be better.

The RAID levels using checksum data (RAID-5, RAID-6 and such) will always be slower.

But, with a fast enough processor for the raid storage you can get decent speeds out of RAID-5 and RAID-6 levels too. I'm not sure what kind of CPU that device has.

 

Here's a test you can do if you have an internal SSD:

Just create a large file that you put on SSD, preferrably a few TB's of size. In terminal you can do the following command:

time cp /path/to/testfile /path/to/destination

This will test the write speed of your volume (as long as the internal SSD is fast enough). time is a command that will give you the exact time it takes to execute a command. cp is the unix command for copying files.

Do this test both for RAID-10 and RAID-5 on your device. For reading from the volume you just switch the source & destination paths and write to your internal SSD instead.

 

A note here: If you don't have a fast enough SSD, you'll be benchmarking the read/write speed of the SSD instead of the thunderbolt volume. Then you'll have to find some software/script that can create semi-random data fast enough in realtime for benchmarking reads & writes.

 

Anyway, how fast your volume will be with RAID-5 is very cpu dependent. I'm myself using a NAS I built myself which uses ZFS raidz2 (functions similar to RAID-6). The network connection is my limitation and I've reached write speeds of 400-500 MB/s when benchmarking on the machine itself with no network involved. Read speeds are even better. But that machine has quad-core Xeon cpu doing the checksum calculations.

 

In the end what matters for you when you do your copy tests:

- What sequential read & write speed is fast enough?

- Is the most speed or the most available disk space more important?

If you have the time to test with real data (test with the highest bitrate material you will be using), try that on RAID-5 first. If that is too slow - you will get a bit more speed out of RAID-10 and can rebuild the array with that.

Link to comment
Share on other sites

If you're looking for both speed and redundancy, RAID-10 or RAID 0+1 will be better.

The RAID levels using checksum data (RAID-5, RAID-6 and such) will always be slower.

But, with a fast enough processor for the raid storage you can get decent speeds out of RAID-5 and RAID-6 levels too. I'm not sure what kind of CPU that device has.

 

Here's a test you can do if you have an internal SSD:

Just create a large file that you put on SSD, preferrably a few TB's of size. In terminal you can do the following command:

time cp /path/to/testfile /path/to/destination

This will test the write speed of your volume (as long as the internal SSD is fast enough). time is a command that will give you the exact time it takes to execute a command. cp is the unix command for copying files.

Do this test both for RAID-10 and RAID-5 on your device. For reading from the volume you just switch the source & destination paths and write to your internal SSD instead.

 

A note here: If you don't have a fast enough SSD, you'll be benchmarking the read/write speed of the SSD instead of the thunderbolt volume. Then you'll have to find some software/script that can create semi-random data fast enough in realtime for benchmarking reads & writes.

 

Anyway, how fast your volume will be with RAID-5 is very cpu dependent. I'm myself using a NAS I built myself which uses ZFS raidz2 (functions similar to RAID-6). The network connection is my limitation and I've reached write speeds of 400-500 MB/s when benchmarking on the machine itself with no network involved. Read speeds are even better. But that machine has quad-core Xeon cpu doing the checksum calculations.

 

In the end what matters for you when you do your copy tests:

- What sequential read & write speed is fast enough?

- Is the most speed or the most available disk space more important?

If you have the time to test with real data (test with the highest bitrate material you will be using), try that on RAID-5 first. If that is too slow - you will get a bit more speed out of RAID-10 and can rebuild the array with that.

Thank you for this. 

I think I need a break to clear up the brainwaves... nothing is making much sense, not even my sandwich does. :flushed:

Link to comment
Share on other sites

Here's a very simple explanation, I hope!  A drive head can only be in read or write mode.  Drives slow down when they switch between read and write mode.  The more the drive goes between read/write, checking what it reads, the better.  Also, the hard drive head writes in chunks and can only write so much before it must stop, go into read (catch its breach so to speak), then write another chunk.  In RAID 0, the drives are configured so while one drive is writing a chunk, the other drive is preparing to write another chunk on it.  This means that the first drive will get part of the data, the second drive the second part, and so on, switching back and forth.  Obviously, if anything happens to either drive ALL the data is lost because each drive has a part of the data.

RAID 0 is all about speed!

HOWEVER, For RAID 0 to be maximized, it needs an internal card.  Going through Thunderbolt will probably remove a lot of internal memory bus benefits, so I would skip RAID 0 in Thunderbold.  I'd go RAID 5 if you want redundancy, though I'd KISS (keep it simple and stupid), and just create backups every day.  

In short, be wary of advertising from drive makers.  If you don't see a difference, go back to KISS.

Every time I study this stuff myself, I end up in tears too!  So you are not alone. 

Link to comment
Share on other sites

I might not be too professional about this, but I came to the conclusion that complex RAID setups are nothing for me and as long as you're no data host or anything, I don't believe it has to be for you, either.

Consider this: the really high transfer rates you need only for editing, so I go like this: I use a RAID 0 of two inexpensive 7.200 RPM HDs for editing, temporary storage, two SSDs (one for OS, one for caching), so you can always read from one, cache on another and write to a third array/disk. If you need more speed, just use an SSD RAID 0 or whatever. In terms of storage and backup I use inexpensive USB 3 solutions like these:

http://www.amazon.co.uk/Tool-free-Inateck-Including-External-Comaptible/dp/B00GIDNLI6/ref=sr_1_10?ie=UTF8&qid=1440515061&sr=8-10&keywords=inateck+usb+3

Windows 10 works perfectly with USB 3 solutions (previous Windows versions did have issues), dunno about Mac, though. I just dump the data onto inexpensive 7.200 HDs (write speed around 120 to 140 MB/s), handwrite something on the label and put them into a closet. If the data is super precious, just get two disks. I do have a Synology 8-bay NAS, but I hate using it. And if something goes wrong, rebuild times are tedious, the damn thing has to run all the time and it's not as fast as your main workstation. So nowadays I'm all about JBOD (just a bunch of disks) as long as you have good control over your environment and shit. Works as long as you're no corporate bigwig. Inexpensive, uncomplicated and imho reliable.

Oh, if anyone in Europe wants to buy the Synology DS1813+, give me a shout. :p

Link to comment
Share on other sites

Here's a very simple explanation, I hope!  A drive head can only be in read or write mode.  Drives slow down when they switch between read and write mode.  The more the drive goes between read/write, checking what it reads, the better.  Also, the hard drive head writes in chunks and can only write so much before it must stop, go into read (catch its breach so to speak), then write another chunk.  In RAID 0, the drives are configured so while one drive is writing a chunk, the other drive is preparing to write another chunk on it.  This means that the first drive will get part of the data, the second drive the second part, and so on, switching back and forth.  Obviously, if anything happens to either drive ALL the data is lost because each drive has a part of the data.

RAID 0 is all about speed!

HOWEVER, For RAID 0 to be maximized, it needs an internal card.  Going through Thunderbolt will probably remove a lot of internal memory bus benefits, so I would skip RAID 0 in Thunderbold.  I'd go RAID 5 if you want redundancy, though I'd KISS (keep it simple and stupid), and just create backups every day.  

In short, be wary of advertising from drive makers.  If you don't see a difference, go back to KISS.

Every time I study this stuff myself, I end up in tears too!  So you are not alone. 

I need it simple, because I don't have the spare brainpower to study this nor do I have time for it. 

I just need to use this huge 24TB trashcan to offer me speedy editing, and also some room to hot swap a disk in the 4-bay that i can use for backup (as well as a bunch of other external backups - let's keep that separate because I'll do that anyway). Make sense? 

When it comes to setting it up... blank!

Link to comment
Share on other sites

 

I need it simple, because I don't have the spare brainpower to study this nor do I have time for it. 

I just need to use this huge 24TB trashcan to offer me speedy editing, and also some room to hot swap a disk in the 4-bay that i can use for backup (as well as a bunch of other external backups - let's keep that separate because I'll do that anyway). Make sense? 

When it comes to setting it up... blank!

One thing to keep in mind: if/when these proprietary hardware solutions fail - you won't be able to get to the data on them until you get an identical unit delivered back that can detect the disks.

Hence it's smart to backup to other thunderbolt / USB3 disks, preferably also when you're in the middle of working on a project.

If you keep that in mind, it doesn't matter much what mode you set the device to, as long as the controller in this hardware device is up to the processing.

Personally I wouldn't go for RAID-0 with four disks. If one single hard disk fails (with 4 disks, the chances for this happening is four times higher than with a single hard disk) - the whole array fails and won't work until you've plugged in another disk and rebuilt the array. With large disks this takes quite a while, and also note that any data that existed on it needs to be copied back if it was running RAID-0 - since all the data is lost in case of one disk failure. So it can cost you time...

If you run the array in RAID-10 or RAID-5 mode, you can lose one physical disk and keep on working (actually in RAID-10 mode you can lose two disks and keep on working if you're lucky and lose the right disks). Later you can insert a new disk and put the array rebuilding in a few days when you have the time to wait for the rebuild.

I've been building file servers for myself for 15 years, so quite a few things that seem obvious to me might not be obvious for others. But feel free to hit me up with questions if something seems unclear.

Link to comment
Share on other sites

One thing to keep in mind: if/when these proprietary hardware solutions fail - you won't be able to get to the data on them until you get an identical unit delivered back that can detect the disks.

Hence it's smart to backup to other thunderbolt / USB3 disks, preferably also when you're in the middle of working on a project.

If you keep that in mind, it doesn't matter much what mode you set the device to, as long as the controller in this hardware device is up to the processing.

Personally I wouldn't go for RAID-0 with four disks. If one single hard disk fails (with 4 disks, the chances for this happening is four times higher than with a single hard disk) - the whole array fails and won't work until you've plugged in another disk and rebuilt the array. With large disks this takes quite a while, and also note that any data that existed on it needs to be copied back if it was running RAID-0 - since all the data is lost in case of one disk failure. So it can cost you time...

If you run the array in RAID-10 or RAID-5 mode, you can lose one physical disk and keep on working (actually in RAID-10 mode you can lose two disks and keep on working if you're lucky and lose the right disks). Later you can insert a new disk and put the array rebuilding in a few days when you have the time to wait for the rebuild.

I've been building file servers for myself for 15 years, so quite a few things that seem obvious to me might not be obvious for others. But feel free to hit me up with questions if something seems unclear.

I got this product for nearly half-price the retail brand new (hence why I've gone from very simple to a little monster!). 

From what I can see it's come configured as RAID-5... although it's 24TB it says the capacity is 18TB. I'm guessing you lose a 1/4 of the space in all the mathematical TB calculational thingy whatever? (yes, my head is working this way right now and I have a night shoot looming!).  

So the best way would be to keep it this way, but have separate external drives just to backup the final data and projects for archive, then when the G-Speed RAID 5 fills up... just delete it all and re-fill with new stuff (which is fine, as I will have all the final projects/media backed up/saved anyway). 

Thanks for the help so far!

 

Link to comment
Share on other sites

 

From what I can see it's come configured as RAID-5... although it's 24TB it says the capacity is 18TB. I'm guessing you lose a 1/4 of the space in all the mathematical TB calculational thingy whatever? (yes, my head is working this way right now and I have a night shoot looming!). 

Correct!

With RAID-5 you lose the size of one single disk to parity information (the redundancy). With 3 disks, 1/3 goes to redundancy, with 4 disks 1/4 goes to redundancy.

 

So the best way would be to keep it this way, but have separate external drives just to backup the final data and projects for archive, then when the G-Speed RAID 5 fills up... just delete it all and re-fill with new stuff (which is fine, as I will have all the final projects/media backed up/saved anyway). 

Thanks for the help so far!

 

Yes - if it came preconfigured like that, I'd assume it has the processing needed to get proper speed out of it in RAID-5 mode.

When speed is enough, RAID-5 should be a good choice for size vs redundancy for something like a video editing array (in other words: you want a bit of redundancy so that one hard disk failure doesn't stop you from working immediately, but you don't need A LOT of redundancy - since you can back up the projects on other external disks).

Link to comment
Share on other sites

Correct!

With RAID-5 you lose the size of one single disk to parity information (the redundancy). With 3 disks, 1/3 goes to redundancy, with 4 disks 1/4 goes to redundancy.

 

Yes - if it came preconfigured like that, I'd assume it has the processing needed to get proper speed out of it in RAID-5 mode.

When speed is enough, RAID-5 should be a good choice for size vs redundancy for something like a video editing array (in other words: you want a bit of redundancy so that one hard disk failure doesn't stop you from working immediately, but you don't need A LOT of redundancy - since you can back up the projects on other external disks).

Cheers! Making more sense now. 

I'm going to spin a few projects through the drives and see how it performs. Hopefully it will be ok and i can get back to focusing on the things I do best :)

I'm just this minute saving 163GB of video footage onto the drive... the waiting time is just 4 minutes! 

Link to comment
Share on other sites

Oliver. SSD's like the Samsung 850 Pro will offer you better speeds than a RAID 5. I am slowly shifting to SSDs as the Media/EDIT drives as my projects rarely go above 6-700 GB.  RAID is a luxury I cannot afford and it does have its caveats. A simpler more economical suggestion would be to use SSDs (the 850 pros) as your Media and Cache drives, and backup to external disks. While this requires a little more effort, It gives you similar/better speeds at a much lower cost.With 4K and even high bitrate 1080p footage, faster seek times (which the SSDs offer) will give you better performance while scrubbing through tracks. Stick these SSDs into thunderbolt external enclosures.

Link to comment
Share on other sites

I want to 2nd what Dahlfors said.  I have photos on an old WD Worldbook.  It failed, and because the card has proprietary encryption whatever, I was about to buy an old unit to try swapping the circuitry out when someone on the Internet said I could snip the diode and it would work again (not something you'll obviously do!).  Electronics die over time.  I was lucky.  That's why I strongly advocate having a few backups.

HEAT is the enemy of ALL THINGS ELECTRONIC.  Put a fan on it if you aren't in a cooled atmosphere.  Also, make sure you have good surge protection, etc.  

 

Link to comment
Share on other sites

I've gone through lots of RAID nightmares and I'd say just go for the max speed and space, i.e. level 0, and then as stated above: use hardware to manage. Then backup to tape often or nightly. I also don't keep project files on RAID, I just us it for source footage (which always exists somewhere else as a master), renders and cache, but keep all project files on internal SATA. I think RAID is still useful for overall size and speed, but I would just forget about the redundancy benefits, rebuilding a RAID is something you never want to do. Then for backup I do nightly partial LTO backups with Bacula for RHEL, it uses MySql or Sqlite, is also available on Mac, Windows and Linux. Then I literally keep my archived LTO's in a fireproof go box... and I still can't sleep at night. 

Link to comment
Share on other sites

I want to 2nd what Dahlfors said.  I have photos on an old WD Worldbook.  It failed, and because the card has proprietary encryption whatever, I was about to buy an old unit to try swapping the circuitry out when someone on the Internet said I could snip the diode and it would work again (not something you'll obviously do!).  Electronics die over time.  I was lucky.  That's why I strongly advocate having a few backups.

This is the reason why I've been keeping my most important data on software raid running on normal PC's with Linux/FreeBSD. As long as the disks are alive, I can just use new off-the-shelf PC hardware, put the disks in and have Linux/FreeBSD detect the raid setup. No need to contact manufacturers or scout Ebay for some discontinued hardware.

Besides that, by running raidz2 I need to lose more than 2 physical hard disks in the disk array before I lose data - and in addition to that I sync & backup the most important data to other machines as well.

I once got burnt and lost 1 full month of work (that I had to redo again...) when a computer's PSU went up in smoke. Hence I've been very wary of my data since that. Since the 90s I've had 15-20 disks die on me, without data loss.

 

I am slowly shifting to SSDs as the Media/EDIT drives as my projects rarely go above 6-700 GB. 

Soon we'll all be shifting to SSD's. Both Samsung and Toshiba have announced 16 TB SSD's, while it seems like hard disk manufacturers are getting stuck at around the 8-10 TB limit for a while.

Here's a shot from a Toshiba presentation: http://www.nordichardware.se/images/labswedish/nyhetsartiklar/Lagring/toshiba.nand/fullimages/toshiba-qlc-roadmap-640x0.jpg

For enterprise, they're expecting to deliver 128 TB drives by 2018. Hopefully this will drive down the prices on moderately sized flash drives :)

Oh, and about speed... The latest flash drives (needing PCI express 3.0 interfaces) have read speeds of 5.5 GB/s and plenty fast write speeds too. I'm not sad to see the demise of hard disks...

Link to comment
Share on other sites

I've been editing a music video project - the first one on this new G-Speed Studio in RAID 5. I've not really decided how to configure it in the long term (I'll have to decide after I've done this project), but I've been able to play full quality 4k ProRes LT video on 9 multi-cam windows simultaneously in FCPX, with the main timeline playing too. (in full quality 4k). Editing is buttery smooth with absolutely no issues whatsoever. 

Also - 128gb of footage from my SD card takes 3 minutes to save to the drive. (before this was an overnight job). 

While I'm very happy with this performance - I'm not sure how that compares with other solutions explained in this thread. 

Link to comment
Share on other sites

Also - 128gb of footage from my SD card takes 3 minutes to save to the drive. (before this was an overnight job). 

While I'm very happy with this performance - I'm not sure how that compares with other solutions explained in this thread. 

If it's 128 GB in 180 seconds, that will be around 700 MB/s in write speed. That's good speed with so few spinning hard disks and means that the disks are writing near the limit of their max write speeds. You will most likely not notice any improvements in speed by switching raid mode.

Link to comment
Share on other sites

If it's 128 GB in 180 seconds, that will be around 700 MB/s in write speed. That's good speed with so few spinning hard disks and means that the disks are writing near the limit of their max write speeds. You will most likely not notice any improvements in speed by switching raid mode.

That beats a regular Sata 3 SSd too! With 4 disks it works out to a speed of 175MB/s PER disk which is superb. Stick with the current config.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

 Share

  • EOSHD Pro Color 5 for All Sony cameras
    EOSHD C-LOG and Film Profiles for All Canon DSLRs
    EOSHD Dynamic Range Enhancer for H.264/H.265
×
×
  • Create New...