HomeMax VeloSSD

Maximum Velocity Caching Software

Hardware RAID 0 problem Messages in this topic - RSS

talaverde
talaverde
Posts: 2


3/26/2019
talaverde
talaverde
Posts: 2
I have a 3 x 10 Terabyte Raid 0 Array, using Intel RapidStorage. It's been working fine since I set it up. No issues. In fact, I've never had any Intel Raid 0 arrays go bad, except from user error (which I can't even remember the last time that happened) Anyway, it was working completely fine. I added the MaxVeloSSD cache using a 1TB NVMe SSD. Within 24 hours, the drive array crashed, losing about 15 TB of data. Fortunately, I had about 95% of it replicated to another drive/server, so I didn't lose much data, but it took me about 48 hours to recover the system. I'm very reluctant to use it again.

I bought a 3 system license. I have the same setup on another 3 x 10TB R0 array, but with a different RAID card. That's been working fine. If that would have crashed too, I would have been screwed.


Anyway. Any ideas why the cache failed? It MUST be because of the cache. I just have no idea why or how it failed.

Oh, also, there is roughly 3-6GB of RAM cache on top of it.
edited by talaverde on 3/26/2019
0 link
bodo
bodo
Administrator
Posts: 98


3/26/2019
bodo
bodo
Administrator
Posts: 98
I am sorry for the error on your system.Certainly there are starting points that I would like to mention in connection. First let me make it clear, however, that our software does not make any use of proprietary specifics. but completely generically accessing it. It is therefore not possible that a crash on your Intel Raid Controller may have been caused by our software.Furthermore, it is generally known that RAID0 is the most error-prone mode in which an array will fail completely when a disk fails. It is also generally not recommended to use Raid0 if you do not have at least 1 complete backup to reserve. ( sorry had to type this in german , thanks to google translate )

I would double check the harddisks with SMART tools and create at least one complete backup. Great would be new harddisks for the system with Cache. And use the old one as a second safe backup. And to be safer on the long run, upgrade to a higher raid level.
edited by bodo on 26.03.2019
0 link
talaverde
talaverde
Posts: 2


3/27/2019
talaverde
talaverde
Posts: 2
I agree with everything you said. Raid 0 is most risky. One drive failure and it's toast. However, to be frank, in my 20+ years of RAID experience, I've had more data loss with parity failures than a drive failing. I've found that redundant RAID 0 arrangements a better than a RAID 5/6. RAID 10 is good to (pretty much the same thing).

I also agree, this type of cache shouldn't cause a failure, which is why I invested in it. The cache is just a copy of the actual data. This is why I was SO surprised with the failure. It's been running flawlessly on my other server, which is using a LSI RAID controller (Raid 5/6). Anyway, for the RAID 0, I simply reformatted the drive, NOT using the cache and it's back to it's usual performance. No issues. It could have been just a fluke. (A coincidence, but an extremely unlikely coincidence). Once I get all the data replicated again, running efficiently with redundancy, I'll try again.

There was a time where the software locked up and became unresponsive. it was about the same time as this failure. There is no way to know cause vs effect.

Could the caching be overloading the RAID 0? I realize you'd be reluctant to share, but are there any other cases where there was issues with RAID 0? I'm fine if there is/was. I just need to know the limitations. I'm guessing not. I don't see how it would matter. In fact, I would think RAID 0 would handle caching better than a parity array.

This was ** A LOT ** of data for a failure. As I said, I had redundancies, but it took a lot of time to recover. I've never had any issues with these drives EVER, until 2 days after installing the cache.


Can you give me more detail on the different performance profiles. Clearly, since you have the different options, the risks vary. Could that be my issue? Should I stick with a lower 'conservative' performance profile? Beyond your standard documentation, do you have any standards for which to use and when? I'm sorry if I'm not being specific. It's only because I don't know the right questions to ask. Thanks for your help. Don't worry, I'll continue to use your software.

Oh, BTW, these drives are less than 2 months old. Used enough to know they aren't defective, but not enough to wear them out. I can't believe it's the drives. I could see, however, that it could be the RAID software. Intel Rapid Storage is 'software raid'. The software could have conflicted with the caching software somehow??


The Rapid Storage software has SMART monitoring built in. Any SMART events would have immediately registered, which did not happen. None of the drives failed.The array formatting simply went from readable status to unreadable, requiring a reformat to become usable. By observable indications, it's like the boot sector was corrupted. (I have a feeling, if I didn't have redundancies and this data was invaluable, I could have had the array rebuilt. It was easier to just re-copy from the replicated source)

Several terabytes of data takes quite a while to recover and then sync up with the replication. Once this is finished, I'll try again. I guess we'll see.
edited by talaverde on 3/28/2019
0 link
bodo
bodo
Administrator
Posts: 98


4/3/2019
bodo
bodo
Administrator
Posts: 98
We have some hardware arrays too and keep them running with Velossd and MaxVeloSSD for some time before your post. Until now we have no issues. However, to be honest we have not yet built a 3 * 10 TB RAID 0. We might build one in the near future. Sorry for the delayed response. I´ll be back when I know more.
0 link