The Disk Drive Array

Here's a radical concept. Let's leave notes and mini-how-to's for each other

The Disk Drive Array

Postby jbv » Thu Oct 18, 2012 7:41 am

Drive Pooling has now been completed and is implemented from Beta-4

As I am now ready to start plugging hard drives into FoxyRoxyLinux, it might be prudent to start a discussion of what I have planned and why.

Along the way, I may deviate from what has become the accepted norm or what is traditionally done in Linux systems. My intent is to not spear off in a totally radical direction and I will be using proven techniques, principles, and packages. While the direction I am heading is not totally out of left-field, it may not be main-stream.

For starters, I want my FoxyRoxyLinux to mount Hard Drives by Volume Label rather than the traditional /dev/sdx mechanism. The primary reason for this is that my system may have 20 Hard Drives loaded and I do not want a hard drive to be tied to a specific drive-bay. In short, I want to be able to plug any drive into any of the 20 drive bays that my machine will have, without needing to reconfigure or remap anything. I see this as being quite important in the event of a drive failure, or if I want to load a suspect drive from another FoxyRoxyLinux machine to see if I can read data from it.

I will also be deviating from the norm with regard to the type of RAID I will be using. I do not intend to use any form of "live" RAID, as in RAID-5 or RAID-6 etc. There are a few reasons behind my thinking in this area. One of the prime factors here is that as a media server, my "box" will spend a large part of it's time in sleep mode with all of the hard drives spun-down and in stand-by. With a traditional RAID, you need to wait for all of the hard drives to spin-up before anything can be read from them. My goal is to have all of the data from any individual file on one physical drive, which will mean that the only drive that will be spun-up will be the one containing the file that you want to read. This also has a side benefit in that you can take any drive out of the system and put it into any machine and read everything from it. In a traditional RAID arrangement, you can not move anything and if you took a drive out of an array and put it into another machine, all you would see would be garbage.

Instead of a real-time RAID, I will be using a form RAID that is referred to as being a "snap-shot RAID". This is where at a determined schedule the system will take a "snapshot" of the data and write the RAID details across the drives in a fashion that it can rebuild any disk from the data. You will be able to choose between RAID-5 or RAID-6 depending upon your needs/feelings in this area. Yes, this means that the RAID data may be out of date until the next snap-shot, although as I intend to take a snap-shot at least once every 24-hours and most of my data will be static, this will work for me.

Volume Management or disk pooling will be another area where I will be deviating slightly. My intent is to have every disk drive appear as a huge single volume or directory path. From this single directory path, I will create my own sub directories. When a file is copied to any path within my directory tree, the system will find the disk with the least amount of space that can hold the entire file and place the file on that physical disk drive. In a way, this becomes a form of load balancing such that files will be placed on disk drives in a balanced order and they will always be complete.

I hope that the tools and techniques I intend to use will not consume system resources and will provide optimal performance on any machine configuration.

I will be making more notes over the next few days and expanding on (and cleaning up) some of this.

Cheers, Brenton
jbv
 
Posts: 600
Joined: Sat Jul 14, 2012 2:02 am
Location: Sydney, Australia

Re: The Disk Drive Array

Postby jbv » Fri Oct 19, 2012 9:50 am

Things have started to get a little serious.
To date all, and I do mean all development on FoxyRoxy has been done on a single 4Gb USB stick.
While I have "a few" 2TB Samsung HD204 drives, these have been filled while I have been building FoxyRoxy.
- minor detail :(

Over the last few days I managed to scrounge a few SATA Hard Drives.

To be specific; the drives I managed to scrounge are older SATA drives. ... I have:
2 x 160Gb 7200 RPM Seagate drives
2 x 80Gb 7200 RPM Seagate drives
1 x 80GB 5400 RPM WD-Green drive

For those who have been following this saga from the beginning, you may recall that I purchased 2 x Gigabyte GA525-TUD main boards and enough RAM to fully populate each with 4Gb, which is the maximum RAM an Intel Atom D525 CPU will see. I also purchased 4 x 5-in-3 drive enclosures, and 4 x 5-port SATA port expanders, and 2 x 15-bay cases with a PSU capable of powering all HDD's if required. After a very quick and dirty test 15 months ago to verify that the on-board SATA chips would work with the port-expanders, I packed everything back into the boxes and just left one of the GA525-TUD main-boards sitting on my desk with a crude on/off switch hard wired to it and set about playing with stuff on USB sticks.

Tonight, things took a dramatic "must not turn left or right" 100% confirmation that I am on the right track !

After a rather productive day at "the office", I thought I'd push my luck a little. I pulled out the enclosure I've been borrowing the power-supply from to power the FoxyRoxy development machine, and pulled the main FoxyRoxy Development machine (board) down. To make sure that I didn't start digging a hole I couldn't climb out of, this was a simple matter of unplugging everything and sticking the main-board on the shelf ;)

I then pulled the box for the enclosure apart (which to date has been serving hard-time as the "plinth" for the PSU so everything reached my desk :lol: I then took FoxyRoxy's sister board and fitted it to the enclosure and started putzing around with power-cables, SATA cables, and a veritable plethora of other cables. Like a kid so excited with a new "toy", I didn't make any notes as to the BIOS settings I had in the original FoxyRoxy mainboard, but knowing she was safe-and-sound on the shelf, I figured, what the heck and just starting playing with her sister. It took a little while, but I finally got a BIOS config that worked with the USB wireless keyboard I now had plugged into the dev-machine and got the new main-board to see the SATA drives. A little more playing and we have it....

With no changes to FoxyRoxy I now have a machine booting FoxyRoxyLinux from a USB stick that now sees 5 x SATA Hard Drives only using 1 of the (4) on-board SATA ports [20 HDD's here we come]. The only "tweaks" made were to the BIOS and basic machine config, which I will sort out and document later, although I expect these settings may be different fo various main-boards and port-expanders etc... But this does prove that FoxyRoxyLinux can "do it". Yep, this little baby can got down and dirty when needed.....

As the "scrounged" HDD's came from "windows" machines, these drives have NTFS partitions on them. While Gparted can see the drives, the mount-all script does not mount them. I may take some time tomorrow to see if I can debug this before moving on, but the real good-news is that ... we have "left orbit" and nothing can stop us now ...

I may need to spend this week-end sorting out some basic hardware stuff that will be machine specific to me, but I can assure you that we are definitely on solid-ground here.

We are as smart as a Fox and this baby really rocks.

If you'll give me a little moment, I've just got to "party-rock" one more time .....

I'll keep you updated over the next few days, but if I vanish for a few days that doesn't mean anything other than I have a life or things are going so well, that I really don't want to waste any time with short notes. If I fall into a hole, I will let you know....

Cheers, Brenton (chillin and party-rockin)
jbv
 
Posts: 600
Joined: Sat Jul 14, 2012 2:02 am
Location: Sydney, Australia

Re: The Disk Drive Array

Postby jbv » Fri Oct 19, 2012 10:04 am

Oh, I forgot to mention that "hot-plugging" SATA drives works too :D :D :D 8-) 8-) 8-) :lol: :lol: :lol: :ugeek:

This baby really rox

From where we are now, and what I have learned over the last few days/weeks, we are literally a hairs-breath away from a fully blown NAS. What may really blow you away is that we won't need much (if any) more RAM or disk space .... YIKES :twisted:
jbv
 
Posts: 600
Joined: Sat Jul 14, 2012 2:02 am
Location: Sydney, Australia

Re: The Disk Drive Array

Postby jbv » Sat Oct 20, 2012 8:38 am

Spent the day playing with udev and the docs are definitely up to Linux standards.

I've got the system auto-magically mounting and un-mounting Hot-Swap SATA drives, at the moment this is by volume and not by label as I want, but I'm still taking baby steps here.

The USB stick that FoxyRoxyLinux boots from is not mounted as /media/FoxyRoxy and instead is being mounted as /dev/sda1 and other USB sticks are being mounted by volume instead of label.

The issue here is that my udev rule is stepping on the "pmount" udev stuff we currently use to auto-mount USB sticks.

What is sending me nuts is that I can't work out how to create a udev rule (my rule) to exclude a specific match. In short, I want my udev rule to not kick in on any USB devices and I can't work out how to do it. I was pretty sure I had it at one point and it all seemed to work magically as intended, until the next reboot :(

My head is spinning at the moment, so it's time to call it quits. I hope this doesn't take long to sort out...

After giving this a little more thought, perhaps I should ignore this for a moment and just get it mounting the volumes by label as I want. If I can get this happening, then I can probably remove the "pmount" stuff and just let my own udev stuff do all drive auto-mounting. I think I might push down this path a little tomorrow.

At the moment, I'm only mounting NTFS drives in rw mode using ntfs-3g.
As I have 2 160Gb drives, I may repartition them so that they each have an 80Gb NTFS partition and an 80Gb XFS partition, as this will help me with the auto-mount stuff with different filesystems.

Cheers, Brenton
jbv
 
Posts: 600
Joined: Sat Jul 14, 2012 2:02 am
Location: Sydney, Australia

Re: The Disk Drive Array

Postby jbv » Sat Oct 20, 2012 10:13 am

I now have the 2 160Gb drives partitioned so that each drive has two 80Gb partitions.
The first partition is NTFS the second partition is XFS.

I can now confirm that I am seeing weird stuff and the drives are not getting the same mount point on a reboot as they do on a power-up. The solution to this will most probably be in mounting by Volume Label as I had planned from the start. With some luck, I will have this sorted tomorrow ...
jbv
 
Posts: 600
Joined: Sat Jul 14, 2012 2:02 am
Location: Sydney, Australia

Re: The Disk Drive Array

Postby jbv » Sun Oct 21, 2012 5:20 am

Sorted.

FoxyRoxyLinux now supports hot-swapping USB and SATA hard drives.
Drives are are mounted in /media by the Partition Label.
All SATA and USB drives attached to the machine are also mounted at power-up.

If a device without a partition label is inserted, the volume partition ID is used.

In other words:

If you insert a drive with a single partition that does not have a label and the drive would traditionally be mounted as /dev/sdb1 then the drive will be mounted as /media/sdb1

If you insert a drive with a single partition that has a label of MYDISK it will be mounted as [/b]/media/MYDISK[/b]

Drives are automatically unmounted when removed.
If a drive with multiple partitions is removed, then all of the mounted partitions are unmounted.

There are a few areas that could do with some tweaking.

At the moment, the system checks to see if the partition is mounted (using the mount point it intends to use) and if it is, it will not mount the drive/partition. This could have implications if two drives have the same label, whereby the second drive/partition probably will not be mounted.

There was a slight conflict with the previous USB auto-mount, and as a consequence, the injection for the new AutoMount removed the old auto-mount from 02-FoxyDesktop.squashfs when you run the injection. The accompanying injection also turns the power off on your machine, instead of just doing a "reboot" as previous injections have done. The reason for this is as follows:

Sorting this out also revealed another minor problem. The original .sqfs have a file /etc/mtab which should not be in the filesystem during boot as it can mess with the way the machine maps USB sticks if you have hard-drives attached. Therefore, the injection that accompanies this sqf removes this file from both 01-FoxyRoxy and 02-FoxyDesktop

The sqf with installation instructions will be posted in the FoxyRoxyLinux - Scripts section.

Over the next few days, I will be doing some benchmark tests on various ways to access the hard drives and working on our disk pooling stuff. I will also be working on some scripts to unmount all partitions on a Volume and spin-down the disk ready for removal.

Little side note: Not all SATA controllers support port-multipliers, or at least Linux does not support port-multipliers on all chipsets. My main FoxyRoxyLinux development machine has 4 x SATA ports, but the 2 x SATA ports on the Intel NM10 chipset are not supported with port-multipliers. This means that I can only plug 12 HDDs into this machine instead of the 20 I had hoped for.
So, I will probably just plug 10 into the machine as it would be a waste of a hot-swap bay for just 2 HDDs

Cheers, Brenton
jbv
 
Posts: 600
Joined: Sat Jul 14, 2012 2:02 am
Location: Sydney, Australia

Re: The Disk Drive Array

Postby jbv » Sun Oct 28, 2012 4:55 am

While getting the spindown script right, I found some very strange things with BASH that were not behaving as the documentation indicates they should. Specifically, accessing strings in arrays by the array element. I ended up doing my own thing here. I also learned that the output of mount should not really be trusted, and that you should use /proc/mounts instead.

I may revisit the auto-mount/dismount scripts and change them when I add the capability for them to configure hard drives on insertion and configure the drive pooling. Both of these things will be best handled by having a text configuration file. I am currently thinking of creating a text file /etc/driveconfig or similar to allow these things to be easily setup and changed.

I learned another little thing over the last few days, that right now I am not sure if it is a global thing or specific to VFAT. If you use the mv command with the -f switch to rename a file and the destination filename exists, I would have expected the mv command to either delete the original file before renaming/moving the new file. It seems that it does not do this. It appears that it does something to the name of the file that is about to be replaced and then renames the file you have told to to mv What this does is, it leaves all of the old file in a state where it can't be seen or read by the system yet the space is also not made free or available. This means that after a while, you end up with a lot of "missing" space from the drive. The only way to recover that space (with a USB stick) is to put it into a Windows machine and do a "disk check" on it.

The implications for FoxyRoxyLinux are that I will need to revisit all of the xx-save scripts and look to see if a backup already exists before doing a rename. If a backup exists, I will need to delete the old backup before creating a current backup. At least, I now know why Windows was reporting my USB sticks as being suspect after I had been using the stick to work on FoxyRoxy.

I've tested the spindown script pretty extensively. So far, no real problems or issues. I properly handles all cases I've thrown at it including a potentially ambiguous volume label that could exist on more than one physical drive and warns that it can only spindown one spindle. If you give it a single partition label and there are multiple partitions on the spindle, it shows you all of the partitions and volumes it is going to unmount and waits for a keypress before doing anything. My 5-in-3 enclosures also turns a LED on when the drive is spun-down and in "sleep" mode which is neat.

I think I will play (manually) with the drive pooling next, then revisit the auto-mount stuff and double-check what happens if you try to mount two partitions with the same label, as I'm not 100% convinced that this is really rock-solid yet. Once I've got the basics of the disk-pooling right, and double-tripple-quadruple checked the auto-mount stuff, I will look to make the auto-mount semi-intelligent and self-extending to do stuff like configure the hard-drive sleep timeouts and related stuff on insertion.

As this will really be a simple extension to adding a drive to the disk-pool, that is probably the time and place to get it right.

Cheers, Brenton
jbv
 
Posts: 600
Joined: Sat Jul 14, 2012 2:02 am
Location: Sydney, Australia

Re: The Disk Drive Array

Postby jbv » Sat Nov 03, 2012 9:49 am

What a day. I've been to hell and back.

I managed to trash the partition on my primary development USB stick. Fortunately, I realised I had done this before I rebooted to test something else and was able to copy the USB stick (with the last weeks work) onto a hot-swap hard drive, then make a fresh USB stick and copy from the hard drive back to the USB stick, so nothing was lost.

I have the auto-mount stuff working absolutely perfectly and it is now self-extending too, from here it will be somewhat straight forward to auto-magically add and remove drives (or partitions) to and from our mediapool and also tweak stuff like spindown time etc on a drive by drive basis.

I even documented it all and very technically explained how the auto-mount stuff works in fine detail. As I was trying to be precise, I opened a few more tabs to check that I was using the correct nomenclature and had the terms right for things like options/switches etc. It seems I was on one tab for to long, because when I came back, the forum had logged me out and the hour it took to create the detailed explanation was wasted. I'm not up to doing it again tonight, as I think from here I need a beer.

Brief Update on current status:

FoxyRoxyLinux will now Automatically mount all drives and partitions at power-up.
All filesytems are mounted in rw mode. The following filesytems are supported:

ext2
ext3
ext4
xfs
fat16
fat32
ntfs

Hot-Swap works perfectly - insertion and removal
Any drive can be manually spun-down for removal
Removing a drive properly unmounts everything regardless as to weather the drive has been manually "spun-down" or just "yanked" from the array.

Adding and Removing drives from the mediapool is working perfectly (manually)
Hooks are in-place for automatically adding drives/partitions to a mediapool
The mediapool can consist of drives/partitions with different file-systems
- you can mix and match the file-system
- this means you can have any file-system or a mix of file-systems in the drivepool
The mediapool overlays all drives/partitions into a single mount-point
The mediapool automatically places files into the drive with the most space if more than one drive has the same directory

drive pooling is done using AUFS, so there is no overhead in terms of memory usage, nor is there a performance hit as this is done at the kernel level. I will play around with some benchmark stuff a little later, once the drive-pooling is automated.

FoxyRoxyLinux and all of the above now fully supports GUID partitions (GPT)
- This means that we should be able to use drives larger than 2TB
... Added bonus, this also means we can (should be able to) use native 4k sectors without translation
... That means not only will 3TB plus Hard Disks work, they will be FAST

I will buy a 3TB drive sometime next week and test/confirm this.

Sorry about the lack of "tech notes", which is what I wanted this area to be.

If you want some light-reading to see what I've been playing with, and have working perfectly in FoxyRoxyLinux, complete with auto-mount, hot-swap, and file-system independence ... have a look at these links:
IBM - Make the most of large drives with GPT and Linux
IBM - Linux on 4KB-sector disks

I really did write some great stuff, and it got eaten.
I'm a little depressed about that right now, and am just not up to re-doing it.
Right now, I really do feel as though I have just crawled my way out of hell.
So I'm going to shut down for the night and have a beer or two.

Cheers, Brenton
jbv
 
Posts: 600
Joined: Sat Jul 14, 2012 2:02 am
Location: Sydney, Australia

Re: The Disk Drive Array

Postby jbv » Sat Nov 03, 2012 11:02 am

A picture tells a thousand words ...
screenshot (Large).jpg
screenshot (Large).jpg (124.95 KiB) Viewed 9134 times
jbv
 
Posts: 600
Joined: Sat Jul 14, 2012 2:02 am
Location: Sydney, Australia

Re: The Disk Drive Array

Postby jbv » Sun Nov 04, 2012 2:42 am

I've just been told about exFAT/64 <link> and after a quick peek in our favorite repo's,
there is a pretty recent build in squeeze-backports.
Needless to say, I've added it and it will will be in our next injection.

At the moment, I am trying to work out how to best handle the configuration file we will need for our drive-pooling.
The reason for the configuration file is that this will be the cleanest and easiest way for people to configure their own drive-pool and have it look however they want it to look.

At the moment, I am thinking of placing a single configuration file drivepool.conf in /etc/udev
My logic behind placing it there is that everything is being driven by udev anyway.

The basic principle and flow of things is as follows:

/etc/udev/rules.d/10-disk-hotswap.rules contain the rules that when triggered by udev fire off ...

/etc/udev/scripts/disk-hotmount.sh at power-on and when a drive is hot-inserted
/etc/udev/scripts/disk-hotdismount.sh when a drive is hot-removed

The above two scripts have the following helper or extension scripts:

/etc/udev/scripts/postmount_partition.sh is called when a new partition is mounted
/etc/udev/scripts/postmount_volume.sh is called when all partitions on a disk have been mounted
/etc/udev/scripts/predismount.sh is called by disk-hotdismount.sh before removing all the mount points

With the drive-pooling, the system should automatically add a drive to the pool when it is inserted or at power-up (both are sort-of the same thing). However, a drive may have more than one partition on it and you may want all or none of the partitions in the drive-pool. Another scenario is that you may only want a particular sub-directory mounted into the drive-pool. This is where our configuration file starts to come into play.

As the script postmount_parition.sh gets called when a partition has been mounted, this is the logical place to work out of the partition we have just mounted needs to be added to the drive-pool and if so, what rules apply.
To do this, the postmount_parition.sh script needs to know what if anything it needs to do.
Therefore it seems logical that it can just get the info from /etc/udev/drivepool.conf
This file will be a pure text-file so that you can easily add/change the configuration with a text editor.

There are a few "hoops" to jump though when un-mounting a drive or volume. Before the drive can be removed, the system needs to unmount any partitions that were mounted and this is exactly what /etc/udev/scripts/disk-hotdismount.sh does. However, with drive-pooling, we need to remove the mount point(s) from the drive-pool before we remove the mount points themselves. This is what /etc/udev/scripts/predismount.sh will do. However, there are a few other tricks that need to be performed while removing drives (or partitions) from the pool and we need to know if this is the last drive (partition) in the pool, because if it is, there are some special handling techniques that we need to be aware of. To help in this area, I am going to need to create a temporary file so that our predismount script can keep track of what is and isn't mounted in the drive-pool.

For this, I am thinking it will be best to create a small file /var/log/drivepool.map
As the /var/log directory is really a ram-drive, this won't write anything to a physical disk and we don't need to keep any of this during power-off/restart, so in many ways it is the ideal place for it.

I will also need to revisit the spindown script and have it call some of this stuff now.
When I first created spindown I wanted it to be somewhat of a stand-alone thing, although drive-pooling adds another layer of complexity and it really needs to work properly with this, so I will also "tweak" it a little.

On the drive-pooling itself, I have done quite a bit of testing including sharing the drive pool and reading/writing to/from the drive-pool from Windows machines. I haven't tested it from NFS yet, although I will soon. I have tested the write balancing in so far as it does put new files on the volume/disk that has the most space if both drives have the same directory structure, and the balancing thing is working quite well. I haven't yet tested to see what happens when one drive is full and there isn't another drive with the same branch (directory), so I don't know what will happen then, but I will test this before we are done.

Anyway, that is all a little way off yet. Right now, I'm still umming-and-ahhing about these config files - what to put in them and the best way to lay them out internally etc. If anyone has any thoughts/comments/suggestions, then please make them now.

Cheers, Brenton
jbv
 
Posts: 600
Joined: Sat Jul 14, 2012 2:02 am
Location: Sydney, Australia

Next

Return to Dev Notes



Who is online

Users browsing this forum: No registered users and 1 guest

cron