|
From: Perry and Lorae Merritt (plmerritt, hypermall dot net) Date: 2001.11.03 - 16.01 MST
X-Mailer: Microsoft Outlook Express 5.50.4522.1200 Wow, quite the list. Here's my comments. You can see them anotated as --------> in the text. If I didn't say anything, it was because I didn't need to or I agreed. P ----- Original Message ----- From: "June Mullins" <junemullins, earthlink dot net> To: <antera, cryptofreak dot org> Sent: Saturday, November 03, 2001 2:05 PM Subject: Opie components Here's a list of components for Opie. I've added some highlights (or lowlights) and some issues. Some of this might look strikingly familiar to you. I have not written a formal ARVM architecture doc yet. The gist of my thoughts about how ARVM and snapshot are implemented are outlined in the bullets. I'm also attaching this as a Word doc for people who prefer it. June 1 Opie Components 1.1 RAID5 · Implements RAID5 in software. · Minimum RAID set size = 2 ISU's. ---------> We'll probably get hit with creating an unprotected single storage unit and then need to have the ability to stripe it if a second storage unit is added. · When new ISU's added, the entire RAID group is morphed · Need to work out data integrity issues around morphing · How to handle different geometries (i.e., adding a bigger disk to a RAID group) ---------> I'd day that for the initial release, we can make it lije every other RAID system. Assume the new storage units are the same size as the older ones. · Always add to existing RAID group or form new one with different geometry? · RAID5 group = logical volume (or part of volume if multiple extents) · Support multiple logical volumes? · Implemented by fixed mapping tables · Any need for multiple extents / volume? · Any need for multiple logical volumes per physical volume? · Do we still want to use MD (RAID0) at the ISU level? ---------> Interesting, need to think about that one. 1.2 NBD · Allows a networked device to be accessed as a block device. · Need to tackle NBD performance issues · Consider iSCSI? 1.3 Snapshot Copy · Snapshots are for backup purposes only · Can we hide snapshots from the user, since we use them only for backup? ---------> I think we can, but don't we need to exspoes it to the toolkit interface anyway so that a backup can instaniate the snapped volume? · Implement via hashed mapping table - mark mapping table of source and snap as snapshot. When receive write request to source, first copy segment for snapshot, then add to hash. Then satisfy write request to source. · Since we are in charge of backup, we delete snapshot as soon as backup is done · What are the interactions between morphing and snapshot? ---------> Doh. Why are you bringing up hard stuff lke this? · Space must be reserved for copy-on-write data · Would it be reasonable to force write-through for copy-on-write data? ---------> Since the volume manager is doing this, isn't it write through by default? I assume the ARVM will need a buffer to read the original data into and then write it to a new place, then write the updated data to the original spot. 1.4 Backups · Initiated one-time by user or as scheduled jobs? ---------> Yes · No HSM yet ---------> Correct · What is the backup medium? If disk, should be RAID5 protected also. 1.5 File System · Needs to support large file systems · Should be journaled · Is ACL support necessary? Posix or Windows ACL's? 1.6 Active / Passive Heads · Need to send config changes from active to passive head · Need to send Snapshot changes from active to passive head · Mapping hash table · Copy-on-write data (or force write-through - see above) ---------> You're not thinking you need to send the copy-on-write data to the passive head are you? 1.7 Configuration Management 1.7.1 Persistent store · Do we need to keep multiple permanent copies of the configuration? ---------> I thinki it needs to be saved in the RAID-protected volume in a well defined place. That way any ISU can pick it up. · Should ISU's store their own configurations (or everybody's configuration)? ---------> There should only be one configuration per system. · Do we allow for spare (or unassigned) ISU's? ---------> Yes · Some configuration data we need: · ISU MAC address, name(?), and IP address ---------> We need to know every path to every other component for redundancy and failure detection. · ISU function (active, unassigned) · If active, which RAID5 group(s) is ISU part of? · Each ISU should also know at least the head's IP address or something to identify its head. ---------> Or heads 1.7.2 Existing device discovery · There needs to be a mechanism for devices that are already configured to announce themselves. · Needed both at boot up time and when a previously broken ISU is now functional. 1.7.3 New device discovery and configuration · New, not yet configured devices need to be able to announce themselves. The head has to be able to recognize that a new device has been added to the system. · How do we do this? We can make the head a DHCP server and use DHCP - but we were not able to figure out how to get from DHCP discovering a new device to getting that fact over to our code. ---------> Could we do a couple of things: ---------> When a new device comes up, have it broadcast a message looking for a head. ---------> In the event it comes up before the head and the broadcast is not heard, can we have the head broadcast w search message whe it comes up? Have it listen for a head to respond · How about writing our own equivalent of a simplified DHCP - simply have the ISU's broadcast to a well-know port behind which is our code? Our code would dole out from a (user-specified?) list of reserved IP addresses. Since the assumption is that the ISU's are isolated behind a switch, this should work. ---------> That might work too. · We might also make the user configure the ISU IP addresses via something like the serial port. ---------> Kinda gross 1.7.4 Configuration of heads · How do heads get their IP addresses? ---------> Can we do an IPSET kinda thing? · Need to configure which is active and which is passive ---------> Done through the GUI 1.8 System Initialization · Each ISU can boot independently, but the file system cannot be started until sufficient ISU's are operational. 1.8.1 ISU initialization · Each ISU starts its own RAID0 configuration (if used). · Then starts its nbd servers (if used). · Then report status to the head. 1.8.2 Active Head Initialization · Once sufficient ISU's operational, the active head starts its nbd clients. · Then activates its RAID5 volumes · Then activates share export. 1.8.3 Passive Head Initialization ---------> I think this is the same as the active head. · ????? 1.9 Status Monitor · Responsible for detecting changes in ISU status, or ISU status transitions. · Responsible for detecting failures in active or passive head?? ---------> Both, the active head needs to know if the passive head is Abbey Normal as does the passive need to know about the active head. · Individual ISU's monitor their own status (Node booting, normal, hard disk failure, MD broken, NBD server or client down) · Network connectivity is monitored (or do we just rely on the nbd layer for this?) ---------> We're probably going to have to do a fair amount of testing to make sure the NBD is solid. · Nodes report their status to big kahuna status monitor (maybe recovery component). 1.10 Recovery · Responsible for initiating actions to recover from "degraded" mode. · If hot spare available, it must activate hot spares automatically after a user-defined timeout period. · Responsible for initiating failover to passive node if active node fails · CIFS issues - in-flight transactions lost? ---------> Are you talking about losing a head? · Buffer cache issues 1.11 Dual Head Synchronization · Responsible for sending configuration updates (and buffer cache updates?) from active to passive head 1.12 Shares Management · What types of shares must we support? CIFS? NFS (2 or 3)? Apple? ---------> Yep · Support CIFS via security=domain? ---------> Yep 1.13 GUI · What needs to be supported in the GUI? · Head configuration (active or passive) · Add ISU (and make it active or unassigned) · Administer backups · Administer shares · Display status · Display storage usage · Display logs · ????? -- This is the antera mailing list. To unsubscribe, email majordomo, cryptofreak dot org with message body `unsubscribe antera'. Or, for more information, visit http://www.cryptofreak.org/.
This archive was generated by hypermail 2b30 : 2001.11.04 - 04.02 MST |