projects

	Agenda
	Antera
	Commentator
	fcreate
	Linux Porting
	mod-chal
	Quake III
	Zope

From: June Mullins (junemullins, earthlink dot net)
Date: 2001.11.03 - 14.05 MST

Next message: Perry and Lorae Merritt: "Re: ARVM discussion"
Previous message: June Mullins: "Re: ARVM discussion"
Next in thread: Brian Kreider: "RE: Opie components"
Reply: Brian Kreider: "RE: Opie components"
Reply: Perry and Lorae Merritt: "Re: Opie components"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Here's a list of components for Opie. I've added some highlights (or 
lowlights)
and some issues. Some of this might look strikingly familiar to you.

I have not written a formal ARVM architecture doc yet. The gist of my 
thoughts
about how ARVM and snapshot are implemented are outlined in the bullets.

I'm also attaching this as a Word doc for people who prefer it.

June

1 Opie Components
1.1 RAID5
ｷ Implements RAID5 in software.
ｷ Minimum RAID set size = 2 ISU's
ｷ When new ISU's added, the entire RAID group is morphed
ｷ Need to work out data integrity issues around morphing
ｷ How to handle different geometries (i.e., adding a bigger disk to a 
RAID group)
ｷ Always add to existing RAID group or form new one with different geometry?
ｷ RAID5 group = logical volume (or part of volume if multiple extents)
ｷ Support multiple logical volumes?
ｷ Implemented by fixed mapping tables
ｷ Any need for multiple extents / volume?
ｷ Any need for multiple logical volumes per physical volume?
ｷ Do we still want to use MD (RAID0) at the ISU level?
1.2 NBD
ｷ Allows a networked device to be accessed as a block device.
ｷ Need to tackle NBD performance issues
ｷ Consider iSCSI?
1.3 Snapshot Copy
ｷ Snapshots are for backup purposes only
ｷ Can we hide snapshots from the user, since we use them only for backup?
ｷ Implement via hashed mapping table - mark mapping table of source and 
snap as snapshot. When receive write request to source, first copy 
segment for snapshot, then add to hash. Then satisfy write request to 
source.
ｷ Since we are in charge of backup, we delete snapshot as soon as backup 
is done
ｷ What are the interactions between morphing and snapshot?
ｷ Space must be reserved for copy-on-write data
ｷ Would it be reasonable to force write-through for copy-on-write data?
1.4 Backups
ｷ Initiated one-time by user or as scheduled jobs?
ｷ No HSM yet
ｷ What is the backup medium? If disk, should be RAID5 protected also.
1.5 File System
ｷ Needs to support large file systems
ｷ Should be journaled
ｷ Is ACL support necessary? Posix or Windows ACL's?
1.6 Active / Passive Heads
ｷ Need to send config changes from active to passive head
ｷ Need to send Snapshot changes from active to passive head
ｷ Mapping hash table
ｷ Copy-on-write data (or force write-through - see above)
1.7 Configuration Management
1.7.1 Persistent store
ｷ Do we need to keep multiple permanent copies of the configuration?
ｷ Should ISU's store their own configurations (or everybody's 
configuration)?
ｷ Do we allow for spare (or unassigned) ISU's?
ｷ Some configuration data we need:
ｷ ISU MAC address, name(?), and IP address
ｷ ISU function (active, unassigned)
ｷ If active, which RAID5 group(s) is ISU part of?
ｷ Each ISU should also know at least the head's IP address or something 
to identify its head.
1.7.2 Existing device discovery
ｷ There needs to be a mechanism for devices that are already configured 
to announce themselves.
ｷ Needed both at boot up time and when a previously broken ISU is now 
functional.
1.7.3 New device discovery and configuration
ｷ New, not yet configured devices need to be able to announce 
themselves. The head has to be able to recognize that a new device has 
been added to the system.
ｷ How do we do this? We can make the head a DHCP server and use DHCP - 
but we were not able to figure out how to get from DHCP discovering a 
new device to getting that fact over to our code.
ｷ How about writing our own equivalent of a simplified DHCP - simply 
have the ISU's broadcast to a well-know port behind which is our code? 
Our code would dole out from a (user-specified?) list of reserved IP 
addresses. Since the assumption is that the ISU's are isolated behind a 
switch, this should work.
ｷ We might also make the user configure the ISU IP addresses via 
something like the serial port.
1.7.4 Configuration of heads
ｷ How do heads get their IP addresses?
ｷ Need to configure which is active and which is passive
1.8 System Initialization
ｷ Each ISU can boot independently, but the file system cannot be started 
until sufficient ISU's are operational.
1.8.1 ISU initialization
ｷ Each ISU starts its own RAID0 configuration (if used).
ｷ Then starts its nbd servers (if used).
ｷ Then report status to the head.
1.8.2 Active Head Initialization
ｷ Once sufficient ISU's operational, the active head starts its nbd 
clients.
ｷ Then activates its RAID5 volumes
ｷ Then activates share export.
1.8.3 Passive Head Initialization
ｷ ?????
1.9 Status Monitor
ｷ Responsible for detecting changes in ISU status, or ISU status 
transitions.
ｷ Responsible for detecting failures in active or passive head??
ｷ Individual ISU's monitor their own status (Node booting, normal, hard 
disk failure, MD broken, NBD server or client down)
ｷ Network connectivity is monitored (or do we just rely on the nbd layer 
for this?)
ｷ Nodes report their status to big kahuna status monitor (maybe recovery 
component).
1.10 Recovery
ｷ Responsible for initiating actions to recover from "degraded" mode.
ｷ If hot spare available, it must activate hot spares automatically 
after a user-defined timeout period.
ｷ Responsible for initiating failover to passive node if active node fails
ｷ CIFS issues - in-flight transactions lost?
ｷ Buffer cache issues
1.11 Dual Head Synchronization
ｷ Responsible for sending configuration updates (and buffer cache 
updates?) from active to passive head
1.12 Shares Management
ｷ What types of shares must we support? CIFS? NFS (2 or 3)? Apple?
ｷ Support CIFS via security=domain?
1.13 GUI
ｷ What needs to be supported in the GUI?
ｷ Head configuration (active or passive)
ｷ Add ISU (and make it active or unassigned)
ｷ Administer backups
ｷ Administer shares
ｷ Display status
ｷ Display storage usage
ｷ Display logs
ｷ ?????

application/msword attachment: OpieComponents.doc

-- This is the antera mailing list. To unsubscribe, email majordomo, cryptofreak dot org with message body `unsubscribe antera'. Or, for more information, visit http://www.cryptofreak.org/.

Next message: Perry and Lorae Merritt: "Re: ARVM discussion"
Previous message: June Mullins: "Re: ARVM discussion"
Next in thread: Brian Kreider: "RE: Opie components"
Reply: Brian Kreider: "RE: Opie components"
Reply: Perry and Lorae Merritt: "Re: Opie components"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

This archive was generated by hypermail 2b30 : 2001.11.04 - 04.02 MST