Sun cluster. SVM state databases

1. Place the cluster device group in maintenance state

# scswitch -m -D <device-group-name>

   

2. Manually takeover the diskset.

# metaset -C take -f -s <disk-set-name>

3. Check the diskset rdb.

# metadb -i -s  <disk-set-name>

flags           first blk       block count
a m     luo t      16              1034            /dev/did/dsk/d2s7
a       luo t      1050            1034            /dev/did/dsk/d2s7
M                  unknown         unknown         /dev/did/dsk/d3s7
M                  unknown         unknown         /dev/did/dsk/d3s7

So half of the replica disks is down:

4. The quickest way to fix this is to delete the faulty disk (d3s7)
from the rdb - in effect we are making the replica 100% available from
the good disk (d2s7).

# metadb -s<disk-set-name> -d /dev/did/dsk/d3s7

NOTE: To find out which disk is d3s7. Go to /cluster/did/

# scdidadm -L  | grep d3

   

3        node2:/dev/rdsk/c2t0d0   /dev/did/rdsk/d3
3        node1:/dev/rdsk/c2t0d0   /dev/did/rdsk/d3

5. Release the diskset.

# metaset -C release -s<disk-set-name>

6. Bring the device group back online.

# scswitch -z -D<device-group-name>-h node1

7. The cluster should come up all the way.
Do additional commands to let the other node join the cluster.

8. Once the T3 is fixed. Let LVM re-scan the disks .

NOTE: metastat -i is available from LVM 4.2.1 and up.

# metastat -i -s <disk-set-name>

9. Add a new rdb on d3s7.

# metadb -s <disk-set-name> -a -c 2 /dev/did/dsk/d3s7

10. Re-enable ALL the diskset volumes that are in "Needs maintenance" state:

# metareplace -s <disk-set-name> -e d1 /dev/did/rdsk/d3s0

============================================================

Recovery procedure if the diskset is used outside of SunCluster:

1. Force a takeover the diskset:

# metaset -s <disk-set-name> -t -f

2. Delete the faulty replica:

  # metadb -s <disk-set-name> -d

3.Release the diskset:

# metaset -s <disk-set-name> -r

4. Takeover the diskset normally:

   # metaset -s <disk-set-name> -t

5. Fix the disk problem and follow steps 8 - 10 above

from dhelios.blogspot.com

Yet another Solaris user

Search This Blog

Sun cluster. SVM state databases

Comments

Post a Comment

Popular posts from this blog

FOS Password recovery (Brocade Fabric OS Switch Password recovery procedure)

memory error detect XSCF uboot

SPARC OBP cheatsheet