Skip to main content

Sun cluster. SVM state databases

1. Place the cluster device group in maintenance state
# scswitch -m -D <device-group-name>
   

2. Manually takeover the diskset.
# metaset -C take -f -s <disk-set-name>
   
3. Check the diskset rdb.
# metadb -i -s  <disk-set-name>
flags           first blk       block count
a m     luo t      16              1034            /dev/did/dsk/d2s7
a       luo t      1050            1034            /dev/did/dsk/d2s7
M                  unknown         unknown         /dev/did/dsk/d3s7
M                  unknown         unknown         /dev/did/dsk/d3s7 
So half of the replica disks is down:
4. The quickest way to fix this is to delete the faulty disk (d3s7)
from the rdb - in effect we are making the replica 100% available from
the good disk (d2s7).
  
# metadb -s<disk-set-name> -d /dev/did/dsk/d3s7
 
NOTE: To find out which disk is d3s7. Go to /cluster/did/
  
# scdidadm -L | grep d3
   

3        node2:/dev/rdsk/c2t0d0   /dev/did/rdsk/d3
3        node1:/dev/rdsk/c2t0d0   /dev/did/rdsk/d3 

5. Release the diskset. 
# metaset -C release -s<disk-set-name>
6. Bring the device group back online.
  
# scswitch -z -D<device-group-name>-h node1
 
7. The cluster should come up all the way.
Do additional commands to let the other node join the cluster.
8. Once the T3 is fixed. Let LVM re-scan the disks .
NOTE: metastat -i is available from LVM 4.2.1 and up.
# metastat -i -s <disk-set-name>

9. Add a new rdb on d3s7.
# metadb -s <disk-set-name> -a -c 2 /dev/did/dsk/d3s7
   
10. Re-enable ALL the diskset volumes that are in "Needs maintenance" state:
  
# metareplace -s <disk-set-name> -e d1 /dev/did/rdsk/d3s0
  
============================================================
Recovery procedure if the diskset is used outside of SunCluster:
1. Force a takeover the diskset:
# metaset -s <disk-set-name> -t -f
2. Delete the faulty replica:
  # metadb -s <disk-set-name> -d
3.Release the diskset:
# metaset -s <disk-set-name> -r
4. Takeover the diskset normally:
   # metaset -s <disk-set-name> -t
5. Fix the disk problem and follow steps 8 - 10 above

from dhelios.blogspot.com

Comments

Popular posts from this blog

memory error detect XSCF uboot

If you see something like this when you poweron you server: memory error detect 80000008, address 000002d0 data 55555555 -> fbefaaaa capture_data hi fbefaaaa lo deadbeef ecc 1b1b capture_attributes 01113001 address 000002d0 memory error detect 80000008, address 000002d4 data aaaaaaaa -> deadbeef capture_data hi fbefaaaa lo deadbeef ecc 1b1b capture_attributes 01113001 address 000002d4 memXSCF uboot  01070000  (Feb  8 2008 - 11:12:19) XSCF uboot  01070000  (Feb  8 2008 - 11:12:19) SCF board boot factor = 7180     DDR Real size: 256 MB     DDR: 224 MB Than your XSCF card is broked. Replace it with new one. After that it will ask you for enter chassis number - located at front of the server XSCF promt to enter your chasses number ( is a S/N of your server ): Please input the chassis serial number : XXXXXXX 1:PANEL Please select the number : 1 Restoring data from PANEL to XSCF#0. Please wait for several minutes ... setdefaults : XSCF clear : start ......

FOS Password recovery (Brocade Fabric OS Switch Password recovery procedure)

Password recovery using root account If you have access to the root account, you can reset the passwords on the switch to default. This feature is available for all currently supported versions of the Fabric OS. Follow the below steps to reset any account password from the root account. 1. Open a CLI session (serial or telnet for an unsecured system and sectelnet for a secure system) to the switch. 2. Log in as root. 3. At the prompt, enter the passwddefault command as shown below: switch:root> passwddefault 4. Follow the prompts to reset the password for the selected account. For example: switch:root> passwddefault All account passwords have been successfully set to factory default. Once the passwords have been reset, log into the switch as admin, and modify your default passwords. Make sure to keep a hardcopy of your switch passwords in a secure location. The default passwords for Fabric OS switches are: Root fibranne Adminpassword Userpassword Password r

SPARC OBP cheatsheet

Boot PROM Basics Boot PROM(programmable read only memory): It is a firmware (also known as the monitor program) provides: 1. basic hardware testing & initialization before booting. 2. contains a user interface that provide access to many important functions. 3. enables the system to boot from wide range of devices. It controls the system operation before the kernel becomes available. It provides a user interface and firmware utility commands known as FORTH command set. These commands include the boot commands, the diagnostic commands & the commands for modifying the default configuration. Command to determine the version of the Open Boot PROM on the system: # /usr/platform/'uname -m'/sbin/prtdiag -v (output omitted) System PROM revisions: ---------------------- OBP 4.16.4 2004/12/18 05:21 Sun Blade 1500 (Silver) OBDIAG 4.16.4.2004/12/18 05:21 # prtconf -v OBP 4.16.4 2004/12/18 05:21 Open Boot Architectures Standards: It is based on IEEE standard #1275, accord