πŸ’Ύ Rebooting Petabyte Control Node πŸ’Ύ

am rebooting one of the control nodes for a petabyte+ storage array, after 504 days of system uptime..

watching kernel log_level 6 debug info scroll by on the SoL terminal via iDrac..

logs scrolling, the array of SAS3 DE3-24C double-redundant SFF linked Oracle/Sun drive enclosures spin-up and begin talking to multipathd...

waiting for Zpool cache file import..

waiting.. 131 / 132 drives online across all enclosures.. hmm.. what's this now...

> transport_port_remove: removed: sas_addr(0x500c04f2cfe10620)

well ffs πŸ˜’

> 12:0:10:0: SATA: handle(0x0017), sas_addr(0x500c04f2cfe10620), phy(32),

oh, that's a SATA drive on the system's local enclosure bay for scratch data, it's not part of the ZFS pool.. 😌

next step, not today, move control nodes to a higher performance + lower wattage pair of FreeBSD servers πŸ’—

terminal screen showing the storage node's system uptijme prior to a service / maintenance reboot. text shows 504 days of uptime with load average: 33.12, 32.96, 32.94terminal screen showing a ZFS pool with many SAS3 drives in draid1, with dual optane nvme cache drives + dual deduplication drives and dual spare drives. top of screen shows a system log with some SAS related info from the kernel.terminal screen showing kernel logs from SAS3 enclosure debugging, post-reboot stageterminal screen showing the same ZFS pool, now having been imported after the node reboot cycle. scrub resilvered nice and quickly, no errors and all spares available.
0

If you have a fediverse account, you can quote this note from your own instance. Search https://mastodon.bsd.cafe/users/winterschon/statuses/114265209213858196 on your instance and quote it. (Note that quoting is not supported in Mastodon.)