A horrible but functional way to shut down a home lab for the evening

stackoverflow

Warning: the following barf– er, bash script might make you ill.

Over the last couple of days I’ve been trying to figure out ways to improve my home lab experience, including automating the tedious shutdown procedure. Most ESXi environments are 24/7, but of course with a home lab there is typically no such need. The trick is to power off the guests gracefully with minimal effort. Enter: scripting!

I currently have five VMs deployed in my lab: two are powered off, one is an embedded ESXi host (that currently runs zero guests), one is a vCenter Server Appliance, and one is the vSphere Management Assistant. The vMA is where all the scripting magic occurs.

The script that I have come up with is admittedly very bad.

#!/bin/bash
# Shutdown script for home lab
#set -x

IFS=""
OUTPUT=( $(esxcli --sessionfile ~/tali01-session vm process list | grep "/vmfs" | cut -c17-256) )
OUTPUTARR=()

while read -r line; do
    OUTPUTARR=("${OUTPUTARR[@]}" "$line")
done <<< "$OUTPUT"

while true; do
    read -p "Are you sure you want to shut down the lab (Y/N)? " yn
    case $yn in
        [Yy]* )
            echo "Shutting down vSphere lab..."
            for ((i = 0; i < ${#OUTPUTARR[@]}; i++))
            do
                echo ${OUTPUTARR[$i]}
                vmware-cmd --sessionfile ~/tali01-session ${OUTPUTARR[$i]} getstate
                vmware-cmd --sessionfile ~/tali01-session ${OUTPUTARR[$i]} stop soft
                vmware-cmd --sessionfile ~/tali01-session ${OUTPUTARR[$i]} getstate
                sleep 60
            done; exit;;
        [Nn]* )
            echo "Exiting."
            exit;;
        * ) echo "Please answer yes or no.";;
    esac
done

What the script does is this: it invokes the special IFS variable in bash so that it doesn’t choke on whitespace, then it run grabs the output of esxcli. This is where it starts getting awful. –sessionfile is something you can use to authenticate to an ESXi host without having to repeatedly type in a username and password. Here’s the problem: it expires if you don’t use the file within 30 minutes. Already we encounter problem one: you have to generate a new sessionfile every time you want to use this script by way of using –savesessionfile.

Problem number two immediately follows. We’re grepping for config files that we can use in order to power down VMs. I use cut in order to choose the exact part of the string that I want to follow. This is a horrendous way of grabbing a part of a string, because if the output ever changes in how esxcli vm process list works, you’ll get an erroneous cut back. Furthermore, notice that I cut up to character 256… what if our vmx file goes well beyond that by even a single character?

Up next: reading a multi-line output into an array. I banged my head against this part for the last two days. It should be so simple, and ultimately it was a modification of the ubiquitous “x = x + 1” or “x++” method. Take an empty but initialized array, add a single read line, and append it to the array. Repeat until done.

Next up: the actual shutdown. Prompt the user to make he or she actually wants to shut down the lab. If affirmative, start walking through each config file stored in the array and softly power off the guest, waiting 60 seconds before looping back to the beginning. Repeat for each system.

That’s it. The concept is simple enough but there are a few other reasons why the script sucks, such as the fact that it shuts down the very guest it exists on, and there’s no intelligence behind what it shuts down first; it so happens to collect the config files in the order that I want them to shut down but that is horribly hacky and not scalable whatsoever.

I had a good friend of mine (whom is a bazillion times the programmer that I am) look the script over and said that “you told bash to stop being bash”. I thought it was funny.

So, where do we go from here?

  1. Figure out a way to prioritize the shutdown of certain VMs. I have barely scratched the surface of the commands available on vMA, so there’s got to be a way to tag or otherwise identify VMs for position in a shutdown list.
  2. Port this to PowerCLI or figure out some way to make it so that the very VM it runs on isn’t shut down when executing the script.
  3. Come up with options, such as “fast” shutdown where the sleep time is 30 seconds or less, “hard” shutdown where we want everything to go down as quickly as possible (ie yank the power cord), and so on.
  4. The authentication method with the ESXi server is downright miserable. Use something like –credstore instead to make the authentication persistent.

If you have any suggestions that focus on either features or scripting, let me know!

-ajs