Instrument Rating: Approaches

When one thinks of what an instrument rating entails, two things are likely come to mind:

  • Flying in the clouds
  • Instrument approaches

An approach is the “fun part” of an instrument rating. It is, after all, one of the most tangible new topics that an instrument student learns as part of their training. The student learns how to read, brief, and interpret an approach plate, and determines from that plate what course to fly, what altitudes to hold, what ground speed to maintain, and what to do if the approach needs to be called off (essentially the equivalent of a go-around, called the “missed approach”). The pilot workload is high during an instrument approach, but with training and a strong basic attitude instrument flying skillset, they can be done confidently and safely.

CDIs and Glideslopes: Understanding the different types of approaches

Pilots will frequently refer to two different sorts of approaches: non-precision approaches and precision approaches. A non-precision approach refers to one that provides lateral guidance only. Lateral course correction can be provided in a variety of ways — VOR or NDB transmitters, localizers, and certain RNAV (GPS) approaches all provide this capability. The way that an aircraft descends to a landing is through the use of predefined stepdown altitudes at waypoints along the approach. These waypoints can be identified through a GPS navigator, a DME readout, or a supporting VOR.

A CDI containing lateral and vertical guidance needles

A precision approach, on the other hand, is an instrument approach that provides both lateral and vertical guidance. This is done through the addition of a glideslope on the CDI, and the objective is to keep both needles centered. An ILS (instrument landing system) is one such example of a precision approach. A PAR approach is another such approach, where a controller guides the aircraft down using radio callouts. These approaches are rare, but do exist.

If you’d like some trivia to impress (or annoy) your friends at the airport, there is a third distinct category of approach, as defined by the FAA: an APV approach, or Approach Procedure with Vertical guidance. A GPS approach with a glideslope (ie an LPV approach) is NOT a precision approach but rather an APV approach. The reason for this distinction is because ICAO defines a precision approach as a very specific thing, and it was ultimately easier for the FAA to create a third category of approach rather than try to retrofit RNAV approaches to fit into the legacy category of “precision”.

If you have read the ACS, you may recall that an applicant must accomplish at least two non-precision approaches and one precision approach, and “acceptable instrument approaches for this [precision] part of the practical test are the ILS and GLS [approaches].” Thankfully, the FAA allows for an LPV approach as well by stating: “..if the installed equipment and database is current and qualified for IFR flight and approaches to LPV minima, an LPV minima approach can be flown to demonstrate precision approach proficiency if the LPV DA is equal to or less than 300 feet HAT [height above terrain].”

So for all intents and purposes, LPV approaches can be treated much in the same way as an ILS is: a very precise approach that gets you down low.

Approach Plates

The VOR-B approach into KSFZ

Let’s look at the VOR-B approach into KSFZ for a reference. We won’t go over absolutely everything, but cover the important parts the apply to civilian aircraft. The way we brief an approach plate like this is top to bottom, right to left.

  • The top briefing strips tells us that the VOR used for navigation is the VOR/DME at Putnam, channel 117.4.
  • The approach course to be dialed into the CDI and flown is the 111 degree radial FROM the PUT VOR.
  • The “T” and “A” embedded in a triangle refers to the fact that non-standard takeoff and alternate requirements exists. The local altimeter setting is required to fly down to 980 feet MSL after the final approach fix, otherwise the Providence altimeter setting must be used and the minimum descent altitude must be increased to 1080 feet MSL. Furthermore, if the approach is made during night, landing on Runway 33 or 15 is not authorized.
  • The missed approach instructions are to make a climbing right turn to 2500 feet MSL via the Norwich 057 degree VOR, and hold at the FOSTY intersection.
  • The plan view shows “the big picture” of the approach. We start the approach at the initial approach fix at the Putnam VOR and it terminates with the missed approach point at JOPVO. Each of the waypoints can be identified in a variety of ways: DME distance from the PUT VOR, identification of VOR radials from the Providence transmitter along the approach course, or using a GPS navigator.
    • The inverted “V” with a dot in the middle represents obstructions. There aren’t a ton on this approach, but one in particular jumps out at us: right near the MAP near the airport! The altitude listed is 573 — thankfully, this is in MSL, not AGL, which means that if we remain at or above our MDA of 980 MSL, we are assured 500 feet of clearance.
    • The circle in the top right corner of the plan view shows the Minimum Sector Altitude in a 25 NM radius around the Putnam VOR. This is a reference that provides us the minimum altitude to avoid all obstacle clearance in the areas depicted.
    • The text in the top left corner of the plan view states that the VOR-B approach is not authorized for aircraft arriving at the PUT VOR via the V146-151 airway. Why is this? It’s due to a national policy called the United States Standard for Terminal Instrument Procedures, also known as “TERPS”. TERPS 2-4-1.a states: “When the IAF is part of the en route structure, the angle of intersection between the en route structure and a ground-based initial approach segment course must not exceed 120 degrees. For RNAV routes, apply Order 8260.58.” What this means is that the turn from V146-151 northwest-bound to the approach course is too sharp, and thus not authorized. What this effectively means is that pilots wishing to execute the VOR-B approach should do so via radar vectors.
    • In the bottom left corner of the plan view, an alternate missed approach fix exists. Instead of holding at the FOSTY intersection, a pilot may elect to hold back at the PUT VOR, allowing for positioning to reattempt the approach.
    • The bottom left corner of the approach plate shows a zoomed in view of the airport, and where the approach terminates. Assuming the approach is flown correctly, the pilot should find themselves approximately midfield of the 5/23 runway. The distance between the final approach fix and the missed approach point is 4.9 NM, and if flown at 90 knots groundspeed, should take three minutes and sixteen seconds.
  • The bottom right corner of the plate shows the profile view. This view shows the approach course to be flown (111 degrees FROM PUT) and the stepdown altitudes. The line underneath the altitude shows the minimum descent altitude for that segment — beginning the approach and until intercepting the BIRDS intersection, the pilot must not descend below 2500 feet. After BIRDS, the pilot may descend no lower than 1800 feet. At the final approach fix, identified on the profile view with a Maltese Cross, the pilot may descend no lower than 980 feet MSL (539 feet AGL) with 1 mile of visibility. Even when the circling approach may be executed, the aircraft may not descend lower than the MDA until three conditions are met:
    1. The aircraft is continuously in a position from which a descent to a landing on the intended runway can be made at a normal rate of descent using normal maneuvers.
    2. The flight visibility (or enhanced flight visibility, if so equipped) is not less than the visibility prescribed in the standard instrument approach being used.
    3. At least one authorized visual reference for the intended runway is distinctly visible and identifiable to the pilot, such as the threshold markings, lights, or threshold itself, the runway lights, the touchdown zone, etc. If the approach light system is in view, the pilot may descend to 100 feet above the touchdown zone, only if the the red terminating bars or the red side row bars are also visible and identifiable.
  • Finally, a short reminder of the missed approach procedure is listed in the profile view.
The VOR-B approach into KSFZ is not authorized if flying northwest bound on the V146-151-405 Victor airway, due to TERPS restrictions.

As mentioned earlier, this particular type of approach doesn’t provide the pilot with any vertical guidance, hence the “non-precision” nature of the approach. Let’s look at another plate, this time being a precision approach.

Introducing the Glideslope and DAs

ILS RWY 5 at KPVD

With an ILS approach, the glideslope is introduced to the pilot to guide them down from the final approach fix. We brief the approach in much the same manner as we did the VOR-B into KSFZ by reviewing the briefing strips, the plan view, and the profile view of the plate. Some key differences between the VOR-B and this approach:

  • A procedure turn exists if intercepting the localizer between CUTSI and KENTE intersection. However, if we are on radar vectors to the FAF (which will almost certainly will be), we will not fly the procedure turn.
  • The glideslope should be intercepted and alive at CUTSI. At this point, we may fly the glideslope, but we may not descend below 740 feet until we intercept FEXUX. After FEXUX, we may descend to an altitude of 253 feet MSL. At this time, we MUST execute a missed approach if not able to meet the three criteria listed previously (position to land normally, visibility adequate, at least one visual reference). This is why the DA is called a “decision height”.
  • There is a V denoted on the profile view 2.8NM DME out from the localizer. This is the Visual Descent Point. This would apply if we were flying the LOC approach, but doesn’t apply to the ILS approach. A VDP is a point on the final approach course of a non-precision straight-in approach from which normal descent from the MDA to the runway touchdown point may begin provided adequate visual reference is established. We’ll go over VDPs in a separate post in the future, as there is a lot of confusion surrounding what they are and what they do.

The main advantage of a precision approach versus a non-precision approach is that with a precision approach, you have an established, stabilized approach for most of the approach. The Instrument Procedures Handbook states:

“A constant-rate descent has many safety advantages over non-precision approaches that require multiple level-offs at stepdown fixes or manually calculating rates of descent. A stabilized approach can be maintained from the FAF to the landing when a constant-rate descent is used. Additionally, the use of an electronic vertical path produced by onboard avionics can serve to reduce CFIT, and minimize the effects of visual illusions on approach and landing. Some countries even mandate the use of continuous descent final approaches (CDFAs) on non-precision approaches.”

Conclusion

There is a lot of information on approach plates, but the information is always presented in a logical, standardized manner. It is always wise to brief approaches before takeoff and during cruise so that we are not rushing to familiarize ourselves with this critical and information-rich chart. Given enough practice, an instrument student will find themselves able to read plates quickly and effectively so that any approach will make sense to them.

Instrument Rating Training: Honing the BAI-Blade and Introducing Approaches

Things are starting to get a little more interesting!

The following two weeks of instrument training since beginning on the 13th of June were very productive! In fact, since the first lesson on basic attitude flying, I managed to fly (or log time in a sim) nine times within a two week period. Paradoxically, the best way to save time and money while pursuing any rating or certificate is to fly as often as your schedule and instructor availability allow for. By building on top of previous lessons quickly, you forget less over time and refine technique without letting the concepts lapse. (In psychology, this is referred to as the Ebbinghaus forgetting curve and is a fundamental topic that is taught to flight and ground instructors. We’ll cover that one at another time…)

The way that Doug has been introducing new instrument flying concepts is layering something new on top of old topics. So, day one was all about basic attitude flying: turns and basic navigation with reference only to the instruments. Then we started introducing wind correction and course tracking — keeping the CDI needle aligned and centered under less-than-perfect conditions (hint: it’s always less than perfect out… especially with IFR weather!).

Once Doug was satisfied enough with my BAI flying and being able to control the airplane with foggles on, it was time to introduce basic procedures. The first procedure we started with were holds and hold entries. The FAA makes a big production out of the proper entry type for holds on the written, but holds are also important in the sense that procedure turns in approaches are basically the same thing as a hold. Holds aren’t terribly difficult in of themselves; unless otherwise stated, a hold has a fix and the legs are supposed to be one minute in length each. The GPS in my plane makes holds incredibly easy to manage, but we still timed each leg one minute using a timer before conducting a standard rate turn 180 degrees. One thing that I need to get better at is tightening up the inbound leg turn so that I’m not overshooting into the nonholding side…

Anatomy of a hold. Things get more interesting when you factor in wind correction.

After introducing altitude changes into holds, we reviewed the VOR-A approach into Danielson. My plane has two VORs in addition to the GPS, so it makes for a very capable instrument trainer. I actually like VOR approaches, as I learned how to fly them back when I had a Cessna 172K with only two VOR receivers.

As of mid-2020, KLZD only has a single instrument approach (the VOR-A depicted here).

The VOR-A into KLZD is relatively simple on paper, but it’s actually a somewhat challenging approach. You start with a hold at the Putnam VOR no lower than 2600 feet (ie, a procedure turn), and then you turn inbound to the 211 approach course. This point represents your final approach fix and you descend to an MDA of 1120 feet MSL, with obstructions all around. The approach dumps you perpendicular to the runway and your only option for landing is to circle to either 13 or 31. A missed approach has you climb back to 2600 and head back to the Putnam VOR and hold.

We were to fly this approach in my plane, but a thunderstorm popped up right in our flight path before we took off (realities of summer). Thankfully, AirVentures has a basic aviation training device — in other words, a certified computer with a flight simulator program on it. A BATD can be used for up to ten hours of simulated instrument training, so if the weather is too bad for us to go flying, we can still get useful time out the day! I shot the approach into Danielson using the sim and worked on VOR tracking.

Doug had to go on a multi-day trip, so he left me to my own devices for a few days. You need forty hours of simulated instrument time to take the checkride but twenty five of those hours can be accumulated with a safety pilot — in other words, it doesn’t have to be done with an instructor, but it can be done with other private pilots. I have two friends (both commercial pilots and instrument rated) that are happy to provide safety pilot time, and once I had something to work on (ie, the approaches), it was a no-brainer to get them engaged.

Holding altitude, being precise, and demanding excellence

While flying with my friend Bill, we shot two VOR approaches (VOR-A LZD, VOR-B SFZ) and a localizer approach (LOC 05 SFZ) as well as two RNAV approaches into 23 SFZ (ignoring the glideslope and using LNAV minimums). During this time, I got myself down to the MDA, but flew below the MDA a handful of times. The MDA is a minimum descent altitude, meaning that by busting through the altitude assigned to the MDA, I would flunk the checkride. The ACS states the following:

“For the final approach segment, maintain no more than a 3/4-scale deflection of the CDI, maintain airspeed +/- 10 knots, and altitude, if applicable, above MDA, +100/-0 feet, to the Visual Descent Point (VDP) or Missed Approach Point (MAP).”

The ACS (Airman Certification Standards) has replaced the old PTS for most certificates and ratings. It prescribes the expected knowledge, risk assessment, and skills expected from an airman applicant, and is in essence an open-book test. No surprises, which means no excuses…

A side note about a pilot’s certificate being “a license to learn”: it’s critical that as pilots, we demand excellence from ourselves and not be complacent with lax tolerances. In VFR flying, being sloppy with your altitude or heading leads to sloppy performance further into the flight. In IFR flying, sloppy flying will get you violated or worse, careening into terrain or other obstruction. The message here is even if you’re not a professional pilot, strive for professionalism. I’m certainly not perfect at this, nor is any weekend warrior private pilot, but the constant pursuit should be ever present. We owe it to ourselves and to our passengers, and in fact, this is why I am pursuing an instrument rating in the first place.

Conclusion

The biggest takeaway that I have had so far is that the parts that will eat non-proficient instrument pilots alive are the transitions. Takeoff to cruise is a transition, cruise to approach is a transition, and approach to landing is a transition. Each one of these segments has a sudden burst of workload and it’s up to the instrument pilot to manage them appropriately. The biggest and most important task, however, is to maintain positive aircraft control through these transitions. That is why there has been so much emphasis on basic attitude instrument flying, so far. As the time and lessons have progressed, I have been tasked with doing more and more items that an instrument pilot would be expected to do. The “big picture” is starting to come into focus as I become more proficient and comfortable with being a pilot without reference to the ground. Stay tuned for what comes next: flying in the actual clouds…

Instrument Rating Training: Learning to Fly All Over Again!

One of my 2020 New Years Resolutions was to finally stop talking about how I wanted to get an instrument rating and to actually go and get it done. With the COVID-19 pandemic injecting chaos across the world, cancelling all of my work trips and with me working from home, this seemed like the ideal time to make this happen. My plane came out of annual in April, I flew for a couple of months, and decided to make good on my journey.

A little background: I had actually started training for the instrument rating about five years prior, going so far as to take the written exam and to start taking some lessons. My head was unfortunately not in the game as much as it should have been, and I was not in a position to fly as often as the instrument rating training regiment really demands. The adage of “fly as often as you can” rings true. So, I let my training lapse and pretty much had to start from the beginning once more.

The Written Exam

The IRA (Instrument Rating – Airplane) written exam is known for being pretty challenging. I got a 93% thanks to Sporty’s, Sheppard Air, and being heads-down for a week and a half.

Before starting to train in earnest, I decided to focus all my efforts on passing the written exam first and foremost. There is a two year timer for passing your checkride that begins once you pass the written exam; whether real or perceived, this timer gives me a sense of urgency to take my flying seriously and to focus.

After researching various home study course options for the written, I purchased the Sporty’s Instrument Rating course. There are other perfectly fine options out there, including the ever-present King courses, free options on Youtube, and some others, but I picked Sporty’s for two reasons:

  1. The material seems up to date and well presented. The presentation itself is professional and accessible.
  2. Sporty’s makes their content easily accessible and synced across multiple devices. While my PC is usually my primary method of studying, I have a Roku and there is a Sporty’s channel that you can add. I found myself using the Roku channel for learning almost exclusively after a couple of hours, and then I would use the smartphone app to take post-lesson quizzes.

After finishing the Sportys course, I used Sheppard Air for test prep. I would say that I was prepared to pass the written after finishing the Sporty’s course but Sheppard Air helped me identify the gaps that I had. I started studying with Sporty’s on May 28th, purchased Sheppard Air on June 1, and passed my written with a 93 on June 9. Twelve days from zero to hero wasn’t a bad turnaround, I’d say!

The topics that I found myself having the most difficulty prior to the test were related to hold entries and “which airplane are you based on this HSI readout”. For the HSI questions, I found this blog article particularly useful, and for hold entries I used a hold entry circular calculator to help visualize everything.

A holding pattern entry calculator like this one can really help visualize entries and will help you score a few extra points on the written.

I don’t know the exact questions I got wrong, but the FAA tells you the knowledge topic codes of the questions that you got wrong, which you can then cross reference with the Airman Certification Standards. In my case, I had five incorrect answers. They were on the topics of:

  • temperature (IR.I.B.K3c)
  • personal weather minimums (IR.I.B.R1b)
  • IFR airworthiness, to include airplane inspection requirements and required equipment for IFR flight (IR.II.C.K2)
  • Determine airplane position relative to the navigational facility or waypoint (IR.V.A.S2) (I’m assuming this was one of those HSI questions!)
  • Procedures and limitations associated with a precision approach, including determining required descent rates and adjusting minimums in the case of inoperative equipment (IR.VI.B.K1)

Prior to receiving an endorsement to take the practical exam (checkride), the CFII will ensure that any missed FAA knowledge test questions are remediated (ie, they’ll ensure that those topics you missed are gone over).

After the pass, I let my instructor know I was ready to begin in earnest and we agreed to start flying as often as our schedules would allow. Time to get out of the books and into the cockpit…

Day 1 of Instrument Flight Training: Basic Attitude Flying: Ground Lesson

Lesson 1 was on the 13th of June. We had a brief ground lesson before flying where Doug emphasized that the number one issue that pilots run into in IFR flying is having a weak instrument scan. This can manifest in a variety of ways, including fixation on things like the airspeed indicator or turn coordinator, or just not switching your focus between instruments often enough. Another problem that occurs is incorrectly interpreting what the instruments are telling you. The attitude indicator is the only instantaneous source of information that we have that tells us what the plane is doing; chasing the altimeter or directional gyro and fixating on them is how we find ourselves deviating from our assigned altitude or finding ourselves overshooting our course. The result of “chasing the needles” is a course or altitude track that oscillates from where we want to be. Another big problem that instrument pilots run into is a loss of situational awareness. Now with modern avionics (which I am lucky enough to have in the form of a GTN 750) and with EFBs, instrument pilots have more access to information than available ever before, and they should be used during your instrument scan.

The golden rule, as Doug put it, is always know your heading. We will be making liberal use of the heading bug, and any time you are assigned a heading, we will be bugging it.

If a deviation does occur, we don’t want to over-correct; this is how an unstable situation arises and how those oscillations that we spoke of manifest. What we do instead is correct to neutral; that is, get back onto the course or stop the climb/descent, and then apply correction as necessary. This also applies to wind correction, as we were about to find out during our flight.

Finally, Doug said that boiled down, instrument flying is not just about maintaining control of the aircraft via a rock-solid instrument scan and maintaining situational awareness, but it is doing those things successfully during transition points. Takeoff transitions to the en-route phase, en-route transitions to the terminal phase, and the terminal phase transitions to landing. Each of these points is a “hot spot” for pilots, and being able to tackle these transitions is key to safe IFR flying.

With our ground lesson out of the way, we went to the plane for our flight!

Day 1 of Instrument Flight Training: Basic Attitude Flying: Flight Lesson

The conditions at North Central State Airport were winds flitting between variable and out of the north at around six to nine knots, scattered clouds at 6000 feet. There existed a cold front off the coast that was moving away to the south, and visibility was excellent. After starting the engine, we programmed the GTN750 to have a flight plan from KSFZ, the PUT VOR, the ORW VOR, the SEY VOR, and then back to KSFZ. We also explored the screen setup of the GTN750; I am constantly amazed at how much information that this unit can display, and how versatile it is. We set up the screen to have a variety of IFR related functions on it, including ground speed, time to next waypoint, and a few other things.

We took off from runway 33 at North Central, the shorter of the two runways at 3200 feet, but more than sufficient for the Archer II. Shortly after reaching 1400 feet, Doug had me put on the foggles and we began. We worked on my scan and setting appropriate power settings for different phases of flight (2300-2400 RPM for cruise, 2000 for cruise descent), and we worked on different maneuvers, including standard rate and half-standard rate turns (more than 30 degrees of directional change? Standard rate. Less than 30? Half standard.), leveling off from climbs, and so on. This all involves a rock-solid instrument scan, and I found myself focusing mostly on the attitude indicator. Despite us encountering light chop the entire flight, I found myself able to maintain a reasonably precise heading and altitude at all times, even after “transitions”. Certainly not checkride ready, of course, but there does seem to be some truth to all of this “trust the instruments” stuff 🙂

A beautiful day out… and I couldn’t look outside! Pic taken by CFII near Block Island

Other things of note on this flight: the vacuum-driven directional gyro has a tendency to drift. You must set the DG every fifteen minutes or so to ensure you’re on the correct path, and this requires you to be in straight-and-level, unaccelerated flight to check against the magnetic compass. Other thing of note: when in approach, you should be configured for landing well ahead of time. By the time that we were approaching KSFZ and Doug had me take the foggles off, I was pretty fatigued from the constant instrument scan (which did degrade after a while, something that I was warned would happen), and found myself diving at the runway because I delayed putting in flaps and taking power out. Staying ahead of the airplane is critical!

We landed and wrapped up for the day with 1.4 hours logged, 1.3 simulated instrument. All in all, it was a successful end to our flight and we scheduled the next lesson in a few days time.

Nearly-instant Zerto deployments to AWS via CloudFormation

As one of Zerto’s public cloud solution architects, I serve in an advisory role to prospective customers, solutions engineers, channel partners, and everyone that is looking for guidance in making their “public cloud journey” a little easier and efficient. One of the things that I have been asked on a variety of occasions is to deploy brand new environments into the cloud and to assist with this process.

We actually have a “Zerto from scratch” AWS deployment guide that we are in the process of updating (written by my super-smart and talented colleague Gene Torres, who you should follow on Twitter at @EugeneJTorres), and while this is certainly a great start to a brand new environment, there are a lot of steps that are involved, and as anyone that has deployed environments in AWS knows, it’s a lot of effort to start from scratch! So, if you’re not actively deploying VPCs on a regular basis, or if you want a better way to do things, one may consider AWS CloudFormation.

Before I get to the good stuff: AWS CloudFormation is a free service provided by AWS that enables you to write and deploy infrastructure stacks using a JSON or YAML template. This enables a concept referred to as “Infrastructure as Code”. Meaning, you can write a reusable template which builds out an entire environment programmatically, significantly reducing time to deployment and human error.

The template that I wrote takes Gene’s deployment guide and makes it a turn-key script. All you need to do is run the stack and it will build out a ready-to-use Zerto environment in AWS (with caveats, which I’ll get to) that you can use for your own purposes, including production, POC, patch testing, or the like. It will create your VPC with DHCP options, three subnets, an internet gateway, a routing table with a route to the internet, a security group for Zerto Cloud Appliance / Zerto Virtual Manager / Virtual Replication Appliance interaction, a NACL for the “test” subnet, a VPC endpoint to S3 for the region you are deploying to, and an IAM user with the required permissions attached as a policy. In other words, this will save you at least an hour of manual work.

The caveats here are as follows (READ THEM CAREFULLY):

  • You will need to create your own access key and grab the secret access key for the Zerto IAM user manually in IAM. I could automate this, but I haven’t figured out a truly secure way of doing this and not logging it to CloudTrail, so I decided to forgo it.
  • You will need a VPN (or Direct Connect link) set up manually back on premise. I recommend IPSec.
  • You will need to create a key pair manually and name it “zertokey0001” (without quotes. There is no way to create a key pair in CloudFormation, and you will need it to log into the ZCA. Creating a keypair is very easy: follow this guide here.
  • The Security Group ingress rules should be tightened up after the VPN is set up. You don’t want 0.0.0.0 as a general rule of thumb for anything, so after the VPN is configured make sure that’s edited!
  • The template was designed to work in us-west-1 (North California region), but it can be easily transported to other regions as well. us-west-1 is small with only two AZs, which is why the ZertoZCASubnet and the ZertoProdSubnet exist on the same AZ. I recommend if you are deploying to a larger region that you separate out the three subnets to separate AZs.
  • The Zerto ZCA instance is being called using an ImageId, which is unique for that region. If you deploy outside of us-west-1/N. California, either update the ImageId with the correct Zerto Community AMI, or delete the reference to ZERTOZCAINSTANCE and deploy a Windows 2019 m5.xlarge instance for the ZCA manually.

So, how do you use this?

Copy the following JSON template located in my github repository hereSave it somewhere on your desktop and open CloudFormation.

todo2.PNG

Click on “Designer”. Here, you can upload a template from your computer or an S3 bucket. (By the way, if you are interesting in pursuing the AWS Solutions Architect certification, CloudFormation is critical, so start playing around with it!)

todo3

Click on the Create Stack button in the top left corner of the screen (the one that looks like a cloud with an arrow), and create the stack. Give your stack a name, assign it a Key/Value tag if desired/necessary, acknowledge the custom IAM changes, and finally “create stack”.

todo4

In about a minute or less you will have a fully deployed environment ready to go! Make sure you create that access key and secret access key so you can give Zerto programmatic access to AWS, log into your new instance with your previously-created keypair, and install Zerto.

Let me know what you think! I know that each environment is different, and if you run into things that you think should be automated, please let me know!

alex.schenck@zerto.com

 

 

Keeping AWS Charges Under Control with VPC Endpoints

We have all heard the horror stories of the sysadmin who racked up a multi-thousand dollar bill because of a misconfigured environment, and sadly these tales of woe are indeed not myths. One of my customers very recently discovered a $4000 charge due to sending over 105 TB with of data to S3! What happened there, how did we fix it, and how can we prevent this from happening in the future?

First, the way I describe AWS to my customers is that it’s not one monolithic organization in which one service seamlessly communicates with other services via hidden channels. Instead, it’s easier to think of AWS as a huge conglomerate of separate teams (because that’s really what they are), and those teams’ products talk to each other largely via the use of APIs, accessed over HTTP. That means that if you want to send a file from an instance in EC2 (originating from an EBS block volume) to an S3 bucket, you need to establish a connection to that bucket in the same manner you would access a website over the internet.

Here’s the problem: AWS charges you for every gigabyte of data processed through a NAT gateway as well as egress out of EC2 to the Internet (as well as between regions and other AWS services). If you are moving 105 TB from EC2 over a NAT gateway to S3, you’re going to be facing a serious bill.

blog3

 

Ouch! So how do we avoid something like this from happening? The answer is thankfully very simple, and it’s in the form of a VPC Endpoint. A VPC Endpoint is a way that allows you to securely connect your existing VPC to another AWS service, effectively creating an internal route within AWS’ “walls”. In this fashion, your data never traverses over the gateway of last resort (either being a NAT gateway, your VPN, or IGW), and you are therefore not on the hook for processing that data!

To create a VPC endpoint, select the appropriate region you want to configure, and go to the VPC dashboard here:

blog4

Click on Create Endpoint. There are two separate types of endpoints: interface endpoints, which are driven by AWS PrivateLink, and gateway endpoints. Only two services are supported by gateway endpoints: S3 and DynamoDB. Find S3 and select it:

blog5

Finally, select the appropriate VPC and the route tables that are associated with the Subnets you want to ensure utilize the endpoint. In the context of Zerto and keeping transfer costs from the ZCA to S3 buckets low, this will be the VPC that your ZCA is installed and the route table(s) that your ZCA’s subnet is associated with. You may also decide to customize your endpoint policy at this time as well. If your security rules require a minimally-permissive policy to be implemented to control which resources may utilize the endpoint, you may edit that policy now or later.

Click on the create endpoint button when finished! The endpoint will be created and S3-bound traffic originating from the subnets that are associated with the route tables you have specified will now flow over the endpoint.

blog6

Best of all, gateway endpoints are free! There are edge cases where a VPC endpoint may not be an appropriate fit, and I encourage you to check out https://docs.aws.amazon.com/vpc/latest/userguide/vpce-gateway.html#vpc-endpoints-limitations on the matter, but for Zerto customers wanting to protect their data in AWS, this is a very important topic to understand and I strongly advise you to consider it. It takes literally moments to set up, is free, and will save you the headache of begging your AWS rep for cost forgiveness (or explaining to your boss why there’s suddenly a massive unexpected charge this month).

I welcome any tips, suggestions, or comments here or on Twitter. Hit me up at @vPilotSchenck.

-ajs

VM Types and Sizes in Azure — Picking the right fit for your needs

One of the most-asked questions that I answer for my customers concerns the topic of what their VMs will look like in the cloud post-migration or failover. Zerto makes it very simple to select a VM type and size while creating a virtual protection group, but there are a dizzying amount of VMs that you can pick from!

“A simple problem to resolve,” an administrator may say. “I shall just pick a VM type that has the same hardware as what I have running on-premise in vSphere!” Sadly, this approach is often not possible. VMs in Azure are not built with individual virtual hardware components but rather via pre-defined templates. Thus, one may not be able to make a VM that has the same specifications as on premise. What is an administrator or architect to do?

In this article, we shall explore the topic of virtual machines in Azure. We shall talk about VM types, how to read the Microsoft Azure shorthand to identify those VMs, and how to pick the best VM type for the performance that you need at the price you are willing to pay.

VM Types

When you create a VM in Azure, you are greeted with a window that looks like this:

blog1

Here, you will specify the name of the VM, the region it will run in, and other important details. What we want to focus on is the size. Clicking on “Change size” presents you with this:

blog2

You are presented with a variety of different options, including the VM size, whether it’s a promo or a standard offering, the family type, and the hardware profile.

So what do the VM size letters and numbers mean? Here is a table of what you may see:

Type Purpose
A General purpose
B Burstable – discounted VMs capable of bursting when needed. Do not use for consistently high-CPU workloads.
D General purpose compute, ideal for enterprise applications
DC Security optimized. Built-in encryption capabilities and uses secure enclaves.
E Memory optimized. High memory-to-CPU core ratio.
F CPU optimized. High CPU core-to-memory ratio.
G Giant or “Godzilla”. Massive VMs that exceed the capabilities of the D-series family.
H High-Performance Computing. Extremely high memory bandwidth VMs, ideal for scientific simulations (ie fluid dynamics, quantum simulation, rendering, etc.)
Ls Storage-optimized. High throughput, low-latency, directly mapped NVMe storage. Ideal for databases and data warehousing.
M (Massive) memory optimized. The largest memory optimized VMs possible in Azure. Ideal for SAP HANA and other in-memory critical workloads.
N GPU-optimized. Ideal for graphics rendering, deep learning, gaming, etc.

You may also notice that your VM type may have additional letters appended to it. For example, you may see something like “DS3_v2” or “B8ms”. These additional letters refer to VM capabilities or a revision of a VM family type. So DS3_V2 refers to:

  • D = General purpose enterprise compute
  • S = SSD, machine capable of supporting Premium Managed disks
  • 3 = Size “3”, which is an indicator of how “big” the VM is in terms of CPU, memory, etc.
  • V2 = Version 2 of the VM type.

Similarly, B8ms refers to:

  • B = Burstable
  • 8 = Size “8”
  • m = Memory multiplier, indicating there is more memory available for this VM than usual for this family type
  • s = Premium storage options available (ie Premium managed disks).

There are other codes as well, such as “i” (isolated) and “r” (remote direct memory access). When in doubt, check the official Microsoft page to verify what features a VM may have at https://docs.microsoft.com/en-us/azure/virtual-machines/windows/sizes.

Performance, Cost, and Compatibility

Now that you understand what the VM name code means, you will want to compare the actual underlying performance of the VM relative to other options. One way to do this would be to look at how many Azure Compute Units (ACUs) that a VM has. ACUs are a way to compare CPU performance between different types of virtual machines in Azure. The benchmark starts at 100 for A1 through A4; a rough rule of thumb is that a 200 ACU \ vCPU value VM is twice as performant than a 100 ACU \ vCPU VM. Use the following link to show the ACU count for any particular VM SKU: https://docs.microsoft.com/en-us/azure/virtual-machines/windows/acu.

Next, you will want to consider the cost of running a VM in the cloud. To do this, navigate to https://azure.microsoft.com/en-us/pricing/details/virtual-machines/windows/ and find your instance type. You may notice that certain VM types are not available in the region of your choice (for example, at the time that this article was written, G-type VMs were not available in the East US region). Make sure you account for that when planning your deployment!

Now that we have determined the VM type that we want to run our migrated or failed over VMs on, and we have determined that the cost is acceptable for the performance we expect, we want to finish our planning by make sure that the VMs themselves are supported in the cloud. I often tell my customers that Zerto doesn’t care what you’re replicating, but Azure does. Microsoft maintains a supportability matrix for both Windows and Linux, so it is wise to ensure that you are failing over a VM that has a supported OS.

https://support.microsoft.com/en-us/help/2721672/microsoft-server-software-support-for-microsoft-azure-virtual-machines

https://docs.microsoft.com/en-us/azure/virtual-machines/linux/endorsed-distros

Example:

Your customer is running SQL Server 2018 on Windows Server 2016. The virtual machine consists of 4 vCPUS, 32GB of RAM, and 16 VMDKs. The customer wants to utilize premium managed disks in Azure. Which VM size will most closely fulfill the customer’s requirements?

  1. DS13-4_v2
  2. E4s_v3
  3. A4m_v2
  4. E4_v3

You can immediately discount 3 and 4, as neither of those instance types support premium managed disks. Looking at the VM size matrix, you can see that DS13-4_v2 will be enough, but it has more RAM and disks than is required. E4s_v3 however has 4 vCPUs, 32GB of memory, but only 8 data disks are supported, and you cannot exceed that number without moving onto a new VM type! Thus, out of the four options available, the best answer is 2.

Note that for the aforementioned example, an even better option exists in Azure: E8-4s_v3, which provides 4 vCPUs, 64GB of RAM, and 16 data disks, at a cost of $374.98 per month, versus the DS13-4_v2, which costs $551.30. Also note that the ACU for Es_v3 is 160-190 versus 210-250 for the DS13-4_v2. However the DS13-4_v2 VM is not hyperthreaded while the E4s_v3 is. Take into consideration these details before committing to a specific VM family type!

Conclusion

It is ultimately the end-user’s responsibility to choose the correct VM type and size for their workloads when moving or rearchitecting VMs to run in the cloud. However, as trusted advisors and solutions engineers, we can help guide our customers through the often-confusing journey to the cloud. Picking the right size VM requires an understanding of your customer’s data and applications, but by investing the time and effort to understanding their requirements, you will significantly improve their chances of success.
Continue reading “VM Types and Sizes in Azure — Picking the right fit for your needs”

A selfish [virtualized] look at 2016-2017

(image credit to Ken Horn, https://twitter.com/kdmhorn)

So the last year and a half has been wild.

 

In 2016, I left my employer and pursued a new opportunity with a data protection company called Zerto, which is where I reside and work today as a solutions engineer. Zerto provides an enterprise-class solution for business continuity, disaster recovery, and application mobility, and as someone that has been doing data protection for about a decade, it’s very refreshing to work on a platform that a.) works well and b.) gets people excited about data protection, a traditionally dry topic.

I can get into Zerto and what it does in detail in later posts, but I want to reflect on the opportunities that this platform has given me, especially as it revolves around virtualization.

First off, my new employer has given me a huge opportunity to get into the inner workings of ESXi and vSphere as a whole. As a technologist, I’m here to discover new and interesting ways to solve problems, and vSphere is a rich platform to develop creative solutions with. I hold current VCP5 and VCP6 certifications but didn’t constantly touch many of the things that VCP gets into. With Zerto, I have had the opportunity to explore production installs of VSAN, get into the weeds of the vSphere API, and truly understand what those esoteric VMX files are really telling us about a VM!

Secondly, I have had the pleasure of meeting a ton of folks in the virtualization world. Fellow geeks and tinkerers that have the same goal of solving problems and doing cool stuff to further their missions are always fun to talk to. Jonathan Frappier (@jfrappier) and Mike Marseglia (@mmarsreally gave me the push I needed to get into the VMware and virtualization community, and from there I started networking with the Providence VMUG crowd as well as the Boston VMUG UserCon regulars.

Third, I now have a paper trail of presentations that I have given to the VMUGs at Rochester, Boston (Twice!), Hartford, and Providence. The great thing about VMUGs is that they are opportunities not just to talk about your company or product but to really connect regarding virtualization and the challenges and problems that you can solve with them. For example, my presentation at the Providence VMUG was a “back-to-basics” discussion regarding having a plan surrounding disaster recovery. Far too many organizations rely on the “throw your hands up and panic” mode of disaster recovery, and I discussed how to incorporate vSphere and Zerto into a higher-level holistic level of DR planning.

So here’s where and when I presented at VMUGs in 2016 and 2017:

So, with that said, where do we go from here? Public cloud integration. vSphere on AWS. 2016 and the first half of 2017 were pretty darn good… let’s see how the next year and a half shapes up!!

3ab7f81c-9ec3-4641-911c-cc6eafd47daf-original

-ajs

Basic Building Blocks: Setting up bind in CentOS 7 for DNS

featured image retrieved from http://www.techrepublic.com/blog/it-security/dns-resource-record-integrity-is-still-a-big-big-problem/

I had the chance to buckle down and get some DNS servers running in the lab. The process, although not painless, was not terribly difficult, and the upshot is that my lab is far more scalable than it was previously. I used a good guide that I found at Unixmen.

First, a primer on DNS. We all use DNS and we know that it provides name resolution against IP addresses, resulting in a far easier way to access our favorite servers than remembering a set of numbers. What do the core components of DNS do? I am going to assume you know what an IP address, host name, FQDN, etc. mean, so here’s a quick vocabulary list for us to get started:

  • Name server: this is a system running DNS server software and is intended to provide name resolution. DNS server software may include Microsoft’s built-in DNS server or something like BIND. Speaking of BIND…
  • BIND: Stands for Berkeley Internet Name Domain. This is the software package that I used for my DNS servers.
  • Zone files: Text documents that associate names to IP addresses (and vice versa). Zone files contain records (another term) which determine what sort of servers those IP addresses are resolving to. We’ll talk about specific types of records shortly.

Now that we know what we’re talking about, how do we go about setting up BIND? First, start with installing a minimal CentOS 7 image.

deploy

You’re going to want to do this twice, assuming that you want a primary and secondary DNS server. Read the Unixmen guide above for the differences in slave setup; if you’re able to set up a master DNS server you will have no problem setting up a slave. A primary server will be sufficient for some, but I plan on abusing these systems a bit down the line.

With CentOS installed and running, assign the systems static IP addresses and make sure that they can access the internet. The tool you want to use in a minimal install of CentOS 7 is nmtui.

nmtui

OS installed, networking set up and blinking… let’s install bind!

sudo yum install bind bind-utils -y

After that’s done, we need to edit /etc/named.conf.

[aschenck@dns1 ~]$ sudo cat /etc/named.conf
//
// named.conf
//
// Provided by Red Hat bind package to configure the ISC BIND named(8) DNS
// server as a caching only nameserver (as a localhost DNS resolver only).
//
// See /usr/share/doc/bind*/sample/ for example named configuration files.
//

options {
        listen-on port 53 { 127.0.0.1; 192.168.1.2;};
#       listen-on-v6 port 53 { ::1; };
        directory       "/var/named";
        dump-file       "/var/named/data/cache_dump.db";
        statistics-file "/var/named/data/named_stats.txt";
        memstatistics-file "/var/named/data/named_mem_stats.txt";
        allow-query     { localhost; 192.168.1.0/24;};
        allow-transfer  { localhost; 192.168.1.3;};

        /*
         - If you are building an AUTHORITATIVE DNS server, do NOT enable recursion.
         - If you are building a RECURSIVE (caching) DNS server, you need to enable
           recursion.
         - If your recursive DNS server has a public IP address, you MUST enable access
           control to limit queries to your legitimate users. Failing to do so will
           cause your server to become part of large scale DNS amplification
           attacks. Implementing BCP38 within your network would greatly
           reduce such attack surface
        */
        recursion yes;

        dnssec-enable yes;
        dnssec-validation yes;

        /* Path to ISC DLV key */
        bindkeys-file "/etc/named.iscdlv.key";

        managed-keys-directory "/var/named/dynamic";

        pid-file "/run/named/named.pid";
        session-keyfile "/run/named/session.key";
};

logging {
        channel default_debug {
                file "data/named.run";
                severity dynamic;
        };
};

zone "." IN {
        type hint;
        file "named.ca";
};

zone "vlab.local" IN {
        type master;
        file "forward.vlab";
        allow-update { none; };
};

zone "1.168.192.in-addr.arpa" IN {
        type master;
        file "reverse.vlab";
        allow-update { none; };
};

include "/etc/named.rfc1912.zones";
include "/etc/named.root.key";

 

Everything in bold needs to be edited, or in the case of the allow-transfer line, added. What we are telling named.conf to do is listen for DNS traffic on our IP and port, allow for anyone on our subnet to query for DNS information, and transfer zone files to any specified slaves. Finally, we need to add references to our forward and reverse zone files. Obviously, change things like “vlab.local” to whatever your own personal private domain happens to be.

With /etc/named.conf ready, we now need to create  our zone files, located at /var/named/.

Start with the forward zone file.

[aschenck@dns1 ~]$ sudo cat /var/named/forward.vlab
[sudo] password for aschenck:
$TTL 86400
@   IN  SOA    dns1.vlab.local. root.vlab.local. (
        00001       ;Serial
        3600        ;Refresh
        1800        ;Retry
        604800      ;Expire
        86400       ;Minimum TTL
)
@       IN  NS        dns1.vlab.local.
@       IN  NS        dns2.vlab.local.
@       IN  A         192.168.1.2
@       IN  A         192.168.1.3
@       IN  A         192.168.1.175
dns1           IN  A  192.168.1.2
dns2           IN  A  192.168.1.3
aveproxy       IN  A  192.168.1.175
[aschenck@dns1 ~]$

 

Now, what the heck does all of this mean? Let’s go through it line by line:

  • $TTL 86400 – This specifies the time to live. This is the amount of time in seconds after which a DNS client must discard an old record and grab an updated version.
  • IN SOA – “Internet Start of Authority”
  • dns1.vlab.local. – Primary master domain server.
  • root.vlab.local. – Email address of administrator for this zone. The . after root automatically becomes a @.
  • Serial – This is an arbitrary number that must be incremented every time you update the zone file so that it propagates correctly.
  • Refresh – Amount of time slaves will wait before polling the master for changes.
  • Retry – Amount of time a slave will wait before polling the master in case it is unreachable.
  • Expire – Amount of time a slave will wait before no longer returning DNS results as authoritative in case of master failure.
  • Minimum TTL – Amount of time that the name server will cache an error if it cannot find the requested name.

Finally, let’s talk about the record types found in the forward zone file:

  • NS records: These are name servers. Notice that we have NS records for both dns1.vlab.local as well as dns2.vlab.local.
  • A records: These map a host to an IPv4 addresses. Notice I have records for three hosts, including the already-mentioned dns1 and dns2 hosts.

Our reverse zone file looks similar, but serves a different purpose. Forward lookups resolve hostnames to IP addresses; reverse lookups associate IP addresses to hostnames!

[aschenck@dns1 ~]$ sudo cat /var/named/reverse.vlab
[sudo] password for aschenck:
$TTL 86400
@   IN  SOA    dns1.vlab.local. root.vlab.local. (
        2011071001  ;Serial
        3600        ;Refresh
        1800        ;Retry
        604800      ;Expire
        86400       ;Minimum TTL
)
@       IN  NS        dns1.vlab.local.
@       IN  NS        dns2.vlab.local.
@       IN  PTR       vlab.local.
dns1        IN  A     192.168.1.2
dns2        IN  A     192.168.1.3
aveproxy    IN  A     192.168.1.175
2       IN  PTR       dns1.vlab.local.
3       IN  PTR       dns2.vlab.local.
175     IN  PTR       aveproxy.vlab.local.
[aschenck@dns1 ~]$

As you can see, the beginning of these files are the same, but now we introduce PTR (pointer) records. PTR records do the bulk of work when it comes to reverse DNS lookups.

With our zone files created, we can now enable and start the DNS service on CentOS, as well as poking the appropriate hole through the firewall.

systemctl enable named
systemctl start named
firewall-cmd --permanent --add-port=53/tcp
firewall-cmd --permanent --add-port=53/udp
firewall-cmd --reload

Just a few more things to do… change the group of /var/named to named, change the owner and group of /etc/named.conf, and tell SELinux about the new defaults for /var/named as well as /etc/named.conf.

chgrp named -R /var/named
chown -v root:named /etc/named.conf
restorecon -rv /var/named
restorecon /etc/named.conf

The last thing you need to do, assuming all is well, is go back into nmtui and change your DNS IP to your DNS server’s static ip address!

Now, the moment of truth… let’s test our DNS server using dig.

[aschenck@dns1 ~]$ dig google.com

; <<>> DiG 9.9.4-RedHat-9.9.4-29.el7_2.1 <<>> google.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 58864
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 4, ADDITIONAL: 5

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;google.com.                    IN      A

;; ANSWER SECTION:
google.com.             300     IN      A       216.58.219.206

;; AUTHORITY SECTION:
google.com.             148671  IN      NS      ns1.google.com.
google.com.             148671  IN      NS      ns3.google.com.
google.com.             148671  IN      NS      ns2.google.com.
google.com.             148671  IN      NS      ns4.google.com.

;; ADDITIONAL SECTION:
ns2.google.com.         148671  IN      A       216.239.34.10
ns1.google.com.         148671  IN      A       216.239.32.10
ns3.google.com.         148671  IN      A       216.239.36.10
ns4.google.com.         148671  IN      A       216.239.38.10

;; Query time: 260 msec
;; SERVER: 192.168.1.2#53(192.168.1.2)
;; WHEN: Mon Dec 21 18:37:13 EST 2015
;; MSG SIZE  rcvd: 191

[aschenck@dns1 ~]$

If you get similar output to what you see above, and your “SERVER” output is your DNS server’s IP address, you’re all set and you can start using your DNS server to resolve names on your lab’s network.

Now that DNS is up and running, I have been able to install the Avamar image proxy on my machine without any problems 🙂 We’ll go over Avamar in a future post.

aveproxy

Things learned today: take the time to set up basic services like DNS as opposed to relying on your consumer router to do it for you. It’s a lot less frustrating to get it set up correctly and then have a dedicated VM to serve your lab than it is to fight with a router that’s marketed towards people who would have zero clue what a pointer record or zone file is!

-ajs

A little prep goes a long way: thoughts on basic services for a home lab

badcable

Image credit: http://imgur.com/gallery/HYO11dd/new

What’s the saying? “If you find yourself in a hole, the first step is to stop digging.” Sage advice, assuming you follow it, and I’m guilty of not following my own advice.

As I continue to build my lab up, I am finding it increasingly difficult to scale. I am already up to nine VMs and it’s becoming a challenge to manage all of the different DNS names and IPs. Part of the problem is that my router, an ASUS RT-AC68U, is acting as both a DNS and a DHCP server for everything, lab and home network included. While it has worked very well for me as a standard home router and has provided excellent 802.11ac coverage, it is not really well equipped to do things like record editing and manual forward/reverse lookup changes.

In the back of my mind, I knew that this problem was a ticking time bomb. Sooner or later, the inability for me to edit DNS settings before creating it was going to rear its ugly head, and sure enough, today was the day it decided to do so.

I managed to successfully install EMC Avamar Virtual Edition on my lab (hooray! also, I’ll post about this soon!) and I am preparing to do image-level, proxy-driven backups of a couple of VMs. The problem that I am encountering, however, is that Avamar isn’t recognizing the DNS names of the proxy that I am attempting to create as there are no A records to reference, and it won’t let me fix the problem after installation as it will just uninstall the proxy at the first sign of trouble. Clearly, this just won’t do.

The short-term fix, which I will be working on deploying soon, is to deploy a system that acts as a DNS server for my lab. Thankfully I have a ton of different options to do this.

  1. Linux is an obvious choice. DNS is not exactly a high-stress service that requires a lot of power, so it may make sense to deploy a standard CentOS 7 server from a template, load up the dns server, and point all new lab clients to that server for future use. aboutdebian.com has an excellent writeup on the matter.
  2. There’s another, perhaps more interesting option: pfSense. pfSense is a FreeBSD-based open source firewall/router that a lot of home labbers like to throw around as a good services OS. Again, this would be a low-impact VM, and I don’t mind using something outside of Linux if it makes sense and is purpose-built.

I’ll likely deploy this either tomorrow or over the weekend, but for now, here are my thoughts for future lab builds to make life a little easier:

  1. Start with a plan. It doesn’t have to be a great plan and it can just be a sheet of notebook paper that you quickly sketched out, but don’t just deploy ESXi, throw VMs on top of it, and hope for the best. Have some sort of hierarchy. Will you be using a separate subnet outside of your home network? Will you be deploying VLANs? What will you do for separate physical devices, such as routers, hosts, etc.?
  2. Use Excel or Google Docs to keep track of your VM name, DNS name, IP settings, and so forth. Again, this doesn’t need to be fancy, but you’ll thank yourself down the line. It’ll make troubleshooting and merely expanding your lab so much easier in the future.
  3. If you have access to Visio, use that to diagram your lab as it grows. A visual representation of your lab will go hand in hand with your spreadsheet in both troubleshooting and expanding.
  4. If your lab is going to grow beyond a few VMs (and environments like mine are pretty much guaranteed to do so!), spend time deploying service VMs. DHCP and DNS should absolutely be part of this, and if you are planning on doing anything with Active Directory, plan on doing that as well.
    1. By extension, that means you should also plan on deploying app servers such as SQL as well, if you forsee the need for them!

 

For my next post, I’ll spend a little time on deploying pfSense and reviewing DNS, and then we’ll talk about Avamar Virtual Edition and why it’s absolutely awesome.

 

-ajs

A horrible but functional way to shut down a home lab for the evening

stackoverflow

Warning: the following barf– er, bash script might make you ill.

Over the last couple of days I’ve been trying to figure out ways to improve my home lab experience, including automating the tedious shutdown procedure. Most ESXi environments are 24/7, but of course with a home lab there is typically no such need. The trick is to power off the guests gracefully with minimal effort. Enter: scripting!

I currently have five VMs deployed in my lab: two are powered off, one is an embedded ESXi host (that currently runs zero guests), one is a vCenter Server Appliance, and one is the vSphere Management Assistant. The vMA is where all the scripting magic occurs.

The script that I have come up with is admittedly very bad.

#!/bin/bash
# Shutdown script for home lab
#set -x

IFS=""
OUTPUT=( $(esxcli --sessionfile ~/tali01-session vm process list | grep "/vmfs" | cut -c17-256) )
OUTPUTARR=()

while read -r line; do
    OUTPUTARR=("${OUTPUTARR[@]}" "$line")
done <<< "$OUTPUT"

while true; do
    read -p "Are you sure you want to shut down the lab (Y/N)? " yn
    case $yn in
        [Yy]* )
            echo "Shutting down vSphere lab..."
            for ((i = 0; i < ${#OUTPUTARR[@]}; i++))
            do
                echo ${OUTPUTARR[$i]}
                vmware-cmd --sessionfile ~/tali01-session ${OUTPUTARR[$i]} getstate
                vmware-cmd --sessionfile ~/tali01-session ${OUTPUTARR[$i]} stop soft
                vmware-cmd --sessionfile ~/tali01-session ${OUTPUTARR[$i]} getstate
                sleep 60
            done; exit;;
        [Nn]* )
            echo "Exiting."
            exit;;
        * ) echo "Please answer yes or no.";;
    esac
done

What the script does is this: it invokes the special IFS variable in bash so that it doesn’t choke on whitespace, then it run grabs the output of esxcli. This is where it starts getting awful. –sessionfile is something you can use to authenticate to an ESXi host without having to repeatedly type in a username and password. Here’s the problem: it expires if you don’t use the file within 30 minutes. Already we encounter problem one: you have to generate a new sessionfile every time you want to use this script by way of using –savesessionfile.

Problem number two immediately follows. We’re grepping for config files that we can use in order to power down VMs. I use cut in order to choose the exact part of the string that I want to follow. This is a horrendous way of grabbing a part of a string, because if the output ever changes in how esxcli vm process list works, you’ll get an erroneous cut back. Furthermore, notice that I cut up to character 256… what if our vmx file goes well beyond that by even a single character?

Up next: reading a multi-line output into an array. I banged my head against this part for the last two days. It should be so simple, and ultimately it was a modification of the ubiquitous “x = x + 1” or “x++” method. Take an empty but initialized array, add a single read line, and append it to the array. Repeat until done.

Next up: the actual shutdown. Prompt the user to make he or she actually wants to shut down the lab. If affirmative, start walking through each config file stored in the array and softly power off the guest, waiting 60 seconds before looping back to the beginning. Repeat for each system.

That’s it. The concept is simple enough but there are a few other reasons why the script sucks, such as the fact that it shuts down the very guest it exists on, and there’s no intelligence behind what it shuts down first; it so happens to collect the config files in the order that I want them to shut down but that is horribly hacky and not scalable whatsoever.

I had a good friend of mine (whom is a bazillion times the programmer that I am) look the script over and said that “you told bash to stop being bash”. I thought it was funny.

So, where do we go from here?

  1. Figure out a way to prioritize the shutdown of certain VMs. I have barely scratched the surface of the commands available on vMA, so there’s got to be a way to tag or otherwise identify VMs for position in a shutdown list.
  2. Port this to PowerCLI or figure out some way to make it so that the very VM it runs on isn’t shut down when executing the script.
  3. Come up with options, such as “fast” shutdown where the sleep time is 30 seconds or less, “hard” shutdown where we want everything to go down as quickly as possible (ie yank the power cord), and so on.
  4. The authentication method with the ESXi server is downright miserable. Use something like –credstore instead to make the authentication persistent.

If you have any suggestions that focus on either features or scripting, let me know!

-ajs