Is scripting an important skill?

So, yesterday, a colleague posed an interesting question. “Is scripting an important skill for a system administrator”?

I’d like to answer that question with a very resounding “YES!” Frankly, I have to question whether one is truly a system administrator without the skill to write a script (even if it’s simply a quick hack involving shell redirection). This is not about any particular scripting language. bash, ksh, Powershell, it doesn’t matter. Scripting is scripting. They’re all just different roads that lead to the same place: automation.

Perhaps my strong feeling about this come from my days as a Unix administrator (Solaris, primarily, but I have run Linux systems and dabbled in HP-UX). As an old-ish Unix jockey, my mentors always reinforced a common theme: if you have to do it more than once, script it! This mantra is not just because we can do more with less by way of a script, but is also for one’s sanity, really.

Let me share an example. A number of years ago, I was hired as a Unix admin for a rather large telecommunications organization. I joined their managed hosting group, and was expecting a reasonably boring life of MACs (Moves, Adds, Changes, for those unfamiliar with the acronym) – shuffling user accounts and data, adding users, groups, software, changing things on a request basis. But upon walking in the door on Day 1, they sat me down with the existing backup guy. 3 months later, this guy was no longer there, and it was just me running the backup system (we ran the backup infrastructure entirely on Solaris at that point). What’s worse, I was a rookie backup guy, and now I had 12 data centers to deal with, and I was running solo. Sure, I could have pointed and clicked my way through the X-Windows interface to the backup software, gotten RSI on my mousing arm, and gone absolutely mad knowing I wasn’t going to get any real help. But I’m all about working to live, rather than living to work. So I spent a little time learning the CLI underpinnings of my backup software, and everything that could be done (quite a bit more than via the GUI, mind you). Sure, my daily care and feeding fell behind for a little bit, but it was oh, so worth it in the end.

So I started building scripts. Reporting scripts, maintenance scripts, add a backup job scripts. If there was something that had to be done daily, I scripted something out. Weekly? Script. Monthly? Script. You get the idea. And then I loaded up my crontab in each datacenter, and I realized how much time I had to actually learn about backups and tape. All of a sudden, I had time to work with my counterparts in our engineering team to help shape the backup infrastructure. I could share pain points with them, and have data to back them up rather than anecdotes. We could dive into the real problems while I have nicely automated everything behind me. I also took the opportunity to start working much more closely with our FC SAN guys, which is a move that shaped my career for years after. Now, 6 months after I did all that work, I managed to start getting a team built up under me. Was the work all for naught? No way. Their lives were also made a bit easier as well. Were I better about my own personal data backups, I’d still have those scripts today 🙂

The moral of the story: Automation is a good thing. Not because we’re letting “the man” win and doing more with less. That’s just a side-effect. The real win for automation is that it frees us up from repetitive tasks so we can spend time doing more valuable things. And value is still what it’s all about. When your manager asks, in your annual review, “What value have you brought to the company this past year?”

Your answer could be “I worked long hard hours pointing and clicking my way through to make sure the bases got covered.” But wouldn’t you rather say “I automated the daily care and feeding of these 4 infrastructure applications my team owns, and have been working with {other team} to help them increase their efficiency by automating applications X, Y, and Z”?

I know which I’d choose.


Ramblings on ESXi

As VMware continues the push to a Service-Console-less world with ESXi, there are things that we may want to contemplate with our customers.

Something that came to mind earlier today was logging. ESXi, by default, has a built-in syslog, but it writes logs to a local memory-based file system. That means that when the host goes offline, the logs just go away. There is a method by which one can redirect those messages to a specific Data Store, but let’s face it, centralized logging is all the rage! If nothing else, it provides a remote facility that won’t be modified if someone gets into the ESXi host and cleans entries up after they’re done. To me, that’s some pretty important security. So how does one redirect syslog on an ESXi host, you ask?

It’s as simple as changing a single Advanced Setting via the vSphere client. Take a look at this brief blog entry atVirtualizationAdmin.comby David Davis:http://blogs.virtualizationadmin.com/davis/2010/02/22/how-to-redirect-esxi-system-logs-to-a-central-syslog-server/

Some other things we want to think about in the transition will ultimately all be COS-related, that being the biggest difference between ESX and ESXi.

Does the customer have agents running in the COS for anything?

  • Backup agents – Perhaps it’s time to revisit backup strategies and methodologies.
  • Hardware management agents – Insight Manager, OpenManage, etc – Many of these functions are being replaced through vendor-specific CIM providers. VMware has available 4 total ISOs for ESXi Installable – one for each of the major vendors (HP, IBM, Dell), and the basic ESXi. The vendor-specific distributions have the appropriate CIM providers cleanly integrated. We should work with our customers in their labs to determine if the CIM providers have the functionality necessary for their specific environments.

Scripts in the COS – customers have developed many scripts to help with management activities in the ESX environment. It is time to begin investigating the transitioning of these scripts to a remote environment. There are a couple of directions that a customer could take in porting their scripts

  • vCLI – the vCLI is a set of tools available from VMware to provide much of the COS toolkit on a remote host. The vCLI is available in 3 forms: a Windows installable package, a Linux installable package, and the vSphere Management Assistant (vMA). The two installable packages can be installed on and run from a Windows or Linux environment. The vMA is a Linux-based Virtual Appliance that can be integrated into a customer’s environment and is designed to provide a prepackaged remote scripting environment for a virtual infrastructure. The vMA provides a number of benefits over the installable vCLI tools such as FastPath Authentication to streamline session authentication functionality without compromising security and simplified deployment as an OVF appliance.
  • PowerShell/PowerCLI – PowerShell is fast becoming a favorite management and scripting toolkit of ESX administrators, partially due to the overwhelming number of Windows administrators that have inherited the responsibility of managing the virtual infrastructure. The PowerCLI toolkit from VMware is a robust set of cmdlets and objects to be used from PowerShell scripts to work with a virtual infrastructure
  • Other SDKs from VMware – VMware provides SDKs for API access from Perl and Java as well, if those languages are more to a customer’s liking

There are still some pieces of functionality that are missing from this stack, admittedly. I’ve spoken with customers about the lack of tools available to manage things like RAID controllers from ESXi. Many of these things are up to the hardware vendors to implement, but VMware can be a conduit for functionality requests as well. We can work with customers to file feature requests through VMware (http://www.vmware.com/support/policies/feature.html). When filing such a request, please be as specific as possible regarding what functionality is being requested. Using the above-mentioned RAID controller management as an example, a good feature request may document that a user would like to be able to add disks to a RAID array, create a new RAID array, destroy a RAID array, and rebuild a RAID array after disk replacement. The more specific the requests are, the move VMware can help implement the functionality.

Expanded functionality seems to be the focus of the next release of vSphere (from the small amounts of info flowing out of VMware’s recent Partner Exchange), and the product continues to improve. Just because a customer doesn’t want to migrate now is no reason to put off testing, evaluation, and porting of the customer-developed management tools.

Just remember, I’m a consultant and a trainer, and these are the kinds of things I think about 🙂

-jk