beyond this horizon.

| | Comments (0)
We're working on a virtualization API and web management tool, just like everyone else in the world.  What will differentiate ours is that it will something we want to use, and it will display our. . . idiosyncratic approach to UI, authentication, and features.

Actually I'm pretty enthused.  We're starting from the rackspace API and throwing in Luke's coercive provisioning ideas, and it's kind of turning into something interesting.  (But hungry.  Always hungry.)  I've thrown myself wholeheartedly into the project, as I tend to do when confronted with the prospect of great effort for uncertain reward.

Yeah, I don't know why that is either.  I should probably get that checked.

anticipate.

| | Comments (0)
Okay, I guess that was an adequate break.  I worked through some of the book examples with my dad, and man -- I was rusty.  It was pretty embarrassing, although it all seemed to come back pretty quickly.

Anyway, though, we've hit a bunch of errata.  Mostly little things.  Some absolute deal-breakers.  There are a couple of bits that we need to expand our explanations of.  Some sections where I really just want to say "skip ahead and skim the next two chapters, then come back and this'll make more sense."  Not sure how to deal with those.

I guess these things are never done.

automation is the cloud

| | Comments (0)

So, Brandon Burton linked to his Automation is the cloud article on lopsa-discuss mailing list today, and he requested that I post my response here, so those of you in blog-land can see it.

First, I very much like your 'automation is the cloud' thesis. I think that is the only sane way to define 'the cloud'. (of course, I think leaving 'the cloud' to the marketing department might be an even better choice, then going back to automating our stuff.)

I am very interested in this, because I'm trying to pull the good parts of 'the cloud' and use them in my own service.

Right now, my intent is to build something that emulates the existing pxeboot/rebooter setup we have with physical boxes... something that can suck in a dhcp.conf created by cobbler or the like and do the correct thing. This will make it easier to integrate with existing tools so that you can mange the servers you rent using the same tools as the servers you own.

Most of the 'API' stuff looks silly to me. You want Python bindings to provision new servers? really? but then, I'm a janitor, not a developer, so maybe it's just that much easier for them to use python than to set up ssh keys and a script?

Personally, I believe that if the cloud is going to be anything more than a super-high-margin playground for those who don't care about money or performance, we need to decouple virtualization from the cloud.

live migration (and failover) both require shared storage. (failover requires doubling your other compute resources as well) which usually makes it a no-go when both cost and performance/reliability matter. you can get good fast shared storage, but it's not cheap, etc... (amazon seems to have stood smack dab in the middle of the good fast cheap triangle with it's elastic block storage project... It's not super reliable, but it's not horrible. It's not super expensive, but it's not horrible. It's not that fast, but it's usable. I think they probably made reasonable choices with what they had. I also think their decision to go with local disk (thus you can't use live migration) was probably a good one, considering the choices.)

Without those things, virtualization becomes nothing more than a tool to turn big servers into many little servers. Which is great if you need little servers. It is way cheaper to run one 32GiB ram/8 core box than to run 8 4GiB/ 1 core boxes, let me tell you.

this means that if you do need a 32GiB/8 core box, you lose out by virtualizing. Without shared storage, pxeboot and rebooting power strips give you almost everything virtualization gives you in terms of automation. Virtualization also has a cost in terms of security (remember the hyperthreading cache-peeking vulnerability?) having a box all to yourself will always be more secure than sharing one. With virtualization, the amount of cpu/disk bandwidth available is either unknown, or split vary hard (like the above example where I've dedicated a core to each virtual. which is exagerating a bit... you need to dedicate a core to the control server, too, if you want reasonable performance.)

My next project is to set up prgmr.com so that customers can upload dhcpd.conf files or similar from cobbbler and my system can kickstart the DomUs they own, for a more seamless transition between servers they own, and virtual and physical servers they have with me.

That's the other problem, owning is much cheaper than renting. When I say this, most people point at the 'sysadmin time' thing, which is expensive, but the only part of sysadmin time that 'the cloud' covers is hardware installation and replacement, and setup of your provisioning system. you still need a sysadmin for troubleshooting (is it hardware vs. is it my OS) though the provisioning system makes that a little easier, as you can just move your junk to new hardware. You still need to handle configuration management. (EC2 has some basic tools, but realistically you still need someone on hand who knows puppet, chief, cfgengine or who really knows your OS and can code up some fancy perl scripts.) what I'm saying is that the cloud only saves you from schleping hardware. You still need a sysadmin.

But what that means is that you need a system that can handle servers you rent (for short-term stuff, it's reasonable to pay a lot extra to rent if you only need the servers for a few days.) as well as servers you own, in a seamless manner.

project plowshare.

| | Comments (0)
Wow, it's been a while since the last time I wrote here.  I'm not apologizing or anything -- we've got good reasons for that.  Ish.  Good-ish.  Anyway.

First, most exciting, the book is done.  That's writing, tech review, copyedit, layout, and proofreading, with the attendant reviews by us at each step.  It looks increasingly real, and it's actually pretty scary.  Look for it in stores this year.  Ask for it by name.  Any customers who buy it shall have such thanks as befit a king's remembrance.

Second, prgmr's been growing at a pretty intense clip.  That's not really an excuse, but we're pretty happy with how things have been going, so it bears mentioning.

Finally, I've been taking a break from virtualization.  I spent several months in there saying "I want this book to be somebody else's problem," and so I basically headed off for a break as soon as humanly possible.  In spirit at least, if not in body.  But I think I'm over that now, and we've got a lot more to work on here.

More on future plans later.

| | Comments (0)
Writing quote of the day:

"i mean, we shouldn't slavishly avoid double negatives when they make our meaning less unclear."

Take that, English language.
The sole problem that I had setting up a pvops domU on a pvops dom0 (mostly following the instructions at http://bderzhavets.wordpress.com/2009/03/29/setup-xen-unstable-dom0-with-2629-tip-pvops-enabled-kernel/ ) was in trying to get the Ubuntu domU's console to work.  Eventually I solved it by adding an 'hvc0' file to the domU's /etc/event.d/ directory, like this:

# cd /etc/event.d
# cp tty1 hvc0
# sed -i -e "s/tty1/xvc0/g" hvc0


(Credit to Mr. bderzhavets, again.  There's  a  lot of good and useful information there.)

I also included an "extra='xencons=tty'" in the domU config.  (I believe.  Might have been hvc.  Try it and see.)

onward, not forward.

| | Comments (0)
Ah, progress!  Thankfully, this problem seems to be fixed in both Solaris Express and OpenSolaris, so I guess we can take out this section.  I'm pretty happy about that, yeah.


Devices and Hotplug

Solaris' hotplug system may give you some trouble. The problem is that the default hotplug configuration (in 2008.05) doesn't react correctly to the Xen virtual devices. This causes domain creation to fail with an error message like "Device 0 (vif) could not be connected. Hotplug scripts not working."

To fix the hotplug scripts, create and run a script like the following, which adds appropriate rules for Xen devices:1

BASEDIR=${BASEDIR:-/}


/usr/sbin/syseventadm list -R $BASEDIR -c EC_xendev > /dev/null 2>&1
if [ $? -ne 0 ]

then
/usr/sbin/syseventadm add -R $BASEDIR -c EC_xendev \
/usr/lib/xen/scripts/xpvd-event 'action=$subclass' \
'domain=$domain' 'vdev=$vdev' 'device=$device' \
'devclass=$devclass' 'febe=$fob'
fi

/usr/sbin/syseventadm list -R $BASEDIR -c EC_xpvsys > /dev/null 2>&1
if [ $? -ne 0 ]
then
/usr/sbin/syseventadm add -R $BASEDIR -c EC_xpvsys \
/usr/lib/xen/scripts/xpvsys-event 'subclass=$subclass' \
'shutdown=$shutdown'
fi


# restart daemon if the package is being added to the running system
if [ "$BASEDIR" = "/" -a $? -eq 0 ]
then
/usr/sbin/syseventadm restart

fi


1 Script from http://blogs.sun.com/dkumar/entry/problem_bring_up_domu

Some days it seems like the caches have churned just right, and El Goog is giving me more useful information than I know what to do with. . . circa two years ago.  This is that day.  I know I've used these search strings before, but some of these links are positively delicious.

Anyway, here's an interesting post on using Perl to communicate with Xen via the XenAPI.

http://unixfoo.blogspot.com/2007/12/perl-xen-api.html
So, when installing CentOS 5.2 (and possibly other distros) with virt-manager, console output disappears after the "mounting /sys filesystem" line.  This might only happen in low-memory conditions.  (The fact that anything less than 512 MB is considered low-memory for an _installer_, by the way, I find appalling.)

To work around this issue, install using the extra kernel argument "xencons=tty" (no quotes.)

on xen vanilla kernels

| | Comments (0)
Interesting thread on merging Xen dom0 support into the kernel:

http://thread.gmane.org/gmane.linux.kernel/800658/focus=800714

Obviously I'm biased toward moving Xen completely in-kernel -- it's ridiculous to rely on out-of-kernel patches that haven't been updated to work with newer kernels -- but I don't have the deep familiarity needed to give reasons why this is bad off the top of my head.  Jeremy Fitzhardinge does, and it's quite enlightening.