Lately (past 3 weeks) I have been doing a lot of work on the Azure cloud service by Microsoft. It still has a long way to go before it is even remotely comparable to AWS. I say this mostly from a ease of use point of view. One statement that I hear a lot from Microsoft folks is the power of the Azure PowerShell tools. As to how more and more users prefer to use Azure PowerShell instead of the UI. Well, I cant speak for all Azure users but I can say this…. I have used AWS for almost 2 years and they too provide a powerful CLI. I have used the AWS CLI about 10 times in the past 2 years. Why just 10 times? Because pretty much anything I want to do I can do using their UI. Its called making life easier. Microsoft Azure has two different UIs each with several limitations, thereby forcing users (or at least me) into using PowerShell. The issue is not as to whether PowerShell is good or bad. Its more about the fact that Azure has not yet matured enough to provide users with a simple, intuitive, yet powerful interface to manage a cloud based data center.
This article is mostly about setting up resources in Azure (classic mode) and quirks that I noticed. Perhaps these are not quirks but then I should say, things the UI does not warn or prevent you from doing and all hell breaks loose. I will keep updating this blog post as and when I find issues and fixes for them.
My setup is as follows: A single Virtual Network (Named VNET1) with a single Subnet (Subnet1) in it (for now). I have one resource group (RG1). I actually use more meaningful names but for the purpose of this blog the current naming convention should suffice. Now this part I do not recall the sequence in which I built them (and it may play a role in what I observed later). So I also have one cloud service (CS1), one availability set (AS1), and one storage account (SA1).
Now I start creating VMs in the classic mode and select RG1, CS1 (domain), VNET1, etc. All works fine. I am not exactly sure of the relationship between Resource Groups, Availability Sets, and Cloud Services. But they are related…LOL. Problems I faced were:
Problem 1: Adding a new machine to a new cloud service within a Virtual network. (Another cloud service already exists)
When trying to add a machine to a new cloud service I tried several approaches.
- Create a new cloud service and then create a new VM and select the new cloud service you created. FAILS. The moment you select the domain name of the new cloud service you created, you will be forced away from VNET1 and Subnet1. You will notice that some other VNET is selected and locked. At least on portal.azure.com.
- So instead you create the domain (cloud service) as a part of creating the VM. When it is time to select a domain, click on create new one. This too would fail.
- That's when I realized that first I should create an availability set (while creating the VM), then a new domain/cloud service and things would work fine.
Problem 2: Adding a new machine to a new cloud service and new storage account
- In the above example if you also needed to create a storage account and did so while creating the VM, cloud service/domain, and availability set. The moment I would make the selection for creating a storage account the UI would thrown an error and discard all my changes.
- So I created a storage account first then followed the steps above. That seemed to work.
Problem 3: In classic mode almost every action is single threaded. You cannot build more than one machine at a time. You cannot do more than one of almost anything at a time. WTF!!!! Can you imagine the time it takes to build out an architecture that requires a 100 machines?
Problem 4: Creating clones/images of existing machines is really not all that simple. So you create this VM, configure it to perfection, you start using it, then you realize you need 3 more. So you can capture this VM and create an image out of it. Using this image you can create additional VMs. But…. Hold on there cowboy…. When you capture the image from a VM, Azure deletes the original VM. Say what?
Problem 5: Adding additional drives to a VM is an action that has to occur after creating the VM and not part of the original creation. Want to build a VM with 5 drives? build the VM, then add drive one,wait……., add drive two … wait…tick tock….. and so on and so forth. Honestly! uncrating, rack mounting, and configuring a Dell Server is faster.
Problem 6: More on creating images. So you build this VM, load it up to your perfect specs and then run a sysprep on it (Assuming Windows VM here). Now you select this VM and press the capture button. This basically should generate an image of the Virtual Machine. Well if you use portal.azure.com you may get an error that says “Failed to capture the virtual machine 'Machine Name' to image name 'IMAGE_NAME'. The Resource 'Microsoft.ClassicCompute/virtualMachines/Machine Name' under resource group 'RG1' was not found.” And as I mentioned earlier your virtual machine has been deleted. So WTF just happened? Well I am not really sure but I noticed that under Virtual Machine Images (classic) I can see the image. OK so a quirky error that may or may not have any meaning.
Problem 7: Now you take the image that you have created and try to launch a new virtual machine. Now way up in the blog post I said I had one storage account SA1. Now lets say I created a second storage account SA2 and a third storage account SA3 somewhere along the line. The image I created was in SA1. I wanted to create an instance of a new machine using this image in SA2, but the UI would always select SA3 and lock the storage account setting (on portal.azure.com). Why? Hell if I know. On manage.windowsazure.com I don't get to choose which storage account the VM is i created in. Its always created in the same storage account as the image, i.e., SA1. So now this is interesting. Azure has two web portals and each throws a different restriction on which storage account my new VM will be created. Why can’t I choose? This is insane. So I can’t create an image that can be used to provision machines in different environments? What am I missing here? I am sure I must be doing something wrong here because this is just stupid…..
Problem 8: Deleting VMs. When you delete a VM in portal.azure.com you have a choice to select Disks. This should then delete all disks associated with that VM. Well guess what? It does not delete any disk. If you go to your storage account you will see all disks from your deleted VM still sitting there. Now from the storage account you can select these disks and delete them. But…… Guess what folks… you cannot. You will see that the lease on this disks is still on. So how the hell do you delete these disks? I mean you said delete when you deleted the VM, then you go to the storage account and say delete again, yet they live on. Now if you go to manage.windowsazure.com and go to virtual machines, then disks, then select a disk, then select at the bottom of the screen you can actually delete these disks.