Posts

Why is three nines better than four in cloud availability?

Image
In building our TietoEVRY Hybrid IaaS on VMware solution, there's been a lot discussions also about availability SLAs. We're changing quite many things also in this area, so wanted to take bit deeper look into this. What's different and how to make best use out of the new model. To look into the topic, let's first dive into few key contexts related to availability. Availability zone Availability zone is single physical location, essentially a data center, which is then sub divided into one or more failure domains.  On IaaS scope, availability zone is the maximum scope for any high availability. Any resiliency actions going across availability zones, are always disaster recovery and are expected to be disruptive by default. Meaning RTO greater than zero. Failure domain From facility perspective failure domains are separate fireproof sections, which are expected to limit local failures in power, cooling or fire.  From cloud perspective, they're they're clusters. E

Join VMware Photon to Active Directory

 Quick and simple task I assumed, get an individual VMware Photon instance to host one container for specific purpose and plug it into Active Directory for proper management. After all it's the baseline image for most of VMware appliances and all of those plug into AD very nicely. As it turns out, not so simple. Not so simple before you know how it's done that is, it took a bit reverse engineering to figure out how to do it. First Googling resulted in solution called Lightwave, which actually looks like a very nice solution but eventually it's for a different purpose. More for replacing Active Directory than integrating into it. Which is actually an activity that I've been looking for to my home lab, AD functionality but running on raspberry pi. So definitely worth looking into, but that in detail is a story for another day. By trying to do the integration with Lightwave, I accidentally then also stumbled into the right approach. Which is likewise. Firstly installing it

reverse engineering VMware Cloud Director API

As a continuation to my Cloud Director automation story, I started to dig a bit more into API calls used for configuring Cloud Director. Process is fairly simple, just turn on network monitoring in chrome developer tools and you can see which API calls the HTML5 portal does when actions are initiated. I got couple basics done, adding vCenter and NSX-T manager, after which I already made quite significant learnings. One is that I now know why the API calls are not documented, they're completely inconsistent and there's pretty much no consistency across. Would be quite difficult to document :) In practice those two actions mean four API calls, three of them different. Add certificate as trusted (done twice, for both vCenter and NSX-T certificate) https://{vcd_url}/cloudapi/1.0.0/ssl/trustedCertificates Register vCenter https://{vcd_url}/api/admin/extension/action/registervimserver Register NSX-T https://{vcd_url}/api/admin/extension/nsxtManagers First when

Automating VMware Cloud Director deployment on Azure

Overview This story is about automating the deployment of VMware Cloud Director on Azure, done purely for exercise purposes but there's quite many re-usable components here. Setup is done in GitHub and overall orchestration is done by GitHub actions. Everything is in code in GitHub and each time there's a new push the previous setup is first deleted and then re-created. Everything except binaries and certificates, which are stored Azure blob storage and fetched from there using keys stored in GitHub secrets. The deployment consists of following component: VM running VMware Cloud Director Database in Azure Postgres PaaS Application gateway for portal access "load balancing" DNS in AWS Route 53 Yes, the title says on Azure but actually there's a bit of AWS also included in the form of public DNS hosting. In the details there's a bit reverse engineering about using undocumented API calls and other interesting bits. The actual code and pipel

Managing containers (@home)

As part of my home infra renovations I've been transferring more and more functions to run in containers on top of Rasperry Pis. Even though specifics are extremely hacky and home use only, there's been quite a few lessons learned which link into enterprise grade ways of working also. This journey mainly started off with Hypriot , which provides great automation to setup docker basics on Pi. So getting to the initial state of delivering everything in one infrastructure as a code model was straight forward.  Version 1.0 Hypriot provides support for cloud-init, which makes automating deployment very easy. Essential flow was / is like this: Deployment image = Hypriot (GIT) Deployment configuration = cloud-init user-data Persistent storage = NFS (Re-)Deployment model = flash SD card with image & config Whole magic is in the cloud-init user-data, which has a lot of failures actually in this model but we'll get to that later. Activities that I placed

Home infrastructure renovations

Image
Starting point The title can and will mean very different things to different people, to some people it means building a toolshed to begin with. Anyhow, we're in IT context now in my personal context this renovation means actions to get rid of a traditional server park running on Dell VRTX and moving more actions into both containers and public cloud. These activities themselves can be understood many ways, but hopefully the story will clarify as we go along. This activity has been actually ongoing already for a while and now as I had a otherwise free weekend I got things a bit more forward, to a let's say midway state from the actual goal. Midway physical architecture Since pictures are always nice, here's physical architecture of my home infra. This is now almost current state, currently the clusterfw is actually behind worksw as there's a bit open actions going on with the devices. As one might guess, worksw is in my office and backendsw is then in my cl