(Cloud computing platforms with auto-scaling functionality)
I’m currently in the middle of a GCP to AWS migration and the user experience in GCP is nicer in a thousand little ways.
GCP developers made sure it’s a nice place to work.
AWS developers made it work just enough to be feature complete. Usability wasn’t part of the spec.
We use both Azure and AWS at work. Azure is less expensive. But it’s also like comparing a corner 7-11 to a city-sized shopping mall. AWS is freaking huge and they add new services every week it seems. It’s kind of overwhelming.
There’s also Google Cloud and IBM and Oracle but they all suck for the same reasons AWS and Azure suck.
You might want to look into Alibaba Cloud, Digital Ocean, and/or Linode/Akamai.
Same, we use AWS, Azure and a third party VMware suite cloud. The VMware is superior by far IMO because I like to have full control of my systems and roll my own stuff. I think the big clouds make their money by saving time on dev ops. I come from a sys engineering background and transitioned to development so none of that stuff is very difficult. I’ve tried Linode, Hetzner, Digital ocean and a few more but I think VMware does all I need.
The VMware is superior by far IMO because I like to have full control of my systems and roll my own stuff.
This was a strong argument before the Broadcom acquisition of VMware. Have you looked at your renewal costs yet? VMware is a great product but is far less attractive at the new pricing. Every customer I have is asking how they can get out of VMware as quickly as possible because of the new pricing.
Yeah we were hit hard by the cost projections. It really sucks. But HCI stack from MS remains even more expensive. We have decided to bring as much as we can in house and only put the workloads that have strict contractual uptime agreements on our VMware or HCI stack. The rest of the stuff goes on KVM or bare metal to save costs.
We have decided to bring as much as we can in house and only put the workloads that have strict contractual uptime agreements on our VMware or HCI stack. The rest of the stuff goes on KVM or bare metal to save costs.
This is similar to the recommendations I give my customers, but its never this easy.
Entire teams are trained on managing VMware. Years of VMware compatible tools are in place and configured to support those workloads. Making a decision to change the underlying hypervisor is easy. Implementing that change is very difficult. An example of this is a customer that was all-in on VMware and using VMware’s Saltstack to orchestrate OS patching. Now workloads they move off of VMware have to have an entirely new patching orchestration tool chosen, licensed, deployed, staff trained, and operationalized. You’ve also now doubled your patching burden because you have to patch first the VMs remaining in VMware using the legacy patching method, then patch the non-VMware workloads with the second solution. Multiply this by all toolsets for monitoring, alerting, backup, etc and the switching costs skyrocket.
Broadcom knows all of this. They are counting on customers willing to choose to bleed from the wrist under Broadcom rather than bleed from the throat by switching.
We take a cloud agnostic approach to systems development so we have flexibility. Our team is quite small and we use Manageengine for patching servers and Atera for patching users systems. We only use a few cloud native services like AWS event bridge, load balancers, S3, Lambda, Azure DNS, Azure storage, Azure App service. But if needed we could pull any one of those and move to an open source solution without too much fuss. The red tape comes from exec level and their appetite for risk. For some reason they think cloud is more stable than our own servers. But we had to move VMs off Azure because of instability!
There’s a cost to keeping an agnostic solution that maintains that portability. It means forgoing many of the features that make cloud attractive. If your enterprise is small enough it is certainly doable, but if you ever need to scale the cracks start to show.
For some reason they think cloud is more stable than our own servers. But we had to move VMs off Azure because of instability!
If you’re treating Azure VMs as simply a replacement for on-prem VMs (running in VMware or KVM), then I can see where that might cause reliability issues. Best results means a different approach to running in the cloud. Cattle, not pets, etc. If you were using Azure VMs and have two VMs in different Availability Zones with your application architecture supporting the redundancy and failover cleanly, you can have a much more reliable experience. If you can evolve your application to run in k8s (AKS in the Azure world) then even more reliability can be had in cloud. However, if instead you’re putting a single VM in a single AZ for a business critical application, then yes, that is not a recipe for a good time Nonprod? Sure do it all the time, who cares. You can get away with that for awhile with prod workloads, but some events will mean downtime that is avoidable with other more cloud native approaches.
I did the on-prem philosophy for about 18 years before bolting on the cloud philosophy to my knowledge. There are pros and cons to both. Anyone that tells you that one is always the best irrespective of the circumstances and business requirements should be treated as unreliable.
Kubernetes is extremely expensive on cloud so we run our own in house
Our problems with VMs on Azure were:
- The Azure Linux Agent incrementing versions and breaking stuff.
- The availability zone becoming over utilized and our non reserved VM clusters fail to start up.
- Changes to Azure automation runbooks breaking scripts and schedules. (unrelated to the stuff they warned about)
- Azure invisible proxy terminating ssh sessions as inactive while doing long running tasks and having to use the awful serial console.
We use digital ocean for a pre-production k8s environment, as well as other stuff, no complaints. Terraform works great with it. My only issue is that the worker nodes IPs change during/after an update, so we have to update our firewalls a few times, while the update is running, and after it’s over.
For the firewall issue, could you keep the cluster on its own vpc, and then use load balancer annotations to do per-service firewalls?
https://docs.digitalocean.com/products/kubernetes/how-to/configure-load-balancers/#firewall-rules
❤️🌊
If your scale is right, both Hetzner and Digital Ocean support the Kubernetes autoscaler.
https://github.com/kube-hetzner/terraform-hcloud-kube-hetzner
https://docs.digitalocean.com/products/kubernetes/how-to/autoscale/
Digital Ocean is super easy for beginners, Hetzner is a bit more technical but like half the cost.
This only outweighs the per-node overhead though if you’re scaling up/down entire 4vcpu/8gib nodes and/or running multiple applications that can borrow cpu/ram from each other.
If you’re small scale, microVMs like Lambda or fly.io are the only way to go for meaningful scaling under 4vcpu/8gib of daily variation. Also, at that scale, you can ask yourself if you really need autoscaling, since you can get servers that big from Hetzner for like $20/month. Simple static scaling is better at that scale unless you have more dev time than money.
Interesting, I already used Hetzner for my current VPS. I’ll look into the stuff you listed, thanks!
Go for it!
Hetzner currently doesn’t have a managed kubernetes option, so you have to set it up manually with Terraform, but there are a few terraform modules out there that have everything you need. The rumor is that they are working on a managed kubernetes offering, so that will be something simpler in the future.
Their api is compatible with all the Kubernetes automation, so all the autoscaling stuff is all automatic once you have it set up, and bullet-proof. Just use the k8s HPA to start and stop new containers based on cpu, or prometheus metrics if youre feeling fancy, and then kubernetes node autoscaler will create and delete nodes automatically for you based on your containers’ cpu/ram reservations.
Let me know if you need documentation links for something.
I second digital ocean. I’m not a beginner, but I really appreciate their simplicity. They also have a cli option that can pretty much do everything the UI can do.
Their Terraform support is top notch too, better than AWS.
Not sure if it’s “best”, but despite my hatred for Microsoft, I actually like azure. I’ve also self-hosted via Hetzner in the past, but not sure if that meets your needs.
It depends entirely on what you want to accomplish. You could argue for anything, but without context it’s ultimately just noise.