Skip to main content
  1. Blog
  2. Article

Tom Callway
on 19 February 2015

Ubuntu, Hortonworks and Microsoft = Big Data Hosted Solution


The first Microsoft Azure hosted service to run Linux (on Ubuntu) announced at Strata Conference

This week thousands of people are in California at Strata + Hadoop World to learn more about the technology and business of big data. At the Strata Conference, Microsoft announced yesterday the preview of Azure HDInsight on Ubuntu. This is a recognition that Ubuntu, the leading scale-out and cloud Linux, is great for running Big Data solutions.

Microsoft’s Ranga Rengarajan, corporate vice president, Data Platform and Joseph Sirosh, corporate vice president, Machine Learning noted that, Azure HDInsight, is Microsoft’s Apache Hadoop-based service in the Azure cloud. It is designed to make it easy for customers to evaluate petabytes of all types of data with fast, cost-effective scale on demand, as well as programming extensions so developers can use their favorite languages. Microsoft customers like Virginia Tech, Chr. Hanson, Mediatonic and many others are using it to find important data insights. And, yesterday, they announced that customers can run HDInsight on Ubuntu clusters (the leading scale-out Linux), in addition to Windows, with simple deployment, a managed service level agreement and full technical support. This is particularly compelling for people that already use Hadoop on Linux on-premises like on Hortonworks Data Platform, because they can use common Linux tools, documentation, and templates and extend their deployment to Azure with hybrid cloud connections.

Combined with Canonical’s Juju, Cloud Orchestration tool, Ubuntu make it a breeze to test, deploy, scale and manage Big Data architectures. This is the result of years of effort to optimize Big Data workloads on Ubuntu by our development teams.

For over a decade, DevOps teams have been working with “classical” Configuration Management tools. They have become very successful at building insurance that each server under their watch would run in perfect accordance with their desires and policies.

However, when it comes to Big Data, whether to process vast data sets, or to run real time analytics on unpredictable data streams, or to offer Data-aaS, new questions arise: How to embrace fast paced scalability of their architectures, whether up when the flow grows, or in, when business flow slows? How to stay ahead of the game in a world of faster than ever changing technologies? Add multi-clouds to the equation to prevent single points of failure and you end up with a nightmare for every decision maker.

Containerization has received a lot of positive reviews as an attempt to fix some of these issues by maintaining a single and lightweight “image” of application that becomes cloud-agnostic. But it also came with a list of new and still-to-be-fixed concerns regarding security and, to come back to the first point, orchestration.

So what is good cloud orchestration? To answer that question we have to get back to the requirements for such a tool:

  • Be portable: orchestration is valuable if and only if it is adaptable to each and every substrate: public cloud, private cloud, hybrid cloud, bare metal, containers…
  • Manage scalability: deploying an architecture and not being able to scale it from the management tool doesn’t make sense. To orchestration, the deployment targets should be infinite. The tool must be able to get any share of that infinity and change its mind at any point in time.
  • Manage services: to consumers of the architecture, the knowledge of each machine involved in a scale out service is pointless. What is important is to know how to access the service that the cluster provides.
  • Manage relations: at cloud scale, what matters is that pieces of architecture can communicate together.

What is our answer to those requirements? Juju.

  • Juju creates portable architectures: When deploying a service, Juju makes the minimum number of assumptions regarding the substrate. It always starts with a vanilla OS image, and adds software or containers on top. All configuration information is processed dynamically. Then it can export to a standard YAML file, and reproduce the same architecture regardless of the provider.
  • Juju can scale architectures in and out: Juju offers commands to add or remove service units, efficiently providing ways to scale in both directions. Complemented with a system collecting performance metrics and pointing to its API, it becomes very easy to design autoscalable solutions that do not rely on a cloud provider to function.
  • Juju manages services: The best illustration of Juju’s focus on service is its GUI: whether a cluster has 2 or 200 nodes, it still comes up as a single box.
  • Juju manages relations: Juju can create and manage relations between services by exposing parameters to other services, and consume exposed variables. Juju plugs services into each other, add credentials, and allows the smoothest way to run complex architectures.

On top of that, Juju comes with a centralized Charm Store, a unique marketplace where all charms are stored and exchanged. The main benefit of this approach is that you’ll always find the currently best charm available for a service. If it doesn’t match your own preferences, you can fork it, and share your views with others, thus helping to create an even better experience for future users.For Enterprises, this is a guarantee that their DevOps team are always up-to-date and as agile as they can be when it comes to building new services for the company.

So take Juju, the best in class cloud orchestration tool, with Ubuntu, the best OS for Big Data deployment and Azure, the most advanced Enterprise cloud together to make it easy for customers to evaluate petabytes of all types of data fast.

Related posts


Massimiliano Gori
31 March 2026

How to manage Ubuntu fleets using on-premises Active Directory and ADSys

Cloud and server Article

The “hybrid fleet” is today’s reality: organizations diversify operating systems while Microsoft Active Directory (AD) remains the dominant identity “source of truth.” IT administrators must ensure Linux machines, like Ubuntu desktops and servers, behave as first-class citizens in this environment. Efficient Linux management demands unifi ...


Massimiliano Gori
30 March 2026

How to Harden Ubuntu SSH: From static keys to cloud identity

Cloud and server Article

30 years after its introduction, Secure Shell (SSH) remains the ubiquitous gateway for administration, making it a primary target for brute force attacks and lateral movement within enterprise environments. For system administrators and security architects operating under the weight of regulatory frameworks like SOC2, HIPAA, and PCI-DSS, ...


Massimiliano Gori
27 March 2026

Modern Linux identity management: from local auth to the cloud with Ubuntu

Cloud and server Article

The modern enterprise operates in a hybrid world where on-premises infrastructure coexists with cloud services, and security threats evolve daily. IT administrators are tasked with a difficult balancing act: maintaining traditional local workflows while managing the inevitable shift toward cloud-native architectures. Identity has emerged ...