Sunday, 10 July 2011

IEEE CLOUD 2011 Conference, Day 3

Day 3 started with a keynote from Dan Reed (Corporate VP of Microsoft). The keynote was titled “Clouds: From Both Sides, Now” and Dan discussed a number of cloud computing research challenges from a provider and a consumer perspective.

From a provider’s perspective, Dan described the development of Microsoft’s datacenters between 2005 and 2011. Microsoft have gone through 4 generations of datacenters in that timeframe, their Chicago datacenter was 3rd generation; 4th generation consists of factory pre-assembled units and pre-manufactured buildings, which enable rapid deployment of extra capacity on demand. Dan suggested that “the questions don’t change, the answers do”, referring to the fact that the question of how to build such enormous datacenters has remained the same but the answer has changed over the years (e.g. cooling is often unnecessary as hardware can work under higher temperatures). Dan’s also argued that change in a system’s magnitude results in some components to break, so scale does matter – this goes inline with the aim of the UK’s Large Scale Complex IT Systems initiative that is investigating the challenges of developing systems at scale.

From a consumer’s perspective, Dan mentioned cost savings (from economies of scale), organizational efficiencies (focusing on core competencies), transfer of responsibilities (to cloud providers), and just-in-time provisioning (pay-per-use model) as the main motivations for organizations using the cloud. The usual security and privacy concerns need to be addressed but Dan argued that these issues are not resolvable by technology alone; there are legal, governmental, business and societal issues that need to be resolved.

Dan’s advice to researchers interested in cloud computing was to “work on problems that industry is not working on, and if it can be solved with money then don’t bother because industry will do it”.

Panel: The Federal Cloud
Moderator: Simon Liu, Director of The National Agricultural Library (NAL), USA
Panelists: Chris Smith, CIO of USDA, USA
Linda Cureton, CIO of NASA, USA
George Strawn, CIO of NSF, also Director of NITRD, USA


Panelists gave a 10min talk about the use of cloud computing in their organizations. NASA had many under the desk servers in their organization but at the same time their datacenters were under-utilized. They surveyed the market and decided to develop their own private cloud software stack, called Open Nebula.

Chris Smith talked about the United States Department of Agriculture (USDA) and how they support America’s 2 million farmers. The USDA Cloud Strategy Framework enables them to investigate their cloud migration decisions by looking at the benefits and opportunities: lower IT costs and improved transparency, improved business agility, creation of new value drivers and innovation. USDA has a private cloud (supports Linux, Windows and AIX VMs), although I did not see on-demand self-service of VMs anywhere on their slides. USDA is looking to move into PaaS next and use it as a platform for their developers to quickly develop new applications.

George Strawn from the NSF mentioned that the Federal government funds the latest and most advanced research in technology but trails in technology uptake. One of George’s main arguments was that the challenges that arise during the uptake of cloud computing are not technical, most are cultural. This goes inline with some of our early research that investigated the socio-technical issues that arise during the migration of IT systems to the cloud (Research Challenges for Enterprise Cloud Computing, Cloud Migration: A Case Study of Migrating an Enterprise IT System to IaaS).

George discussed some of the reasons for governments being late adopters of new technology. He said that they cannot risk tax-payer money with risky new technology (e.g. security, vendor lock-in), and working through contractors is not always easy. The flow goes from governments funding research in universities; the research trickles through to consultancies and contractors, who then sell it back to the government. So until the consultancies and contractors are ready to sell it, the government is not interested.

I attended a number of presentations after the keynote and panel session:

Self-Configuration of Distributed Applications in the Cloud
Xavier Etchevers, Thierry Coupaye, Fabienne Boyer, Noel de Palma (Orange Labs; LIG Labs, France)
In the field of cloud computing, current solutions dedicated to PaaS are only partially automated. This limitation is due to the lack of an architectural model for describing a distributed application in terms of its software stacks (operating system, middleware, application), their instantiation as virtual machines, and their configuration interdependencies. This article puts forward (i) a component-based application model for defining any kind of distributed applications composed of a set of interconnected virtual machines, (ii) an automated line for deploying such a distributed application in the cloud, which includes a decentralized protocol for self-configuring the virtual application machines, (iii) a first performance evaluation demonstrating the viability of the solution.

Delivering High Resilience in Designing Platform-as-a-Service Clouds
Qianhui Liang, Bu-Sung Lee (HP Labs; Nanyang Technological University, Singapore)
One issue in designing PaaSs is how to make development process deliver applications resilient to potential changes of the constraints. The first type of dynamic constraints we need to consider is the compatibility between possible components of the application. PaaSs must only engage compatible components to collaborate with each other in the same instance of applications. Other constraints include the environment that the application is running and the preferences of the users. We present a data-flow based approach, for PaaS clouds, to designing cloud-based applications that are resilient to failures due to dynamic constraints on resources and on component compatibility. The uniqueness of our approach is the following: The procedure of building cloud-based applications is time-stamped. We have designed a graph structure called Instance Dependency Graphs (IDGs), and have used time-based IDGs to capture, analysis and optimize the resilience of the application. A case study is also reported.

What Are You Paying for? Performance Benchmarking for Infrastructure-as-a-Service Offerings
Alexander Lenk, Michael Menzel, Johannes Lipsky, Stefan Tai, Philipp Offermann (FZI Forschungszentrum Informatik; Deutsche Telekom Laboratories, Germany)
They used benchmarking tools to compare CPU performance of different IaaS clouds. Their short-term study involved running the tests once every hour over 20 instances. Their long-term study involved repeating the test suite in August 2010 and November 2010 and comparing the difference. How much do instances differ on average? Conclusion: some applications run better on Intel CPUs and some run better on AMD CPUs, so depending on your application you have to keep asking for instances until you get the CPU you want. I asked what the difference between their work was with CloudHarmony.com and the answer was that when they started their study, CloudHarmony was not as advanced, but they are currently quite advanced and have a database of their results etc. I guess this is one of those research areas that industry will lead.

Efficient Autoscaling in the Cloud Using Predictive Models for Workload Forecasting
Nilabja Roy, Abhishek Dubey, Aniruddha Gokhale (Vanderbilt University, USA)
During autoscaling you cannot provision for average as you lose the peak, but you also don’t want to provision for peak because you’ll pay too much. So how can you predict the future workload to enable the system to autoscale effectively? They used data from the IRCachce Project (www.ircache.net) to evaluate their technique. Abstract:
In the context of Cloud computing, autoscaling mechanisms hold the promise of assuring QoS properties to the applications while simultaneously making efficient use of resources and keeping operational costs low for the service providers. Despite the perceived advantages of autoscaling, realizing the full potential of autoscaling is hard due to multiple challenges stemming from the need to precisely estimate resource usage in the face of significant variability in client workload patterns. This paper makes three contributions to overcome the general lack of effective techniques for workload forecasting and optimal resource allocation. First, it discusses the challenges involved in autoscaling in the cloud. Second, it develops a model-predictive algorithm for workload forecasting that is used for resource autoscaling. Finally, empirical results are provided.

Performance Modeling of Virtual Machine Live Migration
Yangyang Wu, Ming Zhao (Florida International University, USA)
Live migration is resource intensive (needs CPU, network and memory), managing resource allocation efficiently during live migration can improve cloud performance (guarantee QoS of migrated VM, guaranteed QoS of other VMs running on host). The aim here is to predicate VM migration time based on resource allocation. They created models to predict migration time based on source host and destination host resource availability; they control the amount of CPU given to Dom0 (assumption is that Dom0’s CPU allocation is used for VM migration). Run a CPU intensive application on a VM (using the IBM CPU benchmarking tool), and studied the effect of live migration. Also studied migration of memory-intensive VM (memory-read test by writing 0.5GB to memory and keep reading it, memory-write test by writing to 0.5GB of memory over and over). Same set of experiments done for disk I/O and network-intensive tasks.

DACAR Platform for eHealth Services Cloud
L. Fan, W. Buchanan, C. Thümmler, O. Lo, A. Khedim, O. Uthmani, A. Lawson, D. Bell (Edinburgh Napier University; Imperial College London, UK)
Concerns over service integration, large scale deployment, and security, integrity and confidentiality of sensitive medical data still need to be addressed. This paper presents a solution proposed by the Data Capture and Auto Identification Reference (DACAR) project to overcoming these challenges. The key contributions of this paper include a Single Point of Contact (SPoC), a novel rule based information sharing policy syntax, and Data Buckets hosted by a scalable and cost-effective Cloud infrastructure. These key components and other system services constitute DACAR’s eHealth platform, which allows the secure capture, storage and consumption of sensitive health care data. Currently, a prototype of the DACAR platform has been implemented. To assess the viability and performance of the platform, a demonstration application, namely the EarlyWarningScore (EWS), has been developed and deployed within a private Cloud infrastructure at Edinburgh Napier University.

No comments:

Post a Comment