Ali Khajeh-Hosseini @ University of St Andrews

Sunday 10 July 2011

IEEE CLOUD 2011 Conference, Day 4

Day 4 included a panel, a keynote and a number of research presentations:

Panel: Opportunities of Services Business in Cloud Age
This panel was a bit mixed, different speakers talked about their research/industry projects and interests. I don’t think anything new was said here. One of the panelists argued for standards in the cloud, and mentioned that there are 45 groups working on this. He pointed out the following two IEEE meetings where standards are going to be discussed:
P2301 - Guide for Cloud Portability and Interoperability Profiles (CPIP)
P2302 - Standard for Intercloud Interoperability and Federation (SIIF)

Keynote: Web Services in the Scientific Wilds
Carole Goble, University of Manchester, UK
Carole discussed the use of web services in sciences (in particular biological sciences). Most scientists are not trained software engineers and have a hacking attitude towards software development. This, as well as other reasons, has resulted in a mess of services to emerge from such scientific fields: different data formats; some are wrappers for command-line tools; inconsistent APIs etc.

Carole also predicted “the death of scientific papers”, where scientists would provide web services of their experiments, data, algorithms etc. She pointed to the idea of Executable Journals and the use of VMs for papers. For this to happen, research funding bodies should give credit to scientist who provide web services that are used by the scientific community (not just publications).

Towards Pay-As-You-Consume Cloud Computing
Shadi Ibrahim, Bingsheng He, Hai Jin (Huazhong University, China; Nanyang Technological University, Singapore)
Our case studies demonstrate significant variations in the user costs, indicating significant unfairness among different users from the micro-economic perspective. Further studies reveal the reason for such variations is interference among concurrent virtual machines. The amount of interference cost depends on various factors, including workload characteristics, the number of concurrent VMs, and scheduling in the cloud. In this paper, we adopt the concept of pricing fairness from micro economics, and quantitatively analyze the impact of interference on the pricing fairness. To solve the unfairness caused by interference, we propose a pay-as-you-consume pricing scheme, which charges users according to their effective resource consumption excluding interference. The key idea behind the pay-as-you-consume pricing scheme is a machine learning based prediction model of the relative cost of interference.

Price Heuristics for Highly Efficient Profit Optimization of Service Composition
Xianzhi Wang, Zhongjie Wang, Xiaofei Xu (Harbin Institute of Technology, China)
As the de facto provider of composite services, the broker charges the consumers; on the other hand, it awards cost to the providers whose services are involved in the composite services. Besides traditional quality-oriented optimization from the consumers’ point of view, the profit that a broker could earn from the composition is another objective to be optimized. But just as the quality optimization, service selection for profit optimization suffers from dramatic efficiency decline along with the growth in the number of candidate services. On the premise that the expected quality are guaranteed, this paper presents a “divide and select” approach for high-efficiency profit optimization, with price as heuristics. This approach can be applied to both static and dynamic pricing scenarios of service composition. Experiments demonstrate the feasibility.

Differentiated Service Pricing on Social Networks Using Stochastic Optimization
Alexei A.Gaivoronski, Denis Becker (Norwegian University, Norway)
This paper develops a combined simulation and optimization model that allows to optimize different service pricing strategies defined on the social networks under uncertainty. For a specific reference problem we consider a telecom service provider whose customers are connected in such network.Besides the service price, the acceptance of this service by a given customer depends on the popularity of this service among the customer’s neighbors in the network. One strategy that the service provider can pursue in this situation is to stimulate the demand by offering the price incentives to the most connected customers whose opinion can influence many other participants in the social network. We develop a simulation model of such social network and show how this model can be integrated with stochastic optimization in order to obtain the optimal pricing strategy. Our results are reported.

Energy-efficient Management of Virtual Machines in Eucalyptus
Pablo Graubner, Matthias Schmidt, Bernd Freisleben (University of Marburg, Germany)
In this paper, an approach for improving the energy efficiency of infrastructure-as-a-service clouds is presented. The approach is based on performing live migrations of virtual machines to save energy. In contrast to related work, the energy costs of live migrations including their pre- and post-processing phases are taken into account, and the approach has been implemented in the Eucalyptus open-source cloud computing system by efficiently combining a multi-layered file system and distributed replication block devices. To evaluate the proposed approach, several short- and long-term tests based on virtualmachine workloads produced with common operating system benchmarks, web-server emulations as well as different MapReduce applications have been conducted.

Exploiting Spatio-Temporal Tradeoffs for Energy-aware MapReduce in the Cloud
Michael Cardosa, Aameek Singh, Himabindu Pucha, Abhishek Chandra (University of Minnesota; IBM Research, Almaden, USA)
MapReduce is a distributed computing paradigm widely used for building large-scale data processing applications. When used in cloud environments, MapReduce clusters are dynamically created using virtual machines (VMs) and managed by the cloud provider. In this paper, we study the energy efficiency problem for such MapReduce clusters in private cloud environments, that are characterized by repeated, batch execution of jobs. We describe a unique spatio-temporal tradeoff that includes efficient spatial fitting of VMs on servers to achieve high utilization of machine resources, as well as balanced temporal fitting of servers with VMs having similar runtimes to ensure a server runs at a high utilization throughout its uptime. We propose VM placement algorithms that explicitly incorporate these tradeoffs. Our algorithms achieve energy savings over existing placement techniques.

Low Carbon Virtual Private Clouds
Fereydoun Farrahi Moghaddam, Mohamed Cheriet, Kim Khoa Nguyen (Ecole de technologie superieure, Canada)
With the introduction of live WAN VM migration, however, the challenge of energy efficiency extends from a single data center to a network of data centers. In this paper, intelligent live migration of VMs within a WAN is used as a reallocation tool to minimize the overall carbon footprint of the network. We provide a formulation to calculate carbon footprint and energy consumption for the whole network and its components, which will be helpful for customers of a provider of cleaner energy cloud services. Simulation results show that using the proposed Genetic Algorithm (GA)-based method for live VM migration can significantly reduce the carbon footprint of a cloud network compared to the consolidation of individual datacenter servers. In addition, the WAN data center consolidation results show that an optimum solution for carbon reduction is notnecessarily optimal for energy consumption, and vice versa. Also, the simulation platform was tested under heavy and light VMloads, the results showing the levels of improvement in carbon reduction under different loads.

Portability and Interoperability in Cloud Computing
Lee Badger, Tim Grance, Bill MacGregor, NIST
Three presentations by NIST, whose goal is to accelerate the Federal government’s adoption of cloud computing by building a roadmap and leading efforts to develop standards and guidelines.

Exploring Alternative Approaches to Implement an Elasticity Policy
Hamoun Ghanbari, Bradley Simmons, Marin Litoiu, Gabriel Iszlai (York University; IBM Toronto Lab, Canada)
An elasticity policy governs how and when resources (e.g., application server instances at the PaaS layer) are added to and/or removed from a cloud environment. The elasticity policy can be implemented as a conventional control loop or as a set of heuristic rules. In the control-theoretic approach, complex constructs such as tracking filters, estimators, regulators, and controllers are utilized. In the heuristic, rule-based approach, various alerts (e.g., events) are defined on instance metrics (e.g., CPU utilization), which are then aggregated at a global scale in order to make provisioning decisions for a given application tier. This work provides an overview of our experiences designing and working with both approaches to construct an autoscaler for simple applications. We enumerate different criteria such as design complexity, ease of comprehension, and maintenance upon which we form an informal comparison between the different methods. We conclude with a brief discussion of how these approaches can be used in the governance of resources to better meet a high-level goal over time.

IEEE CLOUD 2011 Conference, Day 3

Day 3 started with a keynote from Dan Reed (Corporate VP of Microsoft). The keynote was titled “Clouds: From Both Sides, Now” and Dan discussed a number of cloud computing research challenges from a provider and a consumer perspective.

From a provider’s perspective, Dan described the development of Microsoft’s datacenters between 2005 and 2011. Microsoft have gone through 4 generations of datacenters in that timeframe, their Chicago datacenter was 3rd generation; 4th generation consists of factory pre-assembled units and pre-manufactured buildings, which enable rapid deployment of extra capacity on demand. Dan suggested that “the questions don’t change, the answers do”, referring to the fact that the question of how to build such enormous datacenters has remained the same but the answer has changed over the years (e.g. cooling is often unnecessary as hardware can work under higher temperatures). Dan’s also argued that change in a system’s magnitude results in some components to break, so scale does matter – this goes inline with the aim of the UK’s Large Scale Complex IT Systems initiative that is investigating the challenges of developing systems at scale.

From a consumer’s perspective, Dan mentioned cost savings (from economies of scale), organizational efficiencies (focusing on core competencies), transfer of responsibilities (to cloud providers), and just-in-time provisioning (pay-per-use model) as the main motivations for organizations using the cloud. The usual security and privacy concerns need to be addressed but Dan argued that these issues are not resolvable by technology alone; there are legal, governmental, business and societal issues that need to be resolved.

Dan’s advice to researchers interested in cloud computing was to “work on problems that industry is not working on, and if it can be solved with money then don’t bother because industry will do it”.

Panel: The Federal Cloud
Moderator: Simon Liu, Director of The National Agricultural Library (NAL), USA
Panelists: Chris Smith, CIO of USDA, USA
Linda Cureton, CIO of NASA, USA
George Strawn, CIO of NSF, also Director of NITRD, USA

Panelists gave a 10min talk about the use of cloud computing in their organizations. NASA had many under the desk servers in their organization but at the same time their datacenters were under-utilized. They surveyed the market and decided to develop their own private cloud software stack, called Open Nebula.

Chris Smith talked about the United States Department of Agriculture (USDA) and how they support America’s 2 million farmers. The USDA Cloud Strategy Framework enables them to investigate their cloud migration decisions by looking at the benefits and opportunities: lower IT costs and improved transparency, improved business agility, creation of new value drivers and innovation. USDA has a private cloud (supports Linux, Windows and AIX VMs), although I did not see on-demand self-service of VMs anywhere on their slides. USDA is looking to move into PaaS next and use it as a platform for their developers to quickly develop new applications.

George Strawn from the NSF mentioned that the Federal government funds the latest and most advanced research in technology but trails in technology uptake. One of George’s main arguments was that the challenges that arise during the uptake of cloud computing are not technical, most are cultural. This goes inline with some of our early research that investigated the socio-technical issues that arise during the migration of IT systems to the cloud (Research Challenges for Enterprise Cloud Computing, Cloud Migration: A Case Study of Migrating an Enterprise IT System to IaaS).

George discussed some of the reasons for governments being late adopters of new technology. He said that they cannot risk tax-payer money with risky new technology (e.g. security, vendor lock-in), and working through contractors is not always easy. The flow goes from governments funding research in universities; the research trickles through to consultancies and contractors, who then sell it back to the government. So until the consultancies and contractors are ready to sell it, the government is not interested.

I attended a number of presentations after the keynote and panel session:

Self-Configuration of Distributed Applications in the Cloud
Xavier Etchevers, Thierry Coupaye, Fabienne Boyer, Noel de Palma (Orange Labs; LIG Labs, France)
In the field of cloud computing, current solutions dedicated to PaaS are only partially automated. This limitation is due to the lack of an architectural model for describing a distributed application in terms of its software stacks (operating system, middleware, application), their instantiation as virtual machines, and their configuration interdependencies. This article puts forward (i) a component-based application model for defining any kind of distributed applications composed of a set of interconnected virtual machines, (ii) an automated line for deploying such a distributed application in the cloud, which includes a decentralized protocol for self-configuring the virtual application machines, (iii) a first performance evaluation demonstrating the viability of the solution.

Delivering High Resilience in Designing Platform-as-a-Service Clouds
Qianhui Liang, Bu-Sung Lee (HP Labs; Nanyang Technological University, Singapore)
One issue in designing PaaSs is how to make development process deliver applications resilient to potential changes of the constraints. The first type of dynamic constraints we need to consider is the compatibility between possible components of the application. PaaSs must only engage compatible components to collaborate with each other in the same instance of applications. Other constraints include the environment that the application is running and the preferences of the users. We present a data-flow based approach, for PaaS clouds, to designing cloud-based applications that are resilient to failures due to dynamic constraints on resources and on component compatibility. The uniqueness of our approach is the following: The procedure of building cloud-based applications is time-stamped. We have designed a graph structure called Instance Dependency Graphs (IDGs), and have used time-based IDGs to capture, analysis and optimize the resilience of the application. A case study is also reported.

What Are You Paying for? Performance Benchmarking for Infrastructure-as-a-Service Offerings
Alexander Lenk, Michael Menzel, Johannes Lipsky, Stefan Tai, Philipp Offermann (FZI Forschungszentrum Informatik; Deutsche Telekom Laboratories, Germany)
They used benchmarking tools to compare CPU performance of different IaaS clouds. Their short-term study involved running the tests once every hour over 20 instances. Their long-term study involved repeating the test suite in August 2010 and November 2010 and comparing the difference. How much do instances differ on average? Conclusion: some applications run better on Intel CPUs and some run better on AMD CPUs, so depending on your application you have to keep asking for instances until you get the CPU you want. I asked what the difference between their work was with CloudHarmony.com and the answer was that when they started their study, CloudHarmony was not as advanced, but they are currently quite advanced and have a database of their results etc. I guess this is one of those research areas that industry will lead.

Efficient Autoscaling in the Cloud Using Predictive Models for Workload Forecasting
Nilabja Roy, Abhishek Dubey, Aniruddha Gokhale (Vanderbilt University, USA)
During autoscaling you cannot provision for average as you lose the peak, but you also don’t want to provision for peak because you’ll pay too much. So how can you predict the future workload to enable the system to autoscale effectively? They used data from the IRCachce Project (www.ircache.net) to evaluate their technique. Abstract:
In the context of Cloud computing, autoscaling mechanisms hold the promise of assuring QoS properties to the applications while simultaneously making efficient use of resources and keeping operational costs low for the service providers. Despite the perceived advantages of autoscaling, realizing the full potential of autoscaling is hard due to multiple challenges stemming from the need to precisely estimate resource usage in the face of significant variability in client workload patterns. This paper makes three contributions to overcome the general lack of effective techniques for workload forecasting and optimal resource allocation. First, it discusses the challenges involved in autoscaling in the cloud. Second, it develops a model-predictive algorithm for workload forecasting that is used for resource autoscaling. Finally, empirical results are provided.

Performance Modeling of Virtual Machine Live Migration
Yangyang Wu, Ming Zhao (Florida International University, USA)
Live migration is resource intensive (needs CPU, network and memory), managing resource allocation efficiently during live migration can improve cloud performance (guarantee QoS of migrated VM, guaranteed QoS of other VMs running on host). The aim here is to predicate VM migration time based on resource allocation. They created models to predict migration time based on source host and destination host resource availability; they control the amount of CPU given to Dom0 (assumption is that Dom0’s CPU allocation is used for VM migration). Run a CPU intensive application on a VM (using the IBM CPU benchmarking tool), and studied the effect of live migration. Also studied migration of memory-intensive VM (memory-read test by writing 0.5GB to memory and keep reading it, memory-write test by writing to 0.5GB of memory over and over). Same set of experiments done for disk I/O and network-intensive tasks.

DACAR Platform for eHealth Services Cloud
L. Fan, W. Buchanan, C. Thümmler, O. Lo, A. Khedim, O. Uthmani, A. Lawson, D. Bell (Edinburgh Napier University; Imperial College London, UK)
Concerns over service integration, large scale deployment, and security, integrity and confidentiality of sensitive medical data still need to be addressed. This paper presents a solution proposed by the Data Capture and Auto Identification Reference (DACAR) project to overcoming these challenges. The key contributions of this paper include a Single Point of Contact (SPoC), a novel rule based information sharing policy syntax, and Data Buckets hosted by a scalable and cost-effective Cloud infrastructure. These key components and other system services constitute DACAR’s eHealth platform, which allows the secure capture, storage and consumption of sensitive health care data. Currently, a prototype of the DACAR platform has been implemented. To assess the viability and performance of the platform, a demonstration application, namely the EarlyWarningScore (EWS), has been developed and deployed within a private Cloud infrastructure at Edinburgh Napier University.

Wednesday 6 July 2011

IEEE CLOUD 2011 Conference, Day 2

Day 2 started with some early research talks; these were followed by a panel session on “Security in the Cloud”, which was similar to last year’s security panel/keynote. There was also a lunchtime panel about “Enterprise Clouds vs Commodity Clouds”.

Tonight was the conference banquet; a pretty good 5-course meal, during which Paul Hofmann from SAP Research talked about SAPs worldwide presence. If you didn’t know this already, Paul provided a couple of slides full of stats to convince you. He also talked about the future of cloud in enterprises; how ERP will be in the clouds in 10 to 15 years; and differences between electrification and cloud computing. His key point was that there will be fewer CIOs in the future and their role will, primarily, be strategic.

The following is a list of talks from today; I didn’t take good notes as I was running around too much so I’ve just copied the details from the conference proceedings. It was interesting to see that IBM (specially their TJ Watson Lab) are interested in tools to support the migration of IT systems to the cloud. I guess their main aim is to develop such tools so that IBM’s consultants can start to use them over the next few years.

A Pattern-Based Approach to Cloud Transformation
Yi-Min Chee, Nianjun Zhou, Fan Jing Meng, Peide Zhong, Saeed Bagheri (Arizona State University, USA)
One problem clients face in migrating to cloud is a lack of experience and knowledge as to how best to accomplish this transformation. We propose a Cloud Transformation Advisor (CTA) which helps users to select appropriate enablement patterns from a knowledge base of best practices when performing transformation planning. This knowledge base uses a structured representation to capture applicationinformation, cloud platform capability information, and enablement pattern information in order to facilitate patternselection. We describe this representation and a mathematical model which leverages it to choose the "best" combination of patterns for a given transformation problem. We present an example which illustrates the approach, and describe the usage of the CTA.

A Saasify Tool for Converting Traditional Web-Based Applications to Saas Application
Jie Song, Feng Han, Zhenxing Yan, Guoqi Liu, Zhiliang Zhu (Northeastern University, China)
SaaS is increasingly used by web-based applications. It is significative if service providers can automatically convert traditional applications into SaaS mode, a SaaSify tool is needed urgently. In this paper, we analyze and conclude the new challenges of automatically SaaSify webbased application, propose several key technologies for SaaSifying, and further propose SaaSify Flow Language (SFL) to model and implement SaaSify process; finally, we use a case study to show the effects of proposed tool, and the performance experiments prove that the proposed approach is efficient and effective.

Migrating Service-Oriented System to Cloud Computing: An Experience Report
Muhammad Aufeef Chauhan, Muhammad Ali Babar (Mälardalen University, Sweden; IT University of Copenhagen, Denmark)

Since cloud-orientd migration projects are likely to encounter several kinds of challenges, it is important to identify and share the process and logistical requirements of migration projects in order to build a body of knowledge of appropriate process, methods, and tools. This paper purports to contribute to the growing knowledge of how to migrate existing systems to cloud computing by reporting our effort aimed at migrating an Open Source Software (OSS) framework, Hackystat, to cloud computing. We report the main steps followed, the process and technical challenges faced, and some of the strategies that helped us to address those challenges. We expect the reported experiences can provide readers with useful insights into the process and technical aspects that should be considered
when migrating existing software systems to cloudcomputing infrastructures.

Variations in Performance and Scalability when Migrating n-Tier Applications to Different Clouds
Deepal Jayasinghe, Simon Malkowski, Qingyang Wang, Jack Li, Pengcheng Xiong, Calton Pu (Georgia Tech, USA)
We aim to evaluate performance and scalability when an n-tier application is migrated from a traditional datacenter environment to an IaaS cloud. We used a representative n-tier macro-benchmark (RUBBoS) and compared its performance and scalability in three different testbeds: Amazon EC2, Open Cirrus (an open scientific research cloud), and Emulab (academic research testbed). Interestingly, we found that the best-performing configuration in Emulab can become the worst-performing configuration in EC2. Subsequently, we identified the bottleneck components, high context switch overhead and network driver processing overhead, to be at the system level. These overhead problems were confirmed at a finer granularity through micro-benchmark experiments that measure component performance directly. We describe concrete alternative approaches as practical solutions.

Migration to Multi-Image Cloud Templates
Birgit Pfitzmann, Nikolai Joukov (IBM T. J.Watson Research Center, USA)
A key vehicle by which enterprises hope to achieve reducing IT costs is cloud computing, and they start to show interest in clouds outside the initial sweet spot of development and test. As business applications typically contain multiple images with dependencies, one is starting to standardize on multi-image structures. Enterprises have huge investments in their existing business applications. The promises of clouds can only be realized if a significant fraction of these existing applications can be migrated into the clouds. We therefore present analysis techniques for mapping existing IT environments to multi-image cloud templates. We propose multiple matching criteria, leading to tradeoffs between the number of matches and the migration overhead, and present efficient algorithms for these special graph matching problems. We present results from analyzing an existing enterprise environment with about 1600 servers.

Flexible Process-based Applications in Hybrid Clouds
Christoph Fehling, Ralf Konrad, Frank Leymann, Ralph Mietzner, Michael Pauly, David Schumm (University of Stuttgart; T-Systems International GmbH Frankfurt, Germany)
Cloud applications target large costumer groups to leverage economies of scale. To increase the number of customers, a flexible application design may enable customers to adjust the application to their individual needs in a self-service manner. In this paper, we classify the required variability of these flexible applications: data variability – changes to handled data structures; functional variability – changes to the processes that the application supports; user interface variability – changes to the appearance of the application; provisioning variability – the ability of the application to be deployed in different runtime environments. Existing and new technologies and tools are leveraged to realize these classes of variability. Further, we cover architectural principles to follow during the design of flexible cloud applications and we introduce an abstract architectural pattern to enable data variability.

Elastically Ruling the Cloud: Specifying Application’s Behavior in Federated Clouds
Daniel Morán, Luis M. Vaquero, Fermín Galán (Telefonica Investigacion y Desarrollo; HP Labs, Spain)
Most IaaS clouds present limited capabilities to control how a service behaves at runtime, beyond basic low-level scalability rules for VMs. Higher-level approaches fail to provide mechanisms for a fine grained level of control of the service at runtime, being only focused on scaling. These scalability rules are based on an ad hoc “grammar” that is not expressive enough to reflect other desired control mechanisms at runtime (e.g., reconfigurations dynamic changes in the rules or in the components of the application, re-tiering, etc.). Here, we present an analysis on different alternatives for supporting such features. The Rule Interchange Format (RIF) emerges as a likely candidate to support the required flexibility and so it is proved in a typical use case. Also, a preliminary implementation of a mapping mechanism is offered to parse RIF rules to widespread rule engines such as Drools and Jess.

SLA Based Dynamic Virtualized Resources Provisioning for Shared Cloud Data Centers
Zhiliang Zhu, Jing Bi, Haitao Yuan, Ying Chen (Northeastern University; IBM Research-China)
Cloud computing focuses on delivery of reliable, secure, sustainable, dynamic and scalable resources provisioning for hosting virtualized application services in shared cloud data centers. For an appropriate provisioning mechanism, we developed a novel cloud data center architecture based on virtualization mechanisms for multitier applications, so as to reduce provisioning overheads. Meanwhile, we proposed a novel dynamic provisioning technique and employed a flexible hybrid queuing model to determine the virtualized resources to provision to each tier of the virtualized application services. We further developed meta-heuristic solutions, which is according to different performance requirements of users from different levels. Simulation experiments are reported.

Tuesday 5 July 2011

IEEE CLOUD 2011 Conference, Day 1

Conference Opening Session

The IEEE Computer Society President, Sorel Reisman, opened the conference and provided an overview of the IEEE’s cloud computing initiative. In addition to their on-going efforts of running conferences and publications, the IEEE is going to focus on standards for the cloud. An interesting announcement was made just before the first keynote: from next year, the conference will have a “Journal” track, where authors can submit 14-page papers. The idea of having a conference with a “Journal” track sounds a bit strange but lets see how it turns out…

Keynote 1: Data, Data, Data: The Core of Cloud/Services Computing
Peter Chen, Louisiana State University (LSU) & Carnegie-Mellon University (CMU)
Peter started by putting up the Wikipedia definition of cloud computing and only a few people in the audience agreed with it, so he went onto describe the high-level pros/cons of the cloud.

His keynote argued that thinking about the cloud from a computational viewpoint is wrong, and we should instead take a data viewpoint. We should think about data explosion problems and view clouds as data warehouses, not just compute-cycle generators. The research questions here are: how to store large data amounts? How to retrieve them efficiently? How should data security be managed? How should we preserve data for long-term archival purposes?

Peter concluded by stressing that the ultimate vision of cloud computing should be “information utility” as defined by his definition:
Anybody should be able to get any information (based on access rights), organised in any presentation form specified, in any place, in any time, on any device, in a timely manner, at reasonable costs (or free).

Presentations

Decision Support Tools for Cloud Migration in the Enterprise
Ali Khajeh-Hosseini, Ian Sommerville, Jurgen Bogaerts, Pradeep Teregowda
My talk went well, apart from me bumping into a 2-meter IEEE logo and tipping it behind the projector screen (this classic clip has been recorded and will probably find it’s way onto YouTube).

The main questions from my talk were cost-related. Someone asked if the Cost Modelling Tool can be used for private clouds; the answer is yes, we can add pricing models for private clouds along-side public clouds. Another person asked whether we can use the tool to study hybrid cloud deployments; again the answer is yes, we can define different groups in a model, one group can be a public option, another group can be a private-cloud option, and we can study the overall hybrid costs.

A non-cost-related question was: whether our premise that “cloud = organisational change” also holds for enterprises that have already experienced IT outsourcing, because for them, migrating to the cloud might be simpler as they have already gone through some of the risk assessment exercises that are relevant for cloud migration. The organisations that we’ve worked with so far have not mentioned this issue, and it would be interesting to do cloud migration case studies with organisations that have IT outsourcing experience.

MADMAC: Multiple Attribute Decision Methodology for Adoption of Clouds
Prasad Saripalli, Gopal Pingali (IBM T.J. Watson Research, USA)
Prasad talked about IBM’s Multi-Attribute Decision-Making (MADM) based approach to helping CIOs make rational decisions during the migrating of IT systems to the cloud. The decision area is a choice of public/private IaaS/PaaS/SaaS clouds and once a platform has been selected, a vendor needs to be chosen from that category (although Prasad did not present this part of the research as IBM would obviously be biased and recommend IBM’s cloud).

A brief overview of MADMAC: given an existing system and a set of migration options (one of which is simply to do nothing and keep the legacy system), they ask an expert, or a group of experts, to weigh the importance of each decision-attribute (e.g. costs, security) by assigning a numeric value to each attribute. So if the attribute under investigation were cost, the value would be the cost of that option. If it were latency, the value would be the latency of that option in milliseconds. For attributes that are not easily measured, they ask the expert to pick from a Likert scale range (e.g. the importance of security). The sum of the decision-attribute values of the available options are then calculated to judge which option is best for that system. I asked how they handle the socio-technical aspects of migration decisions, such as the politics in the workplace or the hidden agendas of IT managers, and Prasad said that they hold a meeting with the group of stakeholders and use the Wide-Band Delphi method to arrive at a consensus after several iterations.

Cost-wait Trade-offs in Client-side Resource Provisioning with Elastic Clouds

Stéphane Genaud, Julien Gossa (Universit ́e de Strasbourg)
Stephane described their work that attempts to solve the following problem: given a set of requests (or jobs), when should a VM be started to serve the requests and when should a VM be re-used to serve the requests. The cheapest option is to have one VM for all requests, but the fastest option is to start a new VM for each request that comes-in when there are no free VMs. They studied the optimisation of cost vs. performance by using a bin-packing algorithm and evaluating different strategies (first-fit, best-fit, worst-fit). They found a very small difference between the cost savings of the studied strategies (a few percent). As he acknowledged, they did not consider different types of instances, memory and storage I/O requirements of the requests and their effect on performance.

Real Time Collaborative Video Annotation using Google App Engine and XMPP Protocol
Abbas Attarwala, Deepak Jagdish, Ute Fischer (University of Toronto, Canada; Nokia Research Center; Georgia Tech, USA)
Abbas gave a technical overview of their video annotation application that is deployed on Google AppEngine.

DIaaS: Data Integrity as a Service in the Cloud
Surya Nepal, Shiping Chen, Jinhui Yao, Danan Thilakanathan (CSIRO ICT Centre, Australia)
I missed the main talk but caught the end of the questions, and it was pretty intense. The questioner argued that the paper’s contributions were irrelevant to cloud computing as they dealt with networking and data transfer issues, which begs the question: what can be categorized as relevant research to cloud computing?

A Home Healthcare System in the Cloud – Addressing Security and Privacy Challenges
Mina Deng, Milan Petkovic, Marco Nalin, Ilaria Baroni (Philips Research Europe, The Netherlands; Scientific Institute Hospital San Raffaele, Italy)
Mina talked about the TClouds project, an EU project with a 10million EURO budget that started in Oct 2010 and will keep going until Oct 2013. The project aims to architect internet-scale ICT infrastructures for different business domains. Mina’s talk was focused on the healthcare domain, where Philips is one of the industry partners working with them. Philips’ healthcare monitoring devices are being used to collect health data and the group has setup a private cloud (based on CloudStack) that is used to store and process the data, which is currently captured using wrist-watch-like devices.

Efficient Bidding for Virtual Machine Instances in Clouds
Sharrukh Zaman, Daniel Grosu (Wayne State University, USA)
I missed the main part of this talk but talked to Sharrukh after his talk. He described their research that investigated alternative markets for IaaS clouds. An interesting question that was asked at the end of this talk was: why would providers have markets for computing resources? Markets are for rare resources but if there are plenty of resources then why would cloud providers need to do develop such complicated market mechanism? Why not just keep using the simple pricing models they currently have? Sharrukh argued that although these market mechanisms might not be needed now, but they are likely to be needed in the future when the demand for clouds increases.

Multi-Dimensional SLA-Based Resource Allocation for Multi-Tier Cloud Computing Systems
Hadi Goudarzi, Massoud Pedram (University of Southern California, USA)
Hadi talked about their resource allocation model that takes into account the heterogeneious nature of datacenters and the operational cost of servers (fixed cost when a server is on, a proportional cost relating to the CPU utilization, which corresponds with energy use). They are interested in the placement of applications in IaaS clouds where SLAs have to be considered to ensure that performance metrics are not violated.

Modelling Contract Management for Cloud Services
Mario A. Bochicchio, Antonella Longo (University of Salento, Italy)
Antonella argued that public clouds are currently black boxes, you don’t know much about their location, security levels, who has access to them etc. This is not very comforting to businesses that want to use clouds. The aim of this research is to develop a tool to support contract management between providers and consumers. I guess their underlying assumption is that cloud providers will provide tailored contracts, but I have not come across a public cloud provider that does this. Anotnella presented some preliminary work about the requirements for such a tool and an information model of the data that needs to be captured in such a tool.

Panel: Science of Cloud Computing
Co-Moderators: Ling Liu, Georgia Institute of Technology, USA
Manish Parashar, Rutgers University, USA
Panelists: Geoffrey Charles Fox, Indiana University, USA
Robert Grossman, University of Chicago, USA
Jean-Francois Huard, CTO, Netuitive, Inc.
Vanish Talwar, HP Labs, USA
This panel attempted to discuss some of the main research challenges for cloud computing, each panelist was asked to give a 10min talk about their views on what’s important for cloud research, and this was followed by an open floor discussion of the viewpoints and research challenges. Some of the research challenges that were discussed by the panel were:
- long-term data preservation (e.g. for scientific experiments)
- management of large-scale datacenters (e.g. over 1M servers)
- simple and elegant languages and platforms for large-scale science simulations (e.g. something better than High-Performance Fortran)
- training students for application development in the cloud (e.g. distributed and parallel algorithms that can deal with node failures)

IEEE CLOUD 2011 Conference

This year's IEEE CLOUD Conference is being held in Washington DC. The conference dates coincide with Independence Day and we checked-out the parades and fireworks yesterday.

The conference started yesterday with a number of tutorials. I attended one of the tutorials and it was clear that it was designed for newbies as 30-mins into the talk, they were still discussing "what is cloud?". The main research talks started this morning and I'll be updating this blog on a daily basis to include notes from the presentations that I attend.

I will be presenting our research paper (Decision Support Tools For Cloud Migration in the Enterprise) at 1pm, right after the first keynote and lunch... My Prezi (slides) are here.

Thursday 30 September 2010

ULSS Doctoral School - Day 9

Dag Sjøberg (University of Oslo) talked about Empirical Methods for ULSS. This lecture was a good introduction to research methods, and would be ideal for first-year PhD students who are going to be researching socio-technical issues in ULSS. The lecture included many references for each research method; the methods covered were experiments, case studies, ethnography, surveys, action research, systematic literature reviews, and meta-analysis.

The rest of today will be spent preparing for tomorrow's project presentations.

Wednesday 29 September 2010

ULSS Doctoral School - Day 8

Paolo Atzeni (University of Roma Tre) talked about model independent schema and data translation. The goal of the database research community is to develop the “information utility”: make it easy for everyone to store, organize, access and analyze the majority of human information online. Data is not just in databases, it is in mail messages, social networks, web pages, spreadsheets, textual documents, palmtop devices, mobiles, multimedia annotations, XML documents etc. Heterogeneity at the system level, model level, structural and semantic level means that it is challenging to achieve the information utility. Paolo talked about the challenges of schema and data translation, schema and data integration and data exchange between databases.