Ali Khajeh-Hosseini @ University of St Andrews: September 2010

Thursday, 30 September 2010

ULSS Doctoral School - Day 9

Dag Sjøberg (University of Oslo) talked about Empirical Methods for ULSS. This lecture was a good introduction to research methods, and would be ideal for first-year PhD students who are going to be researching socio-technical issues in ULSS. The lecture included many references for each research method; the methods covered were experiments, case studies, ethnography, surveys, action research, systematic literature reviews, and meta-analysis.

The rest of today will be spent preparing for tomorrow's project presentations.

Wednesday, 29 September 2010

ULSS Doctoral School - Day 8

Paolo Atzeni (University of Roma Tre) talked about model independent schema and data translation. The goal of the database research community is to develop the “information utility”: make it easy for everyone to store, organize, access and analyze the majority of human information online. Data is not just in databases, it is in mail messages, social networks, web pages, spreadsheets, textual documents, palmtop devices, mobiles, multimedia annotations, XML documents etc. Heterogeneity at the system level, model level, structural and semantic level means that it is challenging to achieve the information utility. Paolo talked about the challenges of schema and data translation, schema and data integration and data exchange between databases.

Tuesday, 28 September 2010

ULSS Doctoral School - Day 7

Emanuela Merelli (University of Camerino) talked about agent-based modelling in Systems Biology and how large-scale systems biology simulations could be seen as an ULSS. Systems biology aims to understand how biological molecules in living systems integrate to form complex systems and how these systems function and evolve. This is done at both an intra-cellular and inter-cellular level by simulating events that have been discovered from data gathered from genomics, transcriptomic, proteomic and metabolomic experiments. They want to create multi-level models ranging from individual cells to complete organisms; these models can then be used in drug research to understand the effects and side-effects of drugs (e.g. to develop personalised medicine). The second part of the talk was a general introduction to agent based simulations and its main concepts.

For those who are interested in Systems Biology, Emanuela recommended a book called The Music of Life by Denis Noble (2008).

Domenico Saccà (University of Calabria) talked about workflow management systems in distributed architectures, where the goal is to manage the flow of work such that the work is done at the right time and by the right person/component.

Monday, 27 September 2010

ULSS Doctoral School - Day 6

Murray Cantor (IBM) talked about ULSS from an industrial perspective, with some hypothetical examples of SoS, the challenges they bring, the state of practice today, and some opportunities and research questions for academia. Murray is a Distinguished Engineer at IBM, but before that he was at Berkeley, working in the area of non-linear mathematics – so he has a good background in both academic research and state of the art in practice. His lectures mostly focused on the engineering problems of SoS, much like Linda Northrop’s first ULSS lecture but without the military motivations – Murray used examples from healthcare and transport to illustrate the problems. His last lecture was about governance in SoS, and different approaches that can be taken to deal with uncertainty and risk. He also talked about calculating the expected value of projects that alter or add to a SoS (so you can make decisions about whether they should proceed or not).

Friday, 24 September 2010

ULSS Doctoral School - Day 5

Alex Wolf continued from where he left off yesterday and discussed how their group has been addressing the challenges of automating experiments in large-scale systems. He described how realistic workloads can be generated and how user behaviour can be modelled (using computer programs to specify sequences of actions, choice and decisions and communication between actors). The challenges around (and approaches to) the repeatability of experiments in distributed environments were also discussed, e.g. using the PlanetLab test-bed and repeating experiments over the internet until the required confidence level is obtained, or using emulation to create and control all parameters in a network. Finally, Alex talked about their current research in trying to get their tool to a stage where a programmer can provide it with a hypothesis about the system under investigation, and asking the tool to come up with experimental designs that can be used to validate or disprove the hypothesis. This is difficult as there are many parameters that could be studied, e.g. bandwidth, latency, failure rates, response times, and how these parameters should be sampled and at what scale they should be studied.

Fausto Giunchiglia (University of Trento) talked about the complexity of knowledge representation and management, where the complexity arises from the diversity that is present in large-scale data sets (e.g. the web). He highlighted the need for new methodologies for knowledge representation and management, and their efforts in developing such methods and tools for the management, control and use of emergent knowledge properties (http://entitypedia.org). Fausto’s take home message was that no matter how good we make these technologies, they won’t work until we find the right incentives that motivate people to share their knowledge across organisations and departments - and this is currently an open question.

Thursday, 23 September 2010

ULSS Doctoral School - Day 4

Carlo Ghezzi (Politecnico di Milano) talked about adaptive evolvable systems (where adaption refers to the ability of software to detect changes and react in a self-managed manner, and evolution requires the intervention of a designer). He started with a brief history of early software engineering approaches, which did not follow a precisely formulated process and assumed that organizations are monolithic and stable (so change is avoided). But this assumption is usually incorrect and so maintenance is required. Traditionally, maintenance has been offline, where the maintenance can be corrective maintenance to fix bugs, adaptive maintenance to satisfy changes in the environment, or perfective maintenance to satisfy changes in the requirements. Adaptive systems attempt to reduce the maintenance effort by modelling and reasoning about the goals, requirements and assumptions that a system has (I guess the underlying idea is related to autonomic computing). One approach that could be taken is to use inheritance (in the object oriented sense) with dynamic code generation to create code that extends existing classes. As long as the method interfaces do not change, then the method body can be dynamically changed at run-time using inheritance and polymorphism.

Alexander Wolf (Imperial College) talked about the automation of experiments on large-scale systems (computer science experiments such as investigating properties of distributed hash tables or web applications under different workloads). The challenges in experimenting with such systems are: how do you generate workloads that are realistic, how do you ensure that the experiments are repeatable, and how do you design the actual experiments – e.g. what properties do you measure, how do you measure them. Alex then introduced the Weevil tool that addresses some of these challenges; this will be discussed in more detail in tomorrow’s lecture.

Local village near the hotel

Dinner last night at the Pappone

Lunch at the hotel

Wednesday, 22 September 2010

ULSS Doctoral School - Day 3

Dave Cliff kicked off day 3 with a story of technology failures and snapshots of engineering failures in history. He also talked about the NY stock exchange crash on 6-May-2010 when in one day, 1 trillion dollars disappeared and then re-appeared in the market. The rest of the first talk was the LSCITS story, why it was started and where it’s going...

Lecture 2 was about Market-Based LSCITS and data centre resource management using market-based approaches (e.g. cooling units and servers all trade their “offerings” in an artificial market in the data centre). Dave took us through a brief background in economics and trading then described his work that lead to the to the ZIP trading algorithm.

Lecture 3 was about the growth, scale and failure of LSCITS and summarised a dozen books about: the problem (e.g. Eating the IT Elephant, 2008), the scale of the problem and what happens if it’s not addressed (e.g. Management of Scale, 1992; The Challenger Launch Decision, 1997), and how resilient engineering can address some of the problems (e.g. Resilient Engineering, 2006). The books and papers from Dave’s talk can be found on his Mendeley page.

Overall, a day full of pointers and ideas that sparked a lot of discussion amongst the students.

Ali Khajeh-Hosseini @ University of St Andrews