Alex Wolf continued from where he left off yesterday and discussed how their group has been addressing the challenges of automating experiments in large-scale systems. He described how realistic workloads can be generated and how user behaviour can be modelled (using computer programs to specify sequences of actions, choice and decisions and communication between actors). The challenges around (and approaches to) the repeatability of experiments in distributed environments were also discussed, e.g. using the PlanetLab test-bed and repeating experiments over the internet until the required confidence level is obtained, or using emulation to create and control all parameters in a network. Finally, Alex talked about their current research in trying to get their tool to a stage where a programmer can provide it with a hypothesis about the system under investigation, and asking the tool to come up with experimental designs that can be used to validate or disprove the hypothesis. This is difficult as there are many parameters that could be studied, e.g. bandwidth, latency, failure rates, response times, and how these parameters should be sampled and at what scale they should be studied.
Fausto Giunchiglia (University of Trento) talked about the complexity of knowledge representation and management, where the complexity arises from the diversity that is present in large-scale data sets (e.g. the web). He highlighted the need for new methodologies for knowledge representation and management, and their efforts in developing such methods and tools for the management, control and use of emergent knowledge properties (http://entitypedia.org). Fausto’s take home message was that no matter how good we make these technologies, they won’t work until we find the right incentives that motivate people to share their knowledge across organisations and departments - and this is currently an open question.