Carmela Comito, Simon Patarin, and Domenico Talia. A semantic overlay network for P2P schema-based data integration. In Proceedings of the 11th IEEE Symposium on Computers and Communications (ISCC'06), pages 88-94, Pula-Cagliari, Italy, June 2006.
[ bib | doi | pdf ]
Today data sources are pervasive and their number is growing tremendously. Current tools are not prepared to exploit this unprecedented amount of information and to cope with this highly heterogeneous, autonomous and dynamic environment. In this paper, we propose a novel semantic overlay network architecture, PARIS, aimed at addressing these issues. In PARIS, the combination of decentralized semantic data integration with gossip-based (unstructured) overlay topology management and (structured) distributed hash tables provides the required level of flexibility, adaptability and scalability, and still allows to perform rich queries on a number of autonomous data sources. We describe the logical model that supports the architecture and show how its original topology is constructed. We present the usage of the system in detail, in particular, the algorithms used to let new peers join the network and to execute queries on top of it and show simulation results that assess the scalability and robustness of the architecture.
Sidath Handurukande, Anne-Marie Kermarrec, Fabrice Le Fessant, Laurent Massoulié, and Simon Patarin. Peer sharing behaviour in the edonkey network, and implications for the design of server-less file sharing systems. In Proceedings of the First EuroSys Conference (EuroSys 2006), pages 359-371, Leuwen, Belgium, April 2006.
[ bib | pdf ]
In this paper we present an empirical study of a workload gathered by crawling the eDonkey network - a dominant peer-to-peer file sharing system - for over 50 days. We first confirm the presence of some known features, in particular the prevalence of free-riding and the Zipflike distribution of file popularity. We also analyze the evolution of document popularity. We then provide an in-depth analysis of several clustering properties of such workloads. We measure the geographical clustering of peers offering a given file. We find that most files are offered mostly by peers of a single country, although popular files don't have such a clear home country. We then analyze the overlap between contents offered by different peers. We find that peer contents are highly clustered according to several metrics of interest. We propose to leverage this property by allowing peers to search for content without server support, by querying suitably identified semantic neighbours. We find via trace-driven simulations that this approach is generally effective, and is even more effective for rare files. If we further allow peers to query both their semantic neighbours, and in turn their neighbours' neighbours, we attain hit rates as high as over 55% for neighbour lists of size 20.
Huaigu Wu, Bettina Kemme, Alberto Bartoli, and Simon Patarin. A replication toolkit for J2EE application servers. In Proceedings of the ACM/IFIP/USENIX 6th International Middleware Conference (MiddleWare '05) - Demonstrators Track, Grenoble, France, November 2005.
[ bib | pdf ]
Web service technology allows organizations to provide programmatic interfaces to the services they export. In most cases, these services are implemented with a multi-tier architecture consisting of a client external to the organization, a middle-tier and a back-end tier. The middle-tier typically uses the infrastructure of an application server (AS) whereas the back-end tier consists of a database system. In this demo we present our work for enhancing current AS technology with exible and transparent failure management. We consider multi-tiered services based on the J2EE architecture and replicate the middle-tier for fault-tolerance. The novelty of our contribution consists in the guarantees we provide with respect to failures.
Alberto Bartoli, Ricardo Jiménez-Peris, Bettina Kemme, Cesare Pautasso, Simon Patarin, Stuart Wheater, and Simon Woodman. The ADAPT framework for adaptable and composable web services. IEEE Distributed Systems Online, 6(9), September 2005.
[ bib | html ]
Organizations are increasingly using the Web not only to sell products and deliver information, but also for providing their services to businesses and individual customers. Typically, the provision of such services by organizations requires the construction of applications that integrate existing enterprise information systems to offer new business functions. Organizations need to ensure that these services are available, scalable and also autonomic to guarantee that user interactions are promptly processed even under highly volatile conditions. In most cases, organizations deliver these services by means of application servers with a multi-tier architecture whose functionalities are accessed as web services. We refer to services not relying on other web services as Basic Services (BSs). The presence of a wide variety of BSs over the Internet has created an exciting new business opportunity of providing value added, inter-organizational services by composing multiple BSs into new Composite Services (CSs).
Carmela Comito, Simon Patarin, and Domenico Talia. PARIS: A peer-to-peer architecture for large-scale semantic data integration. In Proceedings of the Third International Workshop on Databases, Information Systems, and Peer-to-Peer Computing (DBISP2P 2005), volume 4125 of Lecture Notes in Computer Science, Trondheim, Norway, August 2005.
[ bib ]
Assia Hachichi, Cyril Martin, Gaël Thomas, Bertil Folliot, and Simon Patarin. A generic language for dynamic adaptation. In Proceedings of the 11th International Euro-Par Conference (Euro-Par 2005), pages 40-49, Lisboa, Portugal, August 2005.
[ bib | doi | pdf ]
Today, component oriented middlewares are used to design, develop and deploy distributed applications easily. They ensure the heterogeneity, interoperability, and reuse of software modules. Several standards address this issue: CCM (CORBA Component Model), EJB (Enterprise Java Beans) and .Net. However they offer a limited and fixed number of system services, and their deployment and configuration mechanisms cannot be used by any language nor API dynamically. As a solution, we present a generic high-level language to adapt system services dynamically in existing middlewares. This solution is based on a highly adaptable platform which enforces adaptive behaviours, and offers a means to specify and adapt system services dynamically. A first prototype was achieved for the OpenCCM platform, and good performance results were obtained.
David Hales and Simon Patarin. Computational sociology for systems ``in the wild'': the case of BitTorrent. IEEE Distributed Systems Online, 6(7), July 2005.
[ bib | doi | html | pdf ]
It is generally agreed that future software systems should be open, distributed, self-organizing, scalable and robust. Fully distributed systems cannot rely on centralized control and open systems cannot ensure that malicious and / or selfish components do not invade the system. The requirement for high scalability means that systems should run at least as well, and ideally better, when scaled to millions of units. How does one begin to formulate methods, techniques and protocols that can deliver on these tough demands? One approach, often adopted within the Multi-Agent Systems (MAS) community is to start from scratch, designing agents and platforms with provable properties using specialized logics and / or sophisticated simulation models. However, this approach is particularly difficult when dealing with open systems containing adaptive agents. This is because the designer cannot be sure how other agents will behave in future states of the system. Worse, much of the desirable behavior of the system as a whole, such as high levels of altruism or cooperation for example, often result from emergent properties which are little understood and not easily reducible to individual behaviors. However, progress is being made.
Simon Patarin and Mesaac Makpangou. Pandora: an efficient platform for the construction of autonomic applications. In Self-Star Properties in Complex Information Systems, volume 3460 of Lecture Notes in Computer Science, pages 291-306, May 2005.
[ bib | doi | pdf ]
Autonomic computing has been proposed recently as a way to address the difficult management of applications whose complexity is constantly increasing. Autonomic systems will have to diagnose the problems they face themselves, devise solutions and act accordingly. In consequence, they require a very high level of flexibility and the ability to constantly monitor themselves. This work presents a framework, Pandora, which eases the construction of applications that satisfy this double goal. Pandora relies on an original application programming pattern - based on stackable layers and message passing - to obtain a minimalist model and architecture that allows control of the overhead imposed by the full reflexivity of the framework. A prototype of the framework has been implemented in C++, freely available for download on the Internet. A detailed performance study is given, together with examples of use, to assess the usability of the platform in real usage conditions.
David Hales and Simon Patarin. How to cheat bittorrent and why nobody does. Technical Report UBLCS-2005-12, University of Bologna, May 2005.
[ bib | pdf ]
The BitTorrent peer-to-peer file-sharing system attempts to build robustness to free-riding by implementing a tit-for-tat-like strategy within its protocol. It is often believed that this strategy alone is responsible for the the high-levels of cooperation found within the BitTorrent system. However, we highlight some of the weaknesses of the approach and indicate where it would be easy to cheat and free-ride. Given that cheating of this kind currently appears rare, this motivates the question: why is the system not dominated by free-riders? We advance a hypothesis which argues that BitTorrent may resist free-riders in a way that has not been previously fully comprehended. Ironically, this process relies on what is commonly believed to be a weakness of BitTorrent - the lack of meta-data search. One consequence of this is to partition the BitTorrent network into numerous isolated swarms - often with several independent swarms for an identical file - which is one of the necessary conditions for a kind of evolutionary group selective process, a process that has been recently identified in similar simulated systems. A further implication of the hypothesis is that, given the choice, users may choose unconditional altruism rather than the more restrictive reciprocal tit-for-tat approach as a result of the same group selective process.
Ozalp Babaoglu, Alberto Bartoli, Vance Maverick, Simon Patarin, Jaksa Vuckovic, and Huaigu Wu. A Framework for Prototyping J2EE Replication Algorithms. In Proceedings of the International Symposium on Distributed Objects and Applications (DOA 2004), volume 3291 of Lecture Notes in Computer Science, pages 1413-1426, Larnaca, Cyprus, October 2004.
[ bib | doi | pdf ]
In application server systems, such as J2EE, replication is an essential strategy for reliability and efficiency. Many J2EE implementations, both commercial and open-source, provide some replication support. However, the range of possible strategies is wide, and the choice of the best one, depending on the expected application profile, remains an open research question. To support research in this area, we introduce a framework for prototyping J2EE replication algorithms. In effect, it divides replication code into two layers: the framework itself, which is common to all replication algorithms, and a specific replication algorithm, which is ``plugged in'' to the framework. The division is defined by an API. The framework simplifies development in two ways. First, it keeps much of the complexity of modifying a J2EE implementation within the framework layer, which is implemented only once. Second, through the API, the replication algorithm sees a highly abstracted view of the components in the server. This frees the designer to concentrate on the important issues that are specific to a replication algorithm, such as communication. We have implemented the framework by extending the open-source J2EE server. Compared to an unmodified server, the framework adds a performance cost of about 22%. Thus, it is quite practical for the initial development and evaluation of replication algorithms. Several algorithms have already been implemented within the framework.
Keywords: J2EE, replication, pluggable framework
Assia Hachichi, Cyril Martin, Gaël Thomas, Simon Patarin, and Bertil Folliot. Reconfigurations dynamiques de services dans un intergiciel à composants CORBA CCM. In Actes de la 1ère Conférence Francophone sur le Déploiement et la (Re) Configuration de Logiciels (DECOR'04), pages 159-170, Grenoble, France, October 2004. In French.
[ bib | ps | pdf ]
De nos jours, les intergiciels à composants sont utilisés pour concevoir, développer, et déployer facilement les applications réparties, et assurer l hétérogénéité, et l interopérabilité, ainsi que la réutilisation des modules logiciels, et la séparation entre le code métier encapsulé dans des composants et le code système géré par les conteneurs. De nombreux standards répondent à cette définition tels : CCM (CORBA Component Model), EJB (Entreprise Java Beans) et .NET. Cependant ces standards offrent un nombre limité et figé de services systèmes, supprimant ainsi toute possibilité d ajout de services systèmes ou de reconfiguration dynamiques de l intergciel. Nos travaux proposent des mécanismes d ajout et d adaptation dynamique des services systèmes, basés sur un langage de reconfiguration adaptable dynamiquement aux besoins de la reconfiguration et sur un outil de reconfiguration dynamique. Un prototype a été réalisé pour la plateforme OpenCCM de LIFL.
Keywords: adaptation et reconfiguration dynamique, conteneurs adaptable, CCM, MVV
Simon Patarin and Mesaac Makpangou. Pandora: une plate-forme efficace pour la construction d'applications autonomes. In Actes de la 1ère Conférence Francophone sur le Déploiement et la (Re) Configuration de Logiciels (DECOR'04), pages 15-26, Grenoble, France, October 2004. In French.
[ bib | ps | pdf ]
L'informatique autonome a récemment été proposée comme une réponse à la difficulté de gérer au quotidien des applications dont la complexité ne cesse d'augmenter. Les applications autonomes devront être particulièrement flexibles et pouvoir se surveiller en permanence. Cette étude présente une plate-forme, Pandora, qui facilite la construction d'applications qui satisfont ce double objectif. Pandora s'appuie sur un mode de programmation original des applications - fondé sur la composition de couches et le passage de messages - pour aboutir à un modèle et une architecture minimalistes qui lui permettent de contrôler les surcoûts imposés par la complète réflexivité de la plate-forme. Un prototype fonctionnel de la plate-forme a par ailleurs été développé en C++. Une étude détaillée des performances, ainsi que des exemples d'utilisation, complètent cette présentation.
Keywords: informatique autonome, modèle de composants, reconfiguration dynamique
Simon Patarin. Pandora: support pour des services de métrologie à l'échelle d'Internet. PhD thesis, Université Pierre et Marie Curie - Paris 6, June 2003.
[ bib | ps.gz | pdf ]
Cette thèse présente un modèle d'architecture pour la conception de moniteurs qui collectent, de manière distribuée, les informations nécessaire à l'adaptation des applications réparties sur Internet aux conditions toujours changeantes de leur environnement. Les moniteurs flexibles, déployés sur les différents sites participants, sont coordonnés à travers un service de contrôle et de dissémination distribué. Cette architecture est fondée sur la notion de composants hautement flexibles, assemblés sous la forme de piles qui définissent les traitements à effectuer pour capturer une métrique donnée. Un prototype (Pandora) qui met en oeuvre cette architecture a été développé et utilisé comme support d'exécution de plusieurs applications; une, en particulier, s'appuie sur Pandora pour effectuer une surveillance détaillée du protocole HTTP fondée sur la capture passive de paquets réseau.
Keywords: métrologie, composants, flexibilité, Internet
Frédéric Ogel, Simon Patarin, Ian Piumarta, and Bertil Folliot. C/SPAN: a Self-Adapting Web Proxy Cache. In Proceedings of the Autonomic Computing Workshop (AMS 2003), pages 178-185, Seattle, WA, June 2003.
[ bib | pdf ]
In response to the exponential growth of Internet traffic, web proxy caches are deployed everywhere. Nonetheless, their efficiency relies on a large number of intrinsically dynamic parameters, most of which can not be predicted statically. Furthermore, in order to react to changing execution conditions - such as network resources, user behavior or flash crowds, or to update the web proxy with new protocols, services or even algorithms - the entire system must be dynamically adapted. Our response to this problem is a self-adapting Web proxy cache, C/SPAN, that applies administrative strategies to adapt itself and react to external events. Because it is completely flexible, even these adaptation policies can be dynamically adapted.
Simon Patarin and Mesaac Makpangou. Continuous Measurement of Web Proxy Cache Efficiency. In Eletronic Proceedings of the 12th International World Wide Web Conference (WWW2003), Budapest, Hungary, May 2003.
[ bib | html | ps.gz | pdf ]
This abstract presents how Pandora, our flexible monitoring platform, can be used to continuously measure the efficiency of a system of cooperating proxy caches.
Keywords: network monitoring, World-Wide Web, proxy cache, evaluation
Fabrice Le Fessant and Simon Patarin. MLdonkey, a Multi-Network Peer-to-Peer File-Sharing Program. Research Report RR-4797, INRIA, April 2003.
[ bib | ps.gz | pdf ]
A lot of designers of functional languages have one dream: finding a killer application, outside of the world of symbolic programming (compilers, theorem provers, DSLs), that would make their language spread in the open-source community. One year ago, we tackled this problem, and decided to use Objective Caml to program a network application in the emerging world of peer-to-peer systems. The result of our work, MLdonkey, has superseded our hopes: it is currently the most popular peer-to-peer file-sharing client on the well-known 'freshmeat.net' site, with about 10,000 daily users. Moreover, MLdonkey is the only client able to connect to several peer-to-peer networks, to download and share files. It works as a daemon, running unattended on the computer, and can be controlled remotely using three different kind of interfaces. In this paper, we present the lessons we learnt from its design and implementation.
Keywords: peer-to-peer, file sharing, functional programming
Simon Patarin and Mesaac Makpangou. On-line Measurement of Web Proxy Cache Efficiency. Research Report RR-4782, INRIA, March 2003.
[ bib | ps.gz | pdf ]
This report presents how Pandora, our flexible monitoring platform, can be used to continuously measure the efficiency of a system of cooperating proxy caches. It circumvents many of the drawbacks of existing tools: Pandora integrates all stages involved in the evaluation process, it operates in real-time, it does not depend on specific cache software, and it can be adapted to any specific system configuration. We detail how this can be achieved using the flexibility offered by Pandora. We also present two experiments that illustrate the utilisation of these techniques: the first one evaluates the proxy cache deployed at INRIA Rocquencourt, the second one measures the efficiency of cooperating caches in an artificial environment. Finally, we describe how we plan to integrate these measurements inside an auto-adaptative Web proxy cache.
Keywords: network monitoring, Web proxy cache, measurement, efficiency
Simon Patarin and Mesaac Makpangou. Pandora: A Flexible Network Monitoring Platform. In Proceedings of the USENIX 2000 Annual Technical Conference, pages 27-40, San Diego, CA, June 2000.
[ bib | html | ps.gz | pdf ]
This paper presents Pandora, a network monitoring platform that captures packets using purely passive techniques. Pandora addresses current needs for improving Internet middleware and infrastructure by providing both in-depth understanding of network usage and metrics to compare existing protocols. Pandora is flexible and easy to use and deploy. The elementary monitoring tasks are encapsulated as independent entities we call monitoring components. The actual packet analysis is performed by stacking the appropriate components. Pandora also preserves user privacy by allowing control of the ``anonymization'' policy. Finally, the evaluation we conducted shows that overheads due to Pandora's flexibility do not significantly affect performance. Pandora is fully functional and has already been used to collect Web traffic traces at INRIA Rocquencourt.
Keywords: network monitoring, passive capture, flexibility, components
Simon Patarin. Pandora : un système de collecte de traces du trafic Web de communautés d'utilisateurs réparties. Rapport de Recherche RR-3743, INRIA, July 1999. In French.
[ bib | ps.gz | pdf ]
Pandora permet de collecter les informations nécessaires pour caractériser le trafic Web d'une communauté d'utilisateurs répartie. Les informations sont obtenues en reconstituant le trafic HTTP directement à partir des paquets réseau. Sur le plan architectural, Pandora est constitué de trois composants logiciels coopérants : un collecteur, un observateur et un coordinateur, qui peuvent être déployés en différents points du réseau. En interne, chaque composant est implémenté par une série de filtres. Cette architecture autorise une grande souplesse d'utilisation et de déploiement. Les traces fournies par Pandora donnent des informations détaillées sur les profils des utilisateurs, les serveurs, les documents accédés, le réseau et les caches. Elles peuvent être utilisées pour déterminer la politique de cache ou de réplication qui offre la meilleure qualité de service possible aux utilisateurs.
Keywords: trace, capture de paquets, cache, Web, caractérisation