By Topic

Challenges of Large Applications in Distributed Environments, 2006 IEEE

Date 19-19 June 2006

Filter Results

Displaying Results 1 - 21 of 21
  • Proceedings. Challenges of Large Applications in Distributed Environments (IEEE Cat. No.06EX1397)

    Publication Year: 2006
    Save to Project icon | Request Permissions | PDF file iconPDF (116 KB)  
    Freely Available from IEEE
  • [Copyright notice]

    Publication Year: 2006 , Page(s): ii
    Save to Project icon | Request Permissions | PDF file iconPDF (55 KB)  
    Freely Available from IEEE
  • Table of contents

    Publication Year: 2006 , Page(s): iii - vi
    Save to Project icon | Request Permissions | PDF file iconPDF (102 KB)  
    Freely Available from IEEE
  • Welcome from the Workshop Chairs

    Publication Year: 2006 , Page(s): vii - viii
    Save to Project icon | Request Permissions | PDF file iconPDF (86 KB) |  | HTML iconHTML  
    Freely Available from IEEE
  • Gridcast - A next generation broadcasting infrastructure?

    Publication Year: 2006 , Page(s): 1 - 2
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (110 KB) |  | HTML iconHTML  

    Summary form only given. Dr. Harmer will address the development of Gridcast, a prototype broadcasting grid developed in conjunction with the BBC, which has been deployed in the field for two years. He will address what grid ideas bring to the development of a secure highperformance broadcasting infrastructure and the evolution, deployment and management of grids in a dynamic and highly demanding real-time environment such as broadcasting. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Applications and data [breaker page]

    Publication Year: 2006 , Page(s): 3 - 4
    Save to Project icon | Request Permissions | PDF file iconPDF (108 KB)  
    Freely Available from IEEE
  • A new paradigm in data intensive computing: Stork and the data-aware schedulers

    Publication Year: 2006 , Page(s): 5 - 12
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (394 KB) |  | HTML iconHTML  

    The unbounded increase in the computation and data requirements of scientific applications has necessitated the use of widely distributed compute and storage resources to meet the demand. In a widely distributed environment, data is no more locally accessible and has thus to be remotely retrieved and stored. Efficient and reliable access to data sources and archiving destinations in such an environment brings new challenges. Placing data on temporary local storage devices offers many advantages, but such "data placements" also require careful management of storage resources and data movement, i.e. allocating storage space, staging-in of input data, staging-out of generated data, and de-allocation of local storage after the data is safely stored at the destination. Traditional systems closely couple data placement and computation, and consider data placement as a side effect of computation. Data placement is either embedded in the computation and causes the computation to delay, or performed as simple scripts which do not have the privileges of a job. The insufficiency of the traditional systems and existing CPU-oriented schedulers in dealing with the complex data handling problem has yielded a new emerging era: the data-aware schedulers. One of the first examples of such schedulers is the Stork data placement scheduler. In this paper, we discuss the limitations of the traditional schedulers in handling the challenging data scheduling problem of large scale distributed applications; give our vision for the new paradigm in data-intensive scheduling; and elaborate on our case study: the Stork data placement scheduler View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Implementation of a distributed rendering environment for the TeraGrid

    Publication Year: 2006 , Page(s): 13 - 21
    Cited by:  Papers (7)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (558 KB) |  | HTML iconHTML  

    This paper discusses the implementation of a distributed rendering environment (DRE) utilizing the TeraGrid. Using the new system, researchers and students across the TeraGrid have access to available resources for distributed rendering. Previously, researchers at universities and national labs, using high end rendering software such as Renderman Compliant Pixie were often limited by the amount of time that it takes to calculate (render) their final images. The amount of time required to render introduces several potential complications in a research setting. In contrast, a typical animation studio has a render farm, consisting of a cluster of computers (nodes) used to render 3D images, known as a distributed rendering environment. By spreading the rendering across hundreds of machines, the overall render time is reduced significantly. Unfortunately, most researchers do not have access to a distributed rendering environment. Our university has been developing a DRE for local use. However, because we are a TeraGrid site, we recently modified our DRE implementation to make use of open source rendering tools and grid tools such as Condor, in order to make the DRE available to other TeraGrid users View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Cross-Grid Applications

    Publication Year: 2006 , Page(s): 23 - 24
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (108 KB)  

    First Page of the Article
    View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • CyberInfrastructure for the analysis of ecological acoustic sensor data: a use case study in grid deployment

    Publication Year: 2006 , Page(s): 25 - 33
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (755 KB) |  | HTML iconHTML  

    The LTER grid pilot study was conducted by the National Center for Supercomputing Applications, the University of New Mexico, and Michigan State University, to design and build a prototype grid for the ecological community. The featured grid application, the Biophony Grid Portal, manages acoustic data from field sensors and allows researchers to conduct real-time digital signal processing analysis on high-performance systems via a Web-based portal. Important characteristics addressed during the study include the management, access, and analysis of a large set of field collected acoustic observations from microphone sensors, single signon, and data provenance. During the development phase of this project new features were added to standard grid middleware software and have already been successfully leveraged by other, unrelated grid projects. This paper provides an overview of the Biophony Grid Portal application and requirements, discusses considerations regarding grid architecture and design, details the technical implementation, and summarizes key experiences and lessons learned that are generally applicable to all developers and administrators in a grid environment View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • NEKTAR, SPICE and Vortonics: using federated grids for large scale scientific applications

    Publication Year: 2006 , Page(s): 34 - 42
    Cited by:  Papers (3)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (373 KB) |  | HTML iconHTML  

    In response to a joint call from US's NSF and UK's EPSRC for applications that aim to utilize the combined computational resources of the US and UK, three computational science groups from UCL, Tufts and Brown Universities teamed up with a middleware team from NIU/Argonne to meet the challenge. Although the groups had three distinct codes and aims, the projects had the underlying common feature that they were comprised of large-scale distributed applications which required high-end networking and advanced middleware in order to be effectively deployed. For example, cross-site runs were found to be a very effective strategy to overcome the limitations of a single resource. The seamless federation of a grid-of-grids remains difficult. Even if interoperability at the middleware and software stack levels were to exist, it would not guarantee that the federated grids can be utilized for large scale distributed applications. There are important additional requirements for example, compatible and consistent usage policy, automated advanced reservations and most important of all co-scheduling. This paper outlines the scientific motivation and describes why distributed resources are critical for all three projects. It documents the challenges encountered in using a grid-of-grids and some of the solutions devised in response View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Exploring the hyper-grid idea with grand challenge applications: the DEISA-TeraGrid interoperability demonstration

    Publication Year: 2006 , Page(s): 43 - 52
    Cited by:  Papers (3)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (663 KB) |  | HTML iconHTML  

    A supercomputing hyper-grid spanning two continents was created to move a step towards interoperability of leading grids. A dedicated network connection was established between DEISA, the leading European supercomputing grid, and TeraGrid, the leading American supercomputing grid. Both grids have adopted the approach of establishing a common, high performance global file system, the wide-area version of IBM's GPFS. Teragrid's approach is based on a single site server solution under Linux, hosted by San Diego Supercomputer Centre, DEISA's approach is a multi-site server solution, with currently servers in France, Germany and Italy. These two grid-internal global file systems were interconnected over a dedicated, trusted network connection. During the Supercomputing Conference 2005, grand challenge applications were carried out both within DEISA and within TeraGrid, and results were written transparently to the combined global file system with physically distributed locations of the involved disk systems. Simulations were carried out in Europe and in America, and results were directly written to the respective remote continent, accessible for all participating scientists in both continents, and were then directly further processed for visualization in a third location, the SC05 exhibition hall in Seattle. Grand challenge applications used for the demo included a protein structure prediction and a cosmological simulation carried out at San Diego Supercomputer Center (SDSC), US (www.sdsc.edu) and a Gyrokinetic turbulence simulation and also a cosmological simulation carried out at Garching Computing Centre of the Max Planck Society (RZG), Germany (www.rzg.mpg.de) View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Applications and workflows [breaker page]

    Publication Year: 2006 , Page(s): 53 - 54
    Save to Project icon | Request Permissions | PDF file iconPDF (108 KB)  
    Freely Available from IEEE
  • Enabling parallel scientific applications with workflow tools

    Publication Year: 2006 , Page(s): 55 - 60
    Cited by:  Papers (3)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (295 KB) |  | HTML iconHTML  

    Electron tomography is a powerful tool for deriving three-dimensional (3D) structural information about biological systems within the spatial scale spanning 1 nm3 and 10 mm3. With this technique, it is possible to derive detailed models of sub-cellular components such as organelles and synaptic complexes and to resolve the 3D distribution of their protein constituents in situ. Due in part to exponentially growing raw data-sizes, there continues to be a need for the increased integration of high-performance computing (HPC) and grid technologies with traditional electron tomography processes to provide faster data processing throughput. This is increasingly relevant because emerging mathematical algorithms that provide better data fidelity are more computationally intensive for larger raw data sizes. Progress has been made towards the transparent use of HPC and grid tools for launching scientific applications without passing on the necessary administrative overhead and complexity (resource administration, authentication, scheduling, data delivery) to the non-computer scientist end-user. There is still a need, however, to simplify the use of these tools for applications developers who are developing novel algorithms for computation. Here we describe the architecture of the Telescience project (http://telescience.ucsd.edu), specifically the use of layered workflow technologies to parallelize and execute scientific codes across a distributed and heterogeneous computational resource pool (including resources from the TeraGrid and OptlPuter projects) without the need for the application developer to understand the intricacies of the grid View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Distributed dynamic event tree generation for reliability and risk assessment

    Publication Year: 2006 , Page(s): 61 - 70
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (780 KB) |  | HTML iconHTML  

    Level 2 probabilistic risk assessments of nuclear plants (analysis of radionuclide release from containment) may require hundreds of runs of severe accident analysis codes such as MELCOR or RELAP/SCDAP to analyze possible sequences of events (scenarios) that may follow given initiating events. With the advances in computer architectures and ubiquitous networking, it is now possible to utilize multiple computing and storage resources for such computational experiments. This paper presents a system software infrastructure that supports execution and analysis of multiple dynamic event-tree simulations on distributed environments. The infrastructure allow for 1) the testing of event tree completeness, and, 2) the assessment and propagation of uncertainty on the plant state in the quantification of event trees View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Applications and infrastructure [breaker page]

    Publication Year: 2006 , Page(s): 71 - 72
    Save to Project icon | Request Permissions | PDF file iconPDF (108 KB)  
    Freely Available from IEEE
  • Distribution and partitioning techniques for NVEs: the case of EVE

    Publication Year: 2006 , Page(s): 73 - 82
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (603 KB) |  | HTML iconHTML  

    The majority of the systems and platforms developed for supporting distributed virtual environments are based on the concept of distribution from the early beginning of their development. In this paper we present the migration to a distributed virtual environment from a traditional client-server architecture. In particular, this paper describes the case of EVE, a networked virtual environment originally aimed to support small-scale applications. EVE started as a standard client-multi server architecture, which could support multiple concurrent virtual worlds with a maximum number of seventeen simultaneous participants in each of these worlds. However, the need to support larger-scale applications revealed that the traditional architecture, upon which EVE was based, is insufficient to meet the needs of these applications, which are large both in the sense of virtual space and graphics and in regard to the number of concurrent participants. This paper discusses the migration of EVE to a distributed platform, which is able to support large-scale networked virtual environments. In particular, the paper describes the modifications realized in the architectural model of the initial platform for supporting effectively large-scale applications. The basic entities of the distributed model are presented, their operations, as well as the interconnection among them. In addition, the paper presents an initial approach of the algorithm that is adopted for the efficient partitioning of the virtual world and the assignment of the clients to the entities and resources of the distributed platform. The approach presented is space-object driven, in the sense that both the actual size of the virtual space along with the number of objects with which the user can interact is taken into account during the partitioning View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Managing grid messaging middleware

    Publication Year: 2006 , Page(s): 83 - 91
    Cited by:  Papers (1)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (375 KB) |  | HTML iconHTML  

    Management in distributed systems has gained much importance in recent years. With the increasing complexity of applications, there is a need for effective management of components of the application. As application components span different administrative domains, differing security policies restrict access to these components. The problem gets more complicated in a dynamic environment where application components and the environment is in a constant state of flux, so that failure is the norm. In this paper we explore the issues related to management in dynamic and heterogeneous environments. We propose a scalable, fault-tolerant and Web services - compliant management architecture that addresses these issues of management and also illustrate the functioning of our framework with respect to the NaradaBrokering messaging middleware View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Best paper

    Publication Year: 2006 , Page(s): 93 - 94
    Save to Project icon | Request Permissions | PDF file iconPDF (108 KB)  
    Freely Available from IEEE
  • Creating personal adaptive clusters for managing scientific jobs in a distributed computing environment

    Publication Year: 2006 , Page(s): 95 - 103
    Cited by:  Papers (9)
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (619 KB) |  | HTML iconHTML  

    We describe a system for creating personal clusters in user-space to support the submission and management of thousands of compute-intensive serial jobs to the network-connected compute resources on the NSF TeraGrid. The system implements a robust infrastructure that submits and manages job proxies across a distributed computing environment. These job proxies contribute resources to personal clusters created dynamically for a user on-demand. The system adapts to the prevailing job load conditions at the distributed sites by migrating job proxies to sites expected to provide resources more quickly. The version of the system described in this paper allows users to build large personal Condor and Sun Grid Engine clusters on the TeraGrid. Users can then submit, monitor and control their scientific jobs with a single uniform interface, using the feature-rich functionality found in these job management environments. Up to 100,000 user jobs have been submitted through the system to date, enabling approximately 900 teraflops of scientific computation View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Author index

    Publication Year: 2006 , Page(s): 105 - 106
    Save to Project icon | Request Permissions | PDF file iconPDF (123 KB)  
    Freely Available from IEEE