Skip to main content ITU
Logo
  • Programmes
    • BSc Programmes
    • BSc in Global Business Informatics
    • BSc in Digital Design and Interactive Technologies
    • BSc in Software Development
    • BSc in Data Science
    • Applying for a BSc programme
    • MSc Programmes
    • MSc in Digital Innovation & Management
    • MSc in Digital Design and Interactive Technologies
    • MSc in Software Design
    • MSc in Data Science
    • MSc in Computer Science
    • MSc in Games
    • Applying for an MSc programme
    • Student Life
    • Practical information for international students
    • Ask a student
    • Women in tech
    • Student organisations at ITU
    • Study start
    • Labs for students
    • Special Educational Support (SPS)
    • Study and Career Guidance
    • Exchange student
    • Become an exchange student
    • Guest Students
    • Who can be a guest student?
    • ITU Summer University
    • Open House
    • Open House - BSc programmes
    • Open House - MSc programmes
  • Professional Education
    • Master in IT Management
    • Master in IT Management
    • Admission and entry requirements
    • Contact
    • Single Subjects
    • About single subjects
    • Admission and entry requirements
    • Contact
    • Short courses | ITU Professional Courses
    • See all short courses
    • Contact
    • Contact
    • Contact us here
  • Research
    • Sections
    • Data Science
    • Data, Systems, and Robotics
    • Digital Business Innovation
    • Digitalization Democracy and Governance
    • Human-Computer Interaction and Design
    • Play Culture and AI
    • Software Engineering
    • Technologies in Practice
    • Theoretical Computer Science
    • Research Centres
    • Centre for Digital Play
    • Center for Climate IT
    • Center for Computing Education Research
    • Centre for Digital Welfare
    • Centre for Information Security and Trust
    • Research Centre for Government IT
    • Danish Institute for IT Program Management
    • Research entities
    • Research centers
    • Sections
    • Research groups
    • Labs
    • ITU Research Portal
    • Find Researcher
    • Find Research
    • Research Ethics and Integrity
    • Good Scientific Practice
    • Technical Reports
    • Technical Reports
    • PhD Programme
    • About the PhD Programme
    • PhD Courses
    • PhD Defences
    • PhD Positions
    • Types of Enrolment
    • PhD Admission Requirements
    • PhD Handbook
    • PhD Support
  • Collaboration
    • Collaboration with students
    • Project collaboration
    • Project Market
    • Student worker
    • Project postings
    • Job and Project bank
    • Employer Branding
    • IT Match Making
    • Hiring an ITU student or graduate
    • Make a post in the job bank
    • Research collaboration
    • Read more about research collaboration at ITU
    • Industrial PhD
    • Hire an Industrial PhD
    • Maritime Hub
    • Innovation and entrepreneurship
    • ITU Business Development
    • ITU NextGen
  • About ITU
    • About ITU
    • Press
    • Vacancies
    • Contact
  • DK
ITU
ITU  /  Research  /  Technical Reports  /  Technical Reports Archive  /  2005  /  Scalable Computation of Acyclic Joins
  • Research
    • Research Sections
    • Research Ethics and Integrity
    • Good Scientific Practice
    • Research centers
    • Research groups
    • Labs
    • Technical Reports
      • Technical Reports Archive
        • 2024
        • 2023
        • 2021
        • 2018
        • 2017
        • 2016
        • 2015
        • 2014
        • 2013
        • 2012
        • 2011
        • 2010
        • 2009
        • 2008
        • 2007
        • 2006
        • 2005
          • Scalable Computation of Acyclic Joins
            • Bigraphical Models of Context-aware Systems
            • Pre-Symmetry Set Based Shape Matching
            • Axiomatizing Binding Bigraphs (revised)
            • Bigraphical Semantics of Higher-Order Mobile Embedded Resources with Local Names
            • BI Hyperdoctrines, Higher-Order Separation Logic, and Abstraction
            • Interactive Reconfiguration in Power Supply Restoration
            • Interactive Configuration Based on Linear Programming
            • Asymmetric k-Center with Minimum Coverage
            • Matching 2D Shapes Using Their Symmetry Sets
            • Semi-Automatic Foreground Extraction For Natural Images
            • Axiomatizing Binding Bigraphs
            • Distributed Reactive XML: an XML-centric coordination middleware
            • Bigraphs by Example
            • Parametric Completion for Models of Polymorphic Linear / Intuitionistic Lambda Calculus
            • Synthetic Domain Theory and Models of Linear Abadi & Plotkin Logic
            • Categorical Models of PILL
            • Parametric Domain-theoretic models of Linear Abadi & Plotkin Logic
            • Bigraphs and (Reactive) XML - an XML-centric model of computation
            • Probabilistic models for concurrency - Notes for a minicourse
            • The Tree Inclusion Problem: In Optimal Space and Faster
          • 2004
          • 2003
          • 2002
          • 2001
          • 2000
      • PhD Programme

    Scalable Computation of Acyclic Joins

    TR-2005-75, Authors: Anna Pagh and Rasmus Pagh



    Anna Pagh
    Rasmus Pagh


    December 2005



    Abstract


    The join operation of relational algebra is a cornerstone of relational database systems. Computing the join of several relations is NP-hard in general, whereas special (and typical) cases are tractable. This paper primarily considers joins having an acyclic join graph, for which current methods initially apply a full reducer to efficiently eliminate tuples that will not contribute to the result of the join. The previously best worst case time for computing an acyclic join of k fully reduced relations, occupying a total of n blocks on disk, is Omega(sort(n) log k + zk) I/Os, where sort(n) is the time for sorting the data of n disk blocks, and z is the size of the output in blocks. Even if the output is small, the log k factor gives a significant overhead when joining many relations.

    In this paper we show how to compute the join in a time bound that is within a constant factor of the cost of running a full reducer plus sorting the output. For a broad class of acyclic join graphs this is O(sort(n+z)) I/Os, removing the dependence on k from previous bounds. Traditional methods decompose the join into a number of binary joins, which are then carried out one at a time (with some parallelism if pipelining is possible). Departing from this approach, our technique is based on computing the size of certain subsets of the result, and using these sizes to compute the location(s) of each data item in the result. We can then assemble the result using a single sorting step.

    Finally, as an initial study of cyclic joins in the I/O model, we show how to compute a join whose join graph is a 3-cycle, in O(n2/m+sort(n+z)) I/Os, where m is the number of blocks in internal memory. Previous techniques also have a quadratic dependence on n, but do not utilize internal memory this well.


    Technical report TR-2005-75 in IT University Technical Report Series, December 2005.

    Available as PDF.


    Contact us

    Phone
    +45 7218 5000
    E-mail
    itu@itu.dk

    All contact information

    Web Accessibility Statement

    Find us

    IT University of Copenhagen
    Rued Langgaards Vej 7
    DK-2300 Copenhagen S
    Denmark
    How to get here

    Follow us

    ITU Student /
    Privacy /
    EAN-nr. 5798000417878/
    CVR-nr. 29 05 77 53 /
    P-nummer 1005162959

    This page is printed from https://www.itu.dk/404