Class QuadTreeRTPartitioner

java.lang.Object
org.apache.spark.Partitioner
All Implemented Interfaces:
Serializable

public class QuadTreeRTPartitioner extends QuadTreePartitioner
This class implements spatial partitioner based on the principles outlined in:

It uses the Quad-tree partitioning strategy to create balanced partitions This partitioning is essential for efficiently executing kNN joins. The process involves the following steps:

1. **Quad-tree Partitioning**: - Data is partitioned into partitions using the Quad-tree data structure. - This ensures balanced partitions and spatial locality preservation.

2. **Global R-Tree Index Construction**: - A set of random samples S' from the dataset S is taken. - An R-tree T is built over S' in the master node (driver program).

3. **Distance Bound Calculation**: - For each partition Ri, the distance ui from the furthest point in Ri to the centroid cri is calculated. - k-nearest neighbors of each centroid cri are found using the R-tree T. - A distance bound γi is derived for each Ri, defined as γi = 2ui + |cri, sk|, where sk is the k-th nearest neighbor of cri.

4. **Partitioning Neighbors**: - For each partition Ri, a subset Si ⊂ S is identified such that for any r ∈ Ri, knn(r, S) = knn(r, Si) using a circle range query centered at cri with radius γi. - This guarantees that the k-nearest neighbors of any point in Ri can be found within the subset Si.

5. **Parallel Local kNN Joins**: - Each combined partition (Ri, Si) is processed in parallel. - An R-tree is built over Si, and a local kNN join is performed for each record in Ri. - The results from all partitions are combined to produce the final kNN join results.

Reference: Xie, Dong, et al. "Simba: Efficient in-memory spatial analytics." In Proceedings of the 2016 ACM SIGMOD International Conference on Management of Data (SIGMOD '16), 2016, DOI: 10.1145/2882903.2915237.

See Also:
  • Constructor Details

    • QuadTreeRTPartitioner

      public QuadTreeRTPartitioner(ExtendedQuadTree<?> extendedQuadTree)
  • Method Details

    • nonOverlappedPartitioner

      public QuadTreeRTPartitioner nonOverlappedPartitioner()
    • getBoundary

      public org.locationtech.jts.geom.Envelope getBoundary()
    • placeObject

      public Iterator<scala.Tuple2<Integer,org.locationtech.jts.geom.Geometry>> placeObject(org.locationtech.jts.geom.Geometry spatialObject) throws Exception
      Depending on overlappedPartitioner, return the expanded boundaries or the original boundaries.
      Overrides:
      placeObject in class QuadTreePartitioner
      Parameters:
      spatialObject - the spatial object
      Returns:
      Iterator<Tuple2<Integer, Geometry>>
      Throws:
      Exception
    • getDedupParams

      public DedupParams getDedupParams()
      Overrides:
      getDedupParams in class QuadTreePartitioner
    • getGrids

      public List<org.locationtech.jts.geom.Envelope> getGrids()
      Overrides:
      getGrids in class SpatialPartitioner
    • numPartitions

      public int numPartitions()
      Overrides:
      numPartitions in class QuadTreePartitioner
    • getOverlappedGrids

      public Map<Integer,List<org.locationtech.jts.geom.Envelope>> getOverlappedGrids()
    • getSTRForOverlappedGrids

      public org.locationtech.jts.index.strtree.STRtree getSTRForOverlappedGrids()