Accelerate Apache Spark SQL Queries
Running SQL queries using IgniteRDD
is orders of magnitude faster than running SQL
queries using Spark native RDDs or Data Frame APIs.
Spark does not support SQL indexes, resulting in slow SQL queries due to full scans across the whole data set. Such full-scan queries in spark can take minutes and introduce significant wait times, especially when running many queries within the same Spark application.
Apache Ignite, on the other hand, supports SQL with in-memory indexing
.
Because of advanced in-memory indexing capabilities, IgniteRDD executes SQL
queries 100s of times faster than Spark native RDDs or Data Frames.
Ignite allows storing data in on-heap as well as off-heap memory. If data is cached in off-heap memory, query indexes will be stored off-heap as well and will not introduce any additional JVM Garbage Collection (GC) overhead.