Flare is a drop-in accelerator for Apache Spark 2.0 that achieves order of magnitude speedups on DataFrame and SQL workloads.
Flare closes this gap by compiling Catalyst query plans to native code.
Flare’s low-level implementation takes full advantage of native execution, using techniques such as NUMA-aware data layout and scheduling to leverage mechanical sympathy, and to bring execution closer to the metal than what is possible with current JVM-based techniques.
Modern server-class hardware provides memory in the TB range, and dozens of CPU cores. Such powerful machines are readily available, for example as X1 instances on Amazon EC2 or for purchase at Dell.
They are as powerful as a small cluster, but do not require network communication or fault tolerance, and they consume less energy, which makes them cheaper to operate.
While Spark was primarily designed for scaling out on clusters, Flare makes scaling up on server-class hardware an attractive alternative, in terms of performance, and also in terms of operating costs.
Flare is under active development, and we will continue to share information on our blog and on Twitter.
We are currently running a private beta program: If you are interested in using Flare, please use the contact form below.
Follow us on Twitter:
Sign up for news via email, or apply for our private beta: