ROAPI is perhaps the closest project to Seafowl. It is also powered by DataFusion and has similar performance characteristics to Seafowl.
Since both Seafowl and ROAPI are built on top of Apache DataFusion, they are very similar in performance and so we didn't do any benchmarking.
Seafowl stores separate metadata for partition pruning, which lets it discard partitions without looking at the original data file. It also has a local cache for data files loaded over HTTP. It's therefore possible that Seafowl can slightly outperform ROAPI on certain queries.
Data sources and writing
ROAPI supports considerably more data sources and data formats (see the documentation), including letting you expose other databases through ROAPI. Seafowl only supports Parquet and CSV files loaded over HTTP or uploaded to it.
On the other hand, Seafowl is a database and supports direct writes. You can
write to Seafowl with standard SQL statements, as well as run
CREATE TABLE AS
queries to transform data.
ROAPI supports being queried with GraphQL, REST and sending SQL queries over HTTP.
Seafowl currently only supports executing SQL queries.
Caching and querying
Both ROAPI and Seafowl are HTTP-first and queryable directly from the end user's browser.
However, Seafowl's query API allows you to use various HTTP caches (CDNs, dedicated caches, the browser cache) to deliver query results to your end users much faster.
You can write UDFs for Seafowl in any language that compiles to WebAssembly (WASM). It is not possible to extend ROAPI in the same way without compiling a custom version of it.
When to use ROAPI over Seafowl
- Your dataset is static
- You want to be able to use GraphQL/REST APIs to query your data
- You want to expose a backend database as a read-only API
When to use Seafowl over ROAPI
- Your dataset might occasionally change or you want to be able to append data to it
- You're comfortable with SQL and don't need a REST/GraphQL API
- You want to write a custom user-defined function
- You want to be able to rely on HTTP caches and CDNs to cache and deliver query results