FAQ
What are the use cases of Seafowl?
Seafowl's main purpose is providing a fast, CDN and HTTP cache-friendly analytical query execution API to your data-driven applications.
Fast interactive notebooks and data visualizations
This is the most immediate use case of Seafowl. It lets you publish visualizations and dashboards powered by large datasets much easier.
Here's an example of a notebook powered by Observable and Seafowl:
In particular:
- You don't need to attach the data to the notebook (data attachments have size limits): you can treat a Seafowl deployment as a publicly facing database.
- Data processing is performed on the server side by Seafowl. The notebook sends SQL queries to it and there's no need for the client to download the data or aggregate it.
- Query results are cached by the user's browser and by a CDN. A person opening the notebook or using popular values for various toggles will see the results immediately, as the same query had already been executed and the results stored in a CDN's cache.
In our tutorial, you will be able to build a similar interactive visualization.
How fast is it?
We publish some basic benchmarks of Seafowl as compared to PostgreSQL,
PostgreSQL with the parquet_fdw
extension, and DuckDB. These aren't as
in-depth as TPC-DS, but do show that Seafowl makes
a lot of seemingly heavy SQL queries now feasible for powering interactive
applications.
In short, on analytical queries, Seafowl is:
- up to 10x faster than vanilla PostgreSQL
- ...or up to 4x faster than PostgreSQL with the
parquet_fdw
extension - ...or around the level of DuckDB in performance, sometimes beating it (apart from some pathological queries where it has memory issues, as well as queries that use a lot of subqueries where the optimizer might not be able to build an optimal query plan)
Where can I deploy it?
We provide a Docker image that is ready to be deployed to a PaaS like Fly.io (see our full tutorial) or Google Cloud Run. You can also download a single binary for deployment on other cloud providers or on bare metal.
What's the difference between Splitgraph and Seafowl?
Seafowl was built at Splitgraph but is a standalone project. You can think of it as a from-scratch rewrite of the Splitgraph DDN, using the lessons we learned from the last several years of working on Splitgraph.
- Splitgraph is PostgreSQL-first, Seafowl is HTTP-first.
- Splitgraph can live query a great variety of remote data sources, while Seafowl has a limited support for couple of well known ones (note: you can export from Splitgraph to a Seafowl instance!).
- Splitgraph DDN has its own query result cache. Seafowl leverages CDNs and HTTP caches to cache and deliver query results globally in milliseconds.
- Splitgraph is built around PostgreSQL and is written in Python. Seafowl is built around Apache DataFusion and is written in Rust, making it much faster, with a smaller resource footprint.
Should I use Seafowl or...?
For an in-depth overview of Seafowl to other projects (PostgreSQL, DuckDB, ROAPI, Snowflake), see this page.