PostgreSQL is an advanced open-source transactional database and is probably one of the first projects one thinks of when they hear the word "database".
However, PostgreSQL is different from Seafowl in terms of the use cases it's designed for.
Seafowl uses Apache DataFusion as its
query engine and its
SELECT syntax is a subset of the PostgreSQL syntax. See
the syntax reference for more information on the
Notably, Seafowl doesn't support:
- the JSON data type and JSON operators
- certain window expressions
For analytical queries (queries that have to do a lot of aggregations and summarizing), Seafowl vastly outperforms PostgreSQL. We've seen 3x-10x performance improvements with Seafowl against PostgreSQL for these kinds of queries (see more on our benchmarking page).
This is due to two factors:
- Seafowl stores data in the Parquet format, which is columnar and compresses much better than PostgreSQL. The subset of the Taxi dataset used in our benchmarks is 11GB in PostgreSQL 14 (without indexes) and 1.3GB in Seafowl. This means that Seafowl is going to be faster at reading data from disk, since there's less data to read.
- DataFusion is a columnar query execution engine. It's optimized for scanning through lots of data in large batches, which means that the per-row overhead of performing a scan is much smaller than in PostgreSQL.
On the other hand, Seafowl is extremely poorly suited to transaction processing use cases. It's missing all features that are vital for this:
- primary, foreign key and other constraints
- transactions, ACID compliance
- triggers and stored procedures (though you can write a Seafowl function in WASM)
If you need to use PostgreSQL, a cheap way to accelerate analytical queries is
parquet_fdw. This is a PostgreSQL extension that lets it query Parquet
files. More information in
While this lets PostgreSQL use the faster (for analytical queries) columnar
Parquet format, this doesn't change the fact that PostgreSQL has a row-oriented
query execution engine. In our
benchmarks, PostgreSQL with
parquet_fdw outperforms plain PostgreSQL on some queries, but can still be up
to 5x slower than Seafowl.
Seafowl is HTTP-first and is designed to be accessible directly from the Internet, letting your application run SQL queries on it straight from the client. This contradicts classical Web application architectures, where the database is only accessed by a trusted backend component.
In addition, Seafowl's query API allows you to use various HTTP caches (CDNs, dedicated caches, the browser cache) to deliver query results to your end users. This is different from PostgreSQL where you'd have to implement this in your backend application, potentially relying on a non-HTTP cache like Redis or Memcached.
When to use PostgreSQL over Seafowl
- You are building a "classic" application that relies on transactional features (a digital store front, a forum, etc) and needs to perform a large volume of single-row reads and writes with transactional consistency.
- You need some PostgreSQL-specific query features:
- Unstructured JSON data type support
- Geographical data support with PostGIS
When to use Seafowl over PostgreSQL
- You are building a Web application that needs to run analytical queries (aggregations, reporting). For example, this could be a data visualization or an interactive dashboard
- You won't be writing to the database often, or only writing data in large batches
- You want to avoid deploying and paying for both a database and a backend
- You want to be able to cache query results at the edge