Splitgraph has been acquired by EDB! Read the blog post.

DuckDB

DuckDB is an in-memory OLAP database that boasts great performance and low latency for analytical queries.

Comparison

Syntax

DuckDB supports a larger subset of the PostgreSQL SELECT syntax than Seafowl. Often, you can take a PostgreSQL SELECT query and run it on DuckDB without any modifications.

Performance

In our benchmarks, Seafowl and DuckDB are very close in terms of performance. However, there are some pathological queries on which Seafowl either runs out of memory or underperforms DuckDB:

  • Top-k (SELECT ... ORDER BY ... LIMIT 1).
  • Low-latency queries (SELECT 1). Being an in-process database, DuckDB takes microseconds to execute these kinds of queries. The HTTP connection and various other overheads in Seafowl make these queries take longer.
  • Certain other aggregations

Deployment patterns

Seafowl is HTTP-first and is designed to be accessible directly from the Internet, letting your application run SQL queries on it straight from the client.

In addition, Seafowl's query API allows you to use various HTTP caches (CDNs, dedicated caches, the browser cache) to deliver query results to your end users.

With DuckDB, it's possible to deploy it at the edge or even let your users download a WASM blob with DuckDB to query the data from within their browser.

However, you'd have to write an HTTP interface to DuckDB in order to make it power a Web application without making the user download the raw data files.

When to use DuckDB over Seafowl

  • You're performing analytics on a single machine and not building a Web application
  • You need a more complete analytical syntax support
  • You want more stability and maximum speed from your queries

When to use Seafowl over DuckDB

  • You are building a Web application that needs to run analytical queries (aggregations, reporting). For example, this could be a data visualization or an interactive dashboard
  • You want to be able to cache query results at the edge