Writing data with SQL queries

In addition to uploading files, you can also write to Seafowl using standard INSERT, UPDATE, DELETE queries.

curl -XPOST -H "Authorization: Bearer some_password" http://localhost:8080/q [email protected]<<EOF
{"query": "INSERT INTO new_table (col_1, col_2) VALUES (1, 'one')"}

Seafowl also supports CREATE TABLE AS statements. These can be useful to transform existing Seafowl tables:

CREATE TABLE weather_by_month AS
  SELECT date_trunc(timestamp, 'month') AS month,
  AVG(temp_c_10 * 0.1) AS avg_temp
FROM cambridge_weather GROUP BY 1 ORDER BY 1 ASC

...or "freeze" external tables into Seafowl tables:

  LOCATION 'https://parqueth-sample.s3.us-west-1.amazonaws.com/mainnet/transactions/dt=2021-07-01/blocks-0012738509-0012739509.parquet';
CREATE TABLE parqueth_sample AS SELECT * FROM staging.data;

Seafowl doesn't support primary keys, foreign keys or unique constraints. If you insert the same row multiple times, it's not going to get deduplicated or raise an error.

For more information on DML statements, see the reference.

Best practices

Batching INSERTs

Every INSERT in Seafowl makes at least one new partition. This means that it's more efficient to write to Seafowl using INSERT statements with multiple rows:

INSERT INTO some_table (col_1, col_2) VALUES
(1, 'one'),
(2, 'two'),
(3, 'three')

Insertion order

It's best to insert data in the order of the key that you're going to be filtering on the most. This can make Seafowl utilize partition pruning to speed up queries by scanning through less data.