Skip to content

Commit e66ddcd

Browse files
committed
New backup method: simple data dumps
We provide two datadumps, storing one for each month. One containing the public tables, one the private tables.
1 parent 7192d60 commit e66ddcd

9 files changed

Lines changed: 95 additions & 68 deletions

deploy/Caddyfile.example

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,3 +15,11 @@ umami.hackorum.dev {
1515
output stdout
1616
}
1717
}
18+
19+
dumps.hackorum.dev {
20+
root * /dumps/public
21+
file_server browse
22+
log {
23+
output stdout
24+
}
25+
}

deploy/README.md

Lines changed: 20 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -3,16 +3,16 @@
33
This is a minimal, single-host setup for running Hackorum on a VPS (e.g., Hetzner) with Docker Compose. It includes:
44
- Web app (Rails / Puma)
55
- IMAP runner (continuous)
6-
- Postgres with WAL archiving to a local volume
6+
- Postgres
77
- Caddy for TLS / reverse proxy
88
- Umami analytics (self-hosted)
99
- Autoheal watchdog to restart unhealthy containers
10-
- Local base backups + WAL retention scripts (no external storage)
10+
- Monthly SQL dumps (public + private split)
1111

1212
## Prerequisites
1313
- Docker + Docker Compose v2 on the VPS
1414
- A domain pointing to the VPS (for Caddy/HTTPS)
15-
- Enough disk for Postgres data + backups (base backups + WAL archives)
15+
- Enough disk for Postgres data + monthly dumps
1616

1717
## Setup steps
1818
1) Copy env template and fill in secrets:
@@ -30,6 +30,7 @@ This is a minimal, single-host setup for running Hackorum on a VPS (e.g., Hetzne
3030
3) Update Caddyfile domain:
3131
- Edit `deploy/Caddyfile` and replace `hackorum.example.com` and contact email.
3232
- Ensure the Umami host is set to `umami.hackorum.dev` (see `deploy/Caddyfile.example`).
33+
- Optional: add `dumps.hackorum.dev` to serve public dumps (see `deploy/Caddyfile.example`).
3334

3435
4) Configure Umami analytics:
3536
- Set `UMAMI_APP_SECRET` and `UMAMI_HASH_SALT` in `deploy/.env`.
@@ -44,7 +45,7 @@ This is a minimal, single-host setup for running Hackorum on a VPS (e.g., Hetzne
4445
Services:
4546
- `web`: Rails/Puma on port 3000 (internal)
4647
- `imap_worker`: continuous IMAP ingest
47-
- `db`: Postgres 18 with WAL archiving to `/var/lib/postgresql/wal-archive`
48+
- `db`: Postgres 18
4849
- `caddy`: TLS + reverse proxy on :80/:443
4950
- `umami`: self-hosted analytics UI/API on port 3000 (internal)
5051
- `autoheal`: restarts containers whose healthchecks fail
@@ -94,18 +95,25 @@ Access:
9495
- `GOOGLE_REDIRECT_URI` (e.g., https://your-domain/auth/google_oauth2/callback)
9596
- Rails runtime: `RAILS_ENV=production`, `RAILS_LOG_TO_STDOUT=1`, `RAILS_SERVE_STATIC_FILES=1`
9697

97-
## Backups (local, WAL + base backups)
98-
Postgres is configured with `archive_mode=on` and copies WAL files into a dedicated volume (`pgwal`). Use the provided scripts to create compressed base backups and prune old WAL/base backups.
98+
## Backups (monthly SQL dumps)
99+
The database dumps are split into public and private data, written to a Docker volume mounted at `/dumps` inside the Postgres container. Each month overwrites the same two files:
100+
- `public/public-YYYY-MM.sql.gz` (full schema + data, excluding private tables)
101+
- `private/private-YYYY-MM.sql.gz` (data-only for private tables)
102+
103+
The table list lives in `deploy/backup/private_tables.txt` and is used for both dumps.
104+
If you enable the `dumps.hackorum.dev` site in Caddy, only `/dumps/public` is mounted read-only in the Caddy container, so private dumps remain inaccessible.
99105

100106
Run (from `deploy/`):
101107
```bash
102-
./backup/run_base_backup.sh # creates tarred base backup under /backups
103-
RETAIN=3 ./backup/prune_backups.sh # keep 3 most recent base backups, prune old WAL (>14 days)
108+
./backup/run_monthly_dumps.sh
104109
```
105110
Recommended cadence:
106-
- Base backup weekly (or more often if you prefer).
107-
- Prune after each base backup.
108-
- Monitor disk usage; adjust retention or add external storage later if needed.
111+
- Run monthly (or more often if you want fresher dev data).
112+
113+
Example crontab (runs at 02:15 on the 1st of each month):
114+
```bash
115+
15 2 1 * * cd /path/to/hackorum/deploy && ./backup/run_monthly_dumps.sh >> /var/log/hackorum-dumps.log 2>&1
116+
```
109117

110118
## Initial archive import (mbox)
111119
If you need to import the historical mailing list archive before running the app:
@@ -147,5 +155,5 @@ docker compose up -d --build
147155
```
148156

149157
## Notes / future improvements
150-
- Swap local backups for remote object storage later by replacing the backup scripts with wal-g or pgbackrest.
158+
- Swap local dumps for remote object storage later if needed.
151159
- Add log shipping/metrics if needed; for now Docker logs go to the host.

deploy/backup/private_tables.txt

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
# Tables that contain private user data.
2+
# Used to include for the private dump, and exclude for the public dump.
3+
activities
4+
identities
5+
message_read_ranges
6+
mentions
7+
name_reservations
8+
note_edits
9+
note_mentions
10+
note_tags
11+
notes
12+
team_members
13+
teams
14+
thread_awarenesses
15+
user_tokens
16+
users

deploy/backup/prune_backups.sh

Lines changed: 0 additions & 21 deletions
This file was deleted.

deploy/backup/run_base_backup.sh

Lines changed: 0 additions & 17 deletions
This file was deleted.

deploy/backup/run_monthly_dumps.sh

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
#!/usr/bin/env bash
2+
set -euo pipefail
3+
4+
# Create monthly public/private SQL dumps (compressed) into the /dumps volume.
5+
# Public dump: full schema + data, excluding private tables.
6+
# Private dump: data-only, only private tables.
7+
8+
ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
9+
cd "$ROOT"
10+
11+
TABLES_FILE="${TABLES_FILE:-${ROOT}/backup/private_tables.txt}"
12+
STAMP="$(date +%Y-%m)"
13+
14+
if [[ ! -f "${TABLES_FILE}" ]]; then
15+
echo "Private tables file not found: ${TABLES_FILE}" >&2
16+
exit 1
17+
fi
18+
19+
readarray -t PRIVATE_TABLES < <(grep -E '^[a-z0-9_]+$' "${TABLES_FILE}")
20+
21+
if [[ ${#PRIVATE_TABLES[@]} -eq 0 ]]; then
22+
echo "No private tables found in ${TABLES_FILE}" >&2
23+
exit 1
24+
fi
25+
26+
EXCLUDE_ARGS=()
27+
INCLUDE_ARGS=()
28+
for table in "${PRIVATE_TABLES[@]}"; do
29+
EXCLUDE_ARGS+=("--exclude-table=public.${table}")
30+
INCLUDE_ARGS+=("--table=public.${table}")
31+
done
32+
33+
EXCLUDE_ARGS_STR="$(printf ' %q' "${EXCLUDE_ARGS[@]}")"
34+
INCLUDE_ARGS_STR="$(printf ' %q' "${INCLUDE_ARGS[@]}")"
35+
36+
echo "Writing dumps for ${STAMP} to /dumps/public and /dumps/private..."
37+
38+
docker compose -f docker-compose.yml exec -T db bash -lc \
39+
"mkdir -p /dumps/public /dumps/private \
40+
&& pg_dump -U \${POSTGRES_USER:-postgres} -d \${POSTGRES_DB:-hackorum} \
41+
--format=plain --no-owner --no-privileges${EXCLUDE_ARGS_STR} \
42+
| gzip -9 > /dumps/public/public-${STAMP}.sql.gz \
43+
&& pg_dump -U \${POSTGRES_USER:-postgres} -d \${POSTGRES_DB:-hackorum} \
44+
--format=plain --data-only --no-owner --no-privileges${INCLUDE_ARGS_STR} \
45+
| gzip -9 > /dumps/private/private-${STAMP}.sql.gz"
46+
47+
echo "Done."

deploy/docker-compose.yml

Lines changed: 3 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,6 @@ version: "3.9"
33
services:
44
db:
55
image: postgres:18
6-
entrypoint: ["/usr/local/bin/custom-entrypoint.sh"]
76
environment:
87
POSTGRES_USER: ${POSTGRES_USER:-postgres}
98
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-postgres}
@@ -14,11 +13,9 @@ services:
1413
volumes:
1514
# Postgres 18+ stores data under /var/lib/postgresql/<major>/main
1615
- pgdata:/var/lib/postgresql
17-
- pgwal:/var/lib/postgresql/wal-archive
18-
- pgbackups:/backups
16+
- pgdumps:/dumps
1917
- ./postgres/postgresql.conf:/etc/postgresql/postgresql.conf:ro # copy from postgresql.conf.example
2018
- ./postgres/init:/docker-entrypoint-initdb.d:ro
21-
- ./postgres/entrypoint.sh:/usr/local/bin/custom-entrypoint.sh:ro
2219
command:
2320
[
2421
"postgres",
@@ -115,6 +112,7 @@ services:
115112
- "443:443"
116113
volumes:
117114
- ./Caddyfile:/etc/caddy/Caddyfile:ro
115+
- pgdumps:/dumps:ro
118116
- caddy_data:/data
119117
- caddy_config:/config
120118
depends_on:
@@ -143,7 +141,6 @@ services:
143141

144142
volumes:
145143
pgdata:
146-
pgwal:
147-
pgbackups:
144+
pgdumps:
148145
caddy_data:
149146
caddy_config:

deploy/postgres/entrypoint.sh

Lines changed: 0 additions & 8 deletions
This file was deleted.

deploy/postgres/postgresql.conf.example

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -11,11 +11,8 @@ shared_preload_libraries = 'pg_stat_statements'
1111
pg_stat_statements.max = 10000
1212
pg_stat_statements.track = all
1313

14-
# WAL and archiving
14+
# WAL
1515
wal_level = replica
1616
wal_compression = on
17-
archive_mode = on
18-
archive_timeout = 60s
19-
archive_command = 'test ! -f /var/lib/postgresql/wal-archive/%f && cp %p /var/lib/postgresql/wal-archive/%f'
2017
max_wal_size = 8GB
2118
min_wal_size = 2GB

0 commit comments

Comments
 (0)