@@ -9,8 +9,8 @@ This directory contains Ansible playbooks and configuration for automating the d
99- [ Prerequisites] ( #prerequisites )
1010- [ Initial Setup] ( #initial-setup )
1111- [ Deployment] ( #deployment )
12+ - [ Redeployment] ( #redeployment )
1213- [ Service Management] ( #service-management )
13- - [ Verification] ( #verification )
1414- [ Troubleshooting] ( #troubleshooting )
1515- [ Advanced Usage] ( #advanced-usage )
1616
@@ -292,6 +292,154 @@ ssh app@agg-mode-mainnet-sender 'tmux attach -t task_sender'
292292# Press Ctrl+B then D to detach without stopping
293293```
294294
295+ ## Redeployment
296+
297+ ### Idempotent Deployment
298+
299+ Idempotent deployment skips building if the binary already exists. Use this when you only want to update configuration files.
300+
301+ ``` bash
302+ # For Hoodi
303+ make gateway_deploy ENV=hoodi
304+
305+ # For Mainnet
306+ make gateway_deploy ENV=mainnet
307+ ```
308+
309+ ### Force Rebuild
310+
311+ Force rebuild always rebuilds binaries from the latest code, even if they already exist. Use this when you want to deploy code changes.
312+
313+ ``` bash
314+ # For Hoodi
315+ make gateway_deploy ENV=hoodi FORCE_REBUILD=true
316+ make gateway_primary_deploy ENV=hoodi FORCE_REBUILD=true
317+ make gateway_secondary_deploy ENV=hoodi FORCE_REBUILD=true
318+
319+ # For Mainnet
320+ make gateway_deploy ENV=mainnet FORCE_REBUILD=true
321+ make gateway_primary_deploy ENV=mainnet FORCE_REBUILD=true
322+ make gateway_secondary_deploy ENV=mainnet FORCE_REBUILD=true
323+ ```
324+
325+ This will:
326+ 1 . Pull latest code from the configured branch (staging for hoodi, main for mainnet)
327+ 2 . Delete existing binaries
328+ 3 . Rebuild gateway and poller from source
329+
330+ ### Migrations
331+
332+ To run database migrations:
333+
334+ ``` bash
335+ # For Hoodi
336+ make postgres_migrations ENV=hoodi
337+
338+ # For Mainnet
339+ make postgres_migrations ENV=mainnet
340+ ```
341+
342+ ### Task Sender
343+
344+ To redeploy the task sender:
345+
346+ ``` bash
347+ # For Hoodi
348+ make task_sender_deploy ENV=hoodi
349+
350+ # For Mainnet
351+ make task_sender_deploy ENV=mainnet
352+ ```
353+
354+ ### Metrics Stack
355+
356+ To redeploy the metrics stack (Prometheus and Grafana):
357+
358+ ``` bash
359+ # For Hoodi
360+ make metrics_deploy ENV=hoodi
361+ make prometheus_deploy ENV=hoodi
362+ make grafana_deploy ENV=hoodi
363+
364+ # For Mainnet
365+ make metrics_deploy ENV=mainnet
366+ make prometheus_deploy ENV=mainnet
367+ make grafana_deploy ENV=mainnet
368+ ```
369+
370+ ### Manual Update
371+
372+ If you prefer to update manually:
373+
374+ ** Gateway:**
375+ ``` bash
376+ # Hoodi
377+ ssh app@agg-mode-hoodi-gateway-1
378+ cd ~ /repos/gateway/aligned_layer
379+ git pull origin staging
380+ cargo install --path aggregation_mode/gateway --bin gateway --features tls --locked
381+
382+ # Mainnet
383+ ssh app@agg-mode-mainnet-gateway-1
384+ cd ~ /repos/gateway/aligned_layer
385+ git pull origin staging
386+ cargo install --path aggregation_mode/gateway --bin gateway --features tls --locked
387+ ```
388+
389+ ** Poller:**
390+ ``` bash
391+ # Hoodi
392+ ssh app@agg-mode-hoodi-gateway-1
393+ cd ~ /repos/poller/aligned_layer
394+ git pull origin staging
395+ cargo install --path aggregation_mode/payments_poller --bin payments_poller --locked
396+
397+ # Mainnet
398+ ssh app@agg-mode-mainnet-gateway-1
399+ cd ~ /repos/poller/aligned_layer
400+ git pull origin staging
401+ cargo install --path aggregation_mode/payments_poller --bin payments_poller --locked
402+ ```
403+
404+ ** Task Sender:**
405+ ``` bash
406+ # Hoodi
407+ ssh app@agg-mode-hoodi-sender
408+ cd ~ /repos/sender/aligned_layer
409+ git pull origin staging
410+ cargo install --path aggregation_mode/cli --bin agg_mode_cli --locked
411+
412+ # Mainnet
413+ ssh app@agg-mode-mainnet-sender
414+ cd ~ /repos/sender/aligned_layer
415+ git pull origin staging
416+ cargo install --path aggregation_mode/cli --bin agg_mode_cli --locked
417+ ```
418+
419+ ** Prometheus:**
420+ ``` bash
421+ # Hoodi
422+ ssh admin@agg-mode-hoodi-metrics
423+ # Update prometheus.yaml configuration manually
424+ systemctl --user restart prometheus
425+
426+ # Mainnet
427+ ssh admin@agg-mode-mainnet-metrics
428+ # Update prometheus.yaml configuration manually
429+ systemctl --user restart prometheus
430+ ```
431+
432+ ** Grafana:**
433+ ``` bash
434+ # Hoodi
435+ ssh admin@agg-mode-hoodi-metrics
436+ sudo systemctl restart grafana-server
437+
438+ # Mainnet
439+ ssh admin@agg-mode-mainnet-metrics
440+ sudo systemctl restart grafana-server
441+ ```
442+
295443## Service Management
296444
297445### Check Service Status
@@ -398,142 +546,6 @@ ssh app@agg-mode-mainnet-sender 'tmux capture-pane -t task_sender -p'
398546# Press Ctrl+B then D to detach
399547```
400548
401- ## Verification
402-
403- ### PostgreSQL Cluster Health
404-
405- 1 . ** Check cluster state:**
406- ``` bash
407- # For Hoodi
408- make postgres_status ENV=hoodi
409-
410- # For Mainnet
411- make postgres_status ENV=mainnet
412- ```
413-
414- 2 . ** Test password authentication:**
415- ``` bash
416- # For Hoodi
417- ssh admin@agg-mode-hoodi-postgres-1 " PGPASSWORD='your_password' psql -U autoctl_node -h localhost -d agg_mode -c 'SELECT 1'"
418- # For Mainnet
419- ssh admin@agg-mode-mainnet-postgres-1 " PGPASSWORD='your_password' psql -U autoctl_node -h localhost -d agg_mode -c 'SELECT 1'"
420- ```
421-
422- 3 . ** Verify replication:**
423- ``` bash
424- # For Hoodi
425- ssh admin@agg-mode-hoodi-postgres-1 " sudo -u postgres psql -d agg_mode -c 'SELECT * FROM pg_stat_replication'"
426-
427- # For Mainnet
428- ssh admin@agg-mode-mainnet-postgres-1 " sudo -u postgres psql -d agg_mode -c 'SELECT * FROM pg_stat_replication'"
429- ```
430-
431- 4 . ** Test failover (optional):**
432- ``` bash
433- # For Hoodi
434- ssh admin@agg-mode-hoodi-postgres-1 " sudo systemctl stop pgautofailover"
435- # Wait 30 seconds, check status
436- make postgres_status ENV=hoodi
437- # Secondary should now be primary
438- ssh admin@agg-mode-hoodi-postgres-1 " sudo systemctl start pgautofailover"
439-
440- # For Mainnet
441- ssh admin@agg-mode-mainnet-postgres-1 " sudo systemctl stop pgautofailover"
442- # Wait 30 seconds, check status
443- make postgres_status ENV=mainnet
444- # Secondary should now be primary
445- ssh admin@agg-mode-mainnet-postgres-1 " sudo systemctl start pgautofailover"
446- ```
447-
448- ### Gateway Health
449-
450- 1 . ** Check HTTP health endpoint:**
451- ``` bash
452- # For Hoodi
453- curl -k https://agg-mode-hoodi-gateway-1/
454-
455- # For Mainnet
456- curl -k https://agg-mode-mainnet-gateway-1/
457- ```
458-
459- 2 . ** Check metrics:**
460- ``` bash
461- # For Hoodi
462- curl http://agg-mode-hoodi-gateway-1:9094/metrics
463-
464- # For Mainnet
465- curl http://agg-mode-mainnet-gateway-1:9094/metrics
466- ```
467-
468- 3 . ** Verify database connectivity:**
469- ``` bash
470- # For Hoodi
471- ssh app@agg-mode-hoodi-gateway-1
472- PGPASSWORD=' your_password' psql -U autoctl_node -h agg-mode-hoodi-postgres-1 -d agg_mode -c " SELECT 1"
473-
474- # For Mainnet
475- ssh app@agg-mode-mainnet-gateway-1
476- PGPASSWORD=' your_password' psql -U autoctl_node -h agg-mode-mainnet-postgres-1 -d agg_mode -c " SELECT 1"
477- ```
478-
479- ### Poller Health
480-
481- 1 . ** Check last processed block:**
482- ``` bash
483- # For Hoodi
484- ssh app@agg-mode-hoodi-gateway-1 " cat ~/config/proof-aggregator.last_block_fetched.json"
485-
486- # For Mainnet
487- ssh app@agg-mode-mainnet-gateway-1 " cat ~/config/proof-aggregator.last_block_fetched.json"
488- ```
489-
490- The block number should increase over time.
491-
492- 2 . ** Check metrics:**
493- ``` bash
494- # For Hoodi
495- curl http://agg-mode-hoodi-gateway-1:9095/metrics
496-
497- # For Mainnet
498- curl http://agg-mode-mainnet-gateway-1:9095/metrics
499- ```
500-
501- ### Metrics Stack
502-
503- 1 . ** Prometheus targets:**
504- - Navigate to ` http://<metrics-ip>:9090/targets `
505- - All targets should show as "UP"
506-
507- 2 . ** Grafana datasources:**
508- - Navigate to ` http://<metrics-ip>:3000 `
509- - Go to Configuration → Data Sources
510- - Verify Prometheus and PostgreSQL datasources are connected
511-
512- ### Task Sender
513-
514- 1 . ** Check tmux session is running:**
515- ``` bash
516- # For Hoodi
517- make task_sender_status ENV=hoodi
518-
519- # For Mainnet
520- make task_sender_status ENV=mainnet
521- ```
522-
523- 2 . ** View recent logs:**
524- ``` bash
525- # For Hoodi
526- ssh app@agg-mode-hoodi-sender ' tmux capture-pane -t task_sender -p'
527-
528- # For Mainnet
529- ssh app@agg-mode-mainnet-sender ' tmux capture-pane -t task_sender -p'
530- ```
531-
532- 3 . ** Verify proof submissions:**
533- - Check logs for successful proof submissions
534- - Look for transaction hashes in the output
535- - Verify proofs are appearing on the network
536-
537549## Troubleshooting
538550
539551### PostgreSQL Issues
@@ -850,90 +862,6 @@ ansible-playbook infra/aggregation_mode/ansible/playbooks/gateway.yaml \
850862 -e " force_rebuild=true"
851863```
852864
853- ### Updating Services
854-
855- ** Update gateway and poller with latest code:**
856-
857- The easiest way to update services is using the ` FORCE_REBUILD ` parameter:
858-
859- ``` bash
860- # For Hoodi
861- make gateway_deploy ENV=hoodi FORCE_REBUILD=true
862- make gateway_primary_deploy ENV=hoodi FORCE_REBUILD=true
863- make gateway_secondary_deploy ENV=hoodi FORCE_REBUILD=true
864-
865- # For Mainnet
866- make gateway_deploy ENV=mainnet FORCE_REBUILD=true
867- make gateway_primary_deploy ENV=mainnet FORCE_REBUILD=true
868- make gateway_secondary_deploy ENV=mainnet FORCE_REBUILD=true
869- ```
870-
871- This will:
872- 1 . Pull latest code from the configured branch (staging for hoodi, main for mainnet)
873- 2 . Delete existing binaries
874- 3 . Rebuild gateway and poller from source
875- 4 . Restart the services
876-
877- ** Manual update (alternative):**
878-
879- If you prefer to update manually:
880-
881- ``` bash
882- # Gateway (Hoodi)
883- ssh app@agg-mode-hoodi-gateway-1
884- cd ~ /repos/gateway/aligned_layer
885- git pull origin staging
886- cargo install --path aggregation_mode/gateway --bin gateway --features tls --locked
887- sudo systemctl restart gateway
888-
889- # Gateway (Mainnet)
890- ssh app@agg-mode-mainnet-gateway-1
891- cd ~ /repos/gateway/aligned_layer
892- git pull origin staging
893- cargo install --path aggregation_mode/gateway --bin gateway --features tls --locked
894- sudo systemctl restart gateway
895-
896- # Poller (Hoodi)
897- ssh app@agg-mode-hoodi-gateway-1
898- cd ~ /repos/poller/aligned_layer
899- git pull origin staging
900- cargo install --path aggregation_mode/payments_poller --bin payments_poller --locked
901- systemctl --user restart poller
902-
903- # Poller (Mainnet)
904- ssh app@agg-mode-mainnet-gateway-1
905- cd ~ /repos/poller/aligned_layer
906- git pull origin staging
907- cargo install --path aggregation_mode/payments_poller --bin payments_poller --locked
908- systemctl --user restart poller
909- ```
910-
911- ### Redeploy with Latest Code
912-
913- ** Idempotent deployment (skip if binary exists):**
914-
915- ``` bash
916- # For Hoodi
917- make gateway_deploy ENV=hoodi
918-
919- # For Mainnet
920- make gateway_deploy ENV=mainnet
921- ```
922-
923- This pulls the latest code but skips building if the binary already exists. Use this when you only want to update configuration files.
924-
925- ** Force rebuild (always rebuild binaries):**
926-
927- ``` bash
928- # For Hoodi
929- make gateway_deploy ENV=hoodi FORCE_REBUILD=true
930-
931- # For Mainnet
932- make gateway_deploy ENV=mainnet FORCE_REBUILD=true
933- ```
934-
935- This always rebuilds binaries from the latest code, even if they already exist. Use this when you want to deploy code changes.
936-
937865### Changing Configuration
938866
9398671 . Update INI files in ` playbooks/ini/ `
0 commit comments