@@ -48,14 +48,16 @@ PostgreSQL documentation
4848 </para>
4949
5050 <para>
51- The result is equivalent to replacing the target data directory with the
52- source one. Only changed blocks from relation files are copied;
53- all other files are copied in full, including configuration files. The
54- advantage of <application>pg_rewind</application> over taking a new base backup, or
55- tools like <application>rsync</application>, is that <application>pg_rewind</application> does
56- not require reading through unchanged blocks in the cluster. This makes
57- it a lot faster when the database is large and only a small
58- fraction of blocks differ between the clusters.
51+ After a successful rewind, the state of the target data directory is
52+ analogous to a base backup of the source data directory. Unlike taking
53+ a new base backup or using a tool like <application>rsync</application>,
54+ <application>pg_rewind</application> does not require comparing or copying
55+ unchanged relation blocks in the cluster. Only changed blocks from existing
56+ relation files are copied; all other files, including new relation files,
57+ configuration files, and WAL segments, are copied in full. As such the
58+ rewind operation is significantly faster than other approaches when the
59+ database is large and only a small fraction of blocks differ between the
60+ clusters.
5961 </para>
6062
6163 <para>
@@ -77,16 +79,18 @@ PostgreSQL documentation
7779 </para>
7880
7981 <para>
80- When the target server is started for the first time after running
81- <application>pg_rewind</application>, it will go into recovery mode and replay all
82- WAL generated in the source server after the point of divergence.
83- If some of the WAL was no longer available in the source server when
84- <application>pg_rewind</application> was run, and therefore could not be copied by the
85- <application>pg_rewind</application> session, it must be made available when the
86- target server is started. This can be done by creating a
87- <filename>recovery.signal</filename> file in the target data directory
88- and configuring suitable <xref linkend="guc-restore-command"/>
89- in <filename>postgresql.conf</filename>.
82+ After running <application>pg_rewind</application>, WAL replay needs to
83+ complete for the data directory to be in a consistent state. When the
84+ target server is started again it will enter archive recovery and replay
85+ all WAL generated in the source server from the last checkpoint before
86+ the point of divergence. If some of the WAL was no longer available in the
87+ source server when <application>pg_rewind</application> was run, and
88+ therefore could not be copied by the <application>pg_rewind</application>
89+ session, it must be made available when the target server is started.
90+ This can be done by creating a <filename>recovery.signal</filename> file
91+ in the target data directory and by configuring a suitable
92+ <xref linkend="guc-restore-command"/> in
93+ <filename>postgresql.conf</filename>.
9094 </para>
9195
9296 <para>
@@ -105,6 +109,15 @@ PostgreSQL documentation
105109 recovered. In such a case, taking a new fresh backup is recommended.
106110 </para>
107111
112+ <para>
113+ As <application>pg_rewind</application> copies configuration files
114+ entirely from the source, it may be required to correct the configuration
115+ used for recovery before restarting the target server, especially if
116+ the target is reintroduced as a standby of the source. If you restart
117+ the server after the rewind operation has finished but without configuring
118+ recovery, the target may again diverge from the primary.
119+ </para>
120+
108121 <para>
109122 <application>pg_rewind</application> will fail immediately if it finds
110123 files it cannot write directly to. This can happen for example when
@@ -342,34 +355,45 @@ GRANT EXECUTE ON function pg_catalog.pg_read_binary_file(text, bigint, bigint, b
342355 Copy all those changed blocks from the source cluster to
343356 the target cluster, either using direct file system access
344357 (<option>--source-pgdata</option>) or SQL (<option>--source-server</option>).
358+ Relation files are now in a state equivalent to the moment of the last
359+ completed checkpoint prior to the point at which the WAL timelines of the
360+ source and target diverged plus the current state on the source of any
361+ blocks changed on the target after that divergence.
345362 </para>
346363 </step>
347364 <step>
348365 <para>
349- Copy all other files such as <filename>pg_xact</filename> and
350- configuration files from the source cluster to the target cluster
351- (everything except the relation files) . Similarly to base backups,
352- the contents of the directories <filename>pg_dynshmem/</filename>,
366+ Copy all other files, including new relation files, WAL segments,
367+ <filename>pg_xact</filename>, and configuration files from the source
368+ cluster to the target cluster . Similarly to base backups, the contents
369+ of the directories <filename>pg_dynshmem/</filename>,
353370 <filename>pg_notify/</filename>, <filename>pg_replslot/</filename>,
354371 <filename>pg_serial/</filename>, <filename>pg_snapshots/</filename>,
355- <filename>pg_stat_tmp/</filename>, and
356- <filename>pg_subtrans/</filename> are omitted from the data copied
357- from the source cluster. Any file or directory beginning with
358- <filename>pgsql_tmp</filename> is omitted, as well as are
372+ <filename>pg_stat_tmp/</filename>, and <filename>pg_subtrans/</filename>
373+ are omitted from the data copied from the source cluster. The files
359374 <filename>backup_label</filename>,
360375 <filename>tablespace_map</filename>,
361376 <filename>pg_internal.init</filename>,
362- <filename>postmaster.opts</filename> and
363- <filename>postmaster.pid</filename>.
377+ <filename>postmaster.opts</filename>, and
378+ <filename>postmaster.pid</filename>, as well as any file or directory
379+ beginning with <filename>pgsql_tmp</filename>, are omitted.
380+ </para>
381+ </step>
382+ <step>
383+ <para>
384+ Create a <filename>backup_label</filename> file to begin WAL replay at
385+ the checkpoint created at failover and configure the
386+ <filename>pg_control</filename> file with a minimum consistency LSN
387+ defined as the result of <literal>pg_current_wal_insert_lsn()</literal>
388+ when rewinding from a live source or the last checkpoint LSN when
389+ rewinding from a stopped source.
364390 </para>
365391 </step>
366392 <step>
367393 <para>
368- Apply the WAL from the source cluster, starting from the checkpoint
369- created at failover. (Strictly speaking, <application>pg_rewind</application>
370- doesn't apply the WAL, it just creates a backup label file that
371- makes <productname>PostgreSQL</productname> start by replaying all WAL from
372- that checkpoint forward.)
394+ When starting the target, <productname>PostgreSQL</productname> replays
395+ all the required WAL, resulting in a data directory in a consistent
396+ state.
373397 </para>
374398 </step>
375399 </procedure>
0 commit comments