@@ -665,6 +665,108 @@ IsForeignRelUpdatable (Relation rel);
665665
666666 </sect2>
667667
668+ <sect2 id="fdw-callbacks-row-locking">
669+ <title>FDW Routines For Row Locking</title>
670+
671+ <para>
672+ If an FDW wishes to support <firstterm>late row locking</> (as described
673+ in <xref linkend="fdw-row-locking">), it must provide the following
674+ callback functions:
675+ </para>
676+
677+ <para>
678+ <programlisting>
679+ RowMarkType
680+ GetForeignRowMarkType (RangeTblEntry *rte,
681+ LockClauseStrength strength);
682+ </programlisting>
683+
684+ Report which row-marking option to use for a foreign table.
685+ <literal>rte</> is the <structname>RangeTblEntry</> node for the table
686+ and <literal>strength</> describes the lock strength requested by the
687+ relevant <literal>FOR UPDATE/SHARE</> clause, if any. The result must be
688+ a member of the <literal>RowMarkType</> enum type.
689+ </para>
690+
691+ <para>
692+ This function is called during query planning for each foreign table that
693+ appears in an <command>UPDATE</>, <command>DELETE</>, or <command>SELECT
694+ FOR UPDATE/SHARE</> query and is not the target of <command>UPDATE</>
695+ or <command>DELETE</>.
696+ </para>
697+
698+ <para>
699+ If the <function>GetForeignRowMarkType</> pointer is set to
700+ <literal>NULL</>, the <literal>ROW_MARK_COPY</> option is always used.
701+ (This implies that <function>RefetchForeignRow</> will never be called,
702+ so it need not be provided either.)
703+ </para>
704+
705+ <para>
706+ See <xref linkend="fdw-row-locking"> for more information.
707+ </para>
708+
709+ <para>
710+ <programlisting>
711+ HeapTuple
712+ RefetchForeignRow (EState *estate,
713+ ExecRowMark *erm,
714+ Datum rowid,
715+ bool *updated);
716+ </programlisting>
717+
718+ Re-fetch one tuple from the foreign table, after locking it if required.
719+ <literal>estate</> is global execution state for the query.
720+ <literal>erm</> is the <structname>ExecRowMark</> struct describing
721+ the target foreign table and the row lock type (if any) to acquire.
722+ <literal>rowid</> identifies the tuple to be fetched.
723+ <literal>updated</> is an output parameter.
724+ </para>
725+
726+ <para>
727+ This function should return a palloc'ed copy of the fetched tuple,
728+ or <literal>NULL</> if the row lock couldn't be obtained. The row lock
729+ type to acquire is defined by <literal>erm->markType</>, which is the
730+ value previously returned by <function>GetForeignRowMarkType</>.
731+ (<literal>ROW_MARK_REFERENCE</> means to just re-fetch the tuple without
732+ acquiring any lock, and <literal>ROW_MARK_COPY</> will never be seen by
733+ this routine.)
734+ </para>
735+
736+ <para>
737+ In addition, <literal>*updated</> should be set to <literal>true</>
738+ if what was fetched was an updated version of the tuple rather than
739+ the same version previously obtained. (If the FDW cannot be sure about
740+ this, always returning <literal>true</> is recommended.)
741+ </para>
742+
743+ <para>
744+ Note that by default, failure to acquire a row lock should result in
745+ raising an error; a <literal>NULL</> return is only appropriate if
746+ the <literal>SKIP LOCKED</> option is specified
747+ by <literal>erm->waitPolicy</>.
748+ </para>
749+
750+ <para>
751+ The <literal>rowid</> is the <structfield>ctid</> value previously read
752+ for the row to be re-fetched. Although the <literal>rowid</> value is
753+ passed as a <type>Datum</>, it can currently only be a <type>tid</>. The
754+ function API is chosen in hopes that it may be possible to allow other
755+ datatypes for row IDs in future.
756+ </para>
757+
758+ <para>
759+ If the <function>RefetchForeignRow</> pointer is set to
760+ <literal>NULL</>, attempts to re-fetch rows will fail
761+ with an error message.
762+ </para>
763+
764+ <para>
765+ See <xref linkend="fdw-row-locking"> for more information.
766+ </para>
767+
768+ </sect2>
769+
668770 <sect2 id="fdw-callbacks-explain">
669771 <title>FDW Routines for <command>EXPLAIN</></title>
670772
@@ -1092,31 +1194,125 @@ GetForeignServerByName(const char *name, bool missing_ok);
10921194 structures that <function>copyObject</> knows how to copy.
10931195 </para>
10941196
1095- <para>
1096- For an <command>UPDATE</> or <command>DELETE</> against an external data
1097- source that supports concurrent updates, it is recommended that the
1098- <literal>ForeignScan</> operation lock the rows that it fetches, perhaps
1099- via the equivalent of <command>SELECT FOR UPDATE</>. The FDW may also
1100- choose to lock rows at fetch time when the foreign table is referenced
1101- in a <command>SELECT FOR UPDATE/SHARE</>; if it does not, the
1102- <literal>FOR UPDATE</> or <literal>FOR SHARE</> option is essentially a
1103- no-op so far as the foreign table is concerned. This behavior may yield
1104- semantics slightly different from operations on local tables, where row
1105- locking is customarily delayed as long as possible: remote rows may get
1106- locked even though they subsequently fail locally-applied restriction or
1107- join conditions. However, matching the local semantics exactly would
1108- require an additional remote access for every row, and might be
1109- impossible anyway depending on what locking semantics the external data
1110- source provides.
1111- </para>
1112-
11131197 <para>
11141198 <command>INSERT</> with an <literal>ON CONFLICT</> clause does not
11151199 support specifying the conflict target, as remote constraints are not
11161200 locally known. This in turn implies that <literal>ON CONFLICT DO
11171201 UPDATE</> is not supported, since the specification is mandatory there.
11181202 </para>
11191203
1204+ </sect1>
1205+
1206+ <sect1 id="fdw-row-locking">
1207+ <title>Row Locking in Foreign Data Wrappers</title>
1208+
1209+ <para>
1210+ If an FDW's underlying storage mechanism has a concept of locking
1211+ individual rows to prevent concurrent updates of those rows, it is
1212+ usually worthwhile for the FDW to perform row-level locking with as
1213+ close an approximation as practical to the semantics used in
1214+ ordinary <productname>PostgreSQL</> tables. There are multiple
1215+ considerations involved in this.
1216+ </para>
1217+
1218+ <para>
1219+ One key decision to be made is whether to perform <firstterm>early
1220+ locking</> or <firstterm>late locking</>. In early locking, a row is
1221+ locked when it is first retrieved from the underlying store, while in
1222+ late locking, the row is locked only when it is known that it needs to
1223+ be locked. (The difference arises because some rows may be discarded by
1224+ locally-checked restriction or join conditions.) Early locking is much
1225+ simpler and avoids extra round trips to a remote store, but it can cause
1226+ locking of rows that need not have been locked, resulting in reduced
1227+ concurrency or even unexpected deadlocks. Also, late locking is only
1228+ possible if the row to be locked can be uniquely re-identified later.
1229+ Preferably the row identifier should identify a specific version of the
1230+ row, as <productname>PostgreSQL</> TIDs do.
1231+ </para>
1232+
1233+ <para>
1234+ By default, <productname>PostgreSQL</> ignores locking considerations
1235+ when interfacing to FDWs, but an FDW can perform early locking without
1236+ any explicit support from the core code. The API functions described
1237+ in <xref linkend="fdw-callbacks-row-locking">, which were added
1238+ in <productname>PostgreSQL</> 9.5, allow an FDW to use late locking if
1239+ it wishes.
1240+ </para>
1241+
1242+ <para>
1243+ An additional consideration is that in <literal>READ COMMITTED</>
1244+ isolation mode, <productname>PostgreSQL</> may need to re-check
1245+ restriction and join conditions against an updated version of some
1246+ target tuple. Rechecking join conditions requires re-obtaining copies
1247+ of the non-target rows that were previously joined to the target tuple.
1248+ When working with standard <productname>PostgreSQL</> tables, this is
1249+ done by including the TIDs of the non-target tables in the column list
1250+ projected through the join, and then re-fetching non-target rows when
1251+ required. This approach keeps the join data set compact, but it
1252+ requires inexpensive re-fetch capability, as well as a TID that can
1253+ uniquely identify the row version to be re-fetched. By default,
1254+ therefore, the approach used with foreign tables is to include a copy of
1255+ the entire row fetched from a foreign table in the column list projected
1256+ through the join. This puts no special demands on the FDW but can
1257+ result in reduced performance of merge and hash joins. An FDW that is
1258+ capable of meeting the re-fetch requirements can choose to do it the
1259+ first way.
1260+ </para>
1261+
1262+ <para>
1263+ For an <command>UPDATE</> or <command>DELETE</> on a foreign table, it
1264+ is recommended that the <literal>ForeignScan</> operation on the target
1265+ table perform early locking on the rows that it fetches, perhaps via the
1266+ equivalent of <command>SELECT FOR UPDATE</>. An FDW can detect whether
1267+ a table is an <command>UPDATE</>/<command>DELETE</> target at plan time
1268+ by comparing its relid to <literal>root->parse->resultRelation</>,
1269+ or at execution time by using <function>ExecRelationIsTargetRelation()</>.
1270+ An alternative possibility is to perform late locking within the
1271+ <function>ExecForeignUpdate</> or <function>ExecForeignDelete</>
1272+ callback, but no special support is provided for this.
1273+ </para>
1274+
1275+ <para>
1276+ For foreign tables that are specified to be locked by a <command>SELECT
1277+ FOR UPDATE/SHARE</> command, the <literal>ForeignScan</> operation can
1278+ again perform early locking by fetching tuples with the equivalent
1279+ of <command>SELECT FOR UPDATE/SHARE</>. To perform late locking
1280+ instead, provide the callback functions defined
1281+ in <xref linkend="fdw-callbacks-row-locking">.
1282+ In <function>GetForeignRowMarkType</>, select rowmark option
1283+ <literal>ROW_MARK_EXCLUSIVE</>, <literal>ROW_MARK_NOKEYEXCLUSIVE</>,
1284+ <literal>ROW_MARK_SHARE</>, or <literal>ROW_MARK_KEYSHARE</> depending
1285+ on the requested lock strength. (The core code will act the same
1286+ regardless of which of these four options you choose.)
1287+ Elsewhere, you can detect whether a foreign table was specified to be
1288+ locked by this type of command by using <function>get_plan_rowmark</> at
1289+ plan time, or <function>ExecFindRowMark</> at execution time; you must
1290+ check not only whether a non-null rowmark struct is returned, but that
1291+ its <structfield>strength</> field is not <literal>LCS_NONE</>.
1292+ </para>
1293+
1294+ <para>
1295+ Lastly, for foreign tables that are used in an <command>UPDATE</>,
1296+ <command>DELETE</> or <command>SELECT FOR UPDATE/SHARE</> command but
1297+ are not specified to be row-locked, you can override the default choice
1298+ to copy entire rows by having <function>GetForeignRowMarkType</> select
1299+ option <literal>ROW_MARK_REFERENCE</> when it sees lock strength
1300+ <literal>LCS_NONE</>. This will cause <function>RefetchForeignRow</> to
1301+ be called with that value for <structfield>markType</>; it should then
1302+ re-fetch the row without acquiring any new lock. (If you have
1303+ a <function>GetForeignRowMarkType</> function but don't wish to re-fetch
1304+ unlocked rows, select option <literal>ROW_MARK_COPY</>
1305+ for <literal>LCS_NONE</>.)
1306+ </para>
1307+
1308+ <para>
1309+ See <filename>src/include/nodes/lockoptions.h</>, the comments
1310+ for <type>RowMarkType</> and <type>PlanRowMark</>
1311+ in <filename>src/include/nodes/plannodes.h</>, and the comments for
1312+ <type>ExecRowMark</> in <filename>src/include/nodes/execnodes.h</> for
1313+ additional information.
1314+ </para>
1315+
11201316 </sect1>
11211317
11221318 </chapter>
0 commit comments