0

I have a table called fragments which hold a bunch of words, and sentence fragments. I want to create a random "sentence" by selecting a random fragment appending it to the string I'm building (separated by a delimiter). I want to keep doing this until I reach a specified length and then return the resulting string. What is the best way to do this in PostgreSQL?

id | text
----------------
1  | hello
2  | world

Given the example table above and a desired length of 20 I would expect to get back something like:

world hello hello world

http://sqlfiddle.com/#!15/0a88ca/1

1 Answer 1

1

You could use a recursive CTE for this:

 WITH RECURSIVE recCte (sentence, length) AS
 (
      --recursive seed
      SELECT
        CAST("text" AS VARCHAR(200)) as sentence,
        1 as length
      FROM (SELECT "text" FROM test.fragments ORDER BY random() LIMIT 1)frag

      UNION ALL

      --recursive term
      SELECT
          CAST(recCTE.sentence || ' ' || frag.text as VARCHAR(200)),
          recCTE.length + 1
      FROM
          reccte, (SELECT "text" from test.fragments ORDER BY random() LIMIT 1)frag
      WHERE recCTE.length <= 20 --Sentence word length
    )
    --Select the full sentence made
    SELECT sentence FROM recCTE WHERE length = 20;

This is a little involved, but it's perfect for your needs. A recursive CTE query is made up of three parts.

  1. The recursive seed - This is the portion of the query that is non-recursive. It's the starting point for the query. We just grab one random word from the table.
  2. The recursive term - This is the portion of the query that is recursive. It refers back to itself ...FROM recCTE .... Again, we just grab one word from the table at random and stick it onto the sentence. While doing this we keep track of how deep we are in the iterations so we can stop after 20 loops.
  3. The final select statement to select the full sentence from the recursive CTE. Each iteration creates a record, so we just grab the record that has made it to 20. Change "20" in the recursive CTE and final Select statement to change the length of the sentence.

Editting to add a version that uses character length:

This is a little trickier because we have to use ORDER BY random() LIMIT 1 to get a random text from the fragments table, but you can't ORDER BY and LIMIT in a recursive CTE. So getting exactly 20 characters is difficult, but.. we can get <= 20, which is pretty close:

 WITH RECURSIVE recCte (sentence, length) AS
    (
      SELECT
        CAST("text" AS VARCHAR(200)) as sentence,
        length("text") as length
      FROM (SELECT "text" FROM test.fragments ORDER BY random() LIMIT 1)frag

      UNION ALL

      SELECT
          CAST(cte.sentence || ' ' || frag.text as VARCHAR(200)),
          cte.length + 1 + length(frag.text)
      FROM
          reccte cte, (SELECT text FROM test.fragments ORDER BY random() LIMIT 1) frag
      WHERE
        length(frag.text) < (20-cte.length)
        AND cte.length <= 20 --Sentence word length

    )

SELECT sentence, length FROM recCTE ORDER BY length DESC LIMIT 1;

The big changes here are changing the Length field to calculate the character length and to add that restriction in the WHERE clause of the recursive term. Finally we ORDER BY length DESC to sort the cte's records by character length, and LIMIT 1 to grab the biggest one we created through the iterations.

Sign up to request clarification or add additional context in comments.

1 Comment

Awesome! I figured it'd be something w a recursive query but I was lost on the implementation. Thanks for the indepth answer.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.