Create postgres index for table with inner join in RubyOnRails

Question

I have an app based on RubyOnRails 4.0. I have two models: Stores and Products. There are about 1.5 million products in the system making it quite slow if I do not use indices properly.

Some basic info

Store has_many Products
Store.affiliate_type_id is used where 1=Affiliated 2=Not affiliated
Products have attributes like "category_connection_id" (integer) and "is_available" (boolean)

In FeededProduct model:

scope :affiliated, -> { joins(:store).where("stores.affiliate_type_id = 1") }

This query takes about 500ms which basically interrupts the website:

FeededProduct.where(:is_available => true).affiliated.where(:category_connection_id => @feeded_product.category_connection_id)

Corresponding postgresql:

FeededProduct Load (481.4ms)  SELECT "feeded_products".* FROM "feeded_products" INNER JOIN "stores" ON "stores"."id" = "feeded_products"."store_id" WHERE "feeded_products"."is_available" = 't' AND "feeded_products"."category_connection_id" = 345 AND (stores.affiliate_type_id = 1)

Update. Postgresql EXPLAIN:

                                           QUERY PLAN
-------------------------------------------------------------------------------------------------
 Hash Join  (cost=477.63..49176.17 rows=21240 width=1084)
   Hash Cond: (feeded_products.store_id = stores.id)
   ->  Bitmap Heap Scan on feeded_products  (cost=377.17..48983.06 rows=38580 width=1084)
         Recheck Cond: (category_connection_id = 5923)
         Filter: is_available
         ->  Bitmap Index Scan on cc_w_store_index_on_fp  (cost=0.00..375.25 rows=38580 width=0)
               Index Cond: ((category_connection_id = 5923) AND (is_available = true))
   ->  Hash  (cost=98.87..98.87 rows=452 width=4)
         ->  Seq Scan on stores  (cost=0.00..98.87 rows=452 width=4)
               Filter: (affiliate_type_id = 1)
(10 rows)

Question: How can I create an index that will take the inner join into consideration and make this faster?

Thanks, but I could not find a good method to do that in RubyOnRails. Any advice here? — Christoffer
– Christoffer, Commented Nov 14, 2016 at 15:12
The thing is that I don't use PostgreSQL directly, but just indirectly through RubyOnRails. I am really not all that good at db administration so I use Rails-commands. In this case .explain and there doesn't seem to be any .explain_and_analyze or the likes of it. — Christoffer
– Christoffer, Commented Nov 14, 2016 at 15:29
Learning new things is fun. Isn't it fun? This is supposed to be fun, dang it. ;-) — Mike Sherrill 'Cat Recall'
– Mike Sherrill 'Cat Recall', Commented Nov 14, 2016 at 18:07

Community · Accepted Answer · 2020-06-20 09:12:55Z

2

That depends on the join algorithm that PostgreSQL chooses. Use EXPLAIN on the query to see how PostgreSQL processes the query.

These are the answers depending on the join algorithm:

nested loop join

Here you should create an index on the join condition for the inner relation (the bottom table in the EXPLAIN output). You may further improve things by adding columns that appear in the WHERE clause and significantly improve selectivity (i.e., significantly reduce the number of rows filtered out during the index scan.
For the outer relation, an index on the columns that appear in the WHERE clause will speed up the query if these conditions filter out most of the rows in the table.
hash join

Here it helps to have indexes on both tables on those columns in the WHERE clause where the conditions filter out most of the rows in the table.
merge join

Here you need indexes on the columns in the merge condition to allow PostgreSQL to use an index scan for sorting. Additionally, you can append columns that appear in the WHERE clause.

Always test with EXPLAIN if your indexes get used. If not, odds are that either they cannot be used or that using them would make the query slower than a sequential scan, e.g. because they do not filter out enough rows.

edited Jun 20, 2020 at 9:12

CommunityBot

11 silver badge

answered Nov 14, 2016 at 11:41

Laurenz Albe

257k22 gold badges312 silver badges388 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Christoffer Over a year ago

Thanks Laurenz, I never really used explain so I didn't think about it. I added the output to my question but to be honest, it doesn't say much to me since I don't really know how to interpret it. Could you give me a hand?

Laurenz Albe Over a year ago

You need the EXPLAIN output from the production system. On the test system, it looks like everything is fine (only 453 rows in the inner table, index used for outer table).

Christoffer Over a year ago

I re-did it for the production system. Note that FeededProducts has about 1.5M whereas Stores have some 1.200 items (with the filter of affiliate, probably 453). The loading of FeededProducts is the one that takes time.

Laurenz Albe Over a year ago

If the plan is the same, and stores is small, the query is probably as good as can be. You cannot avoid fetching all matching rows from feeded_products, right?

Christoffer Over a year ago

I guess not. Thanks, I will investigate this a bit further. The entire code is old and there might be workarounds that does not rely on indexing.

Collectives™ on Stack Overflow

Create postgres index for table with inner join in RubyOnRails

Update. Postgresql EXPLAIN:

1 Answer 1

5 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

Update. Postgresql EXPLAIN:

1 Answer 1

5 Comments

Your Answer

Sign up or log in

Post as a guest

Related