Segregating data or using UNIQUE index for optimization

Solution for Segregating data or using UNIQUE index for optimization
is Given Below:

I have a table;

Orders
* id INT NN AN PK
* userid INT NN
* is_open TINYINT NN DEFAULT 1
* amount INT NN
* desc VARCHAR(255)

and the query SELECT * FROM orders WHERE userid = ? AND is_open = 1; that I run frequently. I would like to optimize the database for this query and I currently have two options;

  • Move closed orders (is_open = 0) to a different table since currently open orders will be relatively smaller than closed orders thereby minimizing rows to scan on lookup
  • Set a unique key constraint: ALTER TABLE orders ADD CONSTRAINT UNIQUE KEY(id, userid);

I don’t know how the latter will perform and I know the former will help performance but I don’t know if it’s a good approach in terms of best practices.

Any other ideas would be appreciated.

The table is of orders; there can be multiple open/closed orders for each userid.

  • WHERE userid = ? AND is_open = 1 would benefit from either of these ‘composite’ indexes: INDEX(userid, is_open) or INDEX(is_open, user_id). The choice of which is better depends on what other queries might benefit from one more than the other.

  • Moving “closed” orders to another table is certainly a valid option. And it will help performance. (I usually don’t recommend it, only because of the clumsy code needed to move rows and/or to search both tables in the few cases where that is needed.)

  • I see no advantage with UNIQUE(id, userid). Presumably id is already “unique” because of being the PRIMARY KEY? Also, in a composite index, the first column will be checked first; that is what the PK is already doing.

  • Another approach… The AUTO_INCREMENT PK leads to the data BTree being roughly chronological. But you usually reach into the table by userid? To make that more efficient, change PRIMARY KEY(id), INDEX(userid) to PRIMARY KEY(userid, id), INDEX(id). (However… without knowing the other queries touching this table, I can’t say whether this will provide much overall improvement.)

  • This might be even better:

      PRIMARY KEY(userid, is_open, id),  -- to benefit many queries
      INDEX(id)   -- to keep AUTO_INCREMENT happy
    
  • The cost of an additional index (on the performance of write operations) is usually more than compensated for by the speedup of Selects.

Setting a unique index on id and user_id will gain you nothing since the id is already uniquely indexed as a primary key, and doesn’t feature in your query anyway.

Moving closed orders to a different table will give some performance improvement, but since the closed orders are probably distributed throughout the table, that performance improvement won’t be as great as you might expect. It also carries an administrative overhead, requiring that orders be moved periodically, and additional complications with reporting.

Your best solution is likely to be to add an index on user_id so that MySQL can go straight to the required User Id and search only those rows. You might get a further boost by indexing on user_id and is_open instead, but the additional benefit is likely to be small.

Bear in mind that each additional index incurs a performance penalty on every table update. This won’t be a problem if your table is not busy.