-
-
Notifications
You must be signed in to change notification settings - Fork 371
Description
Shlink version
4.5.2
PHP version
8.3
How do you serve Shlink
Docker image
Database engine
MySQL
Database version
8.4.5
Current behavior
I have shlink setup in a high loaded environment, and i noticed one particular query pattern bloat the db and eventually crash or slow down.
The query is as below. Since original_url is not indexed, the query makes full table scan and create a load. If multiple queries appear at the same time, they saturate all cpu cores.
Here is the relevant file and line
SELECT
s0_.id AS id_0,
s0_.original_url AS original_url_1,
s0_.short_code AS short_code_2,
s0_.date_created AS date_created_3,
s0_.valid_since AS valid_since_4,
s0_.valid_until AS valid_until_5,
s0_.max_visits AS max_visits_6,
s0_.import_source AS import_source_7,
s0_.import_original_short_code AS import_original_short_code_8,
s0_.title AS title_9,
s0_.title_was_auto_resolved AS title_was_auto_resolved_10,
s0_.crawlable AS crawlable_11,
s0_.forward_query AS forward_query_12,
s0_.domain_id AS domain_id_13,
s0_.author_api_key_id AS author_api_key_id_14
FROM
short_urls s0_
INNER JOIN short_urls_in_tags s2_ ON s0_.id = s2_.short_url_id
INNER JOIN tags t1_ ON t1_.id = s2_.tag_id
AND (t1_.name = 'kvacc8280')
INNER JOIN short_urls_in_tags s4_ ON s0_.id = s4_.short_url_id
INNER JOIN tags t3_ ON t3_.id = s4_.tag_id
AND (t3_.name = 'kivacrm')
INNER JOIN short_urls_in_tags s6_ ON s0_.id = s6_.short_url_id
INNER JOIN tags t5_ ON t5_.id = s6_.tag_id
WHERE
s0_.original_url = 'https://myapp.com/?q=...'
GROUP BY
s0_.id,
s0_.original_url,
s0_.short_code,
s0_.date_created,
s0_.valid_since,
s0_.valid_until,
s0_.max_visits,
s0_.import_source,
s0_.import_original_short_code,
s0_.title,
s0_.title_was_auto_resolved,
s0_.crawlable,
s0_.forward_query,
s0_.domain_id,
s0_.author_api_key_id
HAVING
COUNT(t5_.id) = 2
LIMIT
1;Expected behavior
Since the url is a text column and it can be very long, it is not clever to index it but maybe create a hash string, index it, and query it. I would recommend creating/indexing a generated virtual column such as sha256(original_url), and use it instead of this.
-- create the column and the index
ALTER TABLE short_urls
ADD COLUMN original_url_hash BINARY(32) AS (UNHEX(SHA2(original_url, 256))) STORED,
ADD INDEX idx_original_url_hash (original_url_hash);
-- query hash instead of the url
SELECT ... FROM short_urls WHERE url_hash = UNHEX(SHA2('a long url', 256));Another recommendation in addition to this, adding a configurable (or maybe hardcoded) max execution time limit to these kind of select queries by using (e.g. /*+ MAX_EXECUTION_TIME(5000) */), to prevent executing these queries for a very long time not to crash the db.
Note that, some people (like me :)) might be using a shared database for shlink and other apps.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status