
Reducing MySQL query execution time significantly can involve a combination of optimizing your query, database schema, and server configuration. Here are several strategies to consider:
1. Analyze the Query Execution Plan
- Use the
EXPLAIN
statement to analyze how MySQL executes your query. This will help you identify bottlenecks, such as full table scans or inefficient joins.
- sql
- EXPLAIN SELECT ...; -- Your query here
2. Optimize Joins
- Limit the Number of Joins: Evaluate if all joins are necessary. Consider if you can reduce the number of tables being joined.
- Use Appropriate Join Types: Ensure
Reducing MySQL query execution time significantly can involve a combination of optimizing your query, database schema, and server configuration. Here are several strategies to consider:
1. Analyze the Query Execution Plan
- Use the
EXPLAIN
statement to analyze how MySQL executes your query. This will help you identify bottlenecks, such as full table scans or inefficient joins.
- sql
- EXPLAIN SELECT ...; -- Your query here
2. Optimize Joins
- Limit the Number of Joins: Evaluate if all joins are necessary. Consider if you can reduce the number of tables being joined.
- Use Appropriate Join Types: Ensure you’re using the most efficient join type (INNER JOIN vs. LEFT JOIN) based on your requirements.
3. Indexing
- Composite Indexes: Instead of individual indexes, consider creating composite indexes on multiple columns that are often used together in WHERE clauses or JOIN conditions.
- Covering Indexes: Ensure that your indexes cover all the columns used in your query (SELECT, WHERE, JOIN).
4. Limit Result Set
- Use
LIMIT
to reduce the amount of data being processed if you only need a subset of the results.
- sql
- SELECT ... LIMIT 100; -- Adjust as necessary
5. Optimize WHERE Clauses
- Ensure that your
WHERE
clauses are sargable (search argument able), which means they can take advantage of indexes. Avoid functions on indexed columns in WHERE clauses.
6. Database Schema Optimization
- Normalization: Ensure your database schema is properly normalized to reduce redundancy but also consider denormalization for performance if necessary.
- Partitioning: For very large tables, consider partitioning to improve query performance.
7. Query Caching
- Utilize MySQL query caching if your data doesn’t change frequently. This can significantly speed up repeated queries.
8. Server Configuration
- Increase Memory Allocation: Ensure that your MySQL server has enough memory allocated for buffers, caches, and other operations.
- Optimize MySQL Configuration: Parameters like
innodb_buffer_pool_size
,query_cache_size
, and others can impact performance.
9. Consider Using Temporary Tables
- If your query involves complex calculations or aggregations, consider breaking it down into smaller queries and using temporary tables to store intermediate results.
10. Profile the Query
- Use the
SHOW PROFILE
command to get detailed timing information about various stages of your query execution. This can help identify specific areas that are slow.
- sql
- SET profiling = 1;
- SELECT ...; -- Your query here
- SHOW PROFILES;
11. Review Data Types
- Ensure that the data types of your columns are appropriate and consistent. Mismatched data types can lead to performance issues.
Example of Optimization
Suppose your original query looks like this:
- SELECT a.*, b.*, c.*, d.*, e.*
- FROM table_a a
- JOIN table_b b ON a.id = b.a_id
- JOIN table_c c ON b.id = c.b_id
- JOIN table_d d ON c.id = d.c_id
- JOIN table_e e ON d.id = e.d_id
- WHERE a.status = 'active' AND b.date > '2023-01-01';
After applying some of the strategies, it might look like:
- SELECT a.id, b.name, c.value
- FROM table_a a
- JOIN table_b b ON a.id = b.a_id
- JOIN table_c c ON b.id = c.b_id
- WHERE a.status = 'active'
- AND b.date > '2023-01-01'
- LIMIT 100;
Conclusion
By systematically applying these strategies, you should be able to reduce the execution time of your MySQL query significantly. Each database and query is unique, so it may take some trial and error to find the best combination of optimizations.
Optimizing MySQL queries is a fairly straightforward process. To solve your problem would require a lot of information not provided about the structure of the database, the configuration of the server engine and storage arcitecture, load and contention, database & table structure, and network latency on the full round trip, for starters. That said, here's the most basic things to consider:
A 20k row table is quite small. Sub-second results in 6-way joins with million-row tables should be closer to expectations.
1. list tables with smallest expected results first in the 'from' clause
2. avoid fu
Optimizing MySQL queries is a fairly straightforward process. To solve your problem would require a lot of information not provided about the structure of the database, the configuration of the server engine and storage arcitecture, load and contention, database & table structure, and network latency on the full round trip, for starters. That said, here's the most basic things to consider:
A 20k row table is quite small. Sub-second results in 6-way joins with million-row tables should be closer to expectations.
1. list tables with smallest expected results first in the 'from' clause
2. avoid functions and computations in where clauses
3. avoid 'distinct', the 'like' operator, and outer joins
4. use 'explain' and 'explain extended' to insure you have the right indexes and your correlated sub-queries are cacheable
5. if the optimizer is not cooperating (mysql does not support hints), select the smallest two-table result set into a temporary table, and then join against that denormalized result.
Basically, the idea is to reduce the number of bytes that need to be read & written to resolve the query, and the amount of nested result-set parsing. These 5 steps will go a long way towards reducing the i/o footprint of most multi-table queries.
Once the query itself is reasonably close to optimal, it’s time to look at contention and server configuration. Is the query waiting for other queries to release locks on either the data or the metadata, and is the server spinning on cache stampedes, or touching the disk a lot due to under- or over-allocated buffers and caches? These are more involved problems to solve, but they are also the basis for server efficiency, and can have a deep impact on the performance of the system.
With today’s modern day tools there can be an overwhelming amount of tools to choose from to build your own website. It’s important to keep in mind these considerations when deciding on which is the right fit for you including ease of use, SEO controls, high performance hosting, flexible content management tools and scalability. Webflow allows you to build with the power of code — without writing any.
You can take control of HTML5, CSS3, and JavaScript in a completely visual canvas — and let Webflow translate your design into clean, semantic code that’s ready to publish to the web, or hand off
With today’s modern day tools there can be an overwhelming amount of tools to choose from to build your own website. It’s important to keep in mind these considerations when deciding on which is the right fit for you including ease of use, SEO controls, high performance hosting, flexible content management tools and scalability. Webflow allows you to build with the power of code — without writing any.
You can take control of HTML5, CSS3, and JavaScript in a completely visual canvas — and let Webflow translate your design into clean, semantic code that’s ready to publish to the web, or hand off to developers.
If you prefer more customization you can also expand the power of Webflow by adding custom code on the page, in the <head>, or before the </head> of any page.
Trusted by over 60,000+ freelancers and agencies, explore Webflow features including:
- Designer: The power of CSS, HTML, and Javascript in a visual canvas.
- CMS: Define your own content structure, and design with real data.
- Interactions: Build websites interactions and animations visually.
- SEO: Optimize your website with controls, hosting and flexible tools.
- Hosting: Set up lightning-fast managed hosting in just a few clicks.
- Grid: Build smart, responsive, CSS grid-powered layouts in Webflow visually.
Discover why our global customers love and use Webflow | Create a custom website.
Looks like you’re on the right track.
It is difficult to diagnose your specific situation without seeing the query and knowing how many tables are involved in the query, or how many rows each table contains, etc. Having said that, optimizing a query tends to involve the same steps.
At the company where I work now, we have very stringent standards of query performance, like in the sub 1 second range. I have a good idea of what the database developers do to make them that fast, thanks to the back-end developers. The data architect himself, Tom, is a first class A-hole, but the developers who suffe
Looks like you’re on the right track.
It is difficult to diagnose your specific situation without seeing the query and knowing how many tables are involved in the query, or how many rows each table contains, etc. Having said that, optimizing a query tends to involve the same steps.
At the company where I work now, we have very stringent standards of query performance, like in the sub 1 second range. I have a good idea of what the database developers do to make them that fast, thanks to the back-end developers. The data architect himself, Tom, is a first class A-hole, but the developers who suffer every day working with him do wonderful work.
When it comes to query performance, there are always many factors at play. For instance network traffic and other connection issues. Therefore, be sure to try optimization techniques locally first, in order to take network factors out of the equation.
Here are a few optimization tricks for you to try:
Query Profiling
You can use query profiling to measure query execution time. Here’s how do to that in MySQL:
- Start the profiler with
- SET profiling = 1;
2. Then execute your query with
- SHOW PROFILES;
3. You’ll then see a list of queries the profiler has statistics for. Choose which query to examine with the statement
- SHOW PROFILE FOR QUERY 1;
…or whatever number is assigned to your query.
You’ll then get is a list where exactly how much time was spent during the query.
- mysql> SHOW PROFILE FOR QUERY 1;
- +--------------------+----------+
- | Status | Duration |
- +--------------------+----------+
- | query end | 0.000107 |
- | freeing items | 0.000008 |
- | logging slow query | 0.000015 |
- | cleaning up | 0.000006 |
- +--------------------+----------+
- 4 rows in set (0.00 sec)
You can also get a profile of CPU usage:
- mysql> SHOW PROFILE CPU FOR QUERY 1;
- +----------------------+----------+----------+------------+
- | Status | Duration | CPU_user | CPU_system |
- +----------------------+----------+----------+------------+
- | checking permissions | 0.000040 | 0.000038 | 0.000002 |
- | creating table | 0.000056 | 0.000028 | 0.000028 |
- | After create | 0.011363 | 0.000217 | 0.001571 |
- | query end | 0.000375 | 0.000013 | 0.000028 |
- | freeing items | 0.000089 | 0.000010 | 0.000014 |
- | logging slow query | 0.000019 | 0.000009 | 0.000010 |
- | cleaning up | 0.000005 | 0.000003 | 0.000002 |
- +----------------------+----------+----------+------------+
- 7 rows in set (0.00 sec)
The EXPLAIN Command
A good way to see what a query needs in order to perform better is to use the EXPLAIN command. It returns a formatted description of the query optimizer's execution plan for the specified statement. You can use this information to analyze and troubleshoot the query.
By default, EXPLAIN output represents the query plan as a hierarchy whereby each level represents a single database operation that the optimizer defines to execute the query. It takes a bit of practice to get accustomed to EXPLAIN’s output, but the more you do it, the better you’ll get understanding where your queries are lacking.
In my Navicat database client, there’s a button in the SQL Editor that runs EXPLAIN for me. Results are displayed in an easy-to-read grid format:
Analyzing Query Performance using a Monitoring Tool
Finally, I would strongly recommend that you analyze your query performance using a tool like Navicat Monitor. It has a Query Analyzer that shows information of all executing queries. Moreover, it can help identify slow queries and detect deadlocks, i.e. when two or more queries permanently block each other. Here’s a screenshot of the Query Analyzer screen:
The screen is divided into several sections:
- Latest Deadlock Query: Shows the transaction information of the latest deadlock detected in the selected instance.
- Process List: Displays the total number of running processes for the selected instance, and lists the last 5 processes including ID, command type, user, database and time information.
- Query Analyzer: Displays information about query statements with customizable and sortable columns.
In the end, you’ll likely have to add some indexes to make the query run faster. If that doesn’t do it, you may have to rewrite your query with speed in mind. To do that, start with the core of the query and then build it up from there.
Hope that helps.
Best regards!
Adam
Never use Indexes on all the fields that you are using in WHERE clause.
Try EXPLAIN YOUR MYSQL QUERY .
This command will tell you about the no. of rows being scanned and whether your indexes are being used or not. To optimize a query, we need to avoid full table scans and reduce the no of rows being scanned. You got 20k rows with 6 joins and it takes about 25secs. Make sure you have your indexes on the correct fields (ex.scanning 2k over 20k is always going to be quicker). A row with just 20 records might not need indexes. Use your indexes wisely since it slows down writing your data .
Clauses
Never use Indexes on all the fields that you are using in WHERE clause.
Try EXPLAIN YOUR MYSQL QUERY .
This command will tell you about the no. of rows being scanned and whether your indexes are being used or not. To optimize a query, we need to avoid full table scans and reduce the no of rows being scanned. You got 20k rows with 6 joins and it takes about 25secs. Make sure you have your indexes on the correct fields (ex.scanning 2k over 20k is always going to be quicker). A row with just 20 records might not need indexes. Use your indexes wisely since it slows down writing your data .
Clauses like GROUP BY, ORDER BY, LIKE ,DISTINCT are relatively more time consuming. If you are using them try evaluating it while scripting. Do not use sub-queries. Do not compute inside your query, evaluate it earlier if possible.
If you have decent RAM, try caching if your data doesnt changes that frequently.
Eventually, it will be a lot of trial and error to reach an optimized query. Also, your hardware matters. The more power it has ,the faster execution it does.
I used to PARTITION my mysql table with great effect. See if you can use partition in your use case.
In short : Sensible indexing + relevant caching + partitioning(if required) should solve your problem. 20K is still a very short no.
I undertake performance optimizations on a daily basis with clients.
Some optimization are easy, some are not. You need to know your tools such as EXPLAIN, CREATE TABLE, SHOW INDEXES, I_S and you need to understand the iterative process of verification.
Recently I turned a 10 table join for a client that run 15,000 times per second across 100s of servers. They had invested a great amount of time to create indexes, and indeed all 10 tables were using indexes.
Creating only better indexes, no code changes, no configuration changes, I was able to reduce the time 175ms to 10ms.
Creating Indexes is
I undertake performance optimizations on a daily basis with clients.
Some optimization are easy, some are not. You need to know your tools such as EXPLAIN, CREATE TABLE, SHOW INDEXES, I_S and you need to understand the iterative process of verification.
Recently I turned a 10 table join for a client that run 15,000 times per second across 100s of servers. They had invested a great amount of time to create indexes, and indeed all 10 tables were using indexes.
Creating only better indexes, no code changes, no configuration changes, I was able to reduce the time 175ms to 10ms.
Creating Indexes is only part of the process of query optimization. Knowing how to create better indexes is a topic for an upcoming talk at http://effectiveMySQL.com
Where do I start?
I’m a huge financial nerd, and have spent an embarrassing amount of time talking to people about their money habits.
Here are the biggest mistakes people are making and how to fix them:
Not having a separate high interest savings account
Having a separate account allows you to see the results of all your hard work and keep your money separate so you're less tempted to spend it.
Plus with rates above 5.00%, the interest you can earn compared to most banks really adds up.
Here is a list of the top savings accounts available today. Deposit $5 before moving on because this is one of th
Where do I start?
I’m a huge financial nerd, and have spent an embarrassing amount of time talking to people about their money habits.
Here are the biggest mistakes people are making and how to fix them:
Not having a separate high interest savings account
Having a separate account allows you to see the results of all your hard work and keep your money separate so you're less tempted to spend it.
Plus with rates above 5.00%, the interest you can earn compared to most banks really adds up.
Here is a list of the top savings accounts available today. Deposit $5 before moving on because this is one of the biggest mistakes and easiest ones to fix.
Overpaying on car insurance
You’ve heard it a million times before, but the average American family still overspends by $417/year on car insurance.
If you’ve been with the same insurer for years, chances are you are one of them.
Pull up Coverage.com, a free site that will compare prices for you, answer the questions on the page, and it will show you how much you could be saving.
That’s it. You’ll likely be saving a bunch of money. Here’s a link to give it a try.
Consistently being in debt
If you’ve got $10K+ in debt (credit cards…medical bills…anything really) you could use a debt relief program and potentially reduce by over 20%.
Here’s how to see if you qualify:
Head over to this Debt Relief comparison website here, then simply answer the questions to see if you qualify.
It’s as simple as that. You’ll likely end up paying less than you owed before and you could be debt free in as little as 2 years.
Missing out on free money to invest
It’s no secret that millionaires love investing, but for the rest of us, it can seem out of reach.
Times have changed. There are a number of investing platforms that will give you a bonus to open an account and get started. All you have to do is open the account and invest at least $25, and you could get up to $1000 in bonus.
Pretty sweet deal right? Here is a link to some of the best options.
Having bad credit
A low credit score can come back to bite you in so many ways in the future.
From that next rental application to getting approved for any type of loan or credit card, if you have a bad history with credit, the good news is you can fix it.
Head over to BankRate.com and answer a few questions to see if you qualify. It only takes a few minutes and could save you from a major upset down the line.
How to get started
Hope this helps! Here are the links to get started:
Have a separate savings account
Stop overpaying for car insurance
Finally get out of debt
Start investing with a free bonus
Fix your credit
20K rows isn’t a lot. We have a bunch of tables that are >10B rows, and our requirement for most queries is 500 milliseconds or less.
Getting your query time reduced should be straightforward.
A few suggestions for high-performance schemas:
- As others have posted, make sure you’re using EXPLAIN.
- Make sure every “leg” of your join is using an index. Ideally, it should be joining on the primary key.
- Use InnoDB and not MyISAM. Nowadays, InnoDB is quite a bit faster than MyISAM, particularly if joins are involved, and especially if you’re joining on the primary key.
- Make sure your InnoDB buffer pool is d
20K rows isn’t a lot. We have a bunch of tables that are >10B rows, and our requirement for most queries is 500 milliseconds or less.
Getting your query time reduced should be straightforward.
A few suggestions for high-performance schemas:
- As others have posted, make sure you’re using EXPLAIN.
- Make sure every “leg” of your join is using an index. Ideally, it should be joining on the primary key.
- Use InnoDB and not MyISAM. Nowadays, InnoDB is quite a bit faster than MyISAM, particularly if joins are involved, and especially if you’re joining on the primary key.
- Make sure your InnoDB buffer pool is decently tuned, and the machine isn’t overloaded.
There’s no way to know for sure unless you tell us what’s the query.
But let me give you some pointers.
- First of all make sure you add indexes to all the fields
This will improve your query performance 10x fold.
2. Make sure you’re using INNER JOIN instead of LEFT or RIGHT JOIN. INNER JOIN is faster than the others.
20k rows is really not that much so I guess is an indexing issue.
AI effectiveness depends on relevant, responsible and robust data to prevent costly errors, inefficiencies, and compliance issues. A solid data foundation allows AI models to deliver precise insights and ensures systems comply with regulations and protect brand reputation.
Gartner® finds that "At least 30% of generative AI (GenAI) projects will be abandoned after proof of concept by the end of 2025, due to poor data quality, inadequate risk controls, escalating costs, or unclear business value." High-quality, AI-ready data is the fuel for AI-driven advancements now and in the future.
AI effectiveness depends on relevant, responsible and robust data to prevent costly errors, inefficiencies, and compliance issues. A solid data foundation allows AI models to deliver precise insights and ensures systems comply with regulations and protect brand reputation.
Gartner® finds that "At least 30% of generative AI (GenAI) projects will be abandoned after proof of concept by the end of 2025, due to poor data quality, inadequate risk controls, escalating costs, or unclear business value." High-quality, AI-ready data is the fuel for AI-driven advancements now and in the future.
1. Read about the normalization of RDBMS
2. check if you have primary key defined for each table
3. check if you have foreign key defined for each relation between tables tables
4. if you do not have foreign key for one of the relations, think about whether your model is correct
5. if you want to understand you problem - Imagine two tables, each with 20,000 rows, without index. Then you can easily calculate that MySQL have to test 20.000 * 20.000 = 400.000.000 combinations :-(
Try using a series of simpler queries - I've sped up similar queries by breaking down the joins and using large 'IN (...)' statements instead. This change alone sped up an address search using wildcard matches from over 30 seconds to just under 0.5 seconds.
Here’s the thing: I wish I had known these money secrets sooner. They’ve helped so many people save hundreds, secure their family’s future, and grow their bank accounts—myself included.
And honestly? Putting them to use was way easier than I expected. I bet you can knock out at least three or four of these right now—yes, even from your phone.
Don’t wait like I did. Go ahead and start using these money secrets today!
1. Cancel Your Car Insurance
You might not even realize it, but your car insurance company is probably overcharging you. In fact, they’re kind of counting on you not noticing. Luckily,
Here’s the thing: I wish I had known these money secrets sooner. They’ve helped so many people save hundreds, secure their family’s future, and grow their bank accounts—myself included.
And honestly? Putting them to use was way easier than I expected. I bet you can knock out at least three or four of these right now—yes, even from your phone.
Don’t wait like I did. Go ahead and start using these money secrets today!
1. Cancel Your Car Insurance
You might not even realize it, but your car insurance company is probably overcharging you. In fact, they’re kind of counting on you not noticing. Luckily, this problem is easy to fix.
Don’t waste your time browsing insurance sites for a better deal. A company called Insurify shows you all your options at once — people who do this save up to $996 per year.
If you tell them a bit about yourself and your vehicle, they’ll send you personalized quotes so you can compare them and find the best one for you.
Tired of overpaying for car insurance? It takes just five minutes to compare your options with Insurify and see how much you could save on car insurance.
2. Ask This Company to Get a Big Chunk of Your Debt Forgiven
A company called National Debt Relief could convince your lenders to simply get rid of a big chunk of what you owe. No bankruptcy, no loans — you don’t even need to have good credit.
If you owe at least $10,000 in unsecured debt (credit card debt, personal loans, medical bills, etc.), National Debt Relief’s experts will build you a monthly payment plan. As your payments add up, they negotiate with your creditors to reduce the amount you owe. You then pay off the rest in a lump sum.
On average, you could become debt-free within 24 to 48 months. It takes less than a minute to sign up and see how much debt you could get rid of.
3. You Can Become a Real Estate Investor for as Little as $10
Take a look at some of the world’s wealthiest people. What do they have in common? Many invest in large private real estate deals. And here’s the thing: There’s no reason you can’t, too — for as little as $10.
An investment called the Fundrise Flagship Fund lets you get started in the world of real estate by giving you access to a low-cost, diversified portfolio of private real estate. The best part? You don’t have to be the landlord. The Flagship Fund does all the heavy lifting.
With an initial investment as low as $10, your money will be invested in the Fund, which already owns more than $1 billion worth of real estate around the country, from apartment complexes to the thriving housing rental market to larger last-mile e-commerce logistics centers.
Want to invest more? Many investors choose to invest $1,000 or more. This is a Fund that can fit any type of investor’s needs. Once invested, you can track your performance from your phone and watch as properties are acquired, improved, and operated. As properties generate cash flow, you could earn money through quarterly dividend payments. And over time, you could earn money off the potential appreciation of the properties.
So if you want to get started in the world of real-estate investing, it takes just a few minutes to sign up and create an account with the Fundrise Flagship Fund.
This is a paid advertisement. Carefully consider the investment objectives, risks, charges and expenses of the Fundrise Real Estate Fund before investing. This and other information can be found in the Fund’s prospectus. Read them carefully before investing.
4. Earn Up to $50 this Month By Answering Survey Questions About the News — It’s Anonymous
The news is a heated subject these days. It’s hard not to have an opinion on it.
Good news: A website called YouGov will pay you up to $50 or more this month just to answer survey questions about politics, the economy, and other hot news topics.
Plus, it’s totally anonymous, so no one will judge you for that hot take.
When you take a quick survey (some are less than three minutes), you’ll earn points you can exchange for up to $50 in cash or gift cards to places like Walmart and Amazon. Plus, Penny Hoarder readers will get an extra 500 points for registering and another 1,000 points after completing their first survey.
It takes just a few minutes to sign up and take your first survey, and you’ll receive your points immediately.
5. Get Up to $300 Just for Setting Up Direct Deposit With This Account
If you bank at a traditional brick-and-mortar bank, your money probably isn’t growing much (c’mon, 0.40% is basically nothing).
But there’s good news: With SoFi Checking and Savings (member FDIC), you stand to gain up to a hefty 3.80% APY on savings when you set up a direct deposit or have $5,000 or more in Qualifying Deposits and 0.50% APY on checking balances — savings APY is 10 times more than the national average.
Right now, a direct deposit of at least $1K not only sets you up for higher returns but also brings you closer to earning up to a $300 welcome bonus (terms apply).
You can easily deposit checks via your phone’s camera, transfer funds, and get customer service via chat or phone call. There are no account fees, no monthly fees and no overdraft fees. And your money is FDIC insured (up to $3M of additional FDIC insurance through the SoFi Insured Deposit Program).
It’s quick and easy to open an account with SoFi Checking and Savings (member FDIC) and watch your money grow faster than ever.
Read Disclaimer
5. Stop Paying Your Credit Card Company
If you have credit card debt, you know. The anxiety, the interest rates, the fear you’re never going to escape… but a website called AmONE wants to help.
If you owe your credit card companies $100,000 or less, AmONE will match you with a low-interest loan you can use to pay off every single one of your balances.
The benefit? You’ll be left with one bill to pay each month. And because personal loans have lower interest rates (AmONE rates start at 6.40% APR), you’ll get out of debt that much faster.
It takes less than a minute and just 10 questions to see what loans you qualify for.
6. Lock In Affordable Term Life Insurance in Minutes.
Let’s be honest—life insurance probably isn’t on your list of fun things to research. But locking in a policy now could mean huge peace of mind for your family down the road. And getting covered is actually a lot easier than you might think.
With Best Money’s term life insurance marketplace, you can compare top-rated policies in minutes and find coverage that works for you. No long phone calls. No confusing paperwork. Just straightforward quotes, starting at just $7 a month, from trusted providers so you can make an informed decision.
The best part? You’re in control. Answer a few quick questions, see your options, get coverage up to $3 million, and choose the coverage that fits your life and budget—on your terms.
You already protect your car, your home, even your phone. Why not make sure your family’s financial future is covered, too? Compare term life insurance rates with Best Money today and find a policy that fits.
The numbers you provide indicate that you may not be running your MySQL on a proper server, perhaps you are using Magnetic drives? those have 5–10 ms seek time and can get very slow if your data is spread across the drive. using SSD may significantly improve your performance. another option is doing it in memory, load you whole table into memory (20K rows are not a lot) and do some old school data structuring like hashtables, sorted lists etc’, you can easily get to less than a sec’ performance.
always keep in mind that lot of things depend on the server resources ie RAM, Processor etc also.
also try to use temp table concept or views in mysql, they are quite useful too.
select the smallest two-table result set into a temporary table, and then join against that denormalized result.
OK - I opened up the query itself…
- SELECT `ringtunes`.*,
- count(case when ringtune_history.Action = 'Download' then ringtune_history.Action end) as Org_Downloads,
- count(case when ringtune_history.Action = 'View' then ringtune_history.Action end) as Org_Views,
- count(case when ringtune_history.Action = 'Play' then ringtune_history.Action end) as Total_Plays,
- count(case when ringtune_history.Action = 'Like' then ringtune_history.Action end) as Total_Likes,
- `categories`.`Name` as `Category_Name` FROM `ringtunes`
- LEFT JOIN `ringtune_history` ON `ringtune_history`.`Ringtune_Id` = `ringtunes`.`Id`
- LEFT JOIN `
OK - I opened up the query itself…
- SELECT `ringtunes`.*,
- count(case when ringtune_history.Action = 'Download' then ringtune_history.Action end) as Org_Downloads,
- count(case when ringtune_history.Action = 'View' then ringtune_history.Action end) as Org_Views,
- count(case when ringtune_history.Action = 'Play' then ringtune_history.Action end) as Total_Plays,
- count(case when ringtune_history.Action = 'Like' then ringtune_history.Action end) as Total_Likes,
- `categories`.`Name` as `Category_Name` FROM `ringtunes`
- LEFT JOIN `ringtune_history` ON `ringtune_history`.`Ringtune_Id` = `ringtunes`.`Id`
- LEFT JOIN `categories` ON `categories`.`Id` = `ringtunes`.`Category`
- WHERE `ringtunes`.`Deleted` = 0 AND `ringtunes`.`Status` = 1 AND `categories`.`Deleted` = 0 AND `categories`.`Status` = 1
- GROUP BY `ringtunes`.`Id`
- ORDER BY `ringtunes`.`Id` DESC
- LIMIT 20;
The LIMIT 20 doesn’t do anything for you other than restrict the return result rows. It isn’t going to improve the internal performance of the query.
It’s worth mentioning the zeroth rule of DB query optimizers: they do what they’re coded to do, not what you imagine they could do if they were super-smart and clever. Here, you may imagine that LIMIT 20 could apply to the anchor table of the join, ringtunes, and restrict *it* before executing the join. But it doesn’t; MySQL applies LIMIT after the body of query has been otherwise completely processed. But there is a way to use LIMIT to improve query performance as I’ll show below…
The GROUP BY will have to do a full sort on the result of all the joins, and the LEFT JOIN means you’re going to end up with a working set whose rowcount is at least equal to all the rows in the ringtunes table. The WHERE clause will filter out some of the rows, and this result set will then be ordered and GROUP’d.
Depending on how big the ringtunes table is, this is a pretty beefy query that does a ton of joining and active sorting. The COUNTs, even if they may look icky, shouldn’t be a problem.
As for improving performance:
- First, fetch the 20 IDs you want separately. The LIMIT 20 will *not* do this.
- Get rid of the ringtunes.*. This is probably an illegal GROUP BY, as the only allowed select-list members in a GROUP BY are explicitly GROUP’d columns (in this case ringtunes.ID) or the results of aggregates like COUNT(). The results you’ll get are going to be random, and unfortunately MySQL allows this sort of stuff.
- I assume that at least one of the joins is 1:many, as otherwise grouping on ringtunes . ID doesn’t make any sense. Even so, make sure you’re joining on indexed columns.
So, a possible new query could be something like
- SELECT ringtunes.ID, -- get rid of ringtunes.* - fetch later if needed
- count(case when ringtune_history.Action = 'Download' then ringtune_history.Action end) as Org_Downloads,
- count(case when ringtune_history.Action = 'View' then ringtune_history.Action end) as Org_Views,
- count(case when ringtune_history.Action = 'Play' then ringtune_history.Action end) as Total_Plays,
- count(case when ringtune_history.Action = 'Like' then ringtune_history.Action end) as Total_Likes,
- `categories`.`Name` as `Category_Name`
- FROM (SELECT DISTINCT ringtunes.ID FROM ringtunes WHERE
- ringtunes.deleted = 0 and ringtunes.status = 1 ORDER BY ringtunes.ID DESC LIMIT 20) AS r_id
- JOIN ringtunes ON r_id.id = ringtunes.id
- LEFT JOIN `ringtune_history` ON `ringtune_history`.`Ringtune_Id` = `ringtunes`.`Id`
- LEFT JOIN `categories` ON `categories`.`Id` = `ringtunes`.`Category`
- WHERE `categories`.`Deleted` = 0 AND `categories`.`Status` = 1
- -- ringtunes WHERE filters are in the subquery above
- GROUP BY `ringtunes`.`Id`
- ORDER BY `ringtunes`.`Id` DESC;
This may look ickier as it has the initial subquery to fetch your 20 ordered IDs, but it should be faster, and is at least a legal GROUP BY.
If ringtunes . ID is actually the PK for ringtunes, you can get rid of the DISTINCT above.
- Proper structure of tables and normalization
- Skmetimes(very rare) denormalization or creation of reporting tables(lets say count and last access date for users to be updated on logins)
- Creation of proper indexes
- Making the sql engine work with specific indexes in your queries. Simple short queries usually are slower than long queriea where you guide the engine by forcing it to use indexes based on the conditions you write
Most important is to know relarional database theory basics and indexes and utilizing them. The last point that gives the magnitudes faster queries needs years of experience to b
- Proper structure of tables and normalization
- Skmetimes(very rare) denormalization or creation of reporting tables(lets say count and last access date for users to be updated on logins)
- Creation of proper indexes
- Making the sql engine work with specific indexes in your queries. Simple short queries usually are slower than long queriea where you guide the engine by forcing it to use indexes based on the conditions you write
Most important is to know relarional database theory basics and indexes and utilizing them. The last point that gives the magnitudes faster queries needs years of experience to be.learned.
Some possibilities:
1. The INSERT itself may be slow. If you have a really huge table with lots of secondary indexes, adding new rows could be slow (although if the server is otherwise not busy, a single-row INSERT shouldn't take more than a few dozen milliseconds on reasonably modern, healthy hardware even if the table is huge). Note that adding the data to the table isn't the slow part as much as finding out where to add the data in inserts - in other words, INSERT can actually be a READ-intensive operation if you have a lot of indexes, particularly if they're "cold" (ie, not in the buffer
Some possibilities:
1. The INSERT itself may be slow. If you have a really huge table with lots of secondary indexes, adding new rows could be slow (although if the server is otherwise not busy, a single-row INSERT shouldn't take more than a few dozen milliseconds on reasonably modern, healthy hardware even if the table is huge). Note that adding the data to the table isn't the slow part as much as finding out where to add the data in inserts - in other words, INSERT can actually be a READ-intensive operation if you have a lot of indexes, particularly if they're "cold" (ie, not in the buffer pool).
2. There may be an issue with I/O. INSERT in MySQL involves updates to the xact log and in-memory changes to buffer pool pages. Base tables and indexes are typically worked on in RAM and flushed to disk over time (at least in InnoDB). Once the INSERTing transaction commits (and immediately if you have AUTOCOMMIT=ON), the INSERT's db operation will hit the xact/redo log.
3. You may have a very busy server and don't have enough threads configured for what you're doing with the server. Check the variable "innodb_thread_concurrency" - if it's a lot smaller than the number of routinely active database connections, you may need to increase it (or just put up with slow inserts if you can't increase it). MySQL will queue database operations and not finish them until other operations have "cleared" if the server is busy in a manner similar to how an OS multitasks processes on a CPU.
4. There may be other things locking the table. This is more likely the case with MyISAM, but I've seen it happen with InnoDB on occasion; there appear to be cases where a table or page can effectively be locked and INSERTs will wait for these operations to clear.
I’m not exactly sure how “PDO” works, but a few things that can help in the application level:
- If you’re routinely loading large numbers of records, use a “multi-row” insert statement, ie
- INSERT INTO MYTAB (colnames) VALUES
- (row1), (row2), …, (rowN);
You would just build up a huge string in your client code and send the string in a single “execsql()” call (or whatever it is in your programming language). Note that if you’re building a very long string, you may need to change max_allowed_packet
to allow longer query strings.
I’ve found that an insert of a thousand rows can be nearly as fast as inse
I’m not exactly sure how “PDO” works, but a few things that can help in the application level:
- If you’re routinely loading large numbers of records, use a “multi-row” insert statement, ie
- INSERT INTO MYTAB (colnames) VALUES
- (row1), (row2), …, (rowN);
You would just build up a huge string in your client code and send the string in a single “execsql()” call (or whatever it is in your programming language). Note that if you’re building a very long string, you may need to change max_allowed_packet
to allow longer query strings.
I’ve found that an insert of a thousand rows can be nearly as fast as inserting a single row in many databases, so this approach is almost “free”.
- If you can’t do the above, at least do your INSERTs in a large transaction. Most database engines - including MySQL - have some notion of AUTOCOMMIT, and in most of them, it’s always “on”, which means every INSERT has an implied COMMIT. To only COMMIT at the end of your load, either do AUTOCOMMIT OFF, or (my recommended way), do
- START TRANSACTION;
- <insert1>;
- <insert2>;
- …
- <insertN>;
- COMMIT;
(This will work whether AUTOCOMMIT is ON or not.)
This is often several dozen times faster than having a COMMIT - implicit or otherwise - after every INSERT.
If possible, you may be able to combine the above approaches to load your data quickly.
I'm still learning about MySQL, so feel free to correct me if some of these are inaccurate. These are some tips that I picked up from the folks at Percona:
- Disk is too slow
Are you RAIDing your drives? Using RAID10 is useful not just for performance reasons, but for reliability of drive failures. If you control your hardware, also consider upgrading to SSD or FusionIO.
Link: http://www.mysqlperformanceblog.com/2009/05/01/raid-vs-ssd-vs-fusionio/
- Disk is overloaded
If you are running a logging database on the same machine as your production database, then both those databases will have to compete fo
I'm still learning about MySQL, so feel free to correct me if some of these are inaccurate. These are some tips that I picked up from the folks at Percona:
- Disk is too slow
Are you RAIDing your drives? Using RAID10 is useful not just for performance reasons, but for reliability of drive failures. If you control your hardware, also consider upgrading to SSD or FusionIO.
Link: http://www.mysqlperformanceblog.com/2009/05/01/raid-vs-ssd-vs-fusionio/
- Disk is overloaded
If you are running a logging database on the same machine as your production database, then both those databases will have to compete for the same resources. If possible, break your database apart into more functional units to prevent contention.
- InnoDB transactions are flushing too often
If you are using InnoDB, consider setting your innodb_flush_log_at_trx_commit
to 2. Transaction flushes default to 1 (write and flush after every commit). Setting the flush log variable to 2 will not flush so aggressively, but will cost some reliability.
If you’re not concern [sic] about ACID and can loose [sic] transactions for last [sic] second or two in case of full OS crash than set this value. It can dramatic [sic] effect especially on a lot of short write transactions.
Link: http://www.mysqlperformanceblog.com/2007/11/01/innodb-performance-optimization-basics/
- InnoDB log file is too small
The default for the InnoDB log file size is an extremely small, conservative value so that it is guaranteed to work on any system, but not work efficiently on most systems.
Link: http://www.mysqlperformanceblog.com/2008/11/21/how-to-calculate-a-good-innodb-log-file-size/
- Tables are too large
Can your largest tables be broken apart? Breaking your million-row tables into partitions will effectively reduce the algorithmic complexity of table operations, as well as make more efficient use of MySQL's buffers, at the expense of making JOINs and SELECTs across partition boundaries more expensive.
Link: http://dev.mysql.com/doc/refman/5.1/en/partitioning.html
Link: http://www.mysqlperformanceblog.com/2006/06/09/why-mysql-could-be-slow-with-large-tables/
MySQL 5.7 is painfully slow and lacks many of the tools that I’m used to on Oracle systems. I’m stuck with it until my ISP goes to something newer/better. I take a defensive approach.
- Try to keep your table joins to two tables if when possible. 2nd normal form may perform better, but requires you to do more maintenance to keep everything in sync.
- Use EXPLAIN PLAN often. Just because you add an index doesn’t mean the optimizer will use it.
- When you find a join that works, create a view and use it instead. You don’t want to reinvent the wheel.
- If you need to do a three way join, try to join two of t
MySQL 5.7 is painfully slow and lacks many of the tools that I’m used to on Oracle systems. I’m stuck with it until my ISP goes to something newer/better. I take a defensive approach.
- Try to keep your table joins to two tables if when possible. 2nd normal form may perform better, but requires you to do more maintenance to keep everything in sync.
- Use EXPLAIN PLAN often. Just because you add an index doesn’t mean the optimizer will use it.
- When you find a join that works, create a view and use it instead. You don’t want to reinvent the wheel.
- If you need to do a three way join, try to join two of the tables first through a view. That may give you some control on the join order.
- Consider creating intermediate results tables. Again, in a three way join situation, you want to reduce as many rows as you can. You might have to break it up into two steps. You can do your order by as a separate step as well.
- Consider changing ISP hosts or upgrading your hardware. You might not be able to change your application to get better performance.
- If you have a choice, check out MySQL 8. It has analytic functions and may run faster. But if you are dependent on an ISP, you’re out of luck.
Indexes (including the PK), triggers, or constraints are likely the problem. Each of those may do some work on an insert. The slightly outdated book
The Art of SQL
qualitatively outlines some of the impact of indexes. The author's findings on an Oracle DB was that even two indexes slowed insert speed 2-4x. Batching many inserts into a single transac...
(Asked to answer)
I'll answer from the perspective of SQL Server queries. Other database management systems operate similar enough that the generics of the answer still apply but the details will differ.
SQL is a so-called declarative programming language. This means that the developer doesn't tell the computer HOW to perform the task, but rather tells the computer WHAT RESULTS they want and then the computer figures out how to achieve those results.
If a query is slow, there can be multiple broad categories of reasons.
Perhaps you just asked for a huge load of work. Doing some complex arithmetic
(Asked to answer)
I'll answer from the perspective of SQL Server queries. Other database management systems operate similar enough that the generics of the answer still apply but the details will differ.
SQL is a so-called declarative programming language. This means that the developer doesn't tell the computer HOW to perform the task, but rather tells the computer WHAT RESULTS they want and then the computer figures out how to achieve those results.
If a query is slow, there can be multiple broad categories of reasons.
Perhaps you just asked for a huge load of work. Doing some complex arithmetic on all five billion sales records of the past five years will generally never be faster than at least the time it takes to read this five billion rows of data from disk (unless someone anticipated the question and did some smart prep work).
Perhaps the database is not properly designed for performance. Optimal performance means making very smart decisions on your indexing strategy, so that a high number of important queries can reap the benefits of a small set on indexes. And sometimes you'll need to tell users that certain queries simply cannot be optimized without causing too many unwanted side effects.
And perhaps the database management system made a bad choice when deciding how to execute the query. This can be infuriating because we have very limited options to influence these choices. But on the other hand, once we know how and why such bad decisions are made it often becomes fairly trivial to fix the actual underlying root cause.
For the second and third category, it is absolutely essential that you learn to read and understand the execution plan (for SQL Server), or is equivalent (for other databases). For SQL Server, I highly recommend reading “SQL Server execution plans, 3rd edition” (not the earlier editions, please), which is available as a free download from RedGate, or for money as a printed copy from Amazon.
Not a MySQL expert but a T-SQL and from my perspective it depends.
- Do you have joins? If so…
- Verify the data-type of the fields you’re comparing to.
- Does the table have indexes? If so… What type of index?
- Are you querying with linked servers?
- Are there usless fields?
- Any usless calc?
- Verify calculated fields. (Maybe you are converting instead of casting?)
If nothing of this works, go to the execution plan (I hope MySQL has this feature)
Use the EXPLAIN statement to see the query plan. The explained plan shows how the table(s) are scanned and what indexes are used. The indexes are used to JOIN tables or filter WHERE clauses. Any full table scans are good candidates for new indexes.
Use the ANALYZE TABLE Statement to update planner statistics. This records about how many rows a table has. It tracks the key cardinality used by an index, e.g. the age column has only 5 values (17, 18, 19, 20, 21) and is not a good index to pull many rows.
- See EXPLAIN Join Types
- See
Use the EXPLAIN statement to see the query plan. The explained plan shows how the table(s) are scanned and what indexes are used. The indexes are used to JOIN tables or filter WHERE clauses. Any full table scans are good candidates for new indexes.
Use the ANALYZE TABLE Statement to update planner statistics. This records about how many rows a table has. It tracks the key cardinality used by an index, e.g. the age column has only 5 values (17, 18, 19, 20, 21) and is not a good index to pull many rows.
- See EXPLAIN Join Types
- See possible_key - indexes available
- See key - the index selected for use by query planner
First: try to normalize the database. Database normalization is very important when you reach certain amount of rows.
Second: allways use numbers. If you have to make SELECTs and you have to order by a field o make a condition with a field, try to code that field as a number.
Third: indexes. That's your word: indexes. Every field you use in a WHERE or in an ORDER BY, must be indexed.
Fourth: sometimes, is better to make something by programming and don't make everything with the database core.
I'ver worked with a table with 50 millions rows (yes, 50 millions). And I made a join with that table an
First: try to normalize the database. Database normalization is very important when you reach certain amount of rows.
Second: allways use numbers. If you have to make SELECTs and you have to order by a field o make a condition with a field, try to code that field as a number.
Third: indexes. That's your word: indexes. Every field you use in a WHERE or in an ORDER BY, must be indexed.
Fourth: sometimes, is better to make something by programming and don't make everything with the database core.
I'ver worked with a table with 50 millions rows (yes, 50 millions). And I made a join with that table and two other tables with 8 millions rows each. The time was less than a second.
Sorry for my spelling mistakes.
I have seen some good answers so I’d like to add my few observations:
- As Greg Kemnitz suggested, consider wisely to avoid the ‘*’ queries as waste of time and resources as result render takes a lot of time (mostly not needed);
- Here I add a note: the fact of using a view instead of a table is a useful short cut as if you repeat a lot of similar queries the system will not cache them while e View can help consistently;
- As Mark Twaine suggested, development phase is a time where you struggle to get results while real life is another thing; like the first point, you have to clean up your code and you
I have seen some good answers so I’d like to add my few observations:
- As Greg Kemnitz suggested, consider wisely to avoid the ‘*’ queries as waste of time and resources as result render takes a lot of time (mostly not needed);
- Here I add a note: the fact of using a view instead of a table is a useful short cut as if you repeat a lot of similar queries the system will not cache them while e View can help consistently;
- As Mark Twaine suggested, development phase is a time where you struggle to get results while real life is another thing; like the first point, you have to clean up your code and your SQL calls to avoid bottlenecks;
- In addition to this latter point, there are tools to assess where you may have space to improve like the EXPLAIN command or the MySQLCheck that check both logs and queries to adjust configuration variables to fit to your needs but first you need to revise your code and your SQL calls as you may know much more easily where there may be space for improvement;
- As you may see in other environments, assess your changes as each may be a benefit or not and for this you may use the ‘time’ function of your system;
- My last suggestion is not to write clear text queries not for a matter of elegance but for showing that you know what you are doing and your code is stable and mature.
Standard answer valid for most relational databases:
1. Tweak database parameters, use GOOD judgment and read the documentation properly before you do it.
2. Define proper indices on your database tables, but do not overindex. Overindexing can be a killer.
3. Evaluate your queries by examining execution plans and how the data are structured (table cardinalities, column cardinalities, “soft” relatio
Standard answer valid for most relational databases:
1. Tweak database parameters, use GOOD judgment and read the documentation properly before you do it.
2. Define proper indices on your database tables, but do not overindex. Overindexing can be a killer.
3. Evaluate your queries by examining execution plans and how the data are structured (table cardinalities, column cardinalities, “soft” relationships etc). Rewrite queries and/or restructure your code where necessary. In particular:
4.
1. if a task can be accom...
It depends on what you mean by "with" an OR condition…
Ie
Cola = xxx and colb in (111,222,333,….)
Cola between xxx and yyy or cola….
Cola…. Or colb….
…
How complex is the selection, what are the cardinalities involved….
Some situations will be assisted by the creation of multi column indexes, others would benefit from indexes on separate columns…
As always look at the statistics of the data involved and test the performance of possible Indexes…
You should also consider if consolidation of the OR conditions is possible. .e.g using BETWEEN or range conditions, or CASE/grouping tables to conform the data
It depends on what you mean by "with" an OR condition…
Ie
Cola = xxx and colb in (111,222,333,….)
Cola between xxx and yyy or cola….
Cola…. Or colb….
…
How complex is the selection, what are the cardinalities involved….
Some situations will be assisted by the creation of multi column indexes, others would benefit from indexes on separate columns…
As always look at the statistics of the data involved and test the performance of possible Indexes…
You should also consider if consolidation of the OR conditions is possible. .e.g using BETWEEN or range conditions, or CASE/grouping tables to conform the data being selected… and appropriate use of sub queries or even UNION queries to accomplish the desired processing…
There are lots of ways to do query optimization for MySQL. Here are three:
- Turn on the slow query log in MySQL. This gets you a specific log file of every query that isn’t optimal. Setting this option can be tricky so my advice is to set it and then explicitly run a crappy query to make certain that the logging is actually turned on. You may have to do this a few times to be certain.
- Understand that MySQL uses a single index for query and ordering so if you want to query by one column and order by another both those columns need to be present in the index. This is probably the most common mistak
There are lots of ways to do query optimization for MySQL. Here are three:
- Turn on the slow query log in MySQL. This gets you a specific log file of every query that isn’t optimal. Setting this option can be tricky so my advice is to set it and then explicitly run a crappy query to make certain that the logging is actually turned on. You may have to do this a few times to be certain.
- Understand that MySQL uses a single index for query and ordering so if you want to query by one column and order by another both those columns need to be present in the index. This is probably the most common mistake people make.
- Run the explain option on our query i.e. EXPLAIN SELECT * FROM articles WHERE posted=1; Explain runs the query planner and shows you the result. Be particularly careful when you see the FILE SORT option which literally means “MySQL will write this to an external file on disk and then use a separate sort tool on it”. Anytime the sort isn’t done in memory, well, that’s a problem.
- Look into development environment specific performance tools like New Relic which often have query monitoring.
- Bear in mind that date queries when you store date-times such as “SELECT * FROM articles WHERE DATE(created_at) = ‘2017–06–01’” get translated into range queries under the hood and range queries are expensive. I often store both a created_at and a date_created_at value on my databases to get around this since usually date queries are what people want more than date-time queries.
- If you really have performance issues then consider sharding data across multiple tables but that’s an advanced approach which is harder to discuss / describe.
60M isn't that much if a table is well-indexed and your mysql instance is well-provisioned and configured correctly. We have many tables that are 20x bigger than this that support queries that return in a few dozen milliseconds.
A lot depends on your queries - some queries can't be easily tuned - and a lot depends on how you're indexing or partitioning your table.
Also, InnoDB is much faster than MyISAM for the vast majority of operations, particularly if indexes are used, so make sure you're using InnoDB.
Hi,
It's not clear from your question what kind of performance issue you're talking about. But let me give you some general pointers:
- You could try to build custom indexes which would cater to your select statements
- If your data is partitionable, then partition the table
- If possible have a DBA shrink/compact the tablespace. Trust me, it works wonders.
- Use subquery refactoring if possible.
Query optimization include various steps. Few are as follows:
Optimizing SELECT Statements
Queries, in the form of SELECT statements, perform all the lookup operations in the database. Tuning these statements is a top priority, whether to achieve sub-second response times for dynamic web pages, or to chop hours off the time to generate huge overnight reports.
Besides SELECT statements, the tuning techniques for queries also apply to constructs such as CREATE TABLE...AS SELECT, INSERT INTO...SELECT, and WHERE clauses in DELETE statements. Those statements have additional performance considerations
Query optimization include various steps. Few are as follows:
Optimizing SELECT Statements
Queries, in the form of SELECT statements, perform all the lookup operations in the database. Tuning these statements is a top priority, whether to achieve sub-second response times for dynamic web pages, or to chop hours off the time to generate huge overnight reports.
Besides SELECT statements, the tuning techniques for queries also apply to constructs such as CREATE TABLE...AS SELECT, INSERT INTO...SELECT, and WHERE clauses in DELETE statements. Those statements have additional performance considerations because they combine write operations with the read-oriented query operations.
In MySQL NDB Cluster 7.2 and later, the NDB storage engine supports a join pushdown optimization whereby a qualifying join is sent in its entirety to NDB Cluster data nodes, where it can be distributed among them and executed in parallel. For more information about this optimization, see Conditions for NDB pushdown joins.
The main considerations for optimizing queries are:
- To make a slow SELECT ... WHERE query faster, the first thing to check is whether you can add an index. Set up indexes on columns used in the WHERE clause, to speed up evaluation, filtering, and the final retrieval of results. To avoid wasted disk space, construct a small set of indexes that speed up many related queries used in your application.
- Indexes are especially important for queries that reference different tables, using features such as joins and foreign keys. You can use the EXPLAIN statement to determine which indexes are used for a SELECT.
- Isolate and tune any part of the query, such as a function call, that takes excessive time. Depending on how the query is structured, a function could be called once for every row in the result set, or even once for every row in the table, greatly magnifying any inefficiency.
- Minimize the number of full table scans in your queries, particularly for big tables.
- Keep table statistics up to date by using the ANALYZE TABLE statement periodically, so the optimizer has the information needed to construct an efficient execution plan.
- Learn the tuning techniques, indexing techniques, and configuration parameters that are specific to the storage engine for each table. Both InnoDB and MyISAM have sets of guidelines for enabling and sustaining high performance in queries.
- Avoid transforming the query in ways that make it hard to understand, especially if the optimizer does some of the same transformations automatically.
- If a performance issue is not easily solved by one of the basic guidelines, investigate the internal details of the specific query by reading the EXPLAIN plan and adjusting your indexes, WHERE clauses, join clauses, and so on. (When you reach a certain level of expertise, reading the EXPLAIN plan might be your first step for every query.)
- Adjust the size and properties of the memory areas that MySQL uses for caching. With efficient use of the InnoDB buffer pool, MyISAM key cache, and the MySQL query cache, repeated queries run faster because the results are retrieved from memory the second and subsequent times.
- Even for a query that runs fast using the cache memory areas, you might still optimize further so that they require less cache memory, making your application more scalable. Scalability means that your application can handle more simultaneous users, larger requests, and so on without experiencing a big drop in performance.
- Deal with locking issues, where the speed of your query might be affected by other sessions accessing the tables at the same time.
Subquery Optimization
Certain optimizations are applicable to comparisons that use the IN operator to test subquery results (or that use =ANY, which is equivalent). This section discusses these optimizations, particularly with regard to the challenges that NULL values present. The last part of the discussion suggests how you can help the optimizer.
Optimizing INFORMATION_SCHEMA Queries
Applications that monitor databases may make frequent use of INFORMATION_SCHEMA tables. Certain types of queries for INFORMATION_SCHEMA tables can be optimized to execute more quickly. The goal is to minimize file operations (for example, scanning a directory or opening a table file) to collect the information that makes up these dynamic tables.
Try to use constant lookup values for database and table names in the WHERE clause
You can take advantage of this principle as follows:
- To look up databases or tables, use expressions that evaluate to a constant, such as literal values, functions that return a constant, or scalar subqueries.
- Avoid queries that use a nonconstant database name lookup value (or no lookup value) because they require a scan of the data directory to find matching database directory names.
- Within a database, avoid queries that use a nonconstant table name lookup value (or no lookup value) because they require a scan of the database directory to find matching table files.
Optimizing Data Change Statements
This section explains how to speed up data change statements: INSERT, UPDATE, and DELETE. Traditional OLTP applications and modern web applications typically do many small data change operations, where concurrency is vital. Data analysis and reporting applications typically run data change operations that affect many rows at once, where the main considerations is the I/O to write large amounts of data and keep indexes up-to-date. For inserting and updating large volumes of data (known in the industry as ETL, for “extract-transform-load”), sometimes you use other SQL statements or external commands, that mimic the effects of INSERT, UPDATE, and DELETE statements.
Optimizing Database Privileges
The more complex your privilege setup, the more overhead applies to all SQL statements. Simplifying the privileges established by GRANT statements enables MySQL to reduce permission-checking overhead when clients execute statements. For example, if you do not grant any table-level or column-level privileges, the server need not ever check the contents of the tables_priv and columns_priv tables. Similarly, if you place no resource limits on any accounts, the server does not have to perform resource counting. If you have a very high statement-processing load, consider using a simplified grant structure to reduce permission-checking overhead.
Other Optimization Tips
This section lists a number of miscellaneous tips for improving query processing speed:
If your application makes several database requests to perform related updates, combining the statements into a stored routine can help performance. Similarly, if your application computes a single result based on several column values or large volumes of data, combining the computation into a UDF (user-defined function) can help performance. The resulting fast database operations are then available to be reused by other queries, applications, and even code written in different programming languages.
- To fix any compression issues that occur with ARCHIVE tables, use OPTIMIZE TABLE.
- If possible, classify reports as “live” or as “statistical”, where data needed for statistical reports is created only from summary tables that are generated periodically from the live data.
- If you have data that does not conform well to a rows-and-columns table structure, you can pack and store data into a BLOB column. In this case, you must provide code in your application to pack and unpack information, but this might save I/O operations to read and write the sets of related values.
- With Web servers, store images and other binary assets as files, with the path name stored in the database rather than the file itself. Most Web servers are better at caching files than database contents, so using files is generally faster. (Although you must handle backups and storage issues yourself in this case.)
- If you need really high speed, look at the low-level MySQL interfaces. For example, by accessing the MySQL InnoDB or MyISAM storage engine directly, you could get a substantial speed increase compared to using the SQL interface.
- Replication can provide a performance benefit for some operations. You can distribute client retrievals among replication servers to split up the load. To avoid slowing down the master while making backups, you can make backups using a slave server.
Unless you have some type of Bitmap index, you don’t.
If the tables are nontrivial size, B-Tree indexes on a column with this few different values (low Cardinality) won’t be any better than a tablescan. In fact, it would be much worse in terms of number of pages accessed.
So, you may want to index some other field that has a higher selectivity.
If you have a bunch of different Boolean values for the same rows, and the _combinations_ are reasonably selective, you may want to use some type of bitmask column, and index your mask.
A2A, thanks...
I don't know the exact MySQL query optimizer internals, but it appears to be a fairly standard statistics-driven cost-based query optimizer. For a general description of cost-based query optimizers, this wiki page is a decent start: Query optimization, but at the end of the day, what query optimizers want to do is find the route to the query answer that visits the fewest number of pages possible.
A related post I wrote a bit ago:
Anyway, to start with you have to define exactly what query optimizers do; they're most analogous to a programming language
A2A, thanks...
I don't know the exact MySQL query optimizer internals, but it appears to be a fairly standard statistics-driven cost-based query optimizer. For a general description of cost-based query optimizers, this wiki page is a decent start: Query optimization, but at the end of the day, what query optimizers want to do is find the route to the query answer that visits the fewest number of pages possible.
A related post I wrote a bit ago:
Anyway, to start with you have to define exactly what query optimizers do; they're most analogous to a programming language compiler. They take your SQL input, after it's been processed by a parser, and build an "execution plan" data structure that is used by the execution part of the RDBMS to actually answer the query.
In query optimizing, there are a lot of considerations that an optimizer has to account for in its analysis:
- How to order any joins? Join order is the sequence used to walk through the tables in the query as the join is executing. As a rule, you want the tables that produce the smaller working set visited earliest, especially for the "anchor table" in the join, as it (or a subset of it if there's other WHERE-clause predicates on it) will have to be fully traversed by the query. In MySQL, the STRAIGHT_JOIN hint lets the user dictate left-to-right FROM-clause join order.
- What indexes to use, or is it cheaper to skip the index and use a tablescan? Note that indexes aren't always best, particularly if a query ends up visiting a significant chunk of the table's rows. The first two considerations are informed by table statistics, so it's a good idea to run ANALYZE TABLE to update them, especially if the relative sizes of tables change (ie, a table suddenly becomes large or has a bunch of rows deleted).
- What types of join algorithms can be considered? Different RDBMSs have different join algorithm capabilities. MySQL is limited in that it only supports Nested loop join, although it can use Nested loop over index/PK to do the lookup on the target table. So, in this criterion, MySQL's query optimizer has it easy :)
- Whether temporary structures can be used versus always hitting the disk/buffer pool for rows every time we visit the table?
- Does the storage engine or storage media have any special properties that may make certain routes better than others? For instance, InnoDB uses a table's primary key to organize base table data, making lookups using the PK particularly fast versus using secondary indexes (and this is why you should _always_ have a defined PK on any nontrivial InnoDB table). Also, some optimizers, and apparently some versions of MySQL, can account for the fact that a table is stored in SSD versus spinning disk. This would make a difference if joining two tables stored on different storage media.
- How to figure out all this stuff without taking a long time simply doing the analysis? This is a far bigger consideration than one may think, especially for complex joins involving lots of tables as the solution space can get huge very easily. Query optimizers have heuristics that prune unpromising query paths quickly so they don't have to bother doing a full analysis on them, but like any heuristic of this kind, this occasionally means the best query plan gets thrown out.
What you need is to make sure your queries have the indexes they need. If you have an index on some fields but a query needs an index on a different field, then adding the index needed will be an improvement.
One thing which can make queries faster which have an appropriate index is if it has all the fields the query needs indexed. Sometimes people have indexes for each of the field queried, but all of their indexes only index one field.
For example, if you search names by first and last name then having an index on both first and last name will make that query faster than two indexes one on the
What you need is to make sure your queries have the indexes they need. If you have an index on some fields but a query needs an index on a different field, then adding the index needed will be an improvement.
One thing which can make queries faster which have an appropriate index is if it has all the fields the query needs indexed. Sometimes people have indexes for each of the field queried, but all of their indexes only index one field.
For example, if you search names by first and last name then having an index on both first and last name will make that query faster than two indexes one on the first name and one on the last name.
Keep in mind the order of the fields in the index. A query can use an index which has more fields than it needs, as long as the fields that it needs are listed before the fields it does not need indexed. So if you sometimes query by first and last name, but sometimes query by last name alone, then an index on last name and first name (in that order) can be used by both queries without keeping an index on last name alone.
Less important but also a concern is don’t have too many indexes. Every index has to be updated for every change to the database. If you index every combination of fields which could be useful but are not actually used then you are slowing down your database on updates without any benefit to queries.
This basically totally depends on the type of queries you are working on.
But there are few concepts or rules that we need to take in to consideration..
- Try to reduce sub queries if possible.
- Try to maximise the use of indexes of the table, like use indexed columns on joining conditions.
- Someplaces Union all works better in place of nested sub queries.
- At last, try to debug the slow queries, see the execution plan of mysql via desc queries. This will show the records that's being scanned to produce result. …………Hope you have understood.
Many times, a long-running query is the result of not using a unique key for an inner join. This can easily result in many instances of duplicate records in your view. Each of those records take time to generate. Always remember - SQL is doing EXACTLY what you told it to do. You can’t blame SQL for not giving you the results you thought you asked for.
If you are trying to import the contents of a file into a MySQL table, you will be better off using the mysqlimport program, which is made for this sort of thing.
See these URLs:
4.5.5 mysqlimport - A Data Import Program
13.2.6 LOAD DATA INFILE Syntax
Although I don’t know the details of your PDO connection, I would assume it’s doing all of your 10,000 INSERTs one-at-a-time, and might even have to do foreign key lookups, etc. If so, you are using the wrong tool for the job.
I see you are discovering the pain of MySQL.
You have two options:
- Upgrade to MySQL 5.6.
- Use the old "copy the table and add or modify the index and then rename the table" trick.
The latter is a complete pain, but is necessary if you have large tables and need to keep things running while you make the change.
Fun times!
Usually the answer is re-organise your SQL or add an index. In more extreme situations use temporary or not-so-temporary tables as an intermediate step. Index any intermediate tables.
Look at the JOINs and WHERE clauses in your SQL. Think about how the database will implement them. Will a where clause reduce a table to <1,000 rows using indexed fields, before it has to consider non-indexed fields? Is one side of every join indexed? Where the answer to these questions is “No” or “It’s to complex” try re-organising the SQL.
You want us to help you optimize a query you haven’t posted to get from an execution time you haven’t posted to a desired execution time you haven’t posted?!
If I’m allowed a little pun … Just a sec!
As Greg Kemnitz writes, you usually cannot make use of such a field without a bitmap index. And you have to watch out more to boot:
If the distribution is very skewed (like what you indicate, or worse like 99.9% vs. 0.1%), and you make that column the first column of a multi-column-index, and your optimizer does not do histograms (I am not sure if MySQL does that), or it does but the “true”/”false” to compare against is in a bind variable, then it will assume that the values are in a 50–50 distribution. That might lead it to make it use the index to scan almost all rows of the table, leading to
As Greg Kemnitz writes, you usually cannot make use of such a field without a bitmap index. And you have to watch out more to boot:
If the distribution is very skewed (like what you indicate, or worse like 99.9% vs. 0.1%), and you make that column the first column of a multi-column-index, and your optimizer does not do histograms (I am not sure if MySQL does that), or it does but the “true”/”false” to compare against is in a bind variable, then it will assume that the values are in a 50–50 distribution. That might lead it to make it use the index to scan almost all rows of the table, leading to twice the effort that a full table scan would require.
OP asked: How can I optimize MySQL composite index in InnoDB?
Typically, the most optimal way to write a MySQL index is to consider the statements that will use it. If the goal is to use the index to fulfill a SELECT statement, I use this rule of thumb:
Put (in)equalities first, then one range, then any covering columns.
The most important thing about this index would be the order of the columns to the left.
Do be aware that the cost to creating such an index is experienced during INSERT/UPDATE/DELETE/REPLACE. If the table you’re indexing gets a high percentage of change and the size of the table,
OP asked: How can I optimize MySQL composite index in InnoDB?
Typically, the most optimal way to write a MySQL index is to consider the statements that will use it. If the goal is to use the index to fulfill a SELECT statement, I use this rule of thumb:
Put (in)equalities first, then one range, then any covering columns.
The most important thing about this index would be the order of the columns to the left.
Do be aware that the cost to creating such an index is experienced during INSERT/UPDATE/DELETE/REPLACE. If the table you’re indexing gets a high percentage of change and the size of the table, it may not be worth indexing some or all the columns.
Need more info? Consult with your friendly local database architect, engineer or administrator.
Try creating an index on the creation timestamp if that's what you are querying with. From your description, it looks like MySQL is doing a table scan, resulting in poor performance. To begin with, run EXPLAIN on the exact SELECT command you are using now to understand MySQL's query plan. The output of EXPLAIN will tell you the rows scanned to return the output you've requested. You can then add an index or use other optimization strategies to speed up queries. FWIW, MySQL can handle 1.5mm rows very easily.