MySQL 8hr/timeout/communications link failure – Or how you shouldn’t always trust the internet
I saw the errors ‘The last packet successfully received from the server was x milliseconds ago’ and ‘Communications link failure’ a lot lately. So much it almost drove me crazy. I knew it had something to do with the timeout of 8 hours that MySQL has by default for every connection.This is a simple problem, but the internet is so polluted with everybody parroting each other that it didn’t get me any further.
After a few days I fixed the problem, it was indeed very simple but because I did something stupid with JPA it became very complex.
In this article I’ll explain the simple solution and tell you about my complex problem, maybe you’re having the same trouble.
How to reproduce the problem without waiting 8 hours
First step is decreasing the timeout by running the SQL :
This sets the timeout to 60 seconds, a much more useful period than 8 hours.
To revert the setting execute:
Another option is just killing connections with MySQL workbench (Admin, server status). This is also a good option to test how easy your application recovers from failures you don’t control.
The WRONG solutions
Increasing the timeout is just plain stupid, but a lot of people on the interwebs take this ‘solution’ for granted. I know that they were probably as desperate as me, but it’s just a dirty fix. Yes but the application is accessed at least once during that new timeout period I hear you think. Ok, what happens when your MySQL server crashes and restarts? Try it, not much huh?
autoReconnect=true must be the most popular solution around. It sounds very good, I must agree with that. But the MySQL documentation says : “The use of this feature is not recommended”. When you read further it gets a little bit better, but is still wrong. They’re even talking about increasing a timeout!
The simple solution
I will skip the solutions without connection pooling, because I assume you won’t use that in a production environment.
Since I’m using Tomcat Apache DBCP is your default connection pooling option. I suggest to ditch it and use C3P0. DBCP probably works, but isn’t very tweakable. C3P0 has great documentation, logging and many useful options to make it work for your environment.
This is the important part from my context.xml:
The only extra options you actually need are idleConnectionTestPeriod and the preferredTestQuery. The idleConnectionTestPeriod tests idle connections every 5 minutes and then refreshes them. This means that the error can still occur. By decreasing this timeout and increasing the MySQL GLOBAL.wait_timeout you can suit it to your needs. I increased the timeout to a week. This means that if the applications isn’t accessed in a week there is a 5 minute window where you get an error. Quite reasonable I think. And of course when the MySQL server crashes there is that 5 minute window, but I think we can live with that.
The test query is just a simple query to see if your database is alive.
The complex problem (with also a simple solution)
And now the complex problem just so that I have written it down somewhere and there probably is someone who exactly did the same thing
Since I’m using Hibernate Envers and that’s quite new to me I tried some stuff out. Setting the PersistenceContextType to EXTENDED for example. It gave better results, but I actually had no idea what I was doing. The javadoc for this option is just completely useless. With help from Red Hat documentation and the Spring forum I found out what was going wrong
“Extended context means that you have to take care of the transaction and the entity manager closing” and of course nobody took care of the transaction. That explains a lot.
When you’re stuck ask you colleagues. They will confirm that you’re not stupid (and when they do ask them why they don’t have the solution) and will help you get closer to the solution.
Don’t work for several days on a problem. Go out for a walk, when you don’t get any further after an hour just leave it and do something else.
I’m very happy with documentation of C3P0, a lot of open source projects should take a look there and stop wondering why nobody is using their project