Spicing up a high-load, low-latency REST service
Quiz query. You have got a low-latency, high-load service operating on 42 digital machines, every having 2 CPU cores. Sometime, you migrate your software nodes to 5 beasts of bodily servers, every having 32 CPU cores. Given that every digital machine had a heap of 2GB, what measurement ought to it’s for every bodily server?
So, you could divide 42 * 2 = 84GB of complete reminiscence over 5 machines. That boils all the way down to 84 / 5 = 16.8GB per machine. To take no possibilities, you spherical this quantity as much as 25GB. Sounds believable, proper? Effectively, the right reply seems to be lower than 2GB, as a result of that’s the quantity we obtained by calculating the heap measurement based mostly on the LDS. Can’t imagine it? No worries, we couldn’t imagine it both. Due to this fact, we determined to run an experiment.
Experiment setup
We now have 5 software nodes, so we will run our experiment with 5 differently-sized heaps. We give node one 2GB, node two 4GB, node three 8GB, node 4 12GB, and node 5 25GB. (Sure, we aren’t courageous sufficient to run our software with a heap beneath 2GB.)
As a subsequent step, we hearth up our efficiency exams producing a secure, production-like load of a baffling 56K requests per second. All through the entire run of this experiment, we measure the variety of requests every node receives to make sure that the load is equally balanced. What’s extra, we measure this service’s key efficiency indicator – latency.
As a result of we obtained weary of downloading the GC logs after every check, we invested in Grafana dashboards to point out us the GC’s pause instances, throughput, and heap measurement after a rubbish acquire. This fashion we will simply examine the GC’s well being.
Outcomes
This weblog is about GC tuning, so let’s begin with that. The next determine reveals the GC’s pause instances and throughput. Recall that pause instances point out how lengthy the GC freezes the applying whereas sweeping out reminiscence. Throughput then specifies the share of time the applying is just not paused by the GC.