How to Scale Your Java EE Application
The advent of the Internet has created a technological boom that has changed the world even more than the industrial revolution. By facilitating global interconnectivity on such an enormous scale (it is estimated that there are some 3.578 global internet users) the internet has arguably been the single most revolutionary human invention ever.
However, without web platforms such as Java EE, the internet would not be the powerful tool that we have all come to rely on so heavily.
In this article, I aim to examine the common problems that arise when attempting to scale your Java EE application. I hope this will allow developers who are inexperienced in scaling applications to avoid the main pitfalls that the rest of us have had to overcome in the past.
An inability to achieve scalability will not only compromise application performance but may also result in a system crash.
Why Java EE?
Java EE is the enterprise edition of the Java Programming Language Platform. It is built on the Java SE platform and is a great runtime environment for large-scale app development. It is much loved because it offers important features like scalability, as well as being an excellent platform with which to create reliable applications.
Underlying the platform is a set of standards defined by Oracle. These bring greater simplicity to the process of app development. It has libraries for database access, Web services, XML support, and many other superb web development tools.
Java EE is able to achieve simplicity by standardizing and automating many aspects of development. This makes it a top choice for many software developers around the globe.
Because of this standardization, it becomes easier to follow good development practices which make apps more reliable when building in scalability.
With more and more enterprises using Java EE as a platform to design and run increasingly complex applications, scalability has become an extremely important issue for Java developers.
What Do We Mean By Scalability?
Scaling, while thought synonymous to performance, actually refers to the ability of a system to expand the number of available resources to accommodate a surge in demand.
To compare the two terms, performance is the measure of how fast a system can respond to the requests it receives, while scalability refers to the number of requests that a system can handle simultaneously.
If, for example, a system has a very good performance level but cannot handle more than say 100 requests at a time, then it would limit the usefulness of the application that is running on it. Just imagine the effect on the usability of an app like Uber were it not to be able to scale excessively.
“After all, if Uber doesn‘t work, people get stranded, and that was really, really interesting to me.” – Scalable Systems & Scalable Careers: A Chat with Uber‘s Sumbry
With the potential of 3 billion people to access applications and websites, these programs must be able to scale effectively.
Scaling any application, no matter how simple or complex it might be, can be a nightmare. When not done properly, it can lead to inadequate fixes and patches that actually have a negative impact on system performance.
Scaling a Java EE application should be a well-planned process that takes into consideration a myriad of different issues. To give you an idea of what scaling involves, first I will take a more in-depth look at the different types of scaling.
Types of Scaling: Which One to Choose?
Scaling vertically (also called scaling up) refers to adding physical resources to the network. This may be in the form of increasing the number of CPUs, memory or resource allocation. This type of scaling can be used to enhance virtualization on a Web server and to provide more resources to virtual Operating Systems installed on a virtual machine.
Although scaling up is a good choice when the number of users is still relatively small, for larger applications, scaling up proves to be a lot more expensive.
The main resources that can be added to a server to scale up are memory and processing power i.e.: CPU. For those companies looking to add scalability to small-scale applications, I will briefly go into detail on the advantages and disadvantages of adding both these resources to scale up your application.
Having sufficient memory is of utmost importance when scaling any network. It is particularly important for IO and Database intensive applications where instead of querying the database again and again, something which is a time-intensive task, vital information can just be cached in the memory so it is readily available upon request.
As a result, scaling up memory has a huge effect on system performance.
Java incorporates Garbage Collection, which automatically recycles dynamically allocated memory from objects no longer in use. This frees up memory that applications can then use to increase their performance.
However, while the garbage collector works, most of the threads are put on hold. This can lead to critical delays in real-time applications like IoT.
In scenarios like this when there needs to be bigger intervals between consecutive garbage collection, a larger memory helps the system by delaying the need for the garbage collection to free up the memory in the first place.
Most Web applications are designed on basis of simple request and response server architecture. The user sends a request and data is fetched from the database and relayed to the user in a specific format. Generally speaking, these applications are not CPU centric, meaning they don’t need bigger more powerful CPUs in order to allow for scalability.
Scaling up CPUs for such applications leads to a waste of resources because the CPU only utilizes a small percentage of its total processing power.
Scaling up also has a slight disadvantage when it comes to system availability. Availability refers to the amount of time the application is up and running, otherwise known as Uptime. If all the resources are spent on a single server, then any problems with the server will lead to your application going offline to some if not all users.
If there is no backup system in place, then this can lead to losses, both monetary and in the form of lost users.
Hire expert developers for your next project
1,200 top developers
us over the last 3 years
Scaling horizontally (also called scaling out) is the process of adding more nodes to the system. One example of this strategy can be seen with major tech companies like Facebook and Google. These companies add more and more servers around the globe to cater to an ever-increasing number of user requests.
Such examples show the enormous value of calling out. Without such an approach, companies would be forced to purchase massive supercomputers which would be impractical due to the massive cost.
Advantages of scaling out
Scaling out gives your Java EE application a better level of system availability than scaling up does. If there are multiple servers catering to requests concurrently, then in case any server crashes, there will be other servers that can bear the workload while the crashed server is fixed. This helps keep users happy because while running slower, the application never goes offline.
Scaling out also has a number of strategic advantages when it comes to application performance. With Facebook, for example, the ability to store data in local data centers helps keep performance levels high. And this is no small thing.
As this article highlights, “At Facebook, we have unique storage scalability challenges when it comes to our data warehouse. Our warehouse stores upwards of 300 PB of Hive data, with an incoming daily rate of about 600 TB. In the last year, the warehouse has seen a 3x growth in the amount of data stored.”
How to scale out
Scaling out is usually done with the help of a load balancer. It receives all the incoming requests and then routes them to different servers based on availability. This makes sure that no single server becomes the point of all the traffic and so the workload is distributed uniformly.
If one server in the cluster fails, the load balancer routes the incoming requests to other servers in the cluster. Load balancers also make scaling out your Java EE application much easier.
Should you want to add another server to the cluster, then the load balancer can start directing traffic to the server right away. This saves valuable time that is otherwise used in complicated configurations of the servers.
Pitfalls of using a Load Balancer
Some applications use states to store session information for the client. These states, like HTTP session objects, store the user information, like shopping cart information, and must be present on the server for the requests to be executed.
With simple load-balanced architecture, the load balancer can redirect consequent requests to different servers based on availability. The new server will not have the session states and the user request cannot be processed.
Every time a user accesses the application, they may be directed to a different server. This means that the user will have to submit all their previous data again to the new server, something which slows network performance.
To counter this problem, Sticky Sessions can be used. Sticky sessions are implemented on the load balancer to make sure that subsequent requests from a client go to the same server every time. This is also known as server-affinity.
This helps resolve the aforementioned problem but does, however, give rise to another problem. If the server that created the session for the client crashes, then the next request will be forwarded to another server in the cluster which will not have the state information. This eventuality brings us back to square one.
To counter this problem, we move outwards from a single server and focus on creating integrated server networks. If all the data about the states was stored in a single database that was stored in a way that made it accessible to multiple server systems, then in case of any server going offline, the state data would still be accessible to the next server that responds to requests by the client.
However, the drawback of such a system is that database accesses can be a time-intensive process that can decrease the performance of the system.
Solutions like Oracle Coherence provide an in-memory distributed solution for clustered application servers. This provides a fast messaging service between servers that can exchange critical data like the user states.
For now, this is the best solution for having the user states present in all the servers simultaneously. There is no need for expensive database operations and the data is reliably transferred to all the Servers of an application.
Java EE applications are unique to the architecture they were designed on. Even though they may use the same standard libraries and APIs, two separate applications with the same functionality can be designed with different priorities in mind.
This creates problems when a single strategy is used to scale two different Java EE applications.
How you scaling a Java EE application depends largely upon the design of the application. Scaling out by adding more servers is generally a good way to scale the application to cope with a bigger load. But as I have discussed, the way to do this also depends upon the type of application that needs to be scaled.
Understanding the architecture of your application and the different types of scaling, including the strengths and weaknesses of each approach, will ensure that you will scale your Java EE application successfully.