All articles

How to Scale Your Java EE Application

Java EE logo

Interested in how to scale your Java EE application? 

You’ve come to the right place.

Creating an innovative new software application that improves peoples lives not only help to create a better society but also enriches the company that creates it. Here’re a few amazing case studies of companies who hired DevTeam.Space to build their software products:

  1. High Speed Vehicle Recognition – Machine Learning Image Recognition Application
  2. Algo Trading Solution – Cryptocurrency Trading Bot
  3. YADRO – AI-powered High Speed Vehicle Identification System


Why Java EE?
Types of Scaling: Which One to Choose?
Scaling Up
Scaling Out
How to scale out
My Final Thought

Why Java EE?

Java EE is the enterprise edition of the Java Programming Language Platform. It is built on the Java SE platform and is a great runtime environment for large-scale app development. It is much loved because it offers important features like scalability, as well as being an excellent platform with which to create reliable applications.

Underlying the platform is a set of standards defined by Oracle. These bring greater simplicity to the process of app development. It has libraries for database access, Web services, XML support, and many other superb web development tools.

Java EE is able to achieve simplicity by standardizing and automating many aspects of development. This makes it a top choice for many software developers around the globe.

Because of this standardization, it becomes easier to follow good development practices which make apps more reliable when building in scalability.

With more and more enterprises using Java EE as a platform to design and run increasingly complex applications, scalability has become an extremely important issue for Java developers.

What Do We Mean By Scalability?


Scaling, while thought synonymous to performance, actually refers to the ability of a system to expand the number of available resources to accommodate a surge in demand.

To compare the two terms, performance is the measure of how fast a system can respond to the requests it receives, while scalability refers to the number of requests that a system can handle simultaneously.

If, for example, a system has a very good performance level but cannot handle more than say 100 requests at a time, then it would limit the usefulness of the application that is running on it. Just imagine the effect on the usability of an app like Uber were it not to be able to scale excessively.

“After all, if Uber doesn‘t work, people get stranded, and that was really, really interesting to me.” – Scalable Systems & Scalable Careers: A Chat with Uber‘s Sumbry

With the potential of 3 billion people to access applications and websites, these programs must be able to scale effectively.

Scaling any application, no matter how simple or complex it might be, can be a nightmare. When not done properly, it can lead to inadequate fixes and patches that actually have a negative impact on system performance.

Scaling a Java EE application should be a well-planned process that takes into consideration a myriad of different issues. To give you an idea of what scaling involves, first I will take a more in-depth look at the different types of scaling.

Types of Scaling: Which One to Choose?

A comparison of vertical vs horizontal type of scalability

There are two types of scaling that can be incorporated into your application.

Scaling Up

Scaling vertically (also called scaling up) refers to adding physical resources to the network. This may be in the form of increasing the number of CPUs, memory or resource allocation. This type of scaling can be used to enhance virtualization on a Web server and to provide more resources to virtual Operating Systems installed on a virtual machine.

Although scaling up is a good choice when the number of users is still relatively small, for larger applications, scaling up proves to be a lot more expensive.

The main resources that can be added to a server to scale up are memory and processing power i.e.: CPU. For those companies looking to add scalability to small-scale applications, I will briefly go into detail on the advantages and disadvantages of adding both these resources to scale up your application.

Adding Memory

Having sufficient memory is of utmost importance when scaling any network. It is particularly important for IO and Database intensive applications where instead of querying the database again and again, something which is a time-intensive task, vital information can just be cached in the memory so it is readily available upon request.

As a result, scaling up memory has a huge effect on system performance.

Java incorporates Garbage Collection, which automatically recycles dynamically allocated memory from objects no longer in use. This frees up memory that applications can then use to increase their performance.

However, while the garbage collector works, most of the threads are put on hold. This can lead to critical delays in real-time applications like IoT.

In scenarios like this when there needs to be bigger intervals between consecutive garbage collection, a larger memory helps the system by delaying the need for the garbage collection to free up the memory in the first place.

Adding CPUs

Most Web applications are designed on basis of simple request and response server architecture. The user sends a request and data is fetched from the database and relayed to the user in a specific format. Generally speaking, these applications are not CPU centric, meaning they don’t need bigger more powerful CPUs in order to allow for scalability.

Scaling up CPUs for such applications leads to a waste of resources because the CPU only utilizes a small percentage of its total processing power.

Scaling up also has a slight disadvantage when it comes to system availability. Availability refers to the amount of time the application is up and running, otherwise known as Uptime. If all the resources are spent on a single server, then any problems with the server will lead to your application going offline to some if not all users.

If there is no backup system in place, then this can lead to losses, both monetary and in the form of lost users.

Scaling up is mostly used in specific scenarios where it is deemed to be the best solution.

Scaling Out


Scaling horizontally (also called scaling out) is the process of adding more nodes to the system. One example of this strategy can be seen with major tech companies like Facebook and Google. These companies add more and more servers around the globe to cater to an ever-increasing number of user requests.

Such examples show the enormous value of calling out.  Without such an approach, companies would be forced to purchase massive supercomputers which would be impractical due to the massive cost.

Advantages of scaling out

Scaling out gives your Java EE application a better level of system availability than scaling up does. If there are multiple servers catering to requests concurrently, then in case any server crashes, there will be other servers that can bear the workload while the crashed server is fixed. This helps keep users happy because while running slower, the application never goes offline.

Scaling out also has a number of strategic advantages when it comes to application performance. With Facebook, for example, the ability to store data in local data centers helps keep performance levels high. And this is no small thing.

As this article highlights, “At Facebook, we have unique storage scalability challenges when it comes to our data warehouse. Our warehouse stores upwards of 300 PB of Hive data, with an incoming daily rate of about 600 TB. In the last year, the warehouse has seen a 3x growth in the amount of data stored.”

How to scale out

An illustration of a load balancer

Scaling out is usually done with the help of a load balancer. It receives all the incoming requests and then routes them to different servers based on availability. This makes sure that no single server becomes the point of all the traffic and so the workload is distributed uniformly.

If one server in the cluster fails, the load balancer routes the incoming requests to other servers in the cluster. Load balancers also make scaling out your Java EE application much easier.

Should you want to add another server to the cluster, then the load balancer can start directing traffic to the server right away. This saves valuable time that is otherwise used in complicated configurations of the servers.

Pitfalls of using a Load Balancer

Some applications use states to store session information for the client. These states, like HTTP session objects, store the user information, like shopping cart information, and must be present on the server for the requests to be executed.

With simple load-balanced architecture, the load balancer can redirect consequent requests to different servers based on availability. The new server will not have the session states and the user request cannot be processed.

Every time a user accesses the application, they may be directed to a different server. This means that the user will have to submit all their previous data again to the new server, something which slows network performance.

To counter this problem, Sticky Sessions can be used. Sticky sessions are implemented on the load balancer to make sure that subsequent requests from a client go to the same server every time. This is also known as server-affinity.

This helps resolve the aforementioned problem but does, however, give rise to another problem. If the server that created the session for the client crashes, then the next request will be forwarded to another server in the cluster which will not have the state information. This eventuality brings us back to square one.

To counter this problem, we move outwards from a single server and focus on creating integrated server networks. If all the data about the states was stored in a single database that was stored in a way that made it accessible to multiple server systems, then in case of any server going offline, the state data would still be accessible to the next server that responds to requests by the client.

However, the drawback of such a system is that database accesses can be a time-intensive process that can decrease the performance of the system.

Solutions like Oracle Coherence provide an in-memory distributed solution for clustered application servers. This provides a fast messaging service between servers that can exchange critical data like the user states.

For now, this is the best solution for having the user states present in all the servers simultaneously. There is no need for expensive database operations and the data is reliably transferred to all the Servers of an application.

Final thoughts

Java EE applications are unique to the architecture they were designed on. Even though they may use the same standard libraries and APIs, two separate applications with the same functionality can be designed with different priorities in mind.

This creates problems when a single strategy is used to scale two different Java EE applications.

How you scaling a Java EE application depends largely upon the design of the application. Scaling out by adding more servers is generally a good way to scale the application to cope with a bigger load. But as I have discussed, the way to do this also depends upon the type of application that needs to be scaled.

Understanding the architecture of your application and the different types of scaling, including the strengths and weaknesses of each approach, will ensure that you will scale your Java EE application successfully.

DevTeam.Space is a vetted community of expert dev teams supported by an AI-powered agile process.

Companies like Samsung, Airbus, NEC, and startups rely on us to build great online products. We can help you too, by helping you to hire and effortlessly manage expert developers.

LinkedIn Facebook Twitter Facebook Messenger Whatsapp Skype Telegram