JVM Memory Model
Hello guys, in this blog we are going to talk about the different areas of memory that JVM creates and handles throughout the lifecycle of a program. In this blog, I have tried my best to explain in detail with examples and diagrams.
Before diving deep, let us see how JVM resides on the memory (RAM).
Like other applications, the JVM also occupies some part of the RAM. The occupied RAM is divided into the following categories
- Method Area
- Code Cache
- Heap Memory
- Native Stack
Before looking into each one of them, one must know how classes are loaded into the memory. Let us understand with the help of an example.
The main purpose of the class loader is to locate the .class file using path and classpath, reads the data, and then passes it to the JVM. There are 3 different types of class loaders.
— Bootstrap (displayed as null)
A bootstrap class loader is the parent of all other class loaders. In the above program, the ArrayList class is loaded by the bootstrap loader. The output of this class loader is null because it is written in native code and not in java, that is why it doesn’t show up as a class. The logging class is loaded by the extension class loader as this class belongs to the extension of the standard core java libraries. All user-defined classes are loaded by the Application class loader.
Now we have some background on how classes are loaded, let us see how it is mapped into the method area.
Method Area — The JVM extracts the information from the binary data that was given by the class loader and stores it in the method area. Memory for the class variables (static variables) declared in a class is also taken from the method area. For each type, it loads, the JVM stores the following information.
— The fully qualified name of the type
— The fully qualified name of the type’s immediate superclass
— Whether type is a class or an interface
— The type’s modifiers
Inside the java class file and Java virtual machine, type names are always stored as fully qualified names.
— Runtime constant pool
— Field Information (field’s name, field’s type, field’s modifiers)
— Method Information (method’s name, method’s return type, the number and types of method’s parameters, method’s modifier, method’s bytecode)
— Class Variables — Class variables are shared among all instances of a class. These variables are associated with the class, not with the instances of the class. So they are logically part of the class data in the method area.
— A reference to Class Loader
— A reference to Class class — An instance of the class java.lang.Class is created by the JVM for every type it loads. Basically, it is a corresponding representation of the type loaded in the method area. You can use this java.lang.Class instance to access the metadata of the type.
An example of Method Area in use
To run the Volcano application, you give the name “Volcano” to a Java Virtual Machine. Given the name “Volcano”, the virtual machine finds and reads in the file “Volcano.class”. It extracts the definition of class Volcano from the binary data and places the information into the method area. The virtual machine then invokes the main() method, by interpreting the byte codes stored in the method area. As the virtual machine executes main() method, it maintains a pointer to the constant pool for the class Volcano.
The Java Virtual machine has already begun to execute the bytecodes for main() method in the class Volcano even though it hasn’t yet loaded class Lava. The JVM loads classes only as it needs them.
The main() method’s first instruction tells the JVM to load the class Lava.
If you see the corresponding byte code, it has an instruction and an index. This index corresponds to the index in the constant pool. The index 2 in the constant pool says that it is a class and the name of the class is at index 16. When the virtual machine discovers that it hasn’t yet loaded a class named “Lava”, it proceeds to find and read in file Lava.class. It extracts the definition of class Lava from the binary data and places the information in the method area.
The Java virtual machine then replaces the symbolic reference in Volcano’s constant pool entry which is just the string “Lava”, with a pointer to the class data for Lava. If the virtual machine ever has to use Volcano’s constant pool entry again, it won’t have to go through the relatively slow process of searching through the method area for the class Lava. It can just use the pointer to quickly access the class data for Lava. This process of replacing symbolic references with direct references is called constant pool resolution.
Finally, the virtual machine is ready to allocate memory for a new Lava object. Once again, the virtual machine consults the information stored in the method area. It uses the pointer to the Lava data to find out how much heap space is required by a Lava object.
Once the Java virtual machine has determined the amount of heap space required by the Lava object, it allocates that space on the heap and initializes the instance variable speed to 5, its default initial value. If class Lava’s superclass, Object, has any instance variables, those are also initialized to default initial value.
The first instruction of the main() method completes by pushing a reference to the new Lava object onto the stack. Let us now look into some other important points of the method area.
The method area is shared among all JVM threads. It is created on JVM startup. The method area is just the JVM specification. The one who follows JVM specification must provide the implementation of the method area.
Prior to Java 8, the PermGen space is the implementation of the HotSpot virtual machine based on the JVM specification for the method area. The PermGen space is part of the heap memory and it has a fixed maximum size.
When PermGen was introduced, there was no dynamic class unloading, so once the class was loaded, it was stuck in the memory until the JVM shut down. Due to which JVM might end up throwing out of memory error.
In order to overcome out of memory error for PermGen space, the PermGen space was replaced by Metaspace in Java 8.
The metaspace is part of non-heap memory and it auto increases its size (up to what the underlying OS provides). The garbage collector now automatically triggers the cleaning of the dead classes once the metadata usage reaches its maximum size.
Code Cache—If you see other languages like PHP which interprets each line one by one even if that line is repeated multiple times in the code. These repeated lines of interpretation lead to slower execution. To avoid such type of situation, JVM makes use of the JIT compiler.
The Just-In-Time (JIT) compiler is the one that takes care of the code cache. The JVM monitors which part of the byte code executed more often. It then decides to speed up the execution by converting the frequently used byte code into the native code with the help of the JIT compiler. The frequently used byte code can be anything like a method being executed multiple times or maybe some code block. This native code is then directly understood by the machine. The JIT-compiled native code is stored in the code cache.
The process of compiling the frequently used byte code into the native code is done in a separate thread. Initially, the JVM will continue to use the byte code and once the JIT compilation is ready, it then makes use of the native code.
Let us understand the native code compilation with the help of an example.
To see what is happening in the JIT compilation, we need to run the above program using a flag.
java -XX:+PrintCompilation CodeCacheDemo
The output will be something like this
The first column indicates the time in milliseconds since JVM started. The 2nd column is the order in which the code was compiled. The next few columns are some indicators given below.
n — indicates native code
s — indicates synchronized code
% — indicates code being compiled into native code
! — indicates some exception
Now if you see the above output, the java.lang.Object::hashcode is having an indicator as “n” as this method is not written in java but is written in native code. The PrimeNumber::isPrime has been called a lot of times and it has an indicator as “%”. That means the function isPrime has been compiled into native code and it is stored in the code cache so as to increase the performance of the program.
The number next to the indicators indicates what type of compiling is taking place. The range is from 0 to 4. The 0 indicates no compilation whereas the number 4 indicates, the highest level of compilation. One thing to note about, not all highest levels of compilation are stored in the code cache.
Heap Memory — Whenever a class instance is created in the running Java application, the memory is allocated from single heap memory. As there is only one heap inside the JVM, all threads share it. The heap memory is divided into 2 parts
— The young generation — The young generation is further divided into Eden space and 2 survivor spaces (S0 and S1). All newly created objects are stored in the Eden space. When Eden space gets filled with objects, a minor GC is performed and all the survivor objects are moved to the S0 survivor space. The minor GC also checks the survivor space, if it is full then it is moved to the next survivor space. The objects that survive many rounds of GC’s are moved to the old generation.
— The old generation — This is the area where long-lived objects reside. Once this area gets full, a major GC is performed to reclaim the used memory.
Stack — When a new thread is launched, the JVM creates a new Java stack for that thread. A Java stack stores the thread’s state in discrete frames. The JVM can perform only 2 operations directly on the Java stacks: it pushes and pops frames. When a thread invokes a method, the JVM creates and pushes a new frame onto the thread’s stack. All the data on a thread’s Java stack is private to that thread.
To know more about the Java stack, I have written a blog on it. Here is the link
The Ultimate Stack Frame
In this post, I have discussed about Local Variable Array and Operand Stack
Native Method Stack —When a thread invokes a native method, it enters a new world in which the structures and security restrictions of the JVM no longer hamper its freedom. A native method can likely access the runtime data areas of the virtual machine, but can also do anything else it wants. It may use registers inside the native processor, allocate memory on any number of native heaps, or use any kind of stack.
When a thread invokes a Java method, the virtual machine creates a new frame and pushes it onto the Java stack. When a thread invokes a native method, however, that thread leaves the Java stack behind. Instead of pushing a new frame onto the thread’s Java stack, the Java virtual machine will simply dynamically link to and directly invoke the native method.
If an implementation’s native method interface uses a C-linkage model, then the native method stacks are C stacks.
A native method interface will likely be able to call back into the Java virtual machine and invoke a Java method. In this case, the thread leaves the native method stack and enters another Java stack.
Program Counter — Each thread of a running program has its own pc register or program counter, which is created when the thread is started. The pc register is one word in size, so it can hold both a native pointer and a returnAddress. As a thread executes a Java method. the pc register contains the address of the current instruction being executed by the thread. An “address” can be a native pointer or an offset from the beginning of a method’s byte codes. If a thread is executing a native method, the value of the pc register is undefined.
So that's it from this blog. I hope you guys liked it.