Sunday, April 1, 2012

What's Cooking in Java 8 - Project Jigsaw

What is Project Jigsaw: Project Jigsaw is the project to make the java compiler module aware. For years java API has been monolithic, i.e. the whole API was seen from any part of the code equally. There has also not been any way to declare a code's dependency on any other user libraries. Project Jigsaw attempts to solve these problems along with others in a very eligant way. In this article, I will highlight the basic concepts of Jigsaw module systems and also explain how it would work with the commands so as to provide a real feel of it. Currently, Jigsaw is targetted to be included in the release of Java 8. In my opinion, this is a change bigger than generics that came with verion 5 of java platform.

What is Achieved by Project Jigsaw: As I explained earlier, project Jigsaw solves the problem of the whole java API being used as a single monolithic codebase. The following points highlight the main advantages.

1. Dependency Graph: Jigsaw gives a way to uniquely identify a particular codebase, and also to declare a codebase's dependencies on other codebases. This creates a complete dependency graph for a particular set of classes. Say for example, you want to write a program that depends on Apache BCEL library. Until now, there was no way for you to express this requirement in the code itself. Using Jigsaw, you can express this requirement in the code itself, allowing tools to resolve this dependency.

2. Multiple Versions of the Same Code: Suppose you write a program that depends on both libray A and library B. Now suppose library A depends on version 1.0 of library C and library B depends on version 2.0 of library C. In the current java runtime, you cannot use library A and B at the same time without creating a complex hierarchy of custom classloaders, even that would not work in all cases. After Jigsaw becomes part of java, this is not a problem as a class will be able to see only the versions of its dependent classes that are part of the module versions required by the classes container module. That is to say, since module A depends on version 1.0 of module C, and module B depends on version 2.0 of module C, the java runtime can figure out which version of the classes in module C to be seen by either module A or module B. This is something similar to OSGi project.

3. Modularization of Java Platform Itself: The current java platform API is huge and not all parts of it may be relevant in every case. For example, a java platform intended to run a Java EE server does not have to implement the Swing API as that would not make any sense. Similarly, embedded environments can stripdown some not so important APIs (for embedded) like compiler API to make it smaller and faster. Under current java platform, its not possible as any certified java platform must implement all the APIs. Jigsaw will provide a way to implement only a part of the API set relevant to the particular platform. Since a module can explicitly declare its dependency on any particular java API module, it will be run only when the platform has an implementation of the modules requred by the module.

4. Integration with OS native installation: Since the module system is very similar to what is currently available for installation of programs and libraries in modern operating systems, the java modules can be integrated with those systems. These are in fact out of the scope of Jigsaw project itself, but the OS vendors are encouraged to enable this and they would most likely do so. For example, the rpm based repository system available in Redhat based linux systems and apt based repository systems available in Debian based linux systems can easily be enhanced to support java module systems.

5. Module Entry Point: Java modules can specify an entry point class just like the jars can specify it. When a module is run, the entry-point's main method is invoked. Now since the OS can now install a java module and the java module can be executed, its very similar to installing an OS's native program.

5. Efficiency: Currenly, every time a JVM is run, it verifies the integrity of every single class that is loaded during the run of the program. This takes a considerable amount of time. Also the classes are accessed individually from the OS file system. Since modules can be installed before running, the installation itself can now include the verification step which will eliminate the need to verify the classes at runtime. This will lead to considerable performance improvement. Also, the module system can store the classes in its own optimized manner leading to further improvement in the performance.

6. Module Abstraction: It is possible to provide an abstraction for a particular module. Say module A depends on module X. Now module D can provide for module X thus providing its implementation. For example, the Apache Xerces modules would want to provide for jdk.jaxp module and would be able to satisfy a dependency requirement for jdk.jaxp.

Basics of Modular Codebase: All the above discussion are pretty vague without a real example of modular codebase and its usage. A modular codebase can either be single module or multi-module. In case of single module, all we need to enable module is to create a file named at the base of the source path, outside any package. The file is a special java file written in a special syntax designed to declare module information. The following is an example of such a

module com.a @ 1.0{

        requires com.b @ 1.0;
        class com.a.Hello;

In this case the module is named com.a and it has got a dependency on com.b. It also declares an entry point com.a.Hello. Note that it is not required that the package structure ressembles the module name, although that would probably be a best practice.

Now you might be thinking that if it is a single module mode, then why is there a dependency on a different module, does not that make it two modules. Notice that even if there is only one explicit declaration of a dependency module, there is implicit dependency on all java API modules. If none of the java API modules are declared explicitly as dependencies, all of the them are included. The only reason its still single module is that the com.b must be available in binary form in the module library. Its multi-module when more than one module is being compiled at the same time. Compiling a source in single module is as simple as how we compile a non-modular source. Only difference is that will be present in the source root.

Multi-module Source: In case the source contains multiple modules, they must be given a directory structure. Its pretty simple though. The source under a particular module must be kept in a directory of the name of the module. For example, the source for the class com.a.Hello in the module com.a must be kept in [source-root]/com.a/com/a/ and the must be kept in the directory [source-root]/com.a

Compiling Multi-module Source: For this let us consider an example of compiling two modules com.a and com.b. Let us first take a look at the directory structure. as below:

 |  |
 |  |--com
 |     |--a
 |        |

The code for in com.a would be like this.

module com.a @ 1.0{

        requires com.b @ 1.0;
        class com.a.Hello;

The in com.b

module com.b @ 1.0{
        exports com.b;
} in com.b/com/b

package com.b;

public class Printer{
        public static void print(String toPrint){
} in com.a/com/a

package com.a;
import com.b.Printer;

public class Hello{
        public static void main(String [] args){
                Printer.print("Hello World!");

The codes are pretty self explanatory, we are trying to use com.b.Printer class in module com.b from com.a.Hello class in module com.a. For this, its mandatory for com.a to declare com.b as a dependency with the requires keyword. We are trying to create the output class files in the classes directory. The following javac command would do that.

javac -d classes -modulepath classes -sourcepath src `find src -name '*.java'`

Note that we have used find command in backquotes(`) so that the command's output will be included as the file list. This will work in linux and unix environments. In case of others we might simply type in the list of files.

After compilation, classes directory will have a similar structure of classes. Now we can install the modules using jmod command.

jmod create -L mlib
jmod install -L mlib classes com.b
jmod install -L mlib classes com.a

We first created a module library mlib and installed our modules in the library. We could also have used the default library by not specifying the -L option to the install command in jmod.

Now we can simply run module com.a using

java -L mlib -m com.a

Here too we could have used the default module. It is also possible to create a distributable module package [equivalent to a jar in today's distribution mechanism] that can directly be installed. For example, the following will create com.a@1.0.jmod for com.a

jpkg -m classes/com.a jmod com.a

I have tried to outline the module infrastructure in the upcoming java release. However project Jigsaw is being modified everyday and can turn up to be a completely differnt being altogether at the end. But it is expected that the basic concepts would still remain the same. The total module concepts are more complex and I will cover the details in an upcoming article.


jar command examples said...

when is the beta for JDK8 getting available ?

Debasish Ray Chawdhuri said...

I have no idea (although initially Oracle was targeting mid 2012). But considering the current state of development, its going to take a while. But these are of course my guesses.

Hugi Thordarson said...

Great overview, thanks for taking the time! Jigsaw will be a huge improvement.

Svetlana said...

Thanks for the useful tips.

bouquetf said...

Hi, you may have a look at openjdk which you can get involved in :


Post a Comment