Geeky Articles: An Anatomy of Java Generics

1. What is Generics: Generics is a feature in java language to create parameterized types. A parameterized type is a type with parameters passed to it, mostly to enable compiler to check for errors which otherwise would have been a runtime exception.

For example, let us consider programs below. In the first one, generics is not used. We are using java collection API to store and retrieve objects. In this program, it is impossible for the compiler to know what actually is stored in the collection. It is the programmer's job to take care that the objects stored are of expected type and that they are not cast to a wrong type in the time of retrieval. Not only this takes a lot of nasty boiler plate code, it also obfuscates the purpose of the collection all together. Generics attempts to solve this by passing parameters to type to specify what the types are related to. As we will see.

WithoutGenerics.java WithoutGenericsError.java WithGenerics.java WithGenericsError.java

package com.geekyarticles.generics;

import java.util.ArrayList;
import java.util.List;

public class WithoutGenerics {
    public static void main(String [] args){
        List list=new ArrayList();
        list.add("Hello");
        list.add("Hi");

        String firstValue=(String)list.get(0);//An explicit cast is neccessary here
    }
}

package com.geekyarticles.generics;

import java.util.ArrayList;
import java.util.List;

public class WithoutGenericsError {

    /**
     * @param args
     */
    public static void main(String[] args) {
        List list=new ArrayList();
        list.add("Hello");
        list.add("Hi");

        Integer firstValue=(Integer)list.get(0);//Oops! runtime exception
    }

}

package com.geekyarticles.generics;

import java.util.ArrayList;
import java.util.List;

public class WithGenerics {

    /**
     * @param args
     */
    public static void main(String[] args) {
        List<String> list=new ArrayList<String>();
        list.add("Hello");
        list.add("Hi");

        String firstValue=list.get(0);//No casting is required

    }

}

package com.geekyarticles.generics;

import java.util.ArrayList;
import java.util.List;

public class WithGenericsError {

    /**
     * @param args
     */
    public static void main(String[] args) {
        List<String> list=new ArrayList<String>();
        list.add(new Integer(2));//Compile time error
        list.add("Hi");

        String firstValue=list.get(0);//No casting is required

    }

}

In WithoutGenericsError.java, we are trying to cast the object of String to an Integer. However, since the return type of the get method is Object, the compiler has no way to know that there is a problem. At runtime, a ClassCastException is thrown.

Now let us look at WithGenerics.java. In this case, we are telling the compiler that we are creating a List of String objects, not just any object. That way, not only can the compiler already tell that the get method will return only a String, it also raises an error, in case we try to add an Interger to the List. In the rest of the article, we will discuss the various semantics and constrains of java generics.

This tutorial is designed for people with some exposure to java generics. It would be helpful to know a little bit about java generics. However, even if you do not, you can catch up with a little more effort.

2. Generic Class Declarations: The following code gives an example of a generic class declaration.

GenericClass.java

package com.geekyarticles.generics;

import java.util.ArrayList;
import java.util.List;

public class GenericClass<E , T extends E> {
    public E getFirst(List<T> list){
        if(list.size()>0)
            return list.get(0);
        else
            return null;
    }

    public static void main(String [] args){
        GenericClass<List<?>, List<String>> genObject=new GenericClass<List<?>, List<String>>();
        List<String> aList=null;//new ArrayList<String>();
        List<?> x=new ArrayList<String>();
        List<List<String>> listList=new ArrayList<List<String>>();
        listList.add(aList);
        genObject.getFirst(listList);

    }
}

Here E and T are called type variables. They represent some type. There is also a restriction on the relation between E and T, the T is a subtype of the type E, whatever it is. Like normal local variables, type variables cannot be used before they are declared. For example the following would be an error.

public class GenericClass<T extends E, E> { ... } //Error

It is possible to set a constrain to extend from more than one classes/interfaces by separating them with an & symbol, as shown in the following

public class GenericClass<E, T extends List<E>& Comparable<E> > { ... }

If there are more than one classes (and not interface) in the & separated list, they all must be in superclass-subclass relationship. Otherwise, it will be an error.

3. Wildcard Type Parameters: Wildcard type parameters are introduced to support constrains not so strict. The following shows an example

WildCardExample.java

package com.geekyarticles.generics;

public class WildCardExample <E>{

    /**
     * @param args
     */
    public static void main(String[] args) {
        //See that a weaker restriction is imposed on wildCardExample
        WildCardExample<? extends A1> wildCardExample=new WildCardExample<A2>();

    }


}

Wildcards are applicable in references, method definitions and in method invocation. Note that they are not allowed while creating an object or while creating a class.

3.1. Bounds: If the wildcard is just a ?, its an unbounded wildcard. It means its upper bound is Object and lower bound is null type. If wildcard is of the form ? extends T, it has an upper bound T. On the other hand, if the wildcard is of type ? super T, it has a lower bound.

3.2. Type Argument Containment and Equivalence: A type argument is said to contain another, if the set of types represented by the first one is the super set (not to be confused with super type) of the set of types represented by the second.

? extends T contains ? extends S if and only if T is a sub type of S
? super T contains ? super S if and only if T is a super type of S
T itself contains T
? extends T contains T
? super T contains T

3.3. Capture Conversion: A non static method in a generic class can use its type parameters to form its own parameter list. In that, it can also use wildcards. The reference type of the class can also pass wildcard in the type arguments. To cater that, the compiler creates unnamed imaginary types called captures [that's why the errors like the capture#2 of ? extends E blah blah come up]. A capture is created by accumulating all the constrains including the wildcard types passed during reference creation and the wildcards used in the method declaration. If the arguments of the method follow all the constrains thus obtained, the method is called.

Note that if after capture conversion, any parameter in the method is a wildcard type without a lower bound and the method's argument itself is not a wildcard without a lower bound, there is no way the method can be called. This is because the method must accept only those arguments which are subtypes of the parameter list. However, if the wildcard type has no lower bound, there is no way the compiler can confirm that. However, if the method itself declares wildcards without lower bounds, there is no problem. Because it will be implemented to cater that. The following code shows why.

WildCardExample2.java

package com.geekyarticles.generics;

public class WildCardExample2 <E>{

    private E var;
    public void setVar(E n){
        var=n;
    }
    public E getVar(){
        return var;
    }
    public static void main(String[] args) {
        //See that a weaker restriction is imposed on wildCardExample
        WildCardExample2<? extends A1> wildCardExample=new WildCardExample2<A2>();
        A2 a2=new A2();
        //Now the type of setVar after capture conversion is capture of ? extends A1 which has
        //no lower bounds. There is no type that can be a subtype of this capture.
        //Hence the method setVar() cannot be invoked on this instance.

    }


}

4. Generic Methods: Generic methods are methods which declare their own type variables. During method invocation, those variables are substituted with real types. The types to substitute with may be passed explicitly during method invocation or they can be inferred implicitly. In both the cases, the type arguments must pass the consistency checks.

4.1. Generic Method Definition: To define a generic method, you need to declare the type parameters before the return type declaration in angle brackets, as shown in the following example.

MethodArguments.java

package com.geekyarticles.generics;

import java.util.ArrayList;
import java.util.Collection;
import java.util.List;

public class MethodArguments {

    /**
     * @param args
     */
    public static void main(String[] args) {
        List<? super String> aList=new ArrayList<String>();
        //Here the arguments are implicitly inferred. Note that since aList has lower bound
        //of String, the constraint on parameters of addToCollection is satisfied.
        addToCollection(aList, "Hello");
    }

    public static <E, T extends E> void addToCollection(Collection<E> collection, T object){
        //Here after capture, the type argument of add is E. Now since T extends E,
        //argument of type T can be passed.
        collection.add(object);
    }

}

4.1.1. Implicit Type Inference: In the above example, the type is implicitly inferred. The types are inferred with the following steps:

First a set of initial constrains are formed (after capture conversion)
If the parameters passed do not contain any wildcards, the intial constrains are the solution. If in this case the initial constrains are not satisfied, its a compile time error.
If the initial constrains contain wildcards, the reflexive relations thus formed are solved for. If multiple solutions exist, the most specific one is accepted.

4.1.2. Explicit Type: Types can be explicitly passed as shown in the example.

MethodArgumentsExplicit.java

package com.geekyarticles.generics;

import java.util.ArrayList;
import java.util.Collection;
import java.util.List;

public class MethodArgumentsExplicit {

    /**
     * @param args
     */
    public static void main(String[] args) {
        List<Object> aList=new ArrayList<Object>();
        //Here the arguments are explicitly. Note that the when the arguments are passed
        //explicitly, they are used verbatim. Also note that wildcards are not allowed in the
        //explicit type arguments.
        MethodArgumentsExplicit.<Object,String> addToCollection(aList, "Hello");
    }

    public static <E, T extends E> void addToCollection(Collection<E> collection, T object){
        //Here after capture, the type argument of add is E. Now since T extends E,
        //argument of type T can be passed.
        collection.add(object);
    }

}

It is not possible to pass explicit type arguments without specifying the class (in case of static method)/object (in case of non-static method) the method is called on.

5. Type Erasure: During compilation, the information about parameterized types are erased and the type is flattened to a non-generic type. Note that the parameterized types are instances of generic types when they are passed with parameters. Generic types retain their type variable declarations so that they can be instantiated as parameterized types during compilation process, when they are referred from binary packages. The erasure happens in the following way -

Parameterized types are stripped to raw types
Type variables are replaced by their leftmost bounds. (we are not talking about their declarations, only use)

5.1 Reifiable Types: Reifiable types are the types that do not change after type erasure. Only the below mentioned types are reifiable.

Instances of non-generic type declaration
Raw types
Primitves
Parameterized types with only unbounded wildcards as parameters
Arrays of the reifiable types

6. Backward Compatibility: For backward compatibility, it is possible to assign raw type instances to parameterized references or assign instances of parameterized types to raw type references. Same is applicable for method argument passing. This is called unchecked conversion. This causes a problem called heap pollusion.

6.1 Heap Pollusion: Due to unchecked conversion, it is possible to have wrong type of objects in the generic types potentially causing a runtime exception. This is why, the compiler raises a warning for unchecked operation. The following code shows an example.

Unchecked.java

package com.geekyarticles.generics;

import java.util.ArrayList;
import java.util.List;

public class Unchecked {

    /**
     * @param args
     */
    public static void main(String[] args) {
        List<String> parameterized=new ArrayList<String>();
        List raw=parameterized; //Unchecked operations
        raw.add(new Integer(2));//No problem, its a raw type
        String myString=parameterized.get(0);//Oops! runtime ClassCastException,
                                            //and did not even explicitly cast anything
    }

}

7. Java 7 Enhancement: From java 7, during instantiation of objects of parameterized types, diamond operator can be used for type inference. I already explained it here (see diamond operator).

Saturday, July 30, 2011

An Anatomy of Java Generics

1 comments:

Post a Comment