BT

Java Sleight of Hand

| Posted by Paulo Moreira Follow 0 Followers on Oct 21, 2014. Estimated reading time: 9 minutes |

Every now and then we all come across some code whose behaviour is unexpected. The Java language contains plenty of peculiarities, and even experienced developers can be caught by surprise.

Let’s be honest, we’ve all had a junior colleague come to us and ask “what is the result of executing this code?”, catching us unprepared. Now, instead of using the usual “I could tell you but I think it will be far more educational if you find it by yourself”, we can distract his attention for a moment (hmmm.... I think I just saw Angelina Jolie hiding behind our build server. Can you quickly go and check?) while we rapidly browse through this article.

Java Sleight of Hand” will present some of these peculiarities, aiming to help developers become better prepared when dealing with portions of code that produce unexpected outcomes.

Each trick exposes some code that appears to be simple, but whose behaviour at compile and/or runtime is not straightforward. Each will shed some light on the rationale behind the whys and the hows. The level of complexity will range from very simple remarks to serious brain teasers.

Mad identifiers

We are familiar with the rules that define a legal Java identifier:

  • An identifier is a set of one or more characters consisting of letters, digits, currency characters, or underscores (_).
  • An identifier must begin with a letter, a currency character, or an underscore.
  • A Java keyword cannot be used as identifier.
  • There is no limit to the number of characters that can be used in an identifier.
  • Unicodes from \u00c0 to \ud7a3 can also be used.

The rules are quite simple, but there are tricky cases which might raise some eyebrows. For example, nothing prohibits the developer from using a class name as an identifier:

//Class names can be used as identifiers 
String String = "String"; 
Object Object = null; 
Integer Integer = new Integer(1); 
//What about making the code unreadable? 
Float Double = 1.0f; 
Double Float = 2.0d; 
if (String instanceof String) {
      if (Float instanceof Double) {
          if (Double instanceof Float) {
                System.out.print("Can anyone read this code???");
            }
      }
 }

All the following are legal identifiers as well:

int $ =1;
int € = 2;
int £ = 3;
int _ = 4;
long $€£ = 5;
long €_£_$ = 6;
long $€£$€£$€£$€£$€£$€£$€_________$€£$€£$€£$€£$€£$€£$€£$€£$€£_____ = 7;

Additionally, keep in mind that the same name can be used both for a variable and a label. The compiler knows which one you refer to by analysing the context.

int £ = 1;
£: for (int € = 0; € < £; €++) {
     if (€ == £) {
         break £;
     }
}

And of course, remember that the rules for identifiers apply to names of variables, methods, labels and classes as well:

class $ {} 
interface _ {} 
classextends $ implements _ {}

So you have just learned a great way to create code that is hardly readable by anyone, including yourself!

Where is that NullPointerException coming from?

Autoboxing was introduced in Java 5 and made our life easier when having to jump back and forward from primitive types to wrapper types:

int primitiveA = 1;
Integer wrapperA = primitiveA;
wrapperA++;
primitiveA = wrapperA;

The runtime did not change in order to accommodate such changes, most of the work is done at compilation time. The compiler would look at the previous piece of code and generate something like the following:

int primitiveA = 1;
Integer wrapperA = new Integer(primitiveA);
int tmpPrimitiveA = wrapperA.intValue();
tmpPrimitiveA++;
wrapperA = new Integer(tmpPrimitiveA);
primitiveA = wrapperA.intValue(); 

The autoboxing obviously applies to method invocation as well:

public static int calculate(int a) {
     int result = a + 3;
     return result;
}
public static void main(String args[]) {
     int i1 = 1;
     Integer i2 = new Integer(1);
     System.out.println(calculate(i1));
     System.out.println(calculate(i2));
}

This is great, we can pass Number wrappers to all our methods that take primitive types as parameters, and leave it up to the compiler to perform the translation:

public static void main(String args[]) {
     int i1 = 1;
     Integer i2 = new Integer(1);
     System.out.println(calculate(i1));
     int i2Tmp = i2.intValue();
     System.out.println(calculate(i2Tmp));
} 

Lets now try that code with a slight variation:

public static void main(String args[]) {
     int i1 = 1;
     Integer i2 = new Integer(1);
     Integer i3 = null;
     System.out.println(calculate(i1));
     System.out.println(calculate(i2));
     System.out.println(calculate(i3));
}

As seen before, this code translates into:

public static void main(String args[]) {
    int i1 = 1;
     Integer i2 = new Integer(1);
     System.out.println(calculate(i1));
     int i2Tmp = i2.intValue();
     System.out.println(calculate(i2Tmp));
     int i3Tmp = i3.intValue();
     System.out.println(calculate(i3Tmp));
}

And of course, this code leads us to our old friend NullPointerException. Same would happen with something even simpler:

public static void main(String args[]) {
     Integer iW = null;
     int iP = iW;
}

Therefore be very careful with the use of autoboxing, it can lead to NullPointerExceptions in code where it would be impossible to get such exceptions before this feature was introduced. To make it worse it is not always easy to identify these code patterns. If you have to convert a wrapper into a primitive, and you are not sure if the wrapper might be null, protect your code!

My Wrapper Types have an identity crisis...

Let’s continue with autoboxing and look at the following code:

Short s1 = 1;
Short s2 = s1;
System.out.println(s1 == s2);

It naturally prints true. Lets now make it a bit more interesting:

Short s1 = 1;
Short s2 = s1;
s1++;
System.out.println(s1 == s2);

The output becomes false. Wait a moment, shouldn't s1 and s2 reference the same object? Crazy JVM!
Lets apply the code translation mechanism seen in the previous trick.

Short s1 = new Short((short)1);
Short s2 = s1;
short tempS1 = s1.shortValue();
tempS1++;
s1 = new Short(tempS1);
System.out.println(s1 == s2);

Hmmm... makes more sense now, doesn't it? Be always very careful with autoboxing!

Look mom, no exception!

This one is very simple, but it catches some experienced Java developers by surprise. Let’s look at the following code:

NullTest myNullTest = null;
System.out.println(myNullTest.getInt());

When faced with this code, most of the people would assume that it would cause a NullPointerException. Is that so? Let’s look at the rest of the code:

class NullTest {
     public static int getInt() {
         return 1;
     }
}

Always remember that the use of class variables and methods depend only on the reference type. So even if your reference is null you can still invoke them. In terms of good practices it is advisable to use NullTest.getInt() instead of myNullTest.getInt()but you never know when you'll bump into such code.

Varargs and Arrays, Mutatis Mutandis

The variable arguments feature introduced a powerful concept which has given a hand to developers  in order to simplify code. Nevertheless, behind the scenes the varargs are nothing more and nothing less than an array.

public void calc(int... myInts) {} 
calc(1, 2, 3);

The previous code gets translated by the compiler into something like:

int[] ints = {1, 2, 3};
calc(ints);

Empty invocations correspond to passing an empty array as parameter.

calc();
is equivalent to
int[] ints = new int[0];
calc(ints);

And of course, the following will cause a compilation error, as they are equivalents:

public void m1(int[] myInts) { ...    } 
public void m1(int... myInts) { ...    }

Mutable constants

Most of the developers assume that whenever the final keyword is used in a variable, it indicates a constant, that is, a variable which gets assigned with an immutable value. That is not entirely correct, the final keyword, whenever applied to a variable, indicates that the variable can only get a value being assigned once.

class MyClass {
     private final int myVar;
     private int myOtherVar = getMyVar();
     public MyClass() {
         myVar = 10;
     }
     public int getMyVar() {
         return myVar;
     }
     public int getMyOtherVar() {
         return myOtherVar;
     }
     public static void main(String args[]) {
         MyClass mc = new MyClass();
         System.out.println(mc.getMyVar());
         System.out.println(mc.getMyOtherVar());
     }
}

The previous code will print the sequence 10 0. Therefore, while dealing with final variables, we must distinguish the ones which have a default value assigned at compile time, which work as constants, from the ones which get their values initialized at runtime.

Overriding flavors

Keep in mind that since Java 5 it is possible that an overriding method has a different return type than the overridden method. The only rule is that the return type of the overriding method is a subtype of the return type of the overridden method. Therefore the following code became valid with Java 5:

class A {
     public A m() {
         return new A();     }
} 

class B extends A {
     public B m() {
         return new B();
     }
}

Overload that operator!

Java is not particularly strong concerning operator overloading, but it does support it for the operator '+'. We can use it both for mathematical addition or for string concatenation, depending on the context.

int val = 1 + 2;
String txt = "1" + "2";

It gets trickier whenever a numerical value is mixed with a string. But the rule is simple- a mathematical addition will be performed until a string is encountered as an operand. As soon as a string is found then both operands are converted into strings (if necessary) and a string concatenation is performed. The following examples illustrate the different combinations.

System.out.println(1 + 2); //Performs addition and prints 3 

System.out.println("1" + "2"); //Performs concatenation and prints 12
System.out.println(1 + 2 + 3 + "4" + 5); //Performs addition until "4" is found and then concatenation, prints 645

System.out.println("1" + "2" + "3" + 4 + 5); //Performs concatenation and prints 12345

Tricky Date Format

This trick is related to the way the DateFormat implementations work, how its usage can be misleading, and how sometimes issues can go uncovered until the code hits production.

The DateFormat parse method parses a String and produces a date. The parsing is made according to the defined date format mask. According to the JavaDoc, the method throws a ParseException whenever “the beginning of the specified string cannot be parsed”. This definition is very vague and allows for various interpretations. Most developers assume that if the string parameter does not match the defined format then a ParseException is thrown. That is not always the case.

One ought to be very careful with the SimpleDateFormat. When faced with the code below most developers would assume that a  ParseException would be thrown.

String date = "16-07-2009";

SimpleDateFormat sdf = new SimpleDateFormat("ddmmyyyy");
try {     Date d = sdf.parse(date);
     System.out.println(DateFormat.getDateInstance(DateFormat.MEDIUM,
                     new Locale("US")).format(d));
} catch (ParseException pe) {
     System.out.println("Exception: " + pe.getMessage());
}

When run, the code produces the following output: "Jan 16, 0007". Surprisingly enough, there is no complaint about the string not matching the expected format- the implementation simply goes ahead and tries its best to parse the text. Note that there are two hidden tricks. First, the mask for month is MM while mm is used for minutes and that explains why the month is set as January. Second, the parse method of the DecimalFormat class will parse the text until an unparsable character is found, returning the processed number till that point. Therefore “7-20” will translate into year 7. This discrepancy would be easily identifiable but it gets trickier if "yyyymmdd" is used, as the output will be “Jan 7, 0016”. “16-0” is parsed till the first unparsable character, translating into 16 as year. Then “-0” would not impact the result as it understood as 0 minutes. Then “7-” would map into day 7.

About the Author

Paulo Moreira is a Portuguese freelance Software Engineer, currently working in the financial sector in Luxembourg. Graduated from University of Minho with a Master’s Degree in Computer Science and Systems Engineering, he has been working with Java on the server side since 2001 in the telecom, retail, software and financial markets.

Rate this Article

Adoption Stage
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Usage of underscore is discouraged by Tomasz Kowalczewski

It might be good to mention that usage of "_" as of Java 8 is discouraged and generates compiler warning. AFAIK it will be removed as a valid identifier in Java 9.

Re: Usage of underscore is discouraged by Victor Grazi

Hard to believe Java would forgo backward compatibility! They almost never remove any deprecations!

autoboxing by 臧 秀涛

autoboxing is through 'Integer wrapperA = Integer.valueof(primitiveA);',not 'Integer wrapperA = new Integer(primitiveA);'

Re: autoboxing by Paulo Moreira

The language specification does not impose how boxing is done (docs.oracle.com/javase/specs/jls/se8/html/jls-5...):
"If p is a value of type int, then boxing conversion converts p into a reference r of class and type Integer, such that r.intValue() == p".
The invariant to respect is that "r.intValue() == p" but the compiler is free to instantiate r in anyway.

The unboxing is a different story, there the language spec is more restrictive (docs.oracle.com/javase/specs/jls/se8/html/jls-5...):
"If r is a reference of type Integer, then unboxing conversion converts r into r.intValue()"

Therefore I believe the code in the article respects the language specification and makes no assumption about the underlying compiler- the way the boxing is implemented (instantiating the wrapper directly, using valueOf, etc.) is up to the compiler’s implementation.
I presume that the comment is targeting Oracle's compiler. If that is the case, then I agree, Oracle’s compiler implements boxing by using valueOf.

Empty vararg invocation passes empty array, not null by Markus Krüger

calc() is equivalent to calc(new int[] {}), not calc(null).

As a demonstration, the following program will print "arg length = 0", it will not thow a NullPointerException:


public class VarArg {

public static void calc(int... args) {
System.out.println("arg length = " + args.length);
}

public static void main(String[] args) {
calc();
}
}

Re: Empty vararg invocation passes empty array, not null by Victor Grazi

Corrected. Thanks for pointing this out

Re: Empty vararg invocation passes empty array, not null by Richard Richter

Corrected in code, not in text: Careful with empty invocations, they correspond to passing a null as parameter.

Re: Empty vararg invocation passes empty array, not null by Paulo Moreira

Hi Markus, correct, clear in 15.12.4.2 where argument evaluation is defined for variable arity methods:
docs.oracle.com/javase/specs/jls/se8/html/jls-1...
Thanks

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

8 Discuss

Login to InfoQ to interact with what matters most to you.


Recover your password...

Follow

Follow your favorite topics and editors

Quick overview of most important highlights in the industry and on the site.

Like

More signal, less noise

Build your own feed by choosing topics you want to read about and editors you want to hear from.

Notifications

Stay up-to-date

Set up your notifications and don't miss out on content that matters to you

BT