Saturday, January 12, 2013

Java to Scala: Motivation

If you are a Java programmer and haven't heard about Scala already you might wonder if it is worth the effort learning. Here I show two examples that motivated me to switch from Java to Scala.

This first one I stole from somewhere ― I just can't remember where from, sigh ― and modified it slightly. It is a simple implementation of a class that stores the contact information (name and phone number) of a person.

public class Contact implements Serializable {
    private final String name;
    private final String phone;

    public Contact(String name, String phone) {
        this.name = name;
        this.phone = phone;
    }

    public String getName() {
        return name;
    }

    public String getPhone() {
        return phone;
    }

    public Contact withName(String name) {
        return new Contact(name, phone);
    }

    public Contact withPhone(String phone) {
        return new Contact(name, phone);
    }

    public boolean equals(Object o) {
        if (this == o)  return true;
        if (o == null || getClass() != o.getClass()) return false;
        
        Contact p = (Contact) o;
        if (name != null ? !name.equals(p.name) : p.name != null) {
            return false;
        }
        if (phone != null ? !phone.equals(p.phone) : p.phone != null) {
            return false;
        }
        return true;
    }

    public int hashCode() {
        int result = name != null ? name.hashCode() : 0;
        result = 31 * result + (phone != null ? phone.hashCode() : 0);
        return result;
    }

    public String toString() {
        return "Contact(" + name + "," + phone + ")";
    }
} 

There are no surprises or difficulties in it. Just a lot of boilerplate code. Any Java programmer will have written hundreds of classes similar to this one - most of them a lot more complex. With the Contact class one can create new contacts from scratch or use one of the copy constructors as shown below:

Contact stefanOffice = new Contact("Stefan", "62606");
Contact johnOffice   = new Contact("John", stefan.getPhone());
Contact stefanHome   = stefanOffice.withPhone("401251497");

These 20 plus lines of the the Contact class in Java boil down to a single line in Scala with essentially identical functionality:

case class Contact(name: String, phone: String)


val stefanOffice = Contact("Stefan", "62606")
val johnOffice   = Contact("John", stefan.phone)
val stefanHome   = stefanOffice.copy(phone = "401251497")

A case class in Scala provides default implementations for setters, getters, equality tests and copy constructors. In many cases they are sufficient and if not they can easily be overridden.

While this example is a bit extreme with respect to the amount of code reduction (typically the factor is between two and five) it is not an unrealistic example. But I also wanted to demonstrate the benefits of Scala on a slightly more complex problem.

We are going to look at some code that generates an index of over keywords. For instance, let's say we have the keywords "Apple", "Ananas", "Mango", "Banana", "Blackberry" and we would like to have an alphabetically sorted index that lists the keywords (again alphabetically sorted) starting with the same letter:

A: Ananas, Apple
B: Banana, Blackberry
M: Mango

Not very challenging; should be easy. Well, here comes the Java implementation and as you can see it is far from elegant:

import java.util.*;

public class Index {    
  public static void main(String[] args) {
    List<String> keywords = 
       Arrays.asList("Apple", "Ananas", "Mango", "Banana", "Blackberry"); 
    Map<Character, List<String>> ch2words = new TreeMap<>(); 
    for(String keyword : keywords) {   
      char firstChar = keyword.charAt(0);     
      if(!ch2words.containsKey(firstChar))      
        ch2words.put(firstChar, new ArrayList<String>());       
      ch2words.get(firstChar).add(keyword); 
    } 
    
    for(List<String> list : ch2words.values()) 
      Collections.sort(list); 
      
    StringBuilder str = new StringBuilder();  
    for(Character firstChar : ch2words.keySet()) {
      str.append(firstChar+": ");
      Iterator<String> words = ch2words.get(firstChar).iterator(); 
      while(words.hasNext()) {
        str.append(words.next());  
        str.append(words.hasNext() ? ", " : "\n");  
      }
    }
    System.out.print(str);      
  }
}

We take the keywords and put them into a TreeMap to map the first letter to the list of words. I chose a TreeMap to have the map entries automatically sorted according to the first character. We then need to sort the keywords in each of the entries of the TreeMap and finally use a StringBuilder to write the formatted index to a String.

Now I am going to write some bad and truly disgusting Scala code that essentially is a one-to-one mapping from Java to Scala. While far from optimal it shows what already be gained without using any novel concepts such as functional programming or pattern matching.

import scala.collection.mutable.{Map,StringBuilder}

val keywords = List("Apple", "Ananas", "Mango", "Banana", "Blackberry")
val ch2words = Map[Char, List[String]]()
for(keyword <- keywords) {
  val firstChar = keyword(0)
  if(!ch2words.contains(firstChar))      
    ch2words(firstChar) = List()       
  ch2words(firstChar) = keyword::ch2words(firstChar)
}
val str = new StringBuilder()
for(ch <- ch2words.keys.toSeq.sorted) {
  str ++= ch2words(ch).sorted.mkString(ch+": ", ", ", "\n")      
}  
println(str)

Largely due to type inference in Scala the code becomes less cluttered and the logical flow is easier to recognize. Also the confusing distinction between primitive data types such as char and their corresponding objects (Character) is gone.

Now let's do it again and see how a solution that takes advantage of Scala features could look like:

val keywords = List("Apple", "Ananas", "Mango", "Banana", "Blackberry")
val ch2words = keywords.groupBy(_(0)).mapValues(_.sorted).toList.sortBy(_._1)
val lines = ch2words.map{case (ch,words) => ch+": "+words.mkString(", ")}
print(lines.mkString("\n"))

First of all it is much shorter. But in addition, it is also more readable provided an understanding of some basic concepts in Scala has been acquired. Let's have a closer lock at the code.

The first line is trivial. It creates a list of all the keywords and that's it. The second line is interesting. First the keywords get grouped by their first character (keywords.groupBy(_(0))). That produces a map of letters to keyword lists:

Map(M -> List(Mango), A -> List(Apple, Ananas), B -> List(Banana, Blackberry))

As you can see neither the letters nor the lists of keyword are sorted. We take the values of the map (those are the lists of keywords) and sort them. mapValues(_.sorted) does the job.

Map(M -> List(Mango), A -> List(Ananas, Apple), B -> List(Banana, Blackberry))

To sort the map according to letters we convert the map to a list, which returns a list of tuples, containing letters and their keywords, and sort the list according to the first tuple component ― the letter (toList.sortBy(_._1)).

List((A,List(Ananas, Apple)), (B,List(Banana, Blackberry)), (M,List(Mango)))

Now the data structure is sorted and we simply need to create a formatted string. Pattern matching is used to extract the letters and the corresponding keyword lists (case (ch,words)) and the lines of the index are constructed by mapping (ch,words) tuples to strings. Finally, all lines are glued together via mkString("\n") and printed out.

A: Ananas, Apple
B: Banana, Blackberry
M: Mango

To conclude, in almost all cases Scala allows one to write shorter code than in Java; and it is frequently even more readable. But it also makes it possible to write code that causes a PERL coder to blush. To quote Spider-man: "With great powers come great responsibilities!".

Wednesday, January 9, 2013

An example class in Scala

This is just a little example class to remind me of the syntax and some of the nifty features in Scala that I tend to forget. I essentially take the example of the Rational class from the excellent book "Programming in Scala" by Odersky et al. and modify it slightly.

Let's start very simple. We call the class "Frac" for fraction instead of Rational and its constructor takes the numerator and denominator of the fraction. The val specifier ensures that the class is immutable. We also override the toString method to get a nice print out.

class Frac(val num:Int, val den:Int) {
  override def toString = "%d/%d" format (num,den)
}

With this class at hand we can now create a fraction and print it and its components:

val a = new Frac(1,2)
println(a)
println(a.num)
1/2
1

Now let's add a companion object Frac that allows us to omit the new keyword when creating a fraction. We also want to ensure that the denominator is greater than zero and Scala provides the handy require function to test pre-conditions. Furthermore, We add a toDouble variable that contains the floating point value of the fraction. Note that we could also have said def toDouble = .. instead of val toDouble = ... . It is a trade off between speed and memory consumption. The nice thing about Scala is that the call interface remains the same and the implementation could be changed without any dramas.

class Frac(val num:Int, val den:Int) {
  require(den > 0)
  val toDouble = num/den.toDouble 
  override def toString = "%d/%d" format (num,den)
}

object Frac {
  def apply(num:Int, den:Int) = new Frac(num,den)
}  

println(Frac(1,2))
println(Frac(1,2).toDouble)
1/2
0.5

So far so good. But one problem with the current implementation is that comparisons such as Frac(1,2) == Frac(1,2) return false. We need to override the equals() method and consequently the method hashCode should also be overridden:

class Frac(val num:Int, val den:Int) {
  ...
  override def equals(other:Any) = other match {
    case that:Frac => (this eq that) || 
       (that.num == this.num && this.den == that.den)
    case _ => false
  }

  override def hashCode = 13*(num+13*den)
  ...
}

Apart from testing for equality it would be convenient if fractions could be ordered. The current class implementation does not allow to say Frac(1,3) < Frac(1,2), for instance. But implementing the compare method of the Ordered trait does the trick.

class Frac(val num:Int, val den:Int) extends Ordered[Frac] {
  ...
  def compare(that:Frac):Int = 
    (this.num*that.den) - (that.num*this.den)     
  ...
}

Operator overloading is another feature of Scala and useful when applied appropriately. For a fraction class it certainly makes sense. To keep things simple the following code is limited to the multiplication of fractions with fractions and fractions with scalars.

class Frac(val num:Int, val den:Int)  {
  ...
  def *(that:Frac):Frac = Frac(this.num*that.num, this.den*that.den)
  def *(c:Int):Frac = Frac(num*c, den)   
  ...
}

This enables us to compute Frac(1,3) * Frac(1,2) and Frac(1,3) * 2 but we still cannot calculate 2 * Frac(1,3) because there is no multiplication method for Int that takes a fraction. For that we need an implicit function, preferably defined within the companion object:

object Frac {
  def apply(num:Int, den:Int) = new Frac(num,den)
  implicit def int2Frac(num:Int):Frac = Frac(num,1)
}  

Depending on scope it might be necessary to import the implicit function of the companion object but then things work as expected:

import Frac.int2Frac
println(Frac(1,3) * Frac(1,2))    // prints 1/6
println(Frac(1,3) * 2)            // prints 2/3
println(2 * Frac(1,3))            // prints 2/3

When multiplying a fraction with an integer scalar we would like to get a fraction back. However, when multiplying with a floating point number we need a floating point number as a result. The implementation is similar; just the direction is different.

class Frac(val num:Int, val den:Int) {
  ...
  val toDouble = num/den.toDouble
  def *(c:Int):Frac = Frac(num*c, den)   // Frac * Int => Frac
  def *(c:Double):Double = toDouble*c    // Frac * Double => Double
  ...
}

object Frac {
  def apply(num:Int, den:Int) = new Frac(num,den)
  implicit def int2Frac(num:Int):Frac = Frac(num,1)
  implicit def frac2double(frac:Frac):Double = frac.toDouble
}  
import Frac._
println(Frac(1,2) * 2.5)   // prints 1.25
println(2.5 * Frac(1,2))   // prints 1.25

Alright, that it's. Of course, there is some functionality missing such as addition, subtraction and division operations. Furthermore, fractions should be normalized to the greatest common divisor as shown in Odersky's code. But I wanted to keep this example simple. To conclude let us put together what have now and see what we can do:

class Frac(val num:Int, val den:Int) extends Ordered[Frac] {
  require(den > 0)
  val toDouble = num/den.toDouble
  def *(that:Frac):Frac = Frac(this.num*that.num, this.den*that.den)
  def *(c:Int):Frac = Frac(num*c, den)   
  def *(c:Double):Double = toDouble*c   
  def compare(that:Frac):Int = 
    (this.num*that.den) - (that.num*this.den)     
  override def equals(other:Any) = other match {
    case that:Frac => (this eq that) || 
       (that.num == this.num && this.den == that.den)
    case _ => false
  }
  override def hashCode = 13*(num+13*den)
  override def toString = "%d/%d" format (num,den)
}

object Frac {
  def apply(num:Int, den:Int) = new Frac(num,den)
  implicit def int2Frac(num:Int):Frac = Frac(num,1)
  implicit def frac2double(frac:Frac):Double = frac.toDouble
}  
import Frac._
println(Frac(1,3) == Frac(1,2))
println(Frac(1,3) < Frac(1,2))
println(Frac(1,3) * Frac(1,2))
println(Frac(1,2) * 2.5)
println(2.5 * Frac(1,2))
println(Frac(1,2) * 2)
println(2 * Frac(1,2))