Saturday, January 17, 2015

Pitfalls for Scala beginners --- Part 1

Type erasure. Consider the following code:
class Box[E](e: E) {
  def unbox: E = e
  def rebox(e: E): Box[E] = new Box(e)
}
val box1 = new Box(1)
val boxes: Seq[Box[_]] = Seq(box1) // the type info of Int is lost due to "_"
val box2 = boxes.head 
box1.rebox(box1.unbox)  // ok
box2.rebox(box2.unbox)  // compile-time error!
The last line cannot compile because the Scala compiler cannot tell whether the return type of box2.unbox matches the type parameter of box. One way to solve this problem is to introduce a fresh type parameter for the rebox method, so that the argument type of rebox and the type parameter of Box are allowed to be different:
class Box[E](e: E) {
  def unbox: E = e
  def rebox[F](e: F): Box[F] = new Box(e)  // use a new type parameter for rebox
}
Capital identifiers. Scala's pattern matching mechanism automatically presumes within patterns that all identifiers with an initial capital letter are constants. Hence, val (a,b) = (1,2) is fine but val (A,b) = (1,2) cannot compile due to undefined A. On the other hand, if you set, say, A = 2 before the matching, then it will compile but instead rise a MatchError exception at runtime.

Iterated binding. According to section 4.1 of The Scala Language Specification, a value definition val p1,...,pn = e is a shorthand for the sequence of value definitions val p1 = e; ...; val pn = e. Hence, the following code
val next = { var n = 0; () => { n = n + 1; n } }
val p1, p2, p3, p4, p5, p6, p7, p8 = next()
binds $p_i$ to $i$ for each $i$.

Overridden val. The Scala compiler binds a val variable only once. Hence, if a val variable is overridden in a subclass, it will be initialized in that subclass and appear as its default value before that time, even though it is also initialized in the superclass. For example,
trait A { 
  val foo = 10 
  println(s"foo in A: $foo") 
}
class B extends A {
  override val foo = 11
  println(s"foo in B: $foo") 
}
new B // prints "foo in A: 0", "foo in B: 11"
The point is that access to foo is carried out through an accessor foo() overridden in B. Hence, even though there is a field foo in the superclass, it is effectively hidden from the constructors, and B.foo returns the value set in B after the parent constructors are called. Note that the same holds for Java: you access fields of the subclass through the overridden method, which gives the default value when the method is called in the constructor of the superclass.

Binding vs assignment. In Scala, val denotes binding and var denotes assignment. When one uses var to declare a variable, the closure is made over its reference instead of its value. Hence, the value of the variable is still mutable after the variable is enclosed. Consider the follows code:
val fs = Buffer[() => Unit]()
{// closure I
  val a = 1
  var b = 2 
  fs += (() => println(a + " " + b)) // closure II
  b = 3  
}
val a = 4
fs foreach { f => f() } // print 1 3
Inside closure II, a binds to a value (ie. 1) while b binds to a reference. As the value of b changes, the new value is visible to all closures that can access b. Hence, one has to use val to capture values in a closure.

Right-associative operators. All operators, including those custom ones, that ends in a ":" would be right-associative in Scala. For example, the List concatenation a ::: b ::: c is translated to method calls c.:::(b).:::(a).

Type widening. Suppose you have an implicit conversion from String to Option:
implicit def StrToOpt(s: String) = Option(s)
val opt1: Option[String] = "a".get()         // Ok due to conversion
val opt2: Option[String] = "a".getOrElse("") // Boo! type-mismatch error!
The problem is a result of incomplete tree traversal. The signature
 def getOrElse[B >: A](default: => B): B
allows type widening. Thus, when the compiler realizes that the return type of getOrElse is not as expected, it tries to generalize the type. In this case, it tries to identify a supertype of String, which is Serializable, and then gets stuck there. A solution to this is to use
val opt2: Option[String] = "a".getOrElse[String]("")
Now the return type is forced to be a String, so the compiler finds the implicit without wandering around the type hierarchy. The lesson: in the presence of type widening/narrowing, the compiler can make the type too general before checking for implicits. Putting explicit type annotation help avoid this problem.

Iterators vs Traversables. Iterators in Scala provide analogues of most of the methods that you find in the Traversable classes. For instance, they all provide a foreach method which executes a given procedure on each element. The biggest differences between iterators and traversables is that iterators has states. An iterator maintain a pointer that is advanced automatically and cannot be rewinded. Hence, you cannot reuse an iterator after invoking foreach on it. See this document for details.

Note: Do not confuse Iterator classes with Iterable classes. See this document for all full hierarchy of Scala's collections.

No comments:

Post a Comment