Thursday, January 14, 2016

Functional Programming in Scala

Functional programming (FP) is based on a simple premise that we only use pure functions with no side effects in programs which have far-reaching implications. A pure function is one that lacks side effects. A side effect is something which function does other than returning the result such as Modifying a variable or field in object, or writing to an external file.

A function f with input type A and output type B (written in Scala as a single type: A => B, pronounced “A to B” or “A arrow B”) is a computation that relates every value a of type A to exactly one value b of type B such that b is determined solely by the value of a. Any changing state of an internal or external process is irrelevant to computing the result f(a). Hence when a function has no observable effect on the execution of the program other than to compute a result given its inputs, then we say that it has no side effects. For example, a function intToString having type Int => String will take every integer to a corresponding string. A pure function is modular and composable as it separates the logic of the computation itself from “what to do with the result” and “how to obtain the input”. Such computation logic is reusable with no side effects.

Referential transparency and purity: An expression e is referentially transparent (RT) if, for all programs p, all occurrences of e in p can be replaced by the result of evaluating e without affecting the meaning of p. A function f is pure if the expression f(x) is referentially transparent for all referentially transparent x. In other words, an expression to be referentially transparent—in any program, when the expression can be replaced by its result without changing the meaning of the program. Referential transparency forces the invariant that everything a function does is represented by the value that it returns, according to the result type of the function. When expressions are referentially transparent the computation proceeds using substitution model were at each step we replace a term with an equivalent one.

Scala Basics
  • The class keyword introduces a class and contains the body within the curly braces. 
  • A method of a class is introduced by the def keyword which is followed by the name of the method, followed by the parameter list in parentheses and return type. The body of the method itself comes after a single equals sign. 
    def functionName ([list of parameters]) : [return type] = { }
  • Methods are implicitly declared abstract if the equals sign and method body is missing from the method declaration.
  • Scala uses the syntax keyword var to declare a variable, while uses the keyword val to declare a constant. The value of the constant declared using val cannot be changed hence called immutable variable.
  • The type of a variable is specified after the variable name with colon in between, and before equals sign (e.g. var sum:Int = 0). Variable may or may not have initial value during declaration.
  • Scala compiler can determine the type of the variable based on the initial value assigned to the variable, which is called as variable type inference.
  • In Scala statements are separated by newlines or by semicolons. Newlines delimit statements in a block.
  • The last statement within the block is automatically returned, without specifying the "return" keyword. The value returned from a method is simply whatever value results from evaluating the right-hand side. Scala compiler can infer the return types of methods based on the last statement but is considered bad style.
  • A pair of values can be returned by the method indicated with type enclosed within braces. A pair can be created by putting the values in parentheses separated by a comma.
    def buyCoffee(cc: CreditCard): (Coffee, Charge) = { .. }
  • A function, which does not return anything (called procedures), can return Unit which is equivalent to void in Java and indicates that function does not return anything.
  • Private methods cannot be called from the code outside the owning object.
  • A multi-line string literal is a sequence of characters enclosed in triple quotes """ ... """. The sequence of characters is arbitrary, except that it may contain three or more consuctive quote characters only at the very end.
  • Scala allows the last parameter to a function to be repeated indicated by '*' following the type, e.g. "String*" which actually is Array[String].
       def main(args: Array[String]) {
            printStrings("Hello", "Scala", "Python");
       }
    
       def printStrings( args:String* ) = {
          var i : Int = 0; 
          for( arg <- args ) {
             println("Arg value[" + i + "] = " + arg );
             i = i + 1;
          }
       }
    
  • Function parameters can have default values. The argument for such a parameter can optionally be omitted from a function call, in which case the corresponding argument will be filled in with the default.
       def main(args: Array[String]) {
            println( "Returned Value : " + addInt() );
       }
    
       def addInt( a:Int=5, b:Int=7 ) : Int = {
          var sum:Int = 0
          sum = a + b
    
          return sum
       }
  • In a normal function call, the arguments in the call are matched one by one in the order of the parameters of the called function. Named arguments allows to pass arguments to a function in a different order were each argument is preceded by a parameter name and an equals sign as below.
       def main(args: Array[String]) {
            printInt(b=5, a=7);
       }
    
       def printInt( a:Int, b:Int ) = {
          println("Value of a : " + a );
          println("Value of b : " + b );
       }
    
  • A class has a primary constructor whose argument list comes after the class name. The parameters in this list become public, unmodifiable (immutable) fields of the class and can be accessed using the usual object-oriented dot notation.
  • A class can be instantiated without the keyword new, by just using the class name followed by the list of arguments for its primary constructor.
  • A List[...] is an immutable singly linked list of values. The List.fill(n)(x) creates a List with n copies of x.
  • The unzip method splits a list of pairs into a pair of lists. E.g. List[(Coffee, Charge)] is split by destructuring the pair to declare two values (coffees and charges) on one line.
  • The reduce method reduces the entire list of values into a single value, using the combine method of the value class to combine values two at a time.
        e.g. val (coffees, charges) = purchases.unzip(coffees, charges.reduce((c1,c2) => c1.combine(c2)))
  • The object keyword creates a new singleton type, which is like a class that only has a single named instance similar to anonymous class in java. An object is scala's equivalent to java's static. E.g. object MyModule { ... }
  • Scala looks at the main method with a specific signature, which takes an Array of Strings as its argument, and its return type is Unit, inorder to execute the program. Unit serves a similar purpose to void in java. The literal syntax for unit is (), a pair of empty parentheses. In Scala, every method has to return some value as long as it doesn’t crash or hang.
  • Scala allows to define functions inside a function which are called local functions and are only visible inside the enclosing method.
  • Anonymous functions in scala also called function literals, can have multiple parameters or no parameter. 
        var mul = (x: Int, y: Int) => x*y
        println(mul(3, 4))
    
        var userDir = () => { System.getProperty("user.dir") }
        println( userDir )
    
  • Scala allows to apply functions partially to avoid passing redundant values when a method is invoked multiple times with the same value for a parameter. The constant parameter value can be eliminated by partially applying the argument to the method by binding a value to the constant parameter and leave the other parameters unbound by putting an underscore at their place. The resulting partially applied function is stored in a variable. 
       def main(args: Array[String]) {
         val date = new Date
         val logWithDateBound = log(date, _ : String)
         logWithDateBound("message1" )
       }
    
       def log(date: Date, message: String)  = {
         println(date + "----" + message)
       }
    
  • Currying transforms a function that takes multiple parameters into a chain of functions, each taking a single parameter. Curried functions are defined with multiple parameter lists. 
       def strcat(s1: String)(s2: String) = s1 + s2
       // Alternate Syntax
       def strcat(s1: String) = (s2: String) => s1 + s2
       strcat("foo")("bar")
    
  • In scala, parameters to the functions are usually passed by value but it also has call-by-name parameters, by putting the => symbol between the variable name and the type, which passes an expression to be evaluated within the called function. A call-by-name mechanism passes a code block to the callee and each time the callee accesses the parameter, the code block is executed and the value is calculated.
  • Every Java class is available in Scala. Scala does not have its own String class and uses java.lang.String. Since String class is immutable, StringBuilder should be used while making frequent String modifications.
Every value in Scala is what’s called an object, and each object may have zero or more members. A member can be a method declared with the def keyword, or it can be another object declared with val or object.
Any method name can be used infix, omitting the dot and parentheses when calling it with a single argument. E.g. instead of MyModule.abs(42) we can say MyModule abs 42 and get the same result.
All of an object’s (nonprivate) members can be brought into scope by using the underscore syntax. E.g. import MyModule._
In scala, functions are values and can be assigned to variables, stored in data structures, and passed as arguments to functions. A function that accepts other functions as arguments. This is called a higher-order function (HOF).
In Scala, functions that are local to the body of another function are called an inner function, or local definition. They are used to write loops functionally, without mutating a loop variable, by using a recursive function. Scala detects self-recursion and compiles it to the same sort of bytecode as would be emitted for a while loop,as long as the recursive call is in tail position. A call is said to be in tail position if the caller does nothing other than return the value of the recursive call. If all recursive calls made by a function are in tail position, Scala automatically compiles the recursion to iterative loops that don’t cona function literalsume call stack frames for each iteration. we can tell the Scala compiler about tail call elimination using the tailrec annotation.
def factorial(n: Int): Int = {
   def go(n: Int, acc: Int): Int =
      if (n <= 0) acc
      else go(n-1, n*acc)
   go(n, 1)
}

Example of passing function as the mthod parameter:
def formatResult(name: String, n: Int, f: Int => Int) = { ... }

Polymorphic Function:
A comma-separated list of type parameters, surrounded by square brackets ([A]), following the name of the function is used to define Polymorphic function for various types. The type parameter list introduces type variables that can be referenced in the rest of the type signature.
def findFirst[A](as: Array[A], p: A => Boolean): Int = {
  @annotation.tailrec
  def loop(n: Int): Int =
     if (n >= as.length) -1
     else if (p(as(n))) n
     else loop(n + 1)
        loop(0)
     }
The expression Array(7, 9, 13) is an array literal and it constructs a new array with three integers in it.
A function literal or anonymous function has the arguments to the function declared to the left of the => arrow, while to the right of the arrow is the body of the function were the parameters can be used. Example syntax: (x: Int) => x == 9 and (x: Int, y: Int) => x == y
A function literal under th hood is an object with a method called apply. Scala allows the objects that have a method with special name "apply" can be called as if they were themselves methods. Hence (a, b) => a < b syntactically looks as below, were calling lessThan(10, 20) actually calls the apply method:
val lessThan = new Function2[Int, Int, Boolean] {
   def apply(a: Int, b: Int) = a < b
}

Scala has Function1, Function2, Function3 and other interfaces known as traits provided by the standard Scala library which takes number of arguments indicated by the name. 
Scala’s standard library provides compose as a method on Function1, were two functions f and g can be composed by calling "f compose g". Also f andThen g is the same as g compose f.

A functional data structure is (not surprisingly) operated on using only pure functions. functional data structures are by definition immutable.

Trait
A trait is an abstract interface that may optionally contain implementations of some methods. When sealed is added in front trait it means that all implementations of the trait must be declared in this file. The implementations of the trait is denoted by 'case' keyword. By specifying the type parameter [+A] after sealed trait within the method signature and using inside the constructor, makes the data type to be polymorphic in the type of elements it contains. Hence it can be used with Int, Double, String elements etc. The + indicates that the type parameter A is covariant, hence for all types X and Y, if X is a subtype of Y, then List[X] is a subtype of List[Y]. Nothing is a subtype of all types, which means that in conjunction with the variance annotation, Nil can be considered a List[Int], a List[Double] etc.

sealed trait List[+A]
case object Nil extends List[Nothing]
case class Cons[+A](head: A, tail: List[A]) extends List[A]

Pattern Matching
Pattern matching works similar to switch statement which may descend into the structure of the expression it examines and extract subexpressions of that structure. It’s introduced with an expression (the target or scrutinee) like ints as below, followed by the keyword match, and a {}-wrapped sequence of cases. Each case in the match consists of a pattern (like Cons(x,xs)) to the left of the => and a result (like x + sum(xs)) to the right of the =>. If the target matches the pattern in a case, the result of that case becomes the result of the entire match expression. If multiple patterns match the target, Scala chooses the first matching case. The variable pattern '_' is generally used to match any expression, e.g. List(1,2,3) match { case Cons(h,_) => h } results in 1 as List(1,2,3) is equals to Cons(1, Cons(2, Cons(3, Nil))). A pattern matches the target if there exists an assignment of variables in the pattern to subexpressions of the target that make it structurally equivalent to the target. The resulting expression for a matching case will then have access to these variable assignments in its local scope.

def sum(ints: List[Int]): Int = ints match {
   case Nil => 0
   case Cons(x,xs) => x + sum(xs)
}

A companion object in addition to the data type and its data constructors is an object with the same name as the data type where we put various convenience functions for creating or working with values of the data type. For example a function to fill the List data type with n copies of element a. When functions are in the companion object they are called as fn(obj, arg1), while when inside the body of the trait they are called as obj.fn(arg1) or obj fn arg1.

Variadic functions are just providing a little syntactic sugar for creating and passing a Seq of elements explicitly. Seq is the interface in Scala’s collections library implemented
by sequence-like data structures such as lists, queues, and vectors. The special _* type annotation allows us to pass a Seq to a variadic method.

When data is immutable, sharing of immutable data allows to implement functions more efficiently without having to worry about subsequent code modifying the data. Hence there’s no need to pessimistically make copies to avoid modification or corruption. Functional data structures are persistent, meaning that existing references are never changed by operations on the data structure. E.g. Below function that adds all the elements of one list to the end of another:

def append[A](a1: List[A], a2: List[A]): List[A] =
  a1 match {
    case Nil => a2
    case Cons(h,t) => Cons(h, append(t, a2))
}

The takeWhile takes the elements from a list while the predicates is true. dropWhile on the other hand drops the elements while the predicate is true and returns the rest.
val xs: List[Int] = List(1,2,3,4,5)

val ex1 = dropWhile(xs, (x: Int) => x < 4)
// ex1 == List(1,2,3)

val ex2 = dropWhile(xs, (x: Int) => x < 4)
// ex2 == List(4,5)

Scala can infer type arguments for higher-order functions when we group function defination into argument groups. The syntax of calling dropWhile is 'dropWhile(xs)(f)' were dropWhile(xs) is returning a function, which then calls with the argument f as below. Hence more generally, when a function definition contains multiple argument groups, type information flows from left to right across these argument groups.

def dropWhile[A](as: List[A])(f: A => Boolean): List[A] =
   as match {
   case Cons(h,t) if f(h) => dropWhile(t)(f)
   case _ => as
}

val xs: List[Int] = List(1,2,3,4,5)
val ex1 = dropWhile(xs)(x => x < 4)

The functions can be generalized by pulling subexpressions out into function arguments. If a subexpression refers to any local variables e.g. + or * operation, turn the subexpression into a function that accepts these variables as arguments. In below example, the function takes an argument the value to return in the case of the empty list, and the function to add an element to the result in the case of a nonempty list. It essentially replaces the constructors of the list, Nil and Cons, with z and f, were Cons(1, Cons(2, Nil)) becomes f (1, f (2, z )).

def foldRight[A,B](as: List[A], z: B)(f: (A, B) => B): B =
  as match {
     case Nil => z
     case Cons(x, xs) => f(x, foldRight(xs, z)(f))
  }

def sum2(ns: List[Int]) = foldRight(ns, 0)((x,y) => x + y)
def product2(ns: List[Double]) = foldRight(ns, 1.0)(_ * _)

The anonymous function (x,y) => x + y can be written as _ + _ in situations where the types of x and y could be inferred by Scala. Each underscore in an anonymous function expression like _ + _ introduces a new (unnamed) function parameter and references it.
       List is a data type in the Scala standard library and the constructor is called ::, which associates to the right, so 1 :: 2 :: Nil is equal to 1 :: (2 :: Nil), which is equal to List(1,2). Below are few methods defined in List of standard library.
  • def take(n: Int): List[A] — Returns a list consisting of the first n elements of this.
  • def takeWhile(f: A => Boolean): List[A] — Returns a list consisting of the longest valid prefix of this whose elements all pass the predicate f.
  • def forall(f: A => Boolean): Boolean — Returns true if and only if all elements of this pass the predicate f.
  • def exists(f: A => Boolean): Boolean — Returns true if any element of this passes the predicate f.
  • scanLeft and scanRight — Like foldLeft and foldRight, but they return the List of partial results rather than just the final accumulated value.
An algebraic data type is a data type defined by one or more data constructors, each of which may contain zero or more arguments. The data type is the sum or union of its data constructors, and each data constructor is the product of its arguments.

Exception Handling
Scala represents failures and exceptions with ordinary values and uses higher-order functions that abstract out common patterns of error handling and recovery. Exceptions thrown in the functions make the method return value not referentially transparent. RT expressions does not depend on context and may be reasoned about locally, whereas the meaning of non-RT expressions is context-dependent and requires more global reasoning. For example the meaning of the RT expression 42 + 5 doesn’t depend on the larger expression it’s embedded in—it’s always equal to 47. On the other hand the meaning of the expression throw new Exception("fail") is very context-dependent as it takes on different meanings depending on which try block (if any) it’s nested within. Exceptions break RT and introduce context dependence, and should be used only for error handling and not for control flow. Exceptions are not type-safe and the compiler does not know nor can enforce handling unknown exceptions which won’t be detected until runtime. Java’s checked exceptions which force to handle exceptions add a significant boilerplate code and, don’t work for higher-order functions which are generic and can’t possibly be aware of the specific exceptions that could be raised by their arguments. Hence instead of throwing an exception, we return a value indicating that an exceptional condition has occurred and introduce a new generic type with the possible defined values.
    The Option data type is an explicit return type when the function may not have a return value. Option has two cases: it can be defined, in which case it will be a Some, or it can be undefined, in which case it will be None. Option is similar to a List were it can contain at most one element, and many of the List functions have analogous functions on Option.

def mean(xs: Seq[Double]): Option[Double] =
   if (xs.isEmpty) None
   else Some(xs.sum / xs.length)

The list of functions on Option include map, flatmap, getOrElse, orElse and filter as below. The map function can be used to transform the result inside an Option, if it exists or else if None it aborts the remaining operation. flatMap function is similar, except that the function provided to transform the result can itself fail. filter function is used to filter out the relevant values and mostly used within the chain of operations. It can also convert successes into failures if the successful values don’t match the given predicate. getOrElse is used for conversion and for error handling. A common idiom is to do o.getOrElse(throw new Exception("FAIL")) to convert the None case of an Option back to an exception. orElse is similar to getOrElse, except that we return another Option if the first is undefined.

trait Option[+A] {
   def map[B](f: A => B): Option[B]
   def flatMap[B](f: A => Option[B]): Option[B]
   def getOrElse[B >: A](default: => B): B
   def orElse[B >: A](ob: => Option[B]): Option[B]
   def filter(f: A => Boolean): Option[A]
}

The default: => B type annotation in above getOrElse function indicates that the argument is of type B, but won’t be evaluated until it’s needed by the function. Also, the B >: A type parameter indicates that B must be equal to or a supertype of A inorder to convince Scala that it’s still safe to declare Option[+A] as covariant in A.
   The advantage of Option is that we don’t have to check for None at each stage of the computation and can apply several transformations before checking/handling None. Also since Option[A] is a different type than A, the compiler warns to handle the possibility of None instead of defering it.

Ordinary functions can be lifted to operate on Option using lift function were the map turns a function f of type A => B into a function of type Option[A] => Option[B]. Hence any function can be transformed (via lift) to operate within the context of a single Option value. The lift(f) returns a function which maps None to None and applies f to the contents of Some. Here f need not be aware of the Option type at all.

def lift[A,B](f: A => B): Option[A] => Option[B] = _ map f

val absO: Option[Double] => Option[Double] = lift(math.abs)

The Try function below is a general-purpose function which can used to convert from an exception-based API to an Option-oriented API. This uses a non-strict or lazy argument, as indicated by the => A as the type of a. Functions accepting a single argument may be called with braces instead of parentheses, hence Try { age.toInt } is equivalent to Try(age.toInt).
def Try[A](a: => A): Option[A] =
  try Some(a)
  catch { case e: Exception => None }

The map2 function means that we never need to modify any existing functions of two arguments to make them “Option-aware.” We can lift them to operate in the context of Option after the fact.
def map2[A,B,C](a: Option[A], b: Option[B])(f: (A, B) => C):
  Option[C] =
    a flatMap ( aa =>
                b map (bb => f(aa, bb))
     )

map2(optAge, optTickes)(insuranceRateQuote)

Sequence combines a list of Options into one Option containing a list of all the Some values in the original list. If the original list contains None even once, the result of the function is None.
def sequence[A](a: List[Option[A]]): Option[List[A]]

Traverse allows to map elements in the list to Option and return all the Option elements as a combined Option list of values.
def traverse[A, B](a: List[A])(f: A => Option[B]): Option[List[B]]

Scala provides a syntactic construct called the for-comprehension that it expands automatically to a series of flatMap and map calls. A for-comprehension consists of a sequence of bindings, like aa <- a, followed by a yield after the closing brace, where the yield may make use of any of the values on the left side of any previous <- binding. The compiler desugars the bindings to flatMap calls, with the final binding and yield being converted to a call to map.
def map2[A,B,C](a: Option[A], b: Option[B])(f: (A, B) => C):
  Option[C] =
    for {
       aa <- a
       bb <- b
    } yield f(aa, bb)

The Either data type which is an extension to Option allows to track the reason for failure. Either has only two cases were each case carries one value. The Right constructor is reserved for the success case and Left is used for failure.

sealed trait Either[+E, +A]
case class Left[+E](value: E) extends Either[E, Nothing]
case class Right[+A](value: A) extends Either[Nothing, A]

def safeDiv(x: Int, y: Int): Either[Exception, Int] =
  try Right(x / y)
  catch { case e: Exception => Left(e) }

Strictness and laziness
Consider refining a list of data using multiple criterias. This can be achieved by chaining the map and filter functions to evaluate the result using multiple transformations. For each transformation the map and filter each perform their own traversal of the input and allocate temperory lists for the output. These transformations can be fused into a single pass thus avoiding creation of temporary data structures using non-strictness.

Non-strictness is a property of a function. A strict function always evaluates its arguments while a non-strict function does not evaluate one or more of its arguments. In Scala any function definition is strict unless specified otherwise. The Boolean functions && and || which may choose not to evaluate their arguments, are the examples of non-strict funtions. The if statement which is a built-in contruct is a function in Scala which takes 3 arguments, the boolean condition and two clauses to be executed when the condition is true or false. Since the execution of the clauses is optional depending upon the boolean value of the evaluated condition, if functon is non-strict. In Scala a non-strict function takes its arguments by name rather than by value.

def if2[A](cond: Boolean, onTrue: () => A, onFalse: () => A): A =
   if (cond) onTrue() else onFalse()
if2(a < 22,
   () => println("a"),
   () => println("b")
)

A value of type () => A is a function that accepts zero arguments and returns an A, which is an syntactic alias for the type Function0[A]. The unevaluated form of an expression is called a thunk. The thunk can be forced to evaluate the expression and get a result by invoking the function, passing an empty argument list, e.g. onTrue() or onFalse() as above. Overall here we pass a function of no arguments in place of each non-strict parameter, and then explicitly call the function to obtain a result in the body. Scala has common syntax for wrapping the expression in a thunk as below, were the unevaluated arguments are passed with an arrow => before their type. The arguments which then were passed unevaluated to a function will be evaluated once for each place it’s referenced in the body of the function without caching the previously evaluated results. The value can be cached explicitly to only evaluate the result once, by using the lazy keyword, preventing repeated evaluations triggered by subsequent references.

def if2[A](cond: Boolean, onTrue: => A, onFalse: => A): A =
  if (cond) onTrue else onFalse

def maybeTwice2(b: Boolean, i: => Int) = {
  lazy val j = i
  if (b) j+j else 0
}

Strictness: If the evaluation of an expression runs forever or throws an error instead of returning a definite value, we say that the expression doesn’t terminate, or that it evaluates to bottom. A function f is strict if the expression f(x) evaluates to bottom for all x that evaluate to bottom.

No comments: