Post

Functional Programming in Scala - Day 2

Functional Programming in Scala - Day 2

Read from start of Section 2.5 to section 2.5.1

Summary

What are polymorphic functions?

Section 2.5 goes deeper into what polymorphic functions are. These functions are essentially functions that are abstracted over the type. For example, the following findFirst function is a monomorphic function, because it only works for a type of String. It is also monomorphic in the sense that the matching function is essentially fixed to the == function.

1
2
3
4
5
6
7
8
9
def findFirst(strings: Array[String], key: String): Int = {
  @annotation.tailrec
  def loop(i: Int): Int = {
    if (i >= strings.length) return -1
    else if (strings(i) == key) i
    else loop(i+1)
  }
  loop(0)
}

A polymorphic version of this function would now look like the following.

1
2
3
4
5
6
7
8
9
10
def findFirst[A](as: Array[A], predicate: A => Boolean): Int = {
  @annotation.tailrec
  def loop(i: Int): Int = {
    if (i >= as.length) return -1
    else if (predicate(as(i)) return i
    else loop(i+1)
  }

  loop(0)
}

Notice how the polymorphic version has 2 key differences from the monomorphic version. The first is that the type of the array for the first argument is now generalized so that it can accept any type. Next, the matching function has now been replaced by a predicate rather than just relying on the == function. In this way, we can abstract this method over the type of the array and the function used to search a key within it.

Exercise 2.2

Exercise 2.2 is an exercise where the task is to implement an isSorted function that checks whether an Array[A] is sorted according to a given comparison function. The method signature looks like the following: def isSorted[A](as: Array[A], ordered: (A,A) => Boolean): Boolean.

Attempt 1

Figuring out the solution itself was pretty simple. All I did was just write a double for loop to check to see if the ordered function returned true for every pair of elements in the array.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
def isSorted[A](as: Array[A], ordered: (A,A) => Boolean): Boolean = {
  @annotation.tailrec
  def outerLoop(i: Int): Boolean = {
    @annotation.tailrec
    def innerLoop(curr: Int, toCompare: Int): Boolean = {
      if (toCompare >= as.length) return true
      else if (!ordered(as(curr), as(toCompare))) return false
      else innerLoop(curr, toCompare+1)
    }

    if (i >= as.length) return true
    else if (!innerLoop(i-1, i)) return false
    else outerLoop(i+1)
  }
  outerLoop(1)
}

But then I realized that you don’t actually need to do a double for loop. Instead, it suffices to only use a single for loop and to compare if ordered(a, b) = true for every adjacent elements. By transitivity, if ordered(a, b) = true and ordered(b, c) = true, then ordered(a, c) = true. So when we find out that ordered(c, d) = true, then again by transitivity, we get that ordered(a, d) = true. So once we figure out that ordered is true for every adjacent element, then using transitivity, it is guaranteed that for every pair of elements, ordered holds. (This is all assuming that the first argument in ordered preceeds the second argument in the list). So the solution then simplifies to the following.

1
2
3
4
5
6
7
8
9
10
def isSorted[A](as: Array[A], ordered: (A, A) => Boolean): Boolean = {
  @annotation.tailrec
  def loop(i: Int): Boolean = {
    if (i >= as.length) return true
    else if (!ordered(as(i-1), as(i))) return false
    else loop(i+1)
  }
  
  loop(1)
}

Side Quest: Testing Polymorphic Functions

The Testing Code

As with any code, I wanted to write tests for the isSorted function. The interesting part came when I tried to mix the types that were assigned to A. As you can see, the last test is checking to see if the array ["hello", "world"] is in increasing order. We expect the result to be true.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
object IsSortedModule {
  def isSorted[A](as: Array[A], ordered: (A, A) => Boolean): Boolean = { ... }

  private def formatStatus[A]( ... ): String = {
    // Helper function that returns a "Pass" or "Fail" status message
    // based on if result == expected.
  }

  def main(args: Array[String]): Unit = {
    def isIntIncreasing(a: Int, b: Int) = a <= b
    def isIntDecreasing(a: Int, b: Int) = a >= b
    def isStringIncreasing(a: String, b: String) = a <= b

    var testCases = Array(
      (Array(1,2,3), isIntIncreasing, true),
      ...
      (Array(1, 2), isIntDecreasing, false),
      (Array("hello", "world"), isStringIncreasing, true),
    )

    def runCases(testCaseIndex: Int): Unit = {
      if (testCaseIndex >= testCases.length) return

      val (testCase, orderingFunction, expectedResult) = testCases(testCaseIndex)
      val result = isSorted(testCase, orderingFunction);
      val testStatus = formatStatus(testCaseIndex, testCase, result, expectedResult)
      println(testStatus)
      runCases(testCaseIndex+1)
    }

    runCases(0)
  }
}

The Error

However, when we try to compile this code, we get the following error. I could not figure it out for a while why this was happening, until I decided to actually take a look at the type with the REPL mode.

1
2
3
4
5
6
[error] Found:    (orderingFunction : (Int & String, Int & String) => Boolean)
[error] Required: (testCase.T, testCase.T) => Boolean
[error]       val result = isSorted(testCase, orderingFunction);
[error]                                       ^^^^^^^^^^^^^^^^
Error compiling project (Scala 3.7.4, JVM (21))
Compilation Failed

Scala is a statically typed language, where every variable’s type is known at compile time. Now since the array testCases is a mixed type array, its type is set to be Array[(Array[Int] | Array[String]), ((Int, Int) => Boolean | (String, String) => Boolean), Boolean]. Zooming in on the second argument of each test case, we see that the type of it is ((Int, Int) => Boolean | (String, String) => Boolean). In other words, it can be a function that takes in Int types, or String types. Since the function in the array testCases could be either (Int, Int) => Boolean or (String, String) => Boolean), the Scala compiler will set the type of orderingFunction to be something that can operate for both Int and String. Hence the line

1
[error] Found: (orderingFunction : (Int & String, Int & String) => Boolean)

If Scala had set the type of orderingFunction to take in either Int or String - i.e. (Int | String, Int | String) => Boolean) - then it would be possible to compile a piece of code where there is a type mismatch between the data of the test case and the function. For example, it would be possible to compile a piece of code like the following:

1
2
3
4
5
6
7
8
def isIntIncreasing(a: Int, b: Int) = a <= b
def isStringIncreasing(a: String, b: String) = a <= b

var testCases = Array(
  (Array("hello", "world"), isIntIncreasing, true),
  (Array(1,2,3), isStringIncreasing, true),
  ...
)

The error behavior would only be caught once the code actually runs, when it tries to do isIntIncreasing("hello", "world"), because obviously neither "hello" nor "world" are of type Int. So the compiler has to enforce that the function parameter must be of type (Int & String, Int & String) => Boolean. However, the isSorted function expects testCase.T in place of Int & String, where testCase.T is the type that is inferred from the data. In this case, the type inferred is Int | String so there is a type mismatch, leading to the compile error. The REPL is pretty nice to use to explore these things.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
scala> var arr = Array((Array(1,2,3), true), (Array("hello", "world"), true))
var arr: Array[(Array[Int] | Array[String], Boolean)] = Array((Array(1,2,3), true), (Array("hello", "world"), true)) 

scala> :type arr
Array[(Array[Int] | Array[String], Boolean)]

scala> :type arr(0)
(Array[Int] | Array[String], Boolean)

scala> :type arr(0)(0)
Array[Int] | Array[String]

scala> :type arr(0)(0)(0)
Int | String

The Fix

The solution that I came up with is to create a wrapper type that would preserve the type information. Then this can be fed into the test runner without the compiler getting confused about what type it is.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
object IsSortedModule {
  ...

  class TestCase[A,B](data: Array[A], fn: (A, A) => B, result: B) // <-- NEW!

  def main(args: Array[String]): Unit = {
    ...

    var testCases = Array(
      TestCase(Array(1,2,3), isIntIncreasing, true),
      ...
      TestCase(Array(1, 2), isIntDecreasing, false),
      TestCase(Array("hello", "world"), isStringIncreasing, true),
    )

    ...

    runCases(0)
  }
}

Notes

2.5 Polymorphic functions: abstracting over types

  • Up until this point, the book only talked about monomorphic functions, which are functions that operate on only one type of data.
  • This section introduces polymorphic functions.
    • These are functions that would work for any type that is given.
    • The notion of “polymorphism” is used in a slightly different meaning from OOP here. This form of polymorphism is sometimes called parametric polymorphism.
  • For example, the functions abs and factorial that we saw in the previous section are specific to arguments of type Int.
  • Even the HOF formatResult is fixed to operate on functions that take arguments of type Int.

2.5.1 An example of a polymorphic function

  • Polymorphic functions can be discovered by observing several monomorphic functions.
    • I mean, I have also felt this but I feel like there are several functions that can behave similarly, but not close enough so that you’d actually be able to create a polymorphic function out of them…
    • This also seems to be the difficult part of programming - finding abstractions.
      • I feel like this can also go very wrong when you find the incorrect abstraction.
      • I wonder if you can actually make something so abstract that changing it becomes a monumental task?
  • Below is an example of a monomorphic function that finds a String in an array.
    • 1
      2
      3
      4
      5
      6
      7
      8
      9
      
      def findFirst(ss: Array[String], key: String): Int = {
        @annotation.tailrec
        def loop(n: Int): Int =
          if (n >= ss.length) -1
          else if (ss(n) == key) n
          else loop(n+1)
      
        loop(0)
      }
      
    • This function is specialized for searching for a String in an Array of String values.
      • This sort of specialization for String types is what makes it a “monomorphic” function.
  • The important thing to realize that if you replace the type String in the monomorphic findFirst method with some generic type, then the method still has the same functionality.
  • Below is the polymorphic version of findFirst.
    • 1
      2
      3
      4
      5
      6
      7
      8
      9
      10
      11
      12
      13
      
      // Everything within the "[]" after the method name are the type parameters. 
      // So in this case "A" is a type parameter. You may call the type parameters 
      // anything you want. It is completely syntactically correct to replace "A" 
      // with say, "BEEPBOOP"
      def findFirst[A](as: Array[A], p: A => Boolean): Int = {
        @annotation.tailrec
        def loop(n: Int): Int = 
          if (n >= as.length) -1
          else if (p(as(n))) n
          else loop(n+1)
      
        loop(0)
      }
      
    • This is an example of a polymorphic function, sometimes called a generic function.
    • Notice how this is abstracting over the type of the array and the function used for searching in it.
    • The type parameter list introduces type variables that can be referenced in the rest of the type signature.
  • This exercise of turning monomorphic functions into polymorphic functions reminds me of the LazyArray problem, where you can continuously add maps.
    • I wonder if Scala would actually be a good tool for writing the code for that.

** Exercise 2.2 **

  • Implement isSorted, which checks whether an Array[A] is sorted according to a given comparison function.
  • The method signature must be def isSorted[A](as: Array[A], ordered: (A,A) => Boolean): Boolean

** Attempt 1 **

  • To get something to work wasn’t actually that difficult.
  • The algorithm itself is a double loop, where the outer loop will choose one element, and then the second loop will check all of the elements after the chosen element to see if it follows the ordering.
  • 1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    
    object IsSortedModule {
      def isSorted[A](as: Array[A], ordered: (A,A) => Boolean): Boolean = {
        @annotation.tailrec
        def outerLoop(n: Int): Boolean {
    
          @annotation.tailrec
          def innerLoop(curr: Int, toCompare: Int): Boolean = {
            if (toCompare >= as.length) true 
            else if (!ordered(as(curr), as(toCompare)) return false
            else innerLoop(curr, toCompare+1)
          }
    
          if (n >= as.length) true
          else if (!innerLoop(n-1, n)) return false
          else outerLoop(n+1)
        }
    
        outerLoop(1)
      }
    } 
    

** Attempt 2 **

  • However, after writing the solution to attempt 1, I realized that you don’t actually need 2 loops.
  • A single loop just comparing every adjacent pairs of elements if ordered(a,b) = true is enough.
  • The reason why I thought you needed two loops was because was to handle a case where there exists elements x and y where pos(y) - pos(x) > 1 and ordered(x,y) = false.
    • However, the single loop algorithm will take correctly return this as false.
    • Let’s say we have a list a_1, ..., a_i, ..., a_j, ..., a_n, where j - i > 1 and ordered(a_i, a_j) = false.
    • If there are any elements up until a_{j-1} where ordered returns false for adjacent pairs, then the algorithm is correct.
    • Now we are dealing with the case where ordered returns true for every adjacent pairs up until a_{j-1}.
    • By transitivity, it must be so that ordered(a_i, a_{j-1}) = true.
    • Then the assumption that ordered(a_{j-1}, a_j) = true cannot hold true because by transitivity, this would imply that ordered(a_i, a_j) = true. However, this is not the case.
    • Therefore, it must be so that ordered(a_{j-1}, a_j) = false.
    • This means that if we have a list a_1, ..., a_i, ..., a_j, ..., a_n, where j-i>1 and ordered(a_i, a_j) = false, then our algorithm would correctly capture that this is not an ordered list, as we have shown that in this situation, ordered(a_{j-1}, a_j) = false.
  • This means that the solution can be simplified to look like the following.
  • 1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    
    object IsSortedModule {
     def isSorted[A](as: Array[A], ordered: (A,A) => Boolean): Boolean = {
       @annotation.tailrec
       def loop(curr: Int, next: Int): Boolean = {
         if (next >= as.length) return true
         else if (!ordered(as(curr), as(next))) return false
         else loop(next, next+1)
       }
       loop(0,1)
     }
    } 
    
  • Consulting the solutions, it pretty much has the same solution. The official solutions just has a single input argument for the current index though.
This post is licensed under CC BY 4.0 by the author.