Functional Programming in Scala - Day 2
Read from start of Section 2.5 to section 2.5.1
Summary
What are polymorphic functions?
Section 2.5 goes deeper into what polymorphic functions are. These functions are essentially functions that are abstracted over the type. For example, the following findFirst function is a monomorphic function, because it only works for a type of String. It is also monomorphic in the sense that the matching function is essentially fixed to the == function.
1
2
3
4
5
6
7
8
9
def findFirst(strings: Array[String], key: String): Int = {
@annotation.tailrec
def loop(i: Int): Int = {
if (i >= strings.length) return -1
else if (strings(i) == key) i
else loop(i+1)
}
loop(0)
}
A polymorphic version of this function would now look like the following.
1
2
3
4
5
6
7
8
9
10
def findFirst[A](as: Array[A], predicate: A => Boolean): Int = {
@annotation.tailrec
def loop(i: Int): Int = {
if (i >= as.length) return -1
else if (predicate(as(i)) return i
else loop(i+1)
}
loop(0)
}
Notice how the polymorphic version has 2 key differences from the monomorphic version. The first is that the type of the array for the first argument is now generalized so that it can accept any type. Next, the matching function has now been replaced by a predicate rather than just relying on the == function. In this way, we can abstract this method over the type of the array and the function used to search a key within it.
Exercise 2.2
Exercise 2.2 is an exercise where the task is to implement an isSorted function that checks whether an Array[A] is sorted according to a given comparison function. The method signature looks like the following: def isSorted[A](as: Array[A], ordered: (A,A) => Boolean): Boolean.
Attempt 1
Figuring out the solution itself was pretty simple. All I did was just write a double for loop to check to see if the ordered function returned true for every pair of elements in the array.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
def isSorted[A](as: Array[A], ordered: (A,A) => Boolean): Boolean = {
@annotation.tailrec
def outerLoop(i: Int): Boolean = {
@annotation.tailrec
def innerLoop(curr: Int, toCompare: Int): Boolean = {
if (toCompare >= as.length) return true
else if (!ordered(as(curr), as(toCompare))) return false
else innerLoop(curr, toCompare+1)
}
if (i >= as.length) return true
else if (!innerLoop(i-1, i)) return false
else outerLoop(i+1)
}
outerLoop(1)
}
But then I realized that you don’t actually need to do a double for loop. Instead, it suffices to only use a single for loop and to compare if ordered(a, b) = true for every adjacent elements. By transitivity, if ordered(a, b) = true and ordered(b, c) = true, then ordered(a, c) = true. So when we find out that ordered(c, d) = true, then again by transitivity, we get that ordered(a, d) = true. So once we figure out that ordered is true for every adjacent element, then using transitivity, it is guaranteed that for every pair of elements, ordered holds. (This is all assuming that the first argument in ordered preceeds the second argument in the list). So the solution then simplifies to the following.
1
2
3
4
5
6
7
8
9
10
def isSorted[A](as: Array[A], ordered: (A, A) => Boolean): Boolean = {
@annotation.tailrec
def loop(i: Int): Boolean = {
if (i >= as.length) return true
else if (!ordered(as(i-1), as(i))) return false
else loop(i+1)
}
loop(1)
}
Side Quest: Testing Polymorphic Functions
The Testing Code
As with any code, I wanted to write tests for the isSorted function. The interesting part came when I tried to mix the types that were assigned to A. As you can see, the last test is checking to see if the array ["hello", "world"] is in increasing order. We expect the result to be true.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
object IsSortedModule {
def isSorted[A](as: Array[A], ordered: (A, A) => Boolean): Boolean = { ... }
private def formatStatus[A]( ... ): String = {
// Helper function that returns a "Pass" or "Fail" status message
// based on if result == expected.
}
def main(args: Array[String]): Unit = {
def isIntIncreasing(a: Int, b: Int) = a <= b
def isIntDecreasing(a: Int, b: Int) = a >= b
def isStringIncreasing(a: String, b: String) = a <= b
var testCases = Array(
(Array(1,2,3), isIntIncreasing, true),
...
(Array(1, 2), isIntDecreasing, false),
(Array("hello", "world"), isStringIncreasing, true),
)
def runCases(testCaseIndex: Int): Unit = {
if (testCaseIndex >= testCases.length) return
val (testCase, orderingFunction, expectedResult) = testCases(testCaseIndex)
val result = isSorted(testCase, orderingFunction);
val testStatus = formatStatus(testCaseIndex, testCase, result, expectedResult)
println(testStatus)
runCases(testCaseIndex+1)
}
runCases(0)
}
}
The Error
However, when we try to compile this code, we get the following error. I could not figure it out for a while why this was happening, until I decided to actually take a look at the type with the REPL mode.
1
2
3
4
5
6
[error] Found: (orderingFunction : (Int & String, Int & String) => Boolean)
[error] Required: (testCase.T, testCase.T) => Boolean
[error] val result = isSorted(testCase, orderingFunction);
[error] ^^^^^^^^^^^^^^^^
Error compiling project (Scala 3.7.4, JVM (21))
Compilation Failed
Scala is a statically typed language, where every variable’s type is known at compile time. Now since the array testCases is a mixed type array, its type is set to be Array[(Array[Int] | Array[String]), ((Int, Int) => Boolean | (String, String) => Boolean), Boolean]. Zooming in on the second argument of each test case, we see that the type of it is ((Int, Int) => Boolean | (String, String) => Boolean). In other words, it can be a function that takes in Int types, or String types. Since the function in the array testCases could be either (Int, Int) => Boolean or (String, String) => Boolean), the Scala compiler will set the type of orderingFunction to be something that can operate for both Int and String. Hence the line
1
[error] Found: (orderingFunction : (Int & String, Int & String) => Boolean)
If Scala had set the type of orderingFunction to take in either Int or String - i.e. (Int | String, Int | String) => Boolean) - then it would be possible to compile a piece of code where there is a type mismatch between the data of the test case and the function. For example, it would be possible to compile a piece of code like the following:
1
2
3
4
5
6
7
8
def isIntIncreasing(a: Int, b: Int) = a <= b
def isStringIncreasing(a: String, b: String) = a <= b
var testCases = Array(
(Array("hello", "world"), isIntIncreasing, true),
(Array(1,2,3), isStringIncreasing, true),
...
)
The error behavior would only be caught once the code actually runs, when it tries to do isIntIncreasing("hello", "world"), because obviously neither "hello" nor "world" are of type Int. So the compiler has to enforce that the function parameter must be of type (Int & String, Int & String) => Boolean. However, the isSorted function expects testCase.T in place of Int & String, where testCase.T is the type that is inferred from the data. In this case, the type inferred is Int | String so there is a type mismatch, leading to the compile error. The REPL is pretty nice to use to explore these things.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
scala> var arr = Array((Array(1,2,3), true), (Array("hello", "world"), true))
var arr: Array[(Array[Int] | Array[String], Boolean)] = Array((Array(1,2,3), true), (Array("hello", "world"), true))
scala> :type arr
Array[(Array[Int] | Array[String], Boolean)]
scala> :type arr(0)
(Array[Int] | Array[String], Boolean)
scala> :type arr(0)(0)
Array[Int] | Array[String]
scala> :type arr(0)(0)(0)
Int | String
The Fix
The solution that I came up with is to create a wrapper type that would preserve the type information. Then this can be fed into the test runner without the compiler getting confused about what type it is.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
object IsSortedModule {
...
class TestCase[A,B](data: Array[A], fn: (A, A) => B, result: B) // <-- NEW!
def main(args: Array[String]): Unit = {
...
var testCases = Array(
TestCase(Array(1,2,3), isIntIncreasing, true),
...
TestCase(Array(1, 2), isIntDecreasing, false),
TestCase(Array("hello", "world"), isStringIncreasing, true),
)
...
runCases(0)
}
}
Notes
2.5 Polymorphic functions: abstracting over types
- Up until this point, the book only talked about monomorphic functions, which are functions that operate on only one type of data.
- This section introduces polymorphic functions.
- These are functions that would work for any type that is given.
- The notion of “polymorphism” is used in a slightly different meaning from OOP here. This form of polymorphism is sometimes called parametric polymorphism.
- For example, the functions
absandfactorialthat we saw in the previous section are specific to arguments of typeInt. - Even the HOF
formatResultis fixed to operate on functions that take arguments of typeInt.
2.5.1 An example of a polymorphic function
- Polymorphic functions can be discovered by observing several monomorphic functions.
- I mean, I have also felt this but I feel like there are several functions that can behave similarly, but not close enough so that you’d actually be able to create a polymorphic function out of them…
- This also seems to be the difficult part of programming - finding abstractions.
- I feel like this can also go very wrong when you find the incorrect abstraction.
- I wonder if you can actually make something so abstract that changing it becomes a monumental task?
- Below is an example of a monomorphic function that finds a
Stringin an array.1 2 3 4 5 6 7 8 9
def findFirst(ss: Array[String], key: String): Int = { @annotation.tailrec def loop(n: Int): Int = if (n >= ss.length) -1 else if (ss(n) == key) n else loop(n+1) loop(0) }
- This function is specialized for searching for a
Stringin anArrayofStringvalues.- This sort of specialization for
Stringtypes is what makes it a “monomorphic” function.
- This sort of specialization for
- The important thing to realize that if you replace the type
Stringin the monomorphicfindFirstmethod with some generic type, then the method still has the same functionality. - Below is the polymorphic version of
findFirst.1 2 3 4 5 6 7 8 9 10 11 12 13
// Everything within the "[]" after the method name are the type parameters. // So in this case "A" is a type parameter. You may call the type parameters // anything you want. It is completely syntactically correct to replace "A" // with say, "BEEPBOOP" def findFirst[A](as: Array[A], p: A => Boolean): Int = { @annotation.tailrec def loop(n: Int): Int = if (n >= as.length) -1 else if (p(as(n))) n else loop(n+1) loop(0) }
- This is an example of a polymorphic function, sometimes called a generic function.
- Notice how this is abstracting over the type of the array and the function used for searching in it.
- The type parameter list introduces type variables that can be referenced in the rest of the type signature.
- This exercise of turning monomorphic functions into polymorphic functions reminds me of the LazyArray problem, where you can continuously add maps.
- I wonder if Scala would actually be a good tool for writing the code for that.
** Exercise 2.2 **
- Implement
isSorted, which checks whether anArray[A]is sorted according to a given comparison function. - The method signature must be
def isSorted[A](as: Array[A], ordered: (A,A) => Boolean): Boolean
** Attempt 1 **
- To get something to work wasn’t actually that difficult.
- The algorithm itself is a double loop, where the outer loop will choose one element, and then the second loop will check all of the elements after the chosen element to see if it follows the ordering.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
object IsSortedModule { def isSorted[A](as: Array[A], ordered: (A,A) => Boolean): Boolean = { @annotation.tailrec def outerLoop(n: Int): Boolean { @annotation.tailrec def innerLoop(curr: Int, toCompare: Int): Boolean = { if (toCompare >= as.length) true else if (!ordered(as(curr), as(toCompare)) return false else innerLoop(curr, toCompare+1) } if (n >= as.length) true else if (!innerLoop(n-1, n)) return false else outerLoop(n+1) } outerLoop(1) } }
** Attempt 2 **
- However, after writing the solution to attempt 1, I realized that you don’t actually need 2 loops.
- A single loop just comparing every adjacent pairs of elements if
ordered(a,b) = trueis enough. - The reason why I thought you needed two loops was because was to handle a case where there exists elements
xandywherepos(y) - pos(x) > 1andordered(x,y) = false.- However, the single loop algorithm will take correctly return this as false.
- Let’s say we have a list
a_1, ..., a_i, ..., a_j, ..., a_n, wherej - i > 1andordered(a_i, a_j) = false. - If there are any elements up until
a_{j-1}whereorderedreturns false for adjacent pairs, then the algorithm is correct. - Now we are dealing with the case where
orderedreturns true for every adjacent pairs up untila_{j-1}. - By transitivity, it must be so that
ordered(a_i, a_{j-1}) = true. - Then the assumption that
ordered(a_{j-1}, a_j) = truecannot hold true because by transitivity, this would imply thatordered(a_i, a_j) = true. However, this is not the case. - Therefore, it must be so that
ordered(a_{j-1}, a_j) = false. - This means that if we have a list
a_1, ..., a_i, ..., a_j, ..., a_n, wherej-i>1andordered(a_i, a_j) = false, then our algorithm would correctly capture that this is not an ordered list, as we have shown that in this situation,ordered(a_{j-1}, a_j) = false.
- This means that the solution can be simplified to look like the following.
1 2 3 4 5 6 7 8 9 10 11
object IsSortedModule { def isSorted[A](as: Array[A], ordered: (A,A) => Boolean): Boolean = { @annotation.tailrec def loop(curr: Int, next: Int): Boolean = { if (next >= as.length) return true else if (!ordered(as(curr), as(next))) return false else loop(next, next+1) } loop(0,1) } }
- Consulting the solutions, it pretty much has the same solution. The official solutions just has a single input argument for the current index though.