CS2030S Notes

AY2021/2022 Semester 2, written by See Toh Jin Wei
#nus #cs2030s

Summary Notes

Available here in PDF.

Stack & heap diagram summary is at the end of the above PDF.

Program and Compiler

Compiled vs Interpreted Programs

Compiler

A software tool that reads in the entire program written in a higher-level programming language and translates it into machine code.

C/C++ compiler -- clang.

Java can be compiled into bytecode and then interpreted and compiled on-the-fly by the Java Virutal Machine (JVM).

javac Hello.java -- to compile a Java program into a bytecode called Hello.class

java Hello -- to execute the Hello.class

Interpreter

Software that reads in the program one statement at a time interprets what the statement means, and executes its directly.

Python and Javascript executes like this.

Java can also be directly interepreted by Java interpreter.

jshell -- calls Java interpreter in interactive mode (exit with /exit)

jshell Hello.jsh to run all Java statements in a jshell file.


Variables and Types

Dynamic Typing vs Static Typing

Dynamic Typing

i = 4   # i is an integer
i = "5" # i is now a string

Static Typing

int i;   // declaring a variable
i = 4;   // ok
i = "5"; // error, cannot assign a string to an `int`

The type that a variable is assigned with is called the compile-time type.

Compiler will throw an error if there is a type mismatch.

Strong Typing vs Weak Typing

int i; // declare a variable
i = 4; // ok
i = (int)"5";

Line 3 is ok for weak typing, but will throw an error for strong typing.

| incompatible types: java.lang.String cannot be converted to int (Java example)

Types in Java

Primitive Types

Integral values

  • byte -- 8-bit signed integers
  • short -- 16-bit signed integers
  • int -- 32-bit signed integers
  • long -- 64-bit signed integers
  • char -- 16-bit unsigned integers representing UTF-16 Unicode characters

Floating-point values

  • float -- 16-bit floating-point numbers
  • double -- 32-bit floating-point numbers

Boolean values

  • true
  • false

Primitive type variables never share their value with each other.

int i = 1000;
int j = i; // j is 1000
i = i + 1; // i is updated

i; // 1001
j; // 1000 (unchanged)

Subtypes

Let S and T be two types. We say that T is a subtype of S if a piece of code written for variables of type S can also safely be used on variables of type T.

T <: S or S :> T to denote T is subtype of S.

* both transitive and reflexive (clearly not symmetric)

For Java: byte <: short <: int <: long <: float <: double char <: int

Right to left is narrowing (requires explicit type casting). Left to right is widening (always ok).

double d = 5.0;
int i = 5;
d = i; // ok
i = d; // error (incompatible types: possible lossy conversion from double to int)

Heap and Stack

JVM will partition memory into several regions.

method area for storing the code for the methods metaspace for storing meta information (metadata) about classes, like static fields which are shared amongst all class objects heap for storing dynamically allocated objects stack for storing local variables (including primitive types and object references) and call frames

Heap and stack are common to all execution environments.


Object Oriented Programming (OOP) Principles

  • Encapsulation
  • Abstraction
  • Inheritance
  • Polymorphism

Encapsulation

Abstraction: Composite Data Type

  • Groups primitive types together

(in C)

typedef struct {
  double x, y; // (x,y) coordinate of the center.
  double r; // radius
} circle;

// functions that operate on a circle
double circle_area(circle c) { ... };
bool   circle_contains_point(circle c, double x, double y) { ... };
  :

Abstraction: Class and Object (Encapsulation)

  • using classes

(in Java)

// Circle v0.1
class Circle {
  double x;
  double y;
  double r;

  double getArea() {
    return 3.141592653589793 * r * r;
  }
}

Circle c = new Circle(); // c is now a Circle object
c.r = 10; // set the radius to 10
c.getArea(); // returns 314.1592653589793

To model a problem in an object-oriented manner, we typically model the nouns as classes and objects, the properties or relationships among the classes as fields, and the verbs or actions of the corresponding objects as methods.

Circle class is a reference type.

Circle c1 = new Circle();
Circle c2 = c1;
System.out.println(c2.r); // print 0
c1.r = 10.0;
System.out.println(c2.r); // print 10.0

// c1 and c2 both refer to the same object because Circle is a reference type.

Special Reference Value: null

Circle c1;
c1.r = 10.0;  // error
|  Exception java.lang.NullPointerException

Remember to always instantiate a reference variable before using it.


Information Hiding

Data Hiding

// Circle v0.2
class Circle {
  private double x;
  private double y;
  private double r;

  public double getArea() {
    return 3.141592653589793 * r * r;
  }
}

If access modifier is not speciifed, it defaults to the default modifier.

  • Private modifier
    • not accessible and not modifiable outside of the class Circle
  • Public modifier
    • accessible and modifiable outside of the class Circle

Constructor

// Circle v0.3
class Circle {
  private double x;
  private double y;
  private double r;

  public Circle(double x, double y, double r) {
    // constructor method
    this.x = x;
    this.y = y;
    this.r = r;
  }

  public double getArea() {
    return 3.141592653589793 * this.r * this.r;
  }
}

Circle c = new Circle(0.0, 0.5, 10.0); // creates a Circle object

Constructor method

  • has the same name as the class
  • has no return type

The this keyword

Refers to the class variable, rather than a local variable or parameter.


Accessors and Mutators

// Circle v0.4
class Circle {
  private double x;
  private double y;
  private double r;

  public Circle(double x, double y, double r) {
    this.x = x;
    this.y = y;
    this.r = r;
  }

  public double getX() {
    return this.x;
  }

  public void setX(double x) {
    this.x = x;
  }

  public double getY() {
    return this.y;
  }

  public void setY(double y) {
    this.y = y;
  }

  public double getR() {
    return this.r;
  }

  public void setR(double r) {
    this.r = r;
  }

  public double getArea() {
    return 3.141592653589793 * this.r * this.r;
  }
}

Class methods responsible for accessing and modifying the class variables.

This is to avoid breaking the abstraction barrier.

Tell, Don't Ask Avoid using functions outside that directly reference the class variables. Instead, use class methods.

Try and avoid using getters and setters.


Class Fields

class Math {
  :
  public static final double PI = 3.141592653589793;
  :
  :
}

We call these static fields that are associated with a class as class fields, and fields that are associated with an object as instance fields. Note that, a static class field needs not be final and it needs not be public. Class fields are useful for storing pre-computed values or configuration parameters associated with a class rather than individual objects.

final variables are constants and cannot be re-assigned.

public double getArea() {
  return java.lang.Math.PI * this.r * this.r;
}
import java.lang.Math;

public double getArea() {
  return Math.PI * this.r * this.r;
  // as opposed to "return 3.141592653589793 * this.r * this.r;" as above
}

Use class fields whenever possible, instead of re-declaring a constant everywhere required.

"Final" Circle Class

// version 0.4
import java.lang.Math;

/**
 * A Circle object encapsulates a circle on a 2D plane.
 */
class Circle {
  private double x;  // x-coordinate of the center
  private double y;  // y-coordinate of the center
  private double r;  // the length of the radius

  /**
   * Create a circle centered on (x, y) with given radius
  */
  public Circle(double x, double y, double r) {
    this.x = x;
    this.y = y;
    this.r = r;
  }

  /**
   * Return the area of the circle.
   */
  public double getArea() {
    return Math.PI * this.r * this.r;
  }

  /**
   * Return true if the given point (x, y) is within the circle.
   */
  public boolean contains(double x, double y) {
    // didn't test, but should work.
    return ((x - this.x) * (x - this.x) + (y - this.y) * (y - this.y)) <= this.r * this.r;
  }
}

Class Methods

class Circle {
  private double x;  // x-coordinate of the center
  private double y;  // y-coordinate of the center
  private double r;  // the length of the radius
  private final int id; // identifier
  private static int lastId = 0; // the id of the latest circle instance

  /**
   * Create a circle centered on (x, y) with a given radius
   */
  public Circle(double x, double y, double r) {
    this.x = x;
    this.y = y;
    this.r = r;
    this.id = Circle.lastId;
    Circle.lastId += 1;
  }

  /**
   * Return how many circles have ever existed.
   */
  public static int getNumOfCircles() {
    return Circle.lastId;
  }
}

static method getNumOfCircles is associated with a class, not a instance of the class.

Circle.getNumOfCircles();

Main Method

public final static void main(String[] args) {
}

Every Java program has a class method called main.

Main method takes in command-line arguments as parameter args.


Composition

// version 0.5
import java.lang.Math;

/**
 * A Circle object encapsulates a circle on a 2D plane.
 */
class Circle {
  private Point c;   // the center
  private double r;  // the length of the radius

  /**
   * Create a circle centered on Point c with given radius r
  */
  public Circle(Point c, double r) {
    this.c = c;
    this.r = r;
  }

  /**
   * Return the area of the circle.
   */
  public double getArea() {
    return Math.PI * this.r * this.r;
  }

  /**
   * Return true if the given point p is within the circle.
   */
  public boolean contains(Point p) {
    // TODO Left as an exercise
    return false;
  }
}

Using a Point class to further abstract the coordinates system.

Sharing References (Aliasing)

Point p = new Point(0, 0);
Circle c1 = new Circle(p, 1);
Circle c2 = new Circle(p, 4);

p.moveTo(1, 1); // mutator method for Point

// now, c1 and c2 are both centered around (1, 1) instead of (0, 0)

If moveTo is instead a Circle method instead of Point method, doing c1.moveTo(p2) will mutate c1 but not c2.


Inheritance

Subtypes

Extending classes with inheritance.

Java classes only can extend (inherit) from 1 parent class. Java classes can implement multiple interfaces.

// version 0.3 (using inheritance)
class ColoredCircle extends Circle {
  private Color color;

  public ColoredCircle(Point center, double radius, Color color) {
    super(center, radius);  // call the parent's constructor
    this.color = color;
  }
}

Circle is the parent class or superclass of ColoredCircle. ColoredCircle is a subclass of Circle.

ColoredCircle inherits from Circle since all the public fields and methods of Circle are now accessible to ColoredCircle. NOT private.

super calls the constructor of the superclass.

ColoredCircle c = new Circle(p, 0); // error
Circle c = new ColoredCircle(p, 0, blue); // OK

// where p is a Point object and blue is a Color object.

Compile-time type of c is Circle. Run-time type of c is ColoredCircle.


Overriding

If a class does not explicitly inherit from another class, the class will inherit from Object class implicitly. Object is the at the root of the class hierarchy.

final declaration prevents a method from being overridden.

The toString Method

toString is invoked implicitly by Java, by default to convert a reference object to a String object during string concatenation using the operator +.

Circle c = new Circle(new Point(0, 0), 4.0);
String s = "Circle c is " + c; // Circle c is Circle@1ce92674

The 92674 behind the Circle@ is the reference to the object (and can change based on different runs).

Overriding toString method for a class

// version 0.6
class Circle {
  private Point c;   // the center
  private double r;  // the length of the radius
  :
  :
  /**
   * Return the string representation of this circle.
   */
  @Override
  public String toString() {
      return "{ center: " + this.c + ", radius: " + this.r + " }";
  }
}

Point p = new Point(0.0, 0.0);
Circle c = new Circle(p, 4.0);
c.toString(); // { center: (0.0, 0.0), radius: 4.0 }

This is known as method overriding, to alter the behaviour of an existing class.

Circle::toString has overriden Object::toString.

Annotation

@Override is an annotation.

It is not part of the program and does not affect the bytecode generated. It is merely a hint to the compiler.

But, recommended and expected to do this (in CS2030S).

Overriding Method that Throws Exception

The overriding method must throw only the same, or a more specific checked exception, than the overridden method. This follows Liskov Substitution Principle.


Polymorphism

One thing can have many forms. Access objects of different types through the same interface.

void say(Object obj) {
    System.out.println("Hi, I am " + obj.toString());
}

Point p = new Point(0, 0);
say(p); // Hi, I am (0.0, 0.0)
Circle c = new Circle(p, 4);
say(c); // Hi, I am { center: (0.0, 0.0), radius: 4.0 }

Dynamic Binding

The toString methods called are different.

Whichever method is invoked is decided during the run-time based on the run-time type of the obj. This is called dynamic binding or late binding or dynamic dispatch.

Applied to only instance method invocations.

The most specific method will be called.

Run-time type of target and compile-time type of arguments are used to determine the method that is called during execution of the program.

The compile-time return type is used to a certain extent as well (as a sanity check) to make sure that the required method actually exists.

The equals method

Circle c0 = new Circle(new Point(0, 0), 10);
Circle c1 = new Circle(new Point(0, 0), 10);
Circle c2 = c1;

c0.equals(c1); // false
c2.equals(c1); // true

Because c0 and c1 refer to different objects but c1 and c2 refer to the same object.

Overriding Object::equals

/**
 * Return true the object is the same circle (i.e., same center, same radius).
 */
@Override
public boolean equals(Object obj) {
    if (obj instanceof Circle) {
        Circle circle = (Circle) obj;
        return (circle.c.equals(this.c) && circle.r == this.r);
    }
    return false;
}
  • equals takes in a parameter of compile-time type Object. It only makes sense if we compare (during run-time) a circle with another circle. So, we first check if the run-time type of obj is a subtype of Circle. This is done using the instanceof operator. The operator returns true if obj has a run-time type that is a subtype of Circle.
  • To compare this circle with the given circle, we have to access the center c and radius r. But if we access obj.c or obj.r, the compiler will complain. As far as the compiler is concerned, obj has the compile-time type Object, and there is no such fields c and r in the class Object! This is why, after assuring that the run-time type of obj is a subtype of Circle, we assign obj to another variable circle that has the compile-time type Circle. We finally check if the two centers are equal (again, Point::equals is left as an exercise) and the two radii are equal2.
  • The statement that assigns obj to circle involves type casting. We mentioned before that Java is strongly typed and so it is very strict about type conversion. Here, Java allows type casting from type TT to SS if S<:TS<:T. This is called narrowing type conversion. Unlike widening type conversion, which is always allowed and always correct, a narrowing type conversion requires explicit typecasting and validation during run-time. If we do not ensure that obj has the correct run-time type, casting can lead to a run-time error (which if you recall, is bad).

Overriding Circle::equals MUCH better!

class Circle {
     :
  /**
   * Return true the object is the same circle (i.e., same center, same radius).
   */
  @Override
  public boolean equals(Circle circle) {
      return (circle.c.equals(this.c) && circle.r == this.r);
  }
}

Does not override Object::equals.


Liskov Substitution Principle (LSP)

This is a guideline!

The need to ensure that any inheritance with method overriding does not introduce bugs to existing code.

Changing expected behaviour of code violates LSP.

"Let ϕ(x)\phi(x) be a property provable about objects xx of type TT. Then ϕ(y)\phi(y) should be true for objects yy of type SS where S<:TS<:T".

Example of Violation of LSP

void displayGrade(Module m, double marks) {
    char grade = m.marksToGrade(marks);
    if (grade == 'A')) {
        System.out.println("well done");
    else if (grade == 'B') {
        System.out.println("good");
    else if (grade == 'C') {
        System.out.println("ok");
    } else {
        System.out.println("retake again");
    }
}

Module::marksToGrade returns 'A', 'B', 'C', 'F' (char)

Suppose CSCUModule where CSCUModule <: Module and CSCUModule::marksToGrade only returns 'S', 'U' (char).

This will cause displayGrade to fail to work properly because CSCUModule::marksToGrade violates LSP.


Abstract Class vs Concrete Class

Abstract Class

Good to make functions that take in more general classes. Abstract classes should contain at least 1 abstract method, otherwise, it does not make sense to be abstract.

// version 0.3
double findLargest(Shape[] array) {
  // as opposed to Circle[] array
  double maxArea = 0;
  for (Shape curr : array) {
    double area = curr.getArea();
    if (area > maxArea) {
      maxArea = area;
    }
  }
  return maxShape;
}

This function can work for an array containing different shapes too.

abstract class Shape {
    private int numOfAxesOfSymmetry ;

    public boolean isSymmetric() {
        return numOfAxesOfSymmetry > 0;
    }

    abstract public double getArea();
}

Shape s = new Shape(); // error, abstract class cannot be instantiated

An abstract class in Java is a class that has been made into something so general that it cannot and should not (cannot) be instantiated.

As long as one of the class methods is abstract, the class becomes abstract. Abstract classes can include "default" non-abstract methods that will be used if not overridden.

Shape::isSymmetric is a concrete method but Shape::getArea is abstract.

Concrete Class

A concrete class is a class that has no abstract methods.


Interface

An interface models what an entity can do, the name usually ends with the -able suffix. An interface is also a type.

interface GetAreable {
  double getArea();
}

All methods in an interface are public abstract by default. Interfaces can have static fields (normally only used for constants).

abstract class Shape implements GetAreable {
  private int numOfAxesOfSymmetry ;

  public boolean isSymmetric() {
    return numOfAxesOfSymmetry > 0;
  }
}

So now, Shape will have a public abstract double getArea() method associated with it.

class Flat extends RealEstate implements GetAreable {
    private int numOfRooms;
    private String block;
    private String street;
    private int floor;
    private int unit;

    @Override
    public double getArea() {
        :
    }
}

Flat is a concrete class that implements an interface. Flat is required to override all abstract methods from the interface to be considered concrete.

  • A class can only extend from one superclass, but it can implement multiple interfaces.
  • An interface can extend from one or more interfaces, but an interface cannot extend from class.

If a class CC implements an interface II, C<:IC<:I. This definition implies that a type can have multiple super-types.

Flat <: GetAreable and Flat <: RealEstate.

Once an interface is defined, it is difficult to change it, as all the classes that implement it would have to be changed to accommodate the new change as well.

Pure vs Impure Interfaces

Interfaces can have default methods. However, this is considered impure and used mostly for backwards compatibility reasons.

Don't implement default methods for CS2030S!


Wrapper Class

A wrapper class encapsulates a type, rather than fields and methods. It can be used and behaves like every other class.

All primitive wrapper classes are immutable.

Wrapper class Integer for int

Integer i = new Integer(4);
int j = i.intValue();

Auto-boxing and Unboxing

Integer i = 4; // auto-boxing
int j = i;     // unboxing

Treat the wrapper as a box.

Java automatically converts a wrapper class to its primitive type and vice versa auto-magically.

However, wrapper classes are much slower as they are immutable. (about 2 times slower)

Final Keyword

Variable -> cannot be re-assigned (can be initialised only once) Method -> cannot be overridden by child classes Class -> cannot be extended (inherited)

Method Overloading

Constructors can also be overloaded.

Overload by defining another method of the same name but different method signature. Method descriptor = return type + method signature. Overloading requires changing the order, number and/or type of arguments. (not name)

// original method
public double foo(int x, double y) {}

// not overload (changing return type -- will not compile)
public int foo(int x, double y) {}

// not overload (changing name of parameters)
public double foo(int y, double x) {}

// overloading (different arguments)
public double foo() {}

// overloading (different arguments, even different order is different)
public double foo(double x, int y) {}

// overloading (different arguments)
public double foo(int x, double y, float z) {}

// not overload (different method name)
public double bar(int x, double y) {}

Type Casting

// version 0.4
GetAreable findLargest(GetAreable[] array) {
  double maxArea = 0;
  GetAreable maxObj = null;
  for (GetAreable curr : array) {
    double area = curr.getArea();
    if (area > maxArea) {
      maxArea = area;
      maxObj = curr;
    }
  }
  return maxObj;
}

GetAreable[] circles = new GetAreable[] {
  new Circle(new Point(1, 1), 2),
  new Circle(new Point(0, 0), 5)
};

GetAreable ga = findLargest(circles);  // ok
Circle c1 = findLargest(circles); // error
Circle c2 = (Circle) findLargest(circles); // ok

findLargest returns a GetAreable object, thus need to type-cast it to be a object of compile-time type of Circle.

Note: Compiler cannot verify if object has a run-time type of Circle (or subtype).


Variance

Object[] objArray = new Object[] { new Integer(1), new Integer(2) };
Integer[] intArray = new Integer[] { new Integer(1), new Integer(2) };

contains(objArray, new Integer(1)); // ok
contains(intArray, new Integer(1)); // ok

It is possible to assign an instance with run-time type Integer[] to a variable with compile-time type of Object[].

Let C(S)C(S) be some complex type based on type SS. e.g. An array of type SS.

  • covariant if S<:TC(S)<:C(T)S<:T\to C(S)<:C(T)
  • contravariant if S<:TC(T)<:C(S)S<:T\to C(T)<:C(S)
  • invariant if neither covariant nor contravariant

Java Arrays are covariant.

Integer[] intArray = new Integer[2] {
  new Integer(10), new Integer(20)
};
Object[] objArray;
objArray = intArray; // ok because covariant
objArray[0] = "Hello!"; // <- compiles but will crash on runtime

Exceptions

try catch finally keywords!

Try catch finally blocks

try {
    // do something
} catch (FileNotFoundException e) { // or some other exception
    // handle exception
} finally {
    // clean up code
    // executes regardless of there is an exception or not
}

Can chain multiple catches.

Throwing Exceptions

  :
  public Circle(Point c, double r) throws IllegalArgumentException {
    if (r < 0) {
      throw new IllegalArgumentException("radius cannot be negative.");
    }
    this.c = c;
    this.r = r;
  }
}

throws and throw keywords!

throw new SomeException("Optional Message");

Throwing an exception will cause the method to immediately return.

Checked vs Unchecked Exceptions

Unchecked Exceptions Generally indicates that there is something wrong with the program and might cause run-time errors. Some examples: IllegalArgumentException, NullPointerException, ClassCastException.

Can mean some cases of input are not handled properly (if at all).

An unchecked exception, if not caught, will propagate automatically down the stack until either, it is caught or if it is not caught at all, resulting in an error message displayed to the user.

Checked Exceptions Exceptions that a programmer has no control over. For example, when attempting to open a file, in some cases, it cannot be opened.

In Java, unchecked exceptions are subclasses of RuntimeException.

Checked exceptions must be handled otherwise, will not compile.

// version 0.2 (handle where exception occur)
class Toy {
  static FileReader openFile(String filename) {
    try {
      return new FileReader(filename);
    } catch (FileNotFoundException e) {
      System.err.println("Unable to open " + filename + " " + e);
    }
  }
  public static void main(String[] args) {
    openFile();
  }
}

// version 0.3 (passing exception to caller)
class Toy {
  static FileReader openFile(String filename) throws FileNotFoundException {
    return new FileReader(filename);
  }
  public static void main(String[] args) {
    try {
      openFile();
    } catch (FileNotFoundException e) {
      // warn user and pop up dialog box to select another file.
    }
  }
}

A method has to either catch the exception or throws the exception onto the caller of that function.

Custom Exceptions

class IllegalCircleException extends IllegalArgumentException {
  Point center;
  IllegalCircleException(String message) {
    super(message);
  }
  IllegalCircleException(Point c, String message) {
    super(message);
    this.center = c;
  }
  @Override
    public String toString() {
      return "The circle centered at " + this.center + " cannot be created:" + getMessage();
    }
}

Should only create own exceptions if there is a good reason to do so, for example: to provide additional useful information to the exception handler.

Good Practices for Exception Handling

If exception occurs before some important deallocating process, need to think about how to circumvent this. e.g. Deallocate those resources in the finally block.

Bad Practices for Exception Handling

Do not use "catch-all"

catch (Exception e) { ... }

Do not exit a program just beacuse of an exception. This prevents the calling function from cleaning up the resources. Do not exit a program silently.

try {
  // your code
}
catch (Exception e) {
  System.exit(0);
}

Do not break abstraction barrier Try to handle the implementation-specific exceptions within the abstraction barrier.

Do not use exceptions as control flow mechanism (if/else)

Error

Java has an Error class for situations where the program should terminate as it is hard to recover from. OutOfMemoryError StackOverflowError Typically, do not need to create or handle these errors.


Generics

Generics are invariant.

We will get compile-time errors as opposed to run-time errors, if we use Object and type-casting.

Static fields cannot be generic.

Generic Type

class Pair<S,T> {
  private S first;
  private T second;

  public Pair(S first, T second) {
      this.first = first;
      this.second = second;
  }

  S getFirst() {
      return this.first;
  }

  T getSecond() {
      return this.second;
  }
}

Pair<String,Integer> foo() {
  return new Pair<String,Integer>("hello", 4);
}

Pair<String,Integer> p = foo();
Integer i = (Integer) p.getFirst(); // compile-time error

Method that returns a pair with String, Integer.

Must use Integer, since only reference types can be used as type arguments.

The types are bound during compile time.

Once a generic type is instantiated, it is called a parameterised type.

class DictEntry<T> extends Pair<String,T> {
     :
}

Can pass the type parameter of a generic type to another.

Generic Method

class A {
    // version 0.4 (with generics)
    public static <T> boolean contains(T[] array, T obj) {
      for (T curr : array) {
        if (curr.equals(obj)) {
          return true;
        }
      }
      return false;
    }
}

String[] strArray = new String[] { "hello", "world" };
A.<String>contains(strArray, 123); // type mismatch error
// second argument must be a string
class A {
    // version 0.5
    public static <T extends GetAreable> T findLargest(T[] array) {
        double maxArea = 0;
        T maxObj = null;
        for (T curr : array) {
            double area = curr.getArea();
            if (area > maxArea) {
                maxArea = area;
                maxObj = curr;
            }
        }
        return maxObj;
    }
}

Specifying that T must be a subtype of GetAreable. Uses extends even though GetAreable is an interface.

class Pair<S extends Comparable<S>,T> implements Comparable<Pair<S,T>> {
  private S first;
  private T second;

  public Pair(S first, T second) {
      this.first = first;
      this.second = second;
  }

  S getFirst() {
      return this.first;
  }

  T getSecond() {
      return this.second;
  }

  @Override
  public int compareTo(Pair<S,T> s1) {
      return this.first.compareTo(s1.first);
  }

  @Override
  public String toString() {
    return this.first + " " + this.second;
  }
}

Enabling Pair class to be compared by extending Comparable.

So, java.util.Arrays.sort will work.

Can use multiple bounds with Something<T extends A & B & C>.

a.compareTo(b) => a - b


Type Erasure

Java "erases" generics after compiling. Basically, generics are only used for type-checking during compile-time in Java and do not exist during run-time.

Unbounded types are replaced with Object. Bounded types are replaced with its bound. They are then type-casted around. This is done by compiler so its safe!

This is unlike most other statically-typed languages like C++, Rust, C#. They will compile into multiple methods.

// the class itself is also transformed

Integer i = new Pair<String,Integer>("hello", 4).getSecond();
// transformed to
Integer i = (Integer) new Pair("hello", 4).getSecond();

Arrays and Generics Can't Mix

Will cause heap pollution.

Arrays are reifiable type -- a type where full type information is available during run-time.

But Java generics are not reifiable due to type erasure.

Illegal Java syntax

Pair<String,Integer>[] pairArray = new Pair<String,Integer>[2];
new Pair<S,T>[2];
new T[2];

Unchecked Warnings

// should put a comment to explain why its safe to supress the warnings
// also should put SupressWarnings as close as possible to the actual thing you are supressing
@SupressWarnings("unchecked")
T[] array = (T[]) new Object[n];

Raw Types

A raw type is a generic type but used without type arguments.

Compiler will give a warning when using raw types because it can't do any type checking.

Array a = new Array(4);

Raw types only exists for backwards compatibility.

ArrayList<String> as = new ArrayList<String>();
as instanceof ArrayList // true
as instanceof ArrayList<String> // does not compile

Wildcards (?)

Useful because generics are invariant. So we need to explicitly specify what is allowed.

Note that types are also reflexively their own super/child classes.

PECS

"Producer Extends, Consumer Super."

Upper Bounded Wildcard

? extends T

public void copyFrom(Array<? extends T> src) {
  int len = Math.min(this.array.length, src.array.length);
  for (int i = 0; i < len; i++) {
      this.set(i, src.get(i));
  }
}

Only matches for classes that are child classes of T.

Use extends because src is the producer.

A<S> <: A<? extends S>

Covariant.

Lower Bounded Wildcard

? super T.

public void copyTo(Array<? super T> dest) {
  int len = Math.min(this.array.length, dest.array.length);
  for (int i = 0; i < len; i++) {
      dest.set(i, this.get(i));
  }
}

Only matches for classes that are superclasses of T.

Uses super because dest is the consumer.

A<S> <: A<? super S>

Contravariant.

Unbounded Wildcard

Array<?> will accept any type of Array<>.

Array<?> a1 = new Array<String>(0);       // Does compile
Array<?> a2 = new Array<Integer>(0);      // Does compile

Array<Object> a1 = new Array<String>(0);  // Does not compile
Array<Object> a2 = new Array<Integer>(0); // Does not compile

// this is because generics in Java are invariant.

For any type S: A<S> <: A<?> A<? super S> <: A<?> A<? extends S> <: A<?>

Unbounded Wildcards VS Raw Types

Array<?> is an array of objects of some specific, but unknown type. Array<Object> is an array of Object instances, with type checking by the compiler. Array is an array of Object instances, without type checking.

a instanceof A<?>

Type Inference

Diamond Operator

Pair<String,Integer> p = new Pair<>();
Pair<String,Integer> p = new Pair<String,Integer>();
// are both equivalent!

Target Typing

public static <T extends GetAreable> T findLargest(Array<? extends T> array);

Can also specify the type of parameter T.


Immutability

Aliasing is when reference types may share the same reference values.

Making classes immutable. Good to make the class itself final to disallow inheritance as subclasses may violate property of immutability. BUT not always because might have some utility class that want to extend this. (but can make some specific methods final instead)

Instance cannot have any visible changes outside its abstract barrier. Avoids common bug due to aliases (two objects sharing the same reference object, thus when one changes, both changes).

Make fields final.

Good to use factory methods to construct immutable objects instead of exposing the constructor (so that you can return a shared object, i.e. origin, empty_box).

Advantages of Immutability

  • Ease of understanding

  • Enabling safe sharing of objects

    • Example: ORIGIN Point, EMPTY_BOX Box
    • No problem of aliasing
  • Enabling safe sharing of internals

  • Enabling safe concurrent execution

  • can be used to represent same object at different stages

  • can "revert" objects to a previous state

Variadic Arguments

// @SafeVarargs ise used here because compiler would throw unchecked warning as generics and arrays do not mix well
@SafeVarargs
public static <T> ImmutableArray<T> of (T... items) {
  return new ImmutableArray(items, 0, items.length - 1);
}

Nested Classes

Nested classes act like fields. Used to group logically relevant classes together. Typically, a nested class is tightly coupled with the container class and would have no use outside of the container class. Nested classes can be used to encapsulate information within a container class. Nested class can access the private fields of the container class! Nested classes can be declared as private if there is no need to expose the class outside the barrier.

Nested Non-Static Classes

Also known as inner class.

Inner classes cannot have static fields because they themselves belong to an instance (hence does not make sense to have static fields). (unless they're constants written and known at compile time, e.g. int literals, string literals)

Nested Static Classes

class A {
  private int x;
  static int y;

  class B {
    void foo() {
      this.x = 1;   // error
      A.this.x = 1; // this is preferred
      y = 1; // accessing y from A is OK
    }
  }

  // the definition of static nested classes are stored in the metaspace
  static class C {
    void bar() {
      x = 1; // accessing x from A is not OK since C is static
      y = 1; // accessing y (static field) is OK
    }
  }
}

Local Classes

Classes inside a method. Scoped only inside the method.

// Comparator is a common use case for local classes
void sortNames(List<String> names) {

  class NameComparator implements Comparator<String> {
    public int compare(String s1, String s2) {
      return s1.length() - s2.length();
    }
  }

  names.sort(new NameComparator());
}

Variable Capture

class G {
  void doTask() {
    int x = 0;
    class H {
      int inc() {
        return x + 1;
      }
    }
    x = 1; // removing this statement will lead to this compiling

    H h = new H();
    h.inc();
  }
}

Local variables (including primitives) referenced from an inner class must be final or effectively final (only assigned once even though not explicitly final)!

Usually, when a method returns, all local variables of the method are removed from the stack. BUT an instance of a local class can still exist. Thus, local classes make a copy of (captures) local variables inside itself (even if unused as a failsafe).

Anonymous Class

names.sort(new Comparator<String>() {
  @Override
  public int compare(String s1, String s2) {
    return s1.length() - s2.length();
 }
});

Essentially instantiating an anonymous class that implements Comparator<String>.


Pure Functions

Just like mathematical functions Deterministic! No side effects! (Effect-free, does not affect external environment) -> once stack frame done, nothing should be left except for the final result Referential transparency (no internal state) -> always same output (can substitute)

Impure functions are hard to prove correctness and harder to understand compared to pure functions.

Java functions are not first-class objects! So we have to use some trick.

No Side Effects

Violated through:

  • print to screen
  • write to files
  • throw exceptions
    • be careful of division by zero, FileNotFoundException and other errors
  • change other variables
  • modifying values of arguments (of reference types)

Referential Transparency

No internal state

Violated through:

  • using instance variables (that are not final)

Functional Interface

@FunctionalInterface annotation

An interface in Java with only one abstract method is called a functional interface.

Comparator<T> is a functional interface.

Lambda Expressions

Functions can be first-class citizens (using anonymous classes) BUT very verbose, so use lambda expressions instead to achieve the same thing.

Can use lambda expression to replace anonymous class (only in functional interfaces!)

Note: Lambda expressions do not have to be lazy.

(x, y) -> x + y i -> i + 1

(x, y) -> {}

Don't have to specify the type of the arguments because of type inferencing. (because already declared in interface, and only got 1 method there)

Comparator<Integer> cmp = (x, y) -> x - y;

Only used variables are captured.

Currying Functions

Chaining functions to avoid issue of being allowed only one argument

// Transformer::transform only takes in 1 value
Transformer<Integer, Transformer<Integer, Integer>> add = x -> y -> (x + y);

add.transform(4).transform(5);
// 9

Lambda as Closure

Value of origin is captured by the lambda expression dist.

// origin has to be final or effectively final
Point origin = new Point(0,0);
Transformer<Point, Double> dist = p -> origin.distanceTo(p);
Transformer<Point, Double> dist = origin::distanceTo;
// above 2 are equivalent

Cross-Barrier State Manipulator

Lambda expressions allow you (as the client) to "customise" some implementation into the Implementor's methods (including modifying Implementor's internal states).

Maybe

This is so that we can have different behaviour for "having value" and "null" yet not having to create a special Null class for every type of object.

Lazy/Delayed Evaluation

Lazy = opposite of Eager. Done using lambda expressions.

@FunctionalInterface
interface Task {
  void run();
}

Task t = () -> { System.out.println("Execute"); };
Task t = () -> System.out.println("Execute");
// above 2 are equivalent

t.run(); // Execute printed from this

Infinite Lists

Making use of lazy evaluation to enable infinite lists. A list with eager evaluation cannot be infinite.

The head and tail are both producers and only evaluated when needed. Head: producer to get the current element Tail: producer to get the rest of the elements

Can use an EmptyList to denote the end of the infinite list (if truncating it).

Saves a lot of memory because don't actually hold the actual values in advance, only computed when needed.

But not applicable for all situations.

class InfiniteList<T> {
  private Producer<T> head;
  private Producer<InfiniteList<T>> tail;

  public InfiniteList(Producer<T> head, Producer<InfiniteList<T>> tail) {
    this.head = head;
    this.tail = tail;
  }

  public T head() {
    return this.head.produce();
  }

  public InfiniteList<T> tail() {
    return this.tail.produce();
  }
}

  public static <T> InfiniteList<T> generate(Producer<T> producer) {
    return new InfiniteList<T>(producer,
        () -> generate(producer));
}

public static <T> InfiniteList<T> iterate(T init, Transformer<T, T> next) {
    return new InfiniteList<T>(() -> init,
      () -> iterate(next.transform(init), next));
}

public <R> InfiniteList<R> map(Transformer<? super T, ? extends R> mapper) {
  return new InfiniteList<>(
      () -> mapper.transform(this.head()),
    () -> this.tail().map(mapper));
}

  public InfiniteList<T> filter(BooleanCondition<? super T> cond) {
    Producer<T> newHead = () -> cond.test(this.head()) ? this.head() : null;
    return new InfiniteList<>(newHead, () -> this.tail().filter(cond));
  // BUT this results in a null value (which InfiniteList will now no longer be able to store)
  // also have to change head() and tail() to ignore null
}

Memoize

Only makes sense when function is deterministic.

class Lazy<T> {
  private T value;
  private boolean evaluated;
  private Producer<T> producer;

  public Lazy(Producer<T> producer) {
    this.producer = producer;
    this.value = null;
    this.evaluated = false;
  }

  public T get() {
    // only evaluates the first time you get()
    if (!evaluated) {
      this.value = this.producer.get();
      this.evaluated = true;
    }
    return this.value;
  }
}

Tail-End Recursion

static long sum(long n, long result) {
  if (n == 0) {
    return result;
  } else {
    return sum(n - 1, n + result);
  }
}

Classic tail-end recursion example to compute 1 to n.

No computation is done after recursive call returns (no deferred operations).

Stack overflow error for large values sum(100000, 0).


Streams

https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/stream/Stream.html

// Streams need to be imported
import java.util.stream.*;
CS2030Sjava.util.function
BooleanCondition<T>::testPredicate<T>::test
Producer<T>::produceSupplier<T>::get
Transformer<T,R>::transformFunction<T,R>::apply
Transformer<T,T>::transformUnaryOp<T>::apply
Combiner<S,T,R>::combineBiFunction<S,T,R>::apply
Maybe<T>java.util.Optional<T>
Lazy<T>N/A
InfiniteList<T>java.util.stream.Stream<T>

Stream.iterate(0, x -> x + 1) generates an infinite list of sorted non-negative numbers.

Streams can only be operated on once. Attempting to iterate through a stream multiple times throws a IllegalStateException.

Java also has IntStream, LongStream, ... that contains primitive values instead of wrapper classes

range(x, y) x is inclusive, y is exclusive rangeClosed(x, y) both x and y are inclusive

List::stream() constructs a stream based on the values in a List.

Terminal Operations

Terminal operation is an operation on the stream that triggers the evaluation of the stream. A typical way of writing code that operates on streams is to chain a series of intermediate operations together, ending with a terminal operation.

forEach, reduce etc count() returns the count of the stream as a long

Stream.of(1, 2, 3).forEach(System.out::println);
Stream.generate(() -> 1).forEach(System.out::println); // infinite loop

Intermediate Stream Operations

An intermediate operation on a stream that returns another stream. They are lazy and do not cause the stream to be evaluated. Thus, order does not matter.

map, filter, flatMap etc

flatMap takes in a lambda expression that transforms every element in the stream into another stream. After which, the resulting streams are then flattened and concatenated together.

Stream.of("hello\nworld", "ciao\nmondo", "Bonjour\nle monde", "Hai\ndunia")
    .map(x -> x.lines()) // returns a stream of streams

Stream.of("hello\nworld", "ciao\nmondo", "Bonjour\nle monde", "Hai\ndunia")
    .flatMap(x -> x.lines()) // return a stream of strings

Stateful and Bounded Operations

Stateful intermediate operations need to keep track of some internal state.

sorted, distinct etc

sorted returns a stream with the elements sorted. Can also pass in a Comparator to tell sorted how to sort.

distinct returns a stream with only distinct elements.

Truncating Infinite Lists

limit takes in an int nn that returns a stream containing only the first nn elements of the stream.

takeWhile takes in a predicate and returns a stream containing the elements of the stream, until the predicate becomes false. Note: can potentially still be infinite if predicate never becomes false.

Peeking with a Consumer

Stream.iterate(0, x -> x + 1).peek(System.out::println).takeWhile(x -> x < 5).forEach(x -> {});

Reducing a Stream

reduce(init, (acc, x) -> ??)

reduce(BinaryOperator<T> accumulator) reduce(U identity, BiFunction<U,? super T,U> accumulator, BinaryOperator<U> combiner)

Also known as fold, accumulate in other languages. Repeatedly applies a lambda on elements of the stream to reduce it into a single value.

Stream.of(1, 2, 3).reduce(0, (x, y) -> x + y);
// returns the sum of all elements in a list.

Element Matching

noneMatch -> true if none of the elements pass the given predicate allMatch -> true if all the elements pass the given predicate anyMatch -> true if at least 1 element passes the given predicate

Prime Generating Example

private boolean isPrime(int n) {
  return n > 1 && IntStream
    .range(2, n)
    .noneMatch(x -> n % x == 0);
}

IntStream.iterate(2, x -> x + 1)
    .filter(x -> isPrime(x))
    .limit(500)
    .forEach(System.out::println);

Prints out first 500 primes.

Streams lead to code into becoming more declarative, allowing the code to be more succinct and less bug-prone. However, overusing streams instead of loops can become very messy if multiple nested components are required.


General Abstractions

Example using a Loggable class.

interface Transformer<T, R> {
  R transform(T t);
}

int incr(int x) {
  return x + 1;
}

int abs(int x) {
  return x > 0 ? x : -x;
}

class Loggable<T> {
  private final T value;
  private final String log;

  private Loggable(T value, String log) {
    this.value = value;
  	this.log = log;
  }

  public static <T> Loggable<T> of(T value) {
    return new Loggable<>(value, "");
  }

  public static <T> Loggable<T> of(T value, String log) {
    return new Loggable<>(value, log);
  }

  public <R> Loggable<R> flatMap(Transformer<? super T, ? extends Loggable<? extends R>> transformer) {
    Loggable<? extends R> l = transformer.transform(this.value);
  	return new Loggable<>(l.value, l.log + this.log);
  }

  public String toString() {
    return "value: " + this.value + ", log: " + this.log;
  }
}


Loggable<Integer> incrWithLog(int x) {
  return Loggable.of(incr(x), "incr " + x + "; ");
}

Loggable<Integer> absWithLog(int x) {
  return Loggable.of(abs(x), "abs " + x + "; ");
}

Loggable.of(4)
      .flatMap(x -> incrWithLog(x))
      .flatMap(x -> absWithLog(x))
// value: 5, log: abs 5; incr 4;

flatMap allows you to operate on the value encapsulated inside the object, along with some "extra information".

Monad

Monad and Functors are from Category Theory.

These are examples of monads: Maybe<T>, Lazy<T>, Loggable<T>, Stream<T>, CompletableFuture Note: InfiniteList<T> is not a monad because it does not have a flatMap operation.

Must have of and flatMap methods. (Actual names don't matter of course)

flatMap removes the original wrapper around the object.

map, on the other hand maintains the original wrapper around the object.

Need to explain why the laws hold. For flatMap, be careful of side information stored in both original Monad and Monad returned from the mapper (both should be retained, in the correct order).

Identity Law

Factory method should not do anything extra to the value and side information.

Left identity law says: Monad.of(x).flatMap(y -> f(y)) must be equivalent to f(x)

Right identity law says: monad.flatMap(x -> Monad.of(x)) must be equivalent to monad

Associative Law

monad.flatMap(x -> f(x)).flatMap(y -> g(y)) must be the same as monad.flatMap(x -> f(x).flatMap(y -> g(y))) (NOTE: The location of the closing brackets)

Functors

Functors is similar to monad but only ensures that lambdas can be applied sequentially to the value, without care about side information.

Functor is an abstraction that supports map. Must have map method. (Actual name don't matter of course)

Preserving Identity

functor.map(x -> x) must be the same as functor

Preserving Composition

functor.map(x -> f(x)).map(x -> g(x)) must be the same as functor.map(x -> g(f(x)))


Parallel and Concurrent Programming

Divide the computation into subtasks called threads.

Multi-thread programs are useful

  1. allow programmers to separate unrelated tasks into threads, and write each thread separately
  2. improves utilisation of the processor, e.g. I/O and UI is split up

Concurrency

Concurrency gives the illusion of subtasks running at the same time.

Parallelism

Parallel computing refers to when multiple subtasks are actually running at the same time.

Parallel Streams

Stream in Java allows for parallel operations on the elements of the stream.

parallel is a lazy intermediate operation. sequential will force the stream to be processed sequentially instead. Note that if multiple parallel and sequential are used on the last stream, the last one used will override.

parallelStream() instead of stream() also exists to be used on a collection.

IntStream.range(2_030_000, 2_040_000)
    .filter(x -> isPrime(x))
    .parallel()
    .forEach(System.out::println);

Output produced will not be in the correct order because there is no coordination among the parallel tasks on the order of printing. Thus, whichever task that completes first will output to the screen first.

forEachOrdered(...) will force the output to be ordered BUT will lose some benefits of parallelisation.

IntStream.range(2_030_000, 2_040_000)
    .filter(x -> isPrime(x))
    .parallel()
    .count();

Order does not matter for count(). Thus, it is a perfect use case for parallelisation.

What can be parallelised?

  1. Stream operations must not interfere with the stream data
  2. Most of the time must be stateless
  3. Side-effects should be kept to the minimum

Interference

List<String> list = new ArrayList<>(List.of("Luke", "Leia", "Han"));
list.stream()
    .peek(name -> {
         if (name.equals("Han")) {
           list.add("Chewie"); // they belong together
         }
      })
    .forEach(i -> {});

Interference will cause ConcurrentModificationException to be thrown.

Stateful vs Stateless

Stateful lambda is one where the result depends on any state that might change during the evaluation of the stream.

Stream.generate(scanner::nextInt)
	.map(i -> i + scanner.nextInt())
	.forEach(System.out::println)

// stateful because relies on the stdin

To ensure that the output is correct, need to also ensure that state updates are visible to all parallel subtasks

Side Effects

List<Integer> list = new ArrayList<>(
    Arrays.asList(1,3,5,7,9,11,13,15,17,19));
List<Integer> result = new ArrayList<>();
list.parallelStream()
    .filter(x -> isPrime(x))
    .forEach(x -> result.add(x));

The forEach above has a side effect of modifying result.

ArrayList is a non-thread-safe data structure. Thus, if 2 threads manipulate it at the same time, may lead to incorrect result.

Solution: collect

list.parallelStream()
    .filter(x -> isPrime(x))
	.collect(Collectors.toList())

Solution: CopyOnWriteArrayList

CopyOnWriteArrayList is available in the java.util.concurrent package. Several other thread-safe data structures are available there too.

List<Integer> result = new CopyOnWriteArrayList<>();
list.parallelStream()
    .filter(x -> isPrime(x))
    .forEach(x -> result.add(x));

Associativity

reduce operation is parallelisable but have to abide by some rules: reduce(identity, accumulator) reduce(identity, accumulator, combiner)

Accumulator: First parameter is the accumulated value, second parameter is the current value of the stream. Combiner: Used to combine the results of all the parallel sub-streams.

  1. combiner.apply(identity, i) must be the same as i
  2. combiner and accumulator must be associative
  • Parameters for the combiner and accumulator must be able to be swapped.
  1. combiner and accumulator must be compatible
  • combiner.apply(u, accumulator.apply(identity, t)) must be equal to accumulator.apply(u, t)

Multiplication is a valid operation that abides by these 3 rules.

If not following all these rules, you still might get the correct output sometimes but not consistent in different runs.

Performance of Parallel Stream

Parallelising streams does not always improve performance. Creating a thread to run a task incurs overhead. Overhead of creating too many threads can outweigh the benefits of parallelisation!

Ordered vs Unordered Source

Streams created from iterate and ordered collections (List, arrays) are ordered Streams created from generate and unordered collections (Set) are unordered

Stable Operations Stable operations preserve the original ordering of the elements.

distinct, sorted

Parallel versions of findFirst, limit, skip can be expensive on an ordered stream, since it needs to coordinate between streams to maintain order.


Synchronous Programming

Execution of program is stalled until a method returns. The method blocks until it returns. This is known as synchronous programming.

Synchronous programming is not very efficient when there are frequent method calls that block for a long period of time (methods that require expensive computation or calls to a remote server).

Wasted time that could be spent on other operations like refreshing the UI or other computations

Threads

java.lang.Thread https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/lang/Thread.html

A thread is a single flow of execution in a program.

// example of spawning 2 threads
new Thread(() -> {
  for (int i = 1; i < 100; i += 1) {
    System.out.print("_");
  }
}).start();

new Thread(() -> {
  for (int i = 2; i < 100; i += 1) {
    System.out.print("*");
  }
}).start();

new Thread(...) takes in a Runnable instance as an argument. Runnable is a functional interface with a method void run().

Thread::start() causes the given lambda expression to be ran. It does not return only after the given lambda expression completes its execution. This is asynchronous execution. Java also has different methods to query and control the threads.

There might be different interweaving of executions for different runs of the same program (decided by OS).

Thread Names

Every thread in Java has a name. Thread.currentThread() gets the reference of the current running thread. Thread.currentThread().getName() gets the name of the current running thread.

The main thread automatically created when the main method is invoked.

Thread Sleep

Thread.sleep(x) sleeps the thread for x milliseconds.

Is Alive

thread.isAlive() checks if a thread is running.

Note: The program exits only after all threads created run to their completion, including the main thread.

Limitations of Thread

Java Thread is already higher-level compared to pthread in C/C++.

  1. Consider series of tasks to be executed concurrently BUT some have to use a result from others.

  2. Handling exceptions gracefully. The tasks not dependent on the task throwing the exception should still continue.

  3. Using Thread requires careful coordination. No methods in Thread that return a value. Thus, they can only communicate through shared variables.

  4. There is also no mechanism to specify the execution order and dependencies.

Thread also has overhead because creating new threads take up resources. Thus, should reuse Thread instances as much as possible.

Completable Future Monad

java.util.concurrent.CompletableFuture is a Monad that encapsulates the promise to produce a value. https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/concurrent/CompletableFuture.html

Similar abstraction is usually called promise. Promise in Javascript, std::promise in C++.

CompletableFuture<Integer> foo(int x) {
    CompletableFuture<Integer> a = CompletableFuture.completedFuture(x);
    CompletableFuture<Integer> b = a.thenComposeAsync(i -> taskB(i));
    CompletableFuture<Integer> c = a.thenComposeAsync(i -> taskC(i));
    CompletableFuture<Integer> d = a.thenComposeAsync(i -> taskD(i));
    CompletableFuture<Integer> e = b.thenCombineAsync(c, (i, j) -> taskE(i, j));
    return e;
}

foo(x).get();
// to wait for all concurrent tasks to complete and return the value

Creating a CompletableFuture

CompleteableFuture<U> completedFuture(U value) is equivalent to creating a task that is already completed and returns a value.

CompletableFuture<Void> runAsync(Runnable runnable) CompletableFuture<T> supplyAsync(Supplier<T> supplier) runAsync and supplyAsync Returned CompletableFuture completes when the given lambda expression finishes.

allOf and anyOf methods take in a variable number of other CompletableFuture instances. allOf is completed only when all given CompletableFuture completes. anyOf is completed when any given CompletableFuture completes.

runAfterBoth(CF other, Runnable r), runAfterEither(CF other, Runnable r) do what their names suggest.

Chaining CompletableFuture

thenApply analogous to map thenCompose analogous to flatMap thenCombine analogous to combine

The methods above run the given lambda expression in the same thread as the caller.

There are also asynchronous versions: thenApplyAsync, thenComposeAsync, thenCombineAsync which may cause the given lambda expression to run in a different thread (more concurrency).

Getting Result

get() is a synchronous call so it will block until CompletableFuture completes. Thus, we should only call it at the final step in the code.

CompletableFuture::get throws InterruptedException and ExceutionException. InterruptedException is thrown when the thread has been interrupted. ExecutionException is thrown when there are errors/exceptions during execution.

join() is an alternative to get() that is exactly the same except it does not throw checked exceptions.

thenRun, runAsync (both take in a Runnable), thenAccept (takes in a Consumer) also exist.

When using runAsync, the Java program may terminate before the Runnable is completed!

Examples

// findIthPrime returns the ith prime
private boolean isPrime(int n) {
  return n > 1 && IntStream
    .range(2, n)
    .noneMatch(x -> n % x == 0);
}
int findIthPrime(int i) {
    return Stream
        .iterate(2, x -> x + 1)
        .filter(x -> isPrime(x))
        .limit(i)
        .reduce((x, y) -> y)
        .orElse(0);
}

int i = 9999;
int j = 999;

CompletableFuture<Integer> ith = CompletableFuture.supplyAsync(() -> findIthPrime(i));
CompletableFuture<Integer> jth = CompletableFuture.supplyAsync(() -> findIthPrime(j));

CompletableFuture<Integer> diff = ith.thenCombine(jth, (x, y) -> x - y);

Integer result = diff.join();

CompletableFuture: Handling Exceptions

Exception handling is troublesome when doing asynchronous programming because often times you still want some of the output and not crash the entire program.

CompletableFuture<T> handles exceptions unlike Thread.

exceptionally, whenComplete, handle are methods that deal with exceptions.

CompletableFuture<T> stores exceptions thrown by a computation and passes it down the chain of calls until join is called which will then dump all the exceptions in a CompletionException.

handle method takes in a BiFunction (similar to cs2030s.fp.Combiner), first parameter is the value, second is the exception.

cf.thenApply(x -> x + 1)
  // Java uses null to indicate no exception thrown
  .handle((t, e) -> (e == null) ? t : 0)
  .join();

// can check for either
// if exception, t == null
// if no exception, e == null

https://mincong.io/2020/05/30/exception-handling-in-completable-future/ handle is used for transforming the data depending on if there is a exception. whenComplete is used for logging information. exceptionally is used exclusively for exception handling (and has no access to the data).

Thread Pool

Thread pool lets us reuse threads and therefore reduces the overhead of creating new threads.

Fork and Join. ForkJoinPool, an implementation of a thread pool, that is fine-tuned for fork-join model of recursive parallel execution.

Fork-Join model is a parallel divide-and-conquer model of computation. Forks the problem into smaller problems and then joins then.

Parallel Parallel does not mean asynchronous and asynchronous does not mean parallel but they are often used together.

Recursive Task

Abstract class RecursiveTask<T> supports fork() and join() and has an abstract method compute() which we will use to specify the computation.

RecursiveAction is a result-less version of Recursive Task.

class Summer extends RecursiveTask<Integer> {
    private static final int FORK_THRESHOLD = 2;
    private int low;
    private int high;
    private int[] array;

    public Summer(int low, int high, int[] array) {
      this.low = low;
      this.high = high;
      this.array = array;
    }

    @Override
    protected Integer compute() {
      // stop splitting into subtask if array is already small.
      if (high - low < FORK_THRESHOLD) {
        int sum = 0;
        for (int i = low; i < high; i++) {
          sum += array[i];
        }
        return sum;
      }

      int middle = (low + high) / 2;
      Summer left = new Summer(low, middle, array);
      Summer right = new Summer(middle, high, array);
      // !!
      // many ways to do next 2 lines
      // but using 1.fork(), 2.compute(), 1.join() is the best way
      left.fork();
      return right.compute() + left.join();
    }
  }

    Summer task = new Summer(0, array.length, array);
    int sum = task.compute();

1.fork(), 2.compute(), 1.join() is the most optimal compute() is likely faster than fork() then join() because this reduces the overhead of interacting with the ForkJoinPool

ForkJoinPool

Java manages the thread pool with fork-join tasks internally but the details are out of scope of CS2030S.

  1. Each thread has a deque of tasks
  2. When a thread is idle, it checks its queue of tasks.
  3. If queue is not empty, it picks up a task at the head of the queue to execute (invokes its compute() method)
  4. Otherwise, it picks up a task from the tail of the queue of another thread to run. This is known as work stealing. Picking up from the tail to minimise conflicts.
  5. When fork() is called, the caller adds itself to the head of the queue of the executing thread. This is similar to recursion stack.
  6. When join() is called, several different cases
  7. If the subtask to be joined hasn't been executed, its compute() method is called and the subtask is executed.
  8. If the subtask to be joined has been completed (other thread stole it), then the result is read and done.
  9. If the subtask to be joined has been stolen and is still being executed, current thread works on another task (in local queue or steals another task)

The threads are always looking for something to do and cooperate to maximise work done! Minimise the time each thread is spent idling.

Similar to .NET and Rust implementation.

Ordering of Fork and Join

Since the most recently forked task is likely to be executed next, we should join() the most recent fork() first.

Order of forking should be the reverse of the order of joining. ABCDDCBA

left.fork();
right.fork();
return right.join() + left.join();

is efficient.

Doing this order is better because its added to the queue from the front, thus the second fork will be the new head of the queue.

Join and Fork comes together so must be used in tandem.

Misc

new A().foo() is valid even when foo is static