Gespeichert von h2b am Mi., 30.12.2015 - 21:00

LinAlg -- The Linear-Algebra Scala Library

LinAlg provides data types and operations for algebraic vectors and matrices.

Vector and Matrix elements can be of arbitrary type, provided that a factory for that element type is available. Currently, factories exist for Double, Float, Long, Int, Short, Byte and Char.

Vector and matrix (row and column) indices can be any integer (to state more precisely, an index must be in the interval [Mindex, Maxdex], as defined in the Index object). But only elements corresponding to a subset of that whole integer range actually are stored (the concrete elements), while all other elements by definition are zero (virtual elements).

This concept (which I got to value a long time ago with the ALGOL68 library prelude TORRIX [1]) does not only provide a natural approach to operations with vectors and matrixes of different index ranges, but also prevents from index-out-of-bounds exceptions.

Concrete index ranges (those indices that correspond to concrete elements) implicitly or explicitly are assigned when a vector or matrix is created. By default, it starts at 1 and extends to the number of elements specified, but this is customizable.

Vectors and Matrixes both are immutable by design, so there exist no update methods or the like (you cannot do v(i) = something). There are, however, builder classes that allow you to build Vectors and Matrixes element by element.

Currently, operations on vectors and matrices require identical element types. For instance, you can add a Vector[Int] to another Vector[Int], but you cannot add a Vector[Int] to another Vector[Double].

L1 Indices
- L1-1 Vector
- L1-2 Matrix
L2 Creation
- L2-1 Vector
- L2-2 Matrix
L3 Operations
- L3-1 Vector
- L3-2 Matrix
L4 Iterators
- L4-1 Vector
- L4-2 Matrix
L5 Sparse Vectors and Matrices
- L5-1 Vector
- L5-2 Matrix
L6 Other Features
A Appendix

L1 Indices

L1-1 Vector

All vectors have an index range from Index.Mindex to Index.Maxdex (both inclusive). In addition, each individual vector v has a concrete index range from v.index.low to v.index.high (both inclusive).

 Mindex       low    high        Maxdex
    
   0   0   0   x   x   x   0   0   0
     
   virtual     concrete      virtual

Only the values corresponding to the concrete indices are stored while the rest are virtual zeroes. The representation of a virtual zero depends on the element type (e.g., it is 0.0 for Double or 0 for Int). Operations on vectors treat virtual elements like concrete ones, i.e., as if they were stored as concrete values.

The values of Mindex and Maxdex are implementation-dependent, but they will be very close to Int.MinValue and Int.MaxValue, respectively. So, as long as you don't explore the very edges of the whole Int range, you don't have to bother about index-out-of-bounds exceptions.

L1-2 Matrix

All matrices have both a row and column index range from Index.Mindex to Index.Maxdex (both inclusive). In addition, each individual matrix a has a concrete row index range from a.index.dim1.low to a.index.dim1.high and a concrete column index range from a.index.dim2.low to a.index.dim2.high (all inclusive).

 Mindex    dim2.low dim2.high    Maxdex

   0   0   0   0   0   0   0   0   0   Mindex
    
   0  vir  0   x   x   x   0  vir  0   dim1.low 
    
   0  tu   0   concrete    0   tu  0
    
   0  al   0   x   x   x   0   al  0   dim1.high
    
   0   0   0   0   0   0   0   0   0   Maxdex
    
 Mindex    dim2.low dim2.high    Maxdex

Only the values corresponding to the concrete indices are stored while the rest are virtual zeroes. The representation of a virtual zero depends on the element type (e.g., it is 0.0 for Double or 0 for Int). Depending on the implementation it is even likely that more zeroes as shown actually are virtual and not stored. Operations on matrices treat virtual elements like concrete ones, i.e., as if they were stored as concrete values.

L2 Creation

L2-1 Vector

The most common way to create a vector is by simply enumerating its elements. You get a vector according to the type of the element sequence with a concrete index range starting at 1:

val vi = Vector(1,2,3) //vi is Vector[Int] with v(1)==1, v(2)==2, v(3)==3 and v(i)==0 for all other i in [Mindex, Maxdex]
val vd = Vector(1.0,2.0,3.0) //vd is Vector[Double] with v(1)==1.0, v(2)==2.0, v(3)==3.0 and v(i)==0.0 for all other i in [Mindex, Maxdex]

To get a vector with customized index range, use Vector.at and provide index.low as a separate (first) parameter:

val vat0 = Vector.at(0)(1,2,3) //v0 is Vector[Int] with v(0)==1, v(1)==2, v(2)==3 and v(i)==0 for all other i in [Mindex, Maxdex]

Or provide a generating function f: Int => E (that maps indices to elements of type E) with explicit concrete index bounds:

val vf = Vector((i: Int) => i*i: Double, 1, 3) //vf is Vector[Double] with v(1)==1.0, v(2)==4.0, v(3)==9.0 and v(i)==0.0 for all other i in [Mindex, Maxdex]

If you cannot enumerate the vector elements in advance use a vector builder:

val vbuilder1 = Vector.newBuilder[Double]
vbuilder1 += 1.0
vbuilder1 += 2.0
vbuilder1 += 3.0
val vb1 = vbuilder1.result()

val vbuilder2 = Vector.newBuilder[Double]
vbuilder2(1) = 1.0
vbuilder2(3) = 3.0
val vb2 = vbuilder2.result()

In the first case you get a Vector[Double](1.0,2.0,3.0), in the second case the result is Vector[Double](1.0,0.0,3.0). Note that the second vector implicitly got a (concrete) element vb2(2)==0.0.

L2-2 Matrix

Matrices are built from rows which are established by vectors.

Create a matrix by enumerating its row vectors. The matrix' type is derived from the given vectors, its concrete row index starts at 1 by default and the concrete column range is implied from the row vectors again:

val a = Matrix(Vector(11,12,13), Vector(21,22,23)) //a is Matrix[Int] with a(1,1)==11, a(1,2)==12, ..., a(2,3)==23 and a(i,j)==0 for all not explicit or implicit mentioned (i,j) in [Mindex, Maxdex]

Customized vector indices will be honored as in

val b = Matrix(Vector.at(0)(10,11,12), Vector.at(0)(20,21,22)) //b is Matrix[Int] with b(1,0)==10, b(1,1)==11, ..., b(2,2)==22 and b(i,j)==0 for all not explicit or implicit mentioned (i,j) in [Mindex, Maxdex]

To customize the row range use Matrix.atRow with specified low index:

val c = Matrix.atRow(0)(Vector(1,2,3), Vector(11,12,13)) //b is Matrix[Int] with a(0,1)==1, a(0,2)==2, ..., a(1,3)==13 and a(i,j)==0 for all not explicit or implicit mentioned (i,j) in [Mindex, Maxdex]

Or provide a generating function f: Int => Vector[E] (that maps indices to elements of type Vector[E], i.e., row vectors) with explicit concrete row-index bounds:

val af = Matrix((i: Int) => Vector(i*10+1,i*10+2,i*10+3), 1, 2) //af is equal to a, see above

If you cannot enumerate the matrix rows in advance use a matrix builder:

val row1 = Vector(1,2,3)
val row2 = Vector(4,5,6)
val mbuilder1 = Matrix.newBuilder[Int]
mbuilder1 += row1
mbuilder1 += row2
val ab1 = mbuilder1.result()

val mbuilder2 = Matrix.newBuilder[Int]
mbuilder2(1) = row1
mbuilder2(3) = row2
val ab2 = mbuilder2.result()

In the first case you get a Matrix[Int](Vector(1,2,3),Vector(4,5,6)), in the second case the result is Matrix[Int](Vector(1,2,3),Vector(),Vector(4,5,6). Note that the second matrix implicitly got a (concrete but empty) vector for the second row.

L3 Operations

All operations on vectors and matrices require an identical element type.

Beyond, due to the concept of virtual elements (see above), operations on vectors or matrices with different index ranges resolve properly.

L3-1 Vector

Let u and v be vectors of some type Vector[E], let s be a scalar of type E and i be an integer, then the following operations can be done:

u.length //size of the concrete index range
u.isZero //if u contains no elements other than zero
u(i)     //element with index i, zero if i outside concrete index range
+u       //unary plus
-u       //unary minus
u + v    //sum
u - v    //difference
u * s    //multiplication by scalar
s * u    //multiplication by scalar (*)
u * v    //scalar product (result of type `E`)
u.norm   //Euklidian norm (result always of type `Double`)
u @@ i   //copy of u with lower index bound i

(*) To use this, you must import the package object by import de.h2b.scala.lib.math.linalg._. Beware that -- in contrast to the other operators -- this implementation of s * u always returns a Vector[E]; thus, subtypes should bring along their own implementation of this operator.

L3-2 Matrix

Let a and b be matrices of some type Matrix[E], let v be a vector of type Vector[E], let s be a scalar of type E and i and j be integer, then the following operations can be done:

a.height    //number of concrete rows
a.width     //number of concrete columns
a.isSquare  //if a is a square matrix (regarding to the concrete index range)
a.isZero    //if a contains no elements other than zero
a(i,j)      //element with index (i,j), zero if (i,j) outside concrete index range
a(i)        //same as a.row(i)
a.row(i)    //the row vector with index i, zero vector if i outside concrete index range
a.col(j)    //the column vector with index j, zero vector if j outside concrete index range
+a          //unary plus
-a          //unary minus
a + b       //sum
a - b       //difference
a * b       //product
a * v       //product
v ** a      //product (*) (**)
a * s       //multiplication by scalar
s * a       //multiplication by scalar (*)
a.transpose //transpose matrix
a.rowSum    //vector of row sums of a
a.colSum    //vector of column sums of a
a.atRow(i)  //copy of a with lower row-index bound i
a.atCol(j)  //copy of a with lower column-index bound j (all row vectors are shifted to this bound)
a @@ (i,j)  //copy of a with lower row-index bound i and lower column-index bound j (all row vectors are shifted to j)

(*) To use this, you must import the package object by import de.h2b.scala.lib.math.linalg._. Beware that -- in contrast to the other operators -- these implementations of s * a and v ** a always return a Matrix[E] and Vector[E], respectively; thus, subtypes should bring along their own implementations of these operators.

(**) This operator should be named *, but there is an ambiguity with scalar ops. Until this problem is solved, ** must be used.

L4 Iterators

Both vector and matrix extend Iterable and thus provide iterators over its elements. In addition, you can use all methods inherited from Iterable like map or foreach.

L4-1 Vector

Vector[E] extends Iterable[E] and provides an iterator over its elements:

for (x <- Vector(1,2,3)) print(x) //prints 123

L4-2 Matrix

Matrix[E] extends Iterable[Vector[E]].

A matrix a provides three iterators over its elements:

a.rowIterator  //iterates over the row vectors of a
a.colIterator  //iterates over the column vectors of a
a.elemIterator //iterates over the scalar elements of a

The first two yield vectors over which you can iterate (a(i,) or a(,j)) while the third one iterates over all bare elements (a(i,j)).

The rowIterator is the default one, thus

for (x <- Matrix(Vector(11,12,13), Vector(21,22,23))) println(x)
//prints
//(11,12,13)@1
//(21,22,23)@1

L5 Sparse Vectors and Matrices

Since version 3.0.0 sparse vectors and matrices are provided. They come with storage engines that only store values different from zero. In addition, they are integrated seamless into the LinAlg system, i.e., they are Vectors and Matrixes and inherit all operations from them. Also, operations between sparse and ordinary vectors and matrices can be intermixed.

L5-1 Vector

A sparse vector is created by a sequence of index-value mappings:

val sv = Vector(-3 → -3, 1 → 1, 3 → 3)

L5-2 Matrix

A sparse matrix is created by a sequence of index-value mappings for its row vectors:

val sm = Matrix(-1 → Vector(-11,-12,-13), +1 → Vector(11,12,13))

Note that such a matrix is sparse regarding to its rows but not necessarily to its columns. Of course, the row vectors itself can be (but not need to be) sparse vectors.

L6 Other Features

L6-1 Equality

Two vectors or matrices x and y are considered to be equal (i.e., x==y) if and only if

x and y are of the same type (which includes the element type),
x and y have the same concrete index ranges and
x and y have the same elements at equal index positions.

This means, e.g., Vector(1,2) != Vector(1,2,0) (because of rule 2) and of course Vector(1,2) != Vector(1,3) (because of rule 3). With these examples, only something like Vector(1,2) == Vector(1,2) is true. The same applies to matrices.

For both vectors and matrices, there exists a similarity operator ~~ that lowers these restrictions and disregards rule 2 above. Two vectors or matrices x and y are considered to be similar (i.e., x~~y) if and only if

x and y are of the same type (which includes the element type) and
x and y have the same elements at equal index positions.

This means, e.g., Vector(1,2) ~~ Vector(1,2,0). Of course, ! (Vector(1,2) ~~ Vector(1,3)) still holds and also Vector(1,2) ~~ Vector(1,2) is true. Again, the same applies to matrices.

L6-2 Zero Vector and Matrix

To create zero vectors or matrices (i.e., those containing no concrete elements) just provide empty element lists but specify the element type:

val v0 = Vector[Double]()
val a0 = Matrix[Double]()

L6-3 Subtyping

When subtyping the vector or matrix trait, of course all operations are inherited. Since version 2.0.0, a few measures ensure that these operations (as well as the ones inherited from the iterable trait which is extended by both the vector and matrix trait) will return proper result types.

For instance, let's implement a Point type by extending the vector trait. Our Point class is constructed from two double elements x and y and extends the DoubleVector class which is an implementation of Vector[Double]. For the result type of operations we mix in the IterableLike and VectorLike traits both with type parameters Double (the element type) and Point (the result type):

class Point private (val x: Double, val y: Double) extends
    DoubleVector(1, Seq(x,y)) with IterableLike[Double, Point] with
    VectorLike[Double, Point] with SimpleVectorStore[Double] {
  //override newBuilder
}

Note that we also need a VectorStore to instantiate a vector. In this case, for simplicity we use a SimpleVectorStore[Double] which is the default implementation trait.

The method we have to override here, is newBuilder which we can implement from a static version to be defined later:

override protected[this] def newBuilder: VectorBuilder[Double,Point] = Point.newBuilder

Note the return type: VectorBuilder[Double,Point], which is a vector builder that builds Points from Double elements.

The remaining declarations go into the companion object:

object Point {
  //define factory method
  //define newBuilder
  //define canBuildFrom
  //define implicit class for scalar multiplication from left (optional)
}

First, since we made our constructor private, we need a factory method (as usual):

def apply (x: Double, y: Double) = new Point(x, y)

Then, we define the static newBuilder method mentioned before:

def newBuilder: VectorBuilder[Double,Point] =
  new VectorBuilder[Double, Point] {
    def result: Point = {
      require(elems.length==2)
      Point(elems(0), elems(1))
    }
}

We use an anonymous implementation of the VectorBuilder trait. The trait provides an elems array which holds the elements put into the builder, so we just have to implement the result method to build a Point from the elements. Since our Point consists of x and y coordinates, we require that exactly 2 elements are there when building the result.

Third, we have to provide an implicit VectorCanBuildFrom implementation which follows the rules of the standard scala.collection.generic.CanBuildFrom[From, Elem, To] trait (i. e., define how to create a To collection from a From collection with elements of type Elem):

implicit def canBuildFrom: VectorCanBuildFrom[Point, Double, Point] =
  new VectorCanBuildFrom[Point, Double, Point] {
    def apply (): VectorBuilder[Double,Point] = newBuilder
    def apply (from: Point): VectorBuilder[Double,Point] = newBuilder
}

Here, we just reuse our newBuilder.

Last, to return a Point for scalar multiplication from left (which is special as mentioned above), we define this simple implicit class:

implicit class ScalarOps (s: Double) {
  def * (p: Point): Point = p * s
}

This last definition is optional, but nice.

The complete example is part of the VectorSubtypeTest class to be found under the src/test/scala folder. There is also a MatrixSubtypeTest with a similar example for subtyping the matrix trait.

L6-4 Storage

Since version 2.0.0 vectors and matrices each use a separate storage engine that can be overridden. To this end, a vector inherits from a VectorStore definition trait which in turn has to be implemented by some implementation trait that realizes the storage; a similar mechanism works for matrices where the definition trait is called MatrixStore. The default implementation traits SimpleVectorStore (which uses scala.collection.immutable.Vector as storage for the vector elements) and RowMatrixStore (which uses a similar storage for the row vectors of a matrix) are mixed in during the standard creation process of a vector or matrix; see for instance the DoubleVector companion object to find out where this happens.

To demonstrate how this mechanism can be used to implement an own vector store for a subtype, we retouch the point example from the previous section. There we utilized the default SimpleVectorStore[Double] to store the elements of our point. In fact, this is a duplication of storage, since we already have the fields x and y.

So, here is an alternative that makes a point to be its own vector store:

class PointOSt private (val x: Double, val y: Double) extends
    DoubleVector(1, Seq(x,y)) with IterableLike[Double, PointOSt] with
    VectorLike[Double, PointOSt] {
  //define index
  //define apply
  //define hash code
  //override newBuilder
}

Compared to the original Point, we left out the mix in of SimpleVectorStore[Double] and added three more definitions instead.

At first, a vector needs an index range. This is simple in this case, because we only have two elements numbered from 1 to 2:

val index: Index = Index(1, 2)

Second, we must define an apply method that maps an index to its value:

def apply (i: Int): Double = i match {
  case 1 ⇒ x
  case 2 ⇒ y
  case _ ⇒ 0.0
}

Note that we mapped each integer other than 1 or 2 to zero, following the LinAlg concept of virtual elements.

The third definition we have to add is a dataHashCode that provides a hash code of the stored data elements (x and y in this case). Here, we don't need to take the index range into account, since this is done automatically when an overall vector hash code is computed. So, the hash-code implementation could, e. g., look like this:

protected val dataHashCode: Int = {
  val prime = 31
  var result = 1
  result = prime*result+x.hashCode()
  result = prime*result+y.hashCode()
  result.toInt
}

We already had overridden the newBuilder method in the original Point example. We do not repeat the code here, since the only change is to replace the Point type by PointOSt; the same is true for the companion object that we can also copy from there with the change mentioned.

Beware that it is important to override newBuilder and to define an implicit canBuildFrom in the companion object, because otherwise the results of operations (like p+q for p and q of type PointOSt) would not use our own storage.

The complete example is part of the VectorSubtypeOwnStoreTest class to be found under the src/test/scala folder. There is also a MatrixSubtypeOwnStoreTest with a similar example for overriding the matrix storage engine.

A Appendix

API

See Scaladoc.

Maven Coordinates

See The Central Repository. Choose the version you want (of course, the latest one is recommended) and look under "Dependency Information".

License

LinAlg - Scala Library for Vector and Matrix Types and Operations

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Major Changes

Here we are describing compatibility-breaking changes.

Version 3.0.0

New package structure: building, factory and storage classes and traits are moved to respective sub packages. Implies that import statements for that classes and traits have to be adapted.
Zero vectors and matrices are now always equal regardless indices.
Shorten on zero vectors and matrices now disregards indices and always yields lower index 1.
The VectorFactory trait has an additional abstract create method.

Version 2.0.0

Vector-builder class replaced by trait with target type parameter. In existing code, replace new VectorBuilder... by VectorBuilder... or even better Vector.newBuilder....
Matrix-builder class replaced by trait with source and target type parameter. In existing code, replace new MatrixBuilder... by MatrixBuilder... or even better Matrix.newBuilder....
Made newbuilder methods more specific to return a vector or matrix builder rather than a general builder.
Specific CanBuildFrom traits introduced that return a vector or matrix builder rather than a general builder.
Vector and matrix arithmetics moved to separate implementation traits with target type parameter. This allows subtypes of Vector[E] and Matrix[E} to get proper result types of operations by overriding newBuilder and implementing a suitable CanBuildFrom implicit.
Storage-related stuff moved into separate traits, allowing subtypes to implement their own storage strategy. The implementation classes Double/Int/NumericVector and RowMatrix are abstract now. The default storage strategy is mixed in by the companion objects's apply method, respectively. As a consequence, the runtime class of a vector is not simply Double/Int/NumericVector anymore, but has the storage-implementation trait mixed in; likewise, the runtime class of a matrix is RowMatrix plus storage-implementation trait now.

References

[1] S. G. van der Meulen, P. Kühling, "Programmieren in ALGOL68", Bd. II (Berlin, New York: de Gruyter), 149-188 (1977).

LinAlg -- README