Here are some tips and guidelines for anyone adding collection processing functions to the F# core library, in particular those working on items from the F# 4.0 list of proposed additions.

Implementation

  • General
    • Follow the establish coding style as best you can
    • Follow the established border- and error-case behavior. When null or empty inputs are specified, indices are out of bounds, etc, follow the behavior established by existing, similar functions.
    • Author nice, concise XML doc comments (including exception types and parameter docs) in the relevant .fsi signature file.
  • Performance
    • Mutable state and procedural-style code are absolutely acceptable (and in fact encouraged) if they results in significant performance benefits. The core library code needs to be correct and performant before it needs to be idiomatic. Look at the implementation of similar existing functions to see examples of this.
    • Don't use array slicing (arr.[i .. j]) syntax. Slicing is syntactic sugar that results in a bit of additional generated code and runtime cost. Instead, use lower-level, optimized constructs, e.g.Array.zeroCreateUnchecked + Array.Copy.
    • It is highly encouraged to do performance testing, and to provide comments in your pull request mentioning what testing you did, and what tradeoffs you encountered. This helps provide context to the community when reviewing the change, and helps provide documentation to others as they embark on similar work.
    • It might be most efficient to use special-case handling when small collections are used as input, and to use a different or more general algorithm for larger collection. See, for example, List.foldBack or Seq.windowed
    • Note the existence of optimized helpers and implementations (only legal in FSharp.Core) for List and Array in local.fs e.g. inline IL, mutation of List elements. It might be appropriate to add to or consume these helpers.
    • Consider stack overflows when processing collections (non-tail) recursively
    • Consider the differences in behavior and performance when the collection contains reference types vs value types (vs large value types)

Tests

  • Add unit tests to the existing test code for the relevant collection type, under src/fsharp/FSharp.Core.Unittests/FSharp.Core/Microsoft.FSharp.Collections, e.g. ArrayModule.fs
    • Follow the pattern established by the existing tests - add one [<Test>] method for the new function, with all validation in that method. Additional helper methods or functions are fine.
    • Make sure to validate standard equivalence classes for collections - null, empty, 1-element, multi-element
    • Make sure to validate exception types that should be thrown in negative cases
  • Update the FSharp.Core public surface area baselines in SurfaceArea.4.0.fs and SurfaceArea.Portable.fs
    • These tests simply reflect over all public types and members exposed by FSharp.Core, validating that nothing gets accidentally added, deleted, or modified (by a refactoring, for example).
    • If you are adding a new collection processing function, the change will be a one-line addition, similar to Microsoft.FSharp.Collections.ArrayModule: T[] MyNewArrayFunction[T](T[])

Last edited Jul 1, 2014 at 5:34 PM by latkin, version 2