Yin Zhu, Tuesday, May 4, 2010
Before we go to the answer of this question, let's see some existing data mining environments.
Data mining environments
Following lists the notable features, which I think are helpful for data mining.
1. Succinct, static typing at the same time. These two features are combined together. When we write programs, we need to think about the programming logic and the programs, e.g. write the loop, define a variable, write the type for it, etc. Programming logic is the core thing, but every language has a syntax to obey, so we need to write a lot of other things, which could be distractive during programming because you need to think about more things at the same time. More succinct means you don't need to write so many 'declarations' for your programs. Python is also succinct, however it is not static typing, which means typing checking is done during runtime. Dynamic typing is slow. In a lot of cases, F# does not need to write types, its other syntax is also succinct, it is also static typing at the same time. Combining these two together means we can focus programming logic and at the same time keep the performance.
2. Functional and programming style. F# is functional, thus the A, B, C of a functional programming also applies to F#. Here A could be 'function as the first class', which means you could write:
a |> split |> Seq.filter notStopWord |> Seq.map stemmer
to perform string split, filtering using a self defined filter and doing word stemming in a single line!
In an OOP language, a pipe line design pattern is far more complex than this.