Wednesday, March 18, 2009

How I'm Learning F# - Interacting with the .Net Framework

Over the past few months, I've been hearing more and more about the use of functional programming concepts and also languages like Haskell and F#.  While some of the initial musings that I've read revolved around how the concepts have been around for decades and that it makes financial and scientific applications easier to read and write, I couldn't find a good reason to start learning it for my typical line-of-business application design and development job or even some of my basic hobby projects.  Nonetheless, I kept getting drawn to the concept and have decided to focus on learning it.

This post marks the third entry of a series that I'll be writing to discuss how I'm going about learning F#.  While I'm not saying that my method of learning is ideal and should be followed by others, I'm just reporting how I'm going about doing it.  Over the course of this series, my goal is to provide other .Net developers a(nother) resource for learning the F# language as well as apply the language into some non-financial or scientific scenarios.

 

Series Table of Contents:

  1. Finding Resources
  2. Writing the First Application
  3. Interacting with the .Net Framework

 

Downloadable Resources:

 

Project Overview:

In this post, I'm going to interact with libraries in the .Net framework to illustrate how to you'll be able to apply F# to functionality that normally may do in another language. The basis of this project will incorporate 2 activities:

  1. Read from a comma delimited text file
  2. Create a new Fixed-Width based text file

Throughout the course of this project, we'll be dealing with a large amount of F# syntax as well as a couple of assemblies from the .Net Framework.  This post is a bit of a step from the previous post; however, I am hoping that a lot of the concepts will be applied through a little more comprehensive example.

 

What Does It Take to Read a Text File?:

While I was writing this example, I began to think of how a person traditionally learns a language.  While in our day to day jobs we may gloss over some of the granular steps of what it takes to read from a text file, I began to look for that level of detail in this program.  To read a text file in any .Net language the follow basic steps must be done:

  1. Open the System.IO namespace
  2. Call the File.ReadAllLines(string) static function, passing the path to the file as a parameter.

What's nice about F# is that when you think about the detailed steps of a task, you begin to see the lines that you need to write. Here is a function in F# that does the above steps:

open System.IO
let readFile =
    File.ReadAllLines(@"C:\...\myFile.txt")

By writing the above code, we have just opened a .Net Assembly (System.IO in this case) and declared a value that returns an array of strings representing each line of the text file.  In the above code, we could have fully qualified the ReadAllLines() function instead of opening the assembly; however, we will be creating a new file here in a moment so this works out better.

 

How Do I Manipulate This Array of Lines?

So, we have an array of strings representing the lines of delimited words.  Now what?  If we want to take these lines and break them into a fixed lines, we'll need to do the following things:

  1. Identify each word in each line (a.k.a. split the delimited string of words)
  2. Pad each word with spaces until its total length is 25
  3. Concatenate the words into one long string per line

Once again, we can directly map each of these lines to a line or function of code.  Let's see how these steps would look when we translate them into just functions:

let obtainWords (line : string) =
    line.Split(",".toCharArray())

let padWord (word : string) =
    word.PadRight(25, ' ')

let joinWords (words : string []) =
    System.String.Concat(words)

Our 3 steps translate fairly easily into single lined functions thanks to some build in String functions of the .Net Framework. Our obtainWords function takes a string and splits it using a comma being the delimiter.  Next, our padWord function takes a string an pads it to the right to ensure that it's 25 characters in length.  Lastly, we call the System.String.Concat function to take a string array and turn it into a single string.

We have our "what" to do to the lines but we haven't really answered the "how".  In traditional C# or VB.net, we would probably use a for loop against each read line and then call each function to update the variables in those languages.  We would end up with something that looks like the following in C# (using our function names from above):

for(int x = 0; x<readFile.Length; x++)

    string[] words = obtainWords(readFile[x]);

    for(int i = 0; i<words.Length; i++)
    {
        words[i] = padWord(words[i]);
    }
   
    readFile[x] = joinWords(words);
}

Here we have a loop that iterates through each line read by the readFile function.  Then we split each line into another array variable.  Next we iterate over the words array and update the values of the array to the padded versions.  Finally, we update the line with the combined strings of our padded words.

The code is pretty straightforward but I don't know many people who like inner loops.  Also, some developers may fall into a trap and attempt to use a foreach loop instead of a for loop.  If you are not aware of the difference here, the variable generated by a foreach loop is readonly.  I could iterate over the arrays in a foreach loop;however I would have to add the values into a different variable altogether in order to "update" the values like I did above.  Thankfully, F# has a special function that cleans this up for us.

 

Understanding Array.map()

One of the built in functions that I have really enjoyed in learning is the Array.map() function.  The Array.map() function takes a function value and an array as parameters.  It returns a new array that is comprised of values where the function was applied to each element of the provided array.  An example of this functionality is line C# example where we took the words array of strings and updated all values of the array with the padWord() function.  By using the Array.map() function in F#, we get code that does the same thing but looks like the following:

let newWordsArray =
    Array.map (padWord) words

This creates a new array (newWordsArray) where the values are the same as if each string in the words array was applied to the padWord() function.  One nice thing about the Array.map() function is that it also allows lambda expressions to be used in place of the function value also.  Lambdas will be covered at a later time though. By using Array.map(), we can begin to chain our values together and be able to take our delimited file and turn it into a fixed with file.  However, before we get into that, let's look at a technique in F# that allows us to chain these together even easier called Pipelining.

 

Pipelining with |>

Pipelining is a technique where you pass the returned value of 1 item and pass it as the parameter of another function.  This is very similar to the a technique in shell scripting using the pipe (|), greater than (>), and double greater than (>>) operators.  For example, in a command window you can type in type in the dir command and see a list of directories; however, if you wanted to apply paging to the list you can type in dir | more.  Likewise, if you wanted to send the directory listing from the dir command to a file you could type in the dir > file.txt to create/overwrite a new file with the redirected output or dir >> file.txt to append the directory listing to the contents of file.txt if the file already exists.

In F#, we can take a function and send it's returned function to another function using the |> operator. In our last examples, we can illustrate this by pipelining the return value of the obtainWords function (which returns a string array) into our applyPadding function.  The resulting function (see below) returns a string array that already has every string in the array padded to 25 characters. 

let obtainWords (line : string) =
    line.Split(",".toCharArray())
    |> applyPadding

At first glance, this may seem a bit confusing; however, remember what I said in that the output of the first expression (in this case line.Split()) is used as the parameter of the second (applyPadding).  In essence we are just reordering how we are just reordering a chain of events.  This simple example is just to show how we can remove the need for one additional value to hold the output of obtainWords before passing it to applyPadding. Below is a more applicable example where we do multiple chains in order to give our initial readFile function the ability to output an already transformed array of strings.

open System.IO

let obtainWords (line : string) =
    line.Split(",".toCharArray())

let padWord (word : string) =
    word.PadRight(25, ' ')

let applyPadding (words : string []) =
    Array.map (padWord) words

let joinWords (words : string []) =
    System.String.Concat(words)

let readFile =
    File.ReadAllLines(@"C:\...\myFile.txt")
    |> Array.map(obtainWords)
    |> Array.map(applyPadding)
    |> Array.map(joinWords)

Here, we open our System.IO namespace.  Next we define our functions that we'll be used to transform the data in the file.  Lastly, we create our function, readFile, that reads the lines into a string array, pipes the array into the Array.map function that maps the obtainWords function to the array and returns an array of words.  Those words are then padded and subsquently joined.  Through all of those steps, the readFile now contains an array of strings that represent the fixed width version of the comma delimited file that it read.  The final step is to now write those lines to our output file.

File.WriteAllLines(@"C:\...\myOutputFile.txt", readFile)

 

Summary

At this point, another function can be established to do anything with the output file that you wish.  You could use the System.Net namespace to gain access to the mail message object and email the new file to another process or possibly even FTP/copy it somewhere.  This is just a simple file transformation example to illustrate some advanced functionality and how to interact with the .Net framework using System.IO and System.  If you know any other .Net language, it works exactly the same in F# as in VB or C# from what I can tell so far. 

This project was one where things began to click for me inside of F#.  I knew the basics and seen examples through euler problems and such; however, seeing an example like this saw how easy and straight forward F# can make things.


kick it on DotNetKicks.comShout it

1 comment:

  1. I've been playing with F# off and on for a bit now and really like pipelining.

    F# has a scripting feel too it, but with its static typing and Visual Studio integration you feel that you're going to get it right the first time

    ReplyDelete