Friday, June 21, 2013

Using MS Solver Foundation and C# in Excel with Excel-DNA

In one of my latest posts http://mikejuniperhill.blogspot.fi/2013/06/using-excel-solver-with-vba.html I was presenting my approach for using Frontline Solver in Excel with VBA. I was also mentioning, that I would like to get rid of linking my program with concrete ranges in Excel worksheet. Why? To be able to create convinient User Defined Functions (UDF) for solving optimization routines. I would like to have an optimization tool, in which I take all given parameters and data inside to a program and perform any desired optimization routines inside a program, without interacting with Excel ranges. Now it is a good time to present one such a tool - Microsoft Solver Foundation (MSF).

You can use MSF for creating a wide range of different types of optimization models in C# for example. From there, you can link your solver model back to Excel with Excel-DNA http://exceldna.codeplex.com/. This means, that you can set up your data and other parameters in Excel workbook, then use C# and MSF to perform desired optimization routine without interacting with Excel and finally, push the results back to Excel worksheet. In other words, you can create Excel UDF for solving optimization routines.

Let us first go through, what we are going to reach in this post:
  1. We set up MSF to be used in Visual C# 2010 Express.
  2. We create optimization model in C# by using MSF.
  3. We set up Excel-DNA linking for C# program.
  4. We create interface function for handling data between Excel and C# program.
  5. We verify MSF model results against Excel Frontline Solver results.
So, let us get into it. First, open your Visual C# 2010 Express.

Microsoft Solver Foundation setup

First, we have to find MSF dll library file. We need to have Solver Foundation v3.0 - DLL only. I found it from here
http://archive.msdn.microsoft.com/solverfoundation/Release/ProjectReleases.aspx?ReleaseId=1799

When you have that dll file, we create a new C# Class Project in your Visual C# 2010 Express (File - New Project - Class Library - Name=MSF_Solver). Remember to save your project. Now, create reference to that MSF dll library file (Solution Explorer - References - Add Reference - Browse). If everything goes correctly, you should be able to build the following test program without any errors. This program is just for quick testing, that your MSF is ready for servicing.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using Microsoft.SolverFoundation.Services;

namespace MSF_Solver
{
    public class ExcelFunctions
    {
        SolverContext solver = SolverContext.GetContext();
    }
}

I assume that everything went correctly without any errors in your build. At this stage, we have set up MSF to be used in our Visual Studio. One word about the tools: I am using Visual C# 2010 Express and having .NET Framework 4.0 as my Target framework.

The C# Program

Now we need to create the actual program, which will be using MSF to perform desired optimization routine. First, we create ISolver interface, from which all possible Solver models are going to be implemented. This interface is having only one method (solve) and it returns Dictionary data structure (string, double). In your Visual Studio, add new Interface to your project (Solution Explorer - MSF_Solver Project - Add - Add New Item - Interface - Name=ISolver). You can copy-paste the following program into this interface.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace MSF_Solver
{
    public interface ISolver
    {
        Dictionary<string, double> solve();
    }
}

Next, we need to create an implementation for this interface. Add a new Class to your project (Solution Explorer - MSF_Solver Project - Add - Add New Item - Interface - Name=PolynomialFitting). You can copy-paste the following program into this class.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using Microsoft.SolverFoundation.Services;
//
namespace MSF_Solver
{
    //
    // PolynomialFitting implementation for ISolver interface is fitting nth degree polynomial 
    // approximation function to a given set of data points by using ordinary least squares method
    public class PolynomialFitting : ISolver
    {
        private Dictionary<double, double> curve; // data points to be fitted (x-value, y-value)
        private int degree; // define degree for polynomial function
        //
        public PolynomialFitting(Dictionary<double, double> curve, int degree)
        {
            this.curve = curve;
            this.degree = degree;
        }
        //
        // ISolver interface method implementation
        public Dictionary<string, double> solve()
        {
            // create solver model
            SolverContext solver = SolverContext.GetContext();
            solver.ClearModel();
            Model model = solver.CreateModel();
            //
            // create decision variables for solver model, based on given polynomial degree
            Dictionary<string, double> coefficients = new Dictionary<string, double>();
            for (int i = 0; i <= degree; i++)
            {
                coefficients.Add("b" + i.ToString(), 0.0);
            }
            //
            // add decision variables to solver model
            var decision = coefficients.Select(it => new Decision(Domain.Real, it.Key));
            model.AddDecisions(decision.ToArray());
            //
            // create objective function by using SumTermBuilder
            var objectiveFunction = new SumTermBuilder(curve.Count);
            //
            // itarate through all given data points
            foreach (KeyValuePair<double, double> kvp in curve)
            {
                // create term object incrementally for squared error (y estimate - y actual)^2
                Term subTerm = 0;
                for (int i = 0; i < model.Decisions.Count(); i++)
                {
                    // y-estimate = (b0 * maturity^0 + b1 * maturity^1 + ... + bn * maturity^n)
                    subTerm += model.Decisions.ElementAt(i) * Math.Pow(kvp.Key, i);
                }
                subTerm -= kvp.Value; //  subterm = (y estimate - y actual) = e
                subTerm *= subTerm; // subterm = e^2
                objectiveFunction.Add(subTerm); // add subterm to objectiveFunction (SumTermBuilder)
            }
            //
            // define optimization type and give objective function SUM(e^2) to be minimized
            model.AddGoal("SumOfSquaredErrors", GoalKind.Minimize, objectiveFunction.ToTerm());
            //
            // solve model and transfer results (optimized decision variables) from 
            // model into a dictionary object which will be returned for the caller
            Solution solution = solver.Solve();
            Dictionary<string, double> parameters = new Dictionary<string, double>();
            foreach (Decision parameter in model.Decisions)
            {
                parameters.Add(parameter.Name, parameter.ToDouble());
            }
            return parameters;
        }
    }
}

The previous class takes in a given set of data points (x, y) inside Dictionary data structure, along with the information about the required polynomial degree of the approximation function to be fitted with data. Relevant comments are included inside the program. More information on MSF can be found with Google. Also, Here are two excellent hands-on examples on using MSF from Mathias Brandewinder: 
http://www.clear-lines.com/blog/post/First-steps-with-the-Microsoft-Solver-Foundation.aspx http://www.clear-lines.com/blog/post/Create-optimization-programs-dynamically-with-C-and-the-Microsoft-Solver-Foundation.aspx

At this point, we have created our desired optimization model in C# by using MSF. Next, we need to set up Excel-DNA for linking our C# program with Excel Worksheet.

Excel-DNA part

First we have to find Excel-DNA files. I was able to find these from here http://exceldna.codeplex.com/ by going to Downloads. I have been using ExcelDna-0.29 so far, but there is now version 0.30 available, as I am writing this post. Download zip file and unzip the content into some appropriate folder.

Next, in our C# program, we need to create reference to Excel-DNA dll file (Solution Explorer - References - Add Reference - Browse). You should find ExcelDna.Integration.dll file. In Properties window, mark this reference as Copy Local = False.

Next, Add a new file (dna file) to the project (Solution Explorer - MSF_Solver Project - Add - Add New Item - Text File - Name=MSF_Solver.dna) with the following content:

<DnaLibrary Name="MSF_Solver" RuntimeVersion="v4.0">
  <ExternalLibrary Path="MSF_Solver.dll" />
</DnaLibrary>

In the properties for this file, set that this file will be copied to the Output directory. Set Copy to Output Directory = Copy if newer.

Then, create a copy of the ExcelDna.xll file (you can find this xll file in your unzipped Excel-Dna folder) into the project directory, and rename it to be MSF_Solver.xll. Add this file to the project (Solution Explorer - MSF_Solver Project - Add - Add Existing Item - MSF_Solver.xll), and in the properties for this file, ensure that it will be copied to the Output directory. Set Copy to Output Directory = Copy if newer.

After these important settings, build your solution. I assume everything has gone well to this point. Setting up Excel-DNA is not difficult, but still one needs to be very careful. One slip in settings and you will spend a lot of time with your compiler, wondering why this is not working. So, be careful out there.

Detailed instructions for setting up Excel-DNA functionalities in Visual Studio can be found from here http://exceldna.codeplex.com/documentationFind the following Word documentation Excel-DNA - Step-by-step C# add-in.doc In that documentation, find the section Creating a new add-in library in C#. The process of setting up Excel-DNA has been described in a very detailed way in that section, along with some screen captures.

At this point, we have set up Excel-DNA files for linking our C# program with Excel Workbook. Next, we are going to create interface function for handling data between Excel and C# program.

Interface function between Excel and C#

At this point, you should have that one Class (ExcelFunctions) existing in your current project (the one, in which we tested MSF originally). You can copy-paste the following code into that class.



using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using ExcelDna.Integration;

namespace MSF_Solver
{
    public class ExcelFunctions
    {
        [ExcelFunction(Category="Excel-DNA functions", Description = "Coefficients for nth degree polynomial curve fit to a given set of data points by using OLS method")]
        public static object[,] PolynomialCurveFitting([ExcelArgument(Description = "2-dimensional array of (x, y) values")] Object[,] curve, [ExcelArgument(Description = "Degree for polynomial function")] int degree)
        {
            // number of data points for a given curve is dynamic
            int rows = curve.GetUpperBound(0);
            //
            // read a given curve data into a dictionary
            Dictionary<double, double> dict = new Dictionary<double, double>();
            for (int i = 0; i <= rows; i++)
            {
                dict.Add((double)curve.GetValue(i, 0), (double)curve.GetValue(i, 1));
            }
            //
            // create appropriate solver model, perform curve fitting and receive optimized coefficients
            ISolver solver = new PolynomialFitting(dict, degree);
            Dictionary<string, double> coefficients = solver.solve();
            //
            // create array for output
            // key is coefficient name (string) and value is coefficient value (double)
            object[,] arr = new object[coefficients.Count , 2];
            //
            for (int i = 0; i < coefficients.Count; i++)
            {
                arr[i, 0] = (string)coefficients.ElementAt(i).Key;
                arr[i, 1] = (double)coefficients.ElementAt(i).Value;
            }
            return arr;
        }
    }
}

First, PolynomialCurveFitting method is receiving data as 2-dimensional object array from Excel. The program then reads that data into a Dictionary data structure. This Dictionary is a feed for ISolver implementation (solver), which performs the optimization part and returns optimized coefficients inside a new Dictionary back to this method (coefficients). Finally, the content of this result Dictionary is read into a new object array (arr) and returned back to Excel.

Now, build your fully completed project. At this stage, you should have the following four files in your Project directory (MSF_Solver\bin\Debug):
  1. MSF_Solver.dll
  2. MSF_Solver.dna
  3. MSF_Solver.pdb
  4. MSF_Solver.xll
Program test run - curve fitting

We test our program and verify our MSF model results against Frontline Solver results. First, set up the following swap curve data into a new Excel workbook.

0,08 0,19
0,25 0,27
0,5 0,29
1 0,35
2 0,51
5 1,38
10 2,46
20 3,17
30 3,32

While your Excel is open, go to your Project folder (MSF_Solver\bin\Debug) and double-click MSF_Solver.xll file. You may not notice anything, but behind the scenes your xll file is loaded into your Excel Workbook. If something has gone wrong, you will get an error message at this point.

Now, let us perform a curve fitting for that data with second degree polynomial function. We are going to have three output coefficients from the program (b0, b1, b2). Our C# method is returning the name of the coefficient and its optimized value. All in all, we are going to receive 3x2 output variant array in this example case. So, when using PolynomialCurveFitting function in Excel, remember to select first 3x2 array in your worksheet before creating your function formula and then press CTRL+SHIFT+ENTER to get the result array from C#.

Below here is my test run results. I have been testing this program also for other degree polynomial approximations and it seems to be working well at least against Excel Solver. We could make preliminary conclusion, that our C# MSF solver model is working as required.
MSF solver Excel solver
b0 0,1434 b0 0,1434
b1 0,2684 b1 0,2684
b2 -0,0055 b2 -0,0055

Before publishing this post, I have been following all my instructions described on this posting, re-created this program and used PolynomialCurveFitting function in Excel successfully two times. So, if you follow all the instructions carefully, you should get the working program.

Some Afterthoughts

Let us face one fact: people in banks are accustomed with Excel and they want to use it. Excel is - and probably will be for a long time - the daily workhorse. Tools like Excel-DNA can open up a lot of new doors what we can do in Excel behind the scenes. Personally I find this approach of creating a program completely outside Excel (VBA) and linking that program back to be used in Excel, to be a wonderful tool. It feels a bit like using new Rolls Royce engine in your old Volkswagen.

My personal Thank You this time goes directly to Govert Van Drimmelen, who AFAIK has developed Excel-DNA pretty much on his own. What this world really would be without such innovators? Thank You also for Mathias Brandewinder and his clear-lines.com blog for those two excellent articles on MSF. Also, Thank You for MSF team for developing such a great optimization tool.

And finally, Thank You for reading this blog again. I really hope that you have got something out from this. Have a great midsummer day!
-Mike

No comments:

Post a Comment