Tech Note: Batch Processing

Batch Processing with pro Fit

Table of contents

Introduction

This TechNote explains how to do batch processing with pro Fit, i.e. how to process several files automatically.

The techniques described here can be used for various applications where a large number of data files must be processed in some way. You may want to combine the data of all files into a single file. Or you may want to open each file, calculate the average of each column, and create a new data file that contains the average values of all data files.

In this TechNote we discuss the following example:

You have a large number of files, each file containing the data of an exponential decay. You want to fit the data from each file to pro Fit’s built-in exponential function. The results of all these fits are to be collected in a new data file.

The techniques we present for solving this problem can easily be applied to any other type of batch processing.

This TechNote and all example files can also be downloaded as a zipped file (16 kB).

Methods for batch processing

There are three basic methods that can be used for batch fitting our data files in pro Fit:

  1. You open all the files by hand (select them all in the Finder and double-click). Then run a pro Fit script that goes through all open data windows and runs a fit for each of them.
  2. You write an Python script in pro Fit that runs through all the files. For each file, the script tells pro Fit to open it, then calls a pro Fit script for fitting and collecting the fit results, and then closes the file again.
  3. You write an Apple Script in Apple’s Script Editor that runs through all the files. For each file, the script tells pro Fit to open it, then calls a pro Fit script for fitting and collecting the fit results, and then closes the file again.
  4. You write an external module for cycling through all the files. The module opens each file in pro Fit, runs a fit, stores the fit results, and closes the file again.

In the following, we will discuss methods 1, 2 and 3. Method 4 requires an in-depth knowledge of MacOS programming, but is otherwise similar to methods 2 and 3 – we will therefore not discuss it in this TechNote.

In all the examples presented here, we will fit the data in each file to pro Fit’s built-in function “Exp” by varying the parameters “A”, “t0” and “const”. The parameter “x0” is to be set to 0 and held constant (because “A” and “x0” are mathematically redundant and they cannot be fitted simultaneously).

Method 1: Batch processing from a Pascal script within pro Fit

This method is easiest to script because the script does not have to find the files within the file system, but it rather assumes that you have already opened them within pro Fit. Its drawback lies in the fact that before running the batch process, you must manually open all files to be processed. You can easily open a large batch of files from the Finder: Select all the files to be opened, then hit command-O or drag the files onto the pro Fit icon.

The short script that follows does then all the work for us.

program MultipleFit;

var wind, myDataWindow;
    windowCount;
    fResults: Object;

begin
 NewWindow(type dataType);                {open a new window for storing results}
 myDataWindow := FrontWindow;                            {save a reference to it}

 windowCount := 0;
 wind := NextWindow(myDataWindow);                    {get next window behind it}
 SelectFunction('Exp');                                {the function we will fit}
 SetParameterProperties(parameter 2, mode inactive);    {we don't want to fit x0}
 while (wind <> 0) do                                 {cycle through all windows}
 begin
   if GetWindowProperty(wind, type) = dataType then              {if data window}
   begin
     windowCount := windowCount+1;
     Writeln('Fitting window ', windowCount);
     SetColumnName(windowCount, GetWindowProperty(wind, name));
     fResults := 
     	CurveFit(window wind, xColumn 1, yColumn 2);                {run the fit}
     if FitResult(fitResultObject fResults, result nrFittedParameters) <> 0 then
     begin                           {if the fit was successful, get the results}
       data[1, windowCount] := FitResult(fitResultObject fResults, 
               result fittedParameter, index1 1);      
       data[2, windowCount] := FitResult(fitResultObject fResults, 
               result fittedParameter, index1 2);      
       data[3, windowCount] := FitResult(fitResultObject fResults, 
               result fittedParameter, index1 3);      
       data[4, windowCount] := FitResult(fitResultObject fResults, 
               result fittedParameter, index1 4);      
     end;
   end;{if}
   wind := NextWindow(wind);                                  {go to next window}
 end;
end;

The script first opens a data window to store the fit results. This window becomes the current data window. Then it cycles through all other presently open data windows, fits each one, and stores the results into the first data window.

To cycle through the data windows, the program uses the commands NextWindow and GetWindowType. NextWindow returns a reference ID for a window that lies behind another window. GetWindowProperty(wind, type) returns the window type, i.e. tells if a window is a data window, drawing window or text window. When the script finds a data window, it runs the fit (by calling CurveFit), and it retrieves the results calling FitResult.

Obviously, the same script could also be written as a Python script. For details, see the next section, where a Python script is used to retrieve the data files from the file system.

Method 2: Batch processing from a Python script

Method 1 has the disadvantage that you have to open all the files manually before running the program. 

It might be more convenient to have a method that runs a fit over all files in a given folder. There is no way to do that from a Pascal script, but it can be implemented by means of a Python script. This is described in the following.

Open a new Function window, and paste the following script into it:

import os

# the directory to iterate through, adapt this to your needs:
dataDir = pf.ChooseDirectory()

# select the function to fit and set parameter x0 to inactive
pf.SelectObject(pf.GetFunctionObject(function = "Exp"))
pf.SetParameterProperties(parameter = 2, mode = pf.paramInactive)

# open a new data window for storing the results
pf.NewWindow(type = pf.dataType)
myDataWindow = pf.FrontWindow()

windowCount = 0
# recusrively iterate through the directory 
for root, dirs, files in os.walk(dataDir):
    for name in files:
    
        # open the file
        p = os.path.join(root, name)
        print ('processing', p)
        pf.OpenFile(file = p)
        wind = pf.GetWindowObject(window = pf.FrontWindow())
        
        #if the opened window is a data window
        if wind.type == pf.dataType:
            windowCount = windowCount+1
            windowName = wind.name
            
            # fit the function, and close the opened window
            fResults = pf.CurveFit(window = wind, xColumn = 1, yColumn = 2)
            pf.CloseWindow(window = wind, saveOption = pf.dontSave)
            
            # get the results
            pf.SetColumnName(windowCount, windowName)
            if pf.FitResult(fitResultObject = fResults, result = pf.nrFittedParameters) != 0:
                pf.SetData(1, windowCount, pf.FitResult(fitResultObject = fResults,  \
                    result = pf.fittedParameter, index1 = 1))
                pf.SetData(2, windowCount, pf.FitResult(fitResultObject = fResults,  \
                    result = pf.fittedParameter, index1 = 2))
                pf.SetData(3, windowCount, pf.FitResult(fitResultObject = fResults,  \
                    result = pf.fittedParameter, index1 = 3))
                pf.SetData(4, windowCount, pf.FitResult(fitResultObject = fResults,  \
                    result = pf.fittedParameter, index1 = 4))
                
        else:
            pf.CloseWindow(window = wind, saveOption = pf.dontSave)
    
print ('done')

This script looks similar to the Pascal script above, but instead of getting the data from the presently opened data windows, it brings up a dialog that allows choosing a directory, whereupon it iterates through all files in that directory (assuming that pro Fit can open them). For each data file, it runs the fit, just as in the first example.

Method 3: Batch processing from an Apple Script

Alternatively to using a Python script, you can also use an Apple Script for the same purpose.

To define the AppleScript, start Apple’s Script Editor, open a new AppleScript window, and enter the following:

-- bring up a dialog for selecting the folder
set myFolder to choose folder with prompt "Choose a folder with data files:"

-- create a list with all files in the folder
set myFiles to list folder myFolder -- a list of files in myFolder
set myFileCount to count myFiles -- the number of files in myFolder

-- now start fitting with pro Fit
tell application "pro Fit NAS"
	activate -- bring pro Fit to front
	set error alerts to false -- disable error reports within pro Fit
	make new table -- open a data window for storing our results
	set name of front window to "Result data" -- and set its name
	set globalData 0 to 0 -- this is our window counter
	repeat with i from 1 to myFileCount
		set theFile to item i of myFiles -- get the i-th file
		try
			open file ((myFolder as string) & theFile) as table -- open file
			write line "processing: " & theFile
			run program "SingleFit" -- run the program in pro Fit
			close window theFile saving no -- close without saving
		on error errText
			write line "cannot process: " & theFile & " (" & errText & ")"
		end try
	end repeat
	set error alerts to true -- enable error reports within pro Fit
end tell

The above script first brings up a dialog where you can choose the folder that contains your files. Then it opens a new data window in pro Fit and sets its name to “Result data”. This data window will contain our results. Then it sets the variable globalData[0] to 0. This variable can be accessed by the program described below and is used as a file counter. Then the script tries opening each file of the folder as a data file. If this is successful, it runs a program called “SingleFit” that must be defined in pro Fit. This program is described below. Then the script closes the data window and goes to the next one.

Note that you can save your script as a “compiled script” from your script editor. The script can then be opened as any other pro Fit module using the “Load Module” command in the Customize menu. Or you can put it in the “pro Fit modules” folder. In this way the script is always accessible inside pro Fit and you don’t need anything else to do your batch processing.

Before running the script, you must define the program “SingleFit”. To do this, switch to pro Fit and define the following program:

program SingleFit;
  var windowCount;
    fResults : Object;
begin
  windowCount := GetGlobalData(0) + 1;                {increase window counter}
  SetGlobalData(0, windowCount); 
  SetCurrentWindow(FrontWindow);             {use data of front window for fit}
  SetFunctionParam('Exp', 1, 1);                      {set starting parameters}
  SetFunctionParam('Exp', 2, 0);                       {these values depend on}
  SetFunctionParam('Exp', 3, 1);                          {your model and data}
  SetFunctionParam('Exp', 4, 0);
  fResults := CurveFit(function 'Exp',  xColumn 1, yColumn 2);    {run the fit}
  SetCurrentWindow(GetWindowID('Result data'));            {window for results}
  if FitResult(fitResultObject fResults, result nrFittedParameters) <> 0 then 
  begin                                             {if the fit was successful}
       data[1, windowCount] := FitResult(fitResultObject fResults, 
               result fittedParameter, index1 1);      
       data[2, windowCount] := FitResult(fitResultObject fResults, 
               result fittedParameter, index1 2);      
       data[3, windowCount] := FitResult(fitResultObject fResults, 
               result fittedParameter, index1 3);      
       data[4, windowCount] := FitResult(fitResultObject fResults, 
               result fittedParameter, index1 4);      
  end;
end;

This program simply fits the data in the front window (which was opened by our AppleScript) and writes the fitting results to the window titled “Result data”.

It uses SetGlobalData/GetGlobalDate for storing the window counter so that the value can also be accessed from the AppleScript

Notes

  1. Both above examples set the starting parameters of the function Exp before fitting. The values used here may have to be varied depending on the data sets to be fitted.
  2. The pro Fit package comes with an Apple Script called “batch processing”. This script can be found in the folder “Examples:AppleScript” – it shows some other tricks that can be done with Apple Scripts. The use of an Apple Script might appear a bit clumsy at first sight because it requires writing the script in a separate application. But Apple Scripts are a very powerful tool, and once you get used to their weird syntax, there’s nearly nothing you cannot do with them.
  3. If you use an Apple Script often, you may want to add it to pro Fit’s Prog menu. To do this, save the Apple Script as a “compiled script” from your script editor, then switch to pro Fit and choose “Load Module” from the Customize menu to load the script into pro Fit. To add the script permanently to pro Fit, save it as a “compiled script” and place it into the folder “pro Fit Modules” in the pro Fit folder.

Further reading

pro Fit User Manual, Chapter 11 “Apple Script”