Sunday, July 16, 2017


--- Crosstabulation ---

2016

Until I figured out how to create stub and banner tables with LibreOffice pivot tables, I was considering making a tabling routine for DANSYSX but that would be redundant, so my focus for crosstabulation became just the 2-way and 3-way statistics programs. Since they are involved and, from scratch, it will give me a chance to talk about my structured approach to developing Calc routines.

XTab is a big (!) matrix function, but it's a sequential function, meaning that it has many procedures that are completed before the next process is started. My first task is to outline the processes with comments. A comment in LibreOffice Basic begins with a single apostrophe. Here, I include a header:

FUNCTION XTab(optional InMat,optional InType, optional CellCont as string,
_)

(It's not complete, because I'll be adding parameters as different procedures require them.) Then, comes a descriptive comment:

'Performs a complete analysis on crosstabulation data
'from a raw data table or a crosstabulation of data.
'The input is the data and the output is a matrix containing
'a stub-and-banner table of cell data followed by
'a table of statistics. InType=1 (Crosstab), 2 (raw data)
'CellCont is a binary string designated data to be displayed in
'cells of output: 10 places: counts, exp counts, row%, column%,
'Total%, residuals, stand. residuals, adj, residuals, chi square,
'likelihood chi square, odds, compare column proportions (Bonferroni).

and a declarations and initialization section:

'Declarations and initialization. If InMat isn't specified,
'InMat=MatArray1. If InType isn't specified, InType=1 (crosstab).
'If CellCont isn't specified, CellCont="100000000000"

DIM OutMat(), Xi(), Yj(), OutR as integer, OutC as integer
DIM I AS INTEGER, J AS INTEGER, K AS INTEGER, L AS INTEGER

That will also grow as I add code. I'll declare other variables in alphabetical order, inserting each as I use them, so I can keep up with names that I've used. Double declarations and accidentally using the same variable for multiple purposes without knowing it can be major headaches.

Then, I know that I'll have to crosstabulate data if a raw data table is fed to the function, so I annotate that section:

'Crosstabulation, if needed.


Redim OutMat(), Xi(1,2), Yj(1,2)

Thinking ahead, I know I'll have to redimension the output matrix when I know what the stub-and-banner part of the display will look like. Everything below that will be pretty straightforward and I can add size as I finish each section. Each section is independent of the ones below it, so I can test each one as I finish it. I have a test data on the Report sheet of DANSYSX that I will use to progressively test each section.

The final section will just display the OutMat matrix which, by that time, will already be constructed by the procedures I have coded. OutMat is a convention I use for all macros that display a matrix.

I will have the crosstabulation saved in MatArray3, just to increase it's utility and make it compatible with the matrix functions I've programmed for DANSYS and DANSYSX.

I use a lot of spreadsheet functions in the program and that's a little tricky. The topic isn't covered in the programming guide, but you can find information online.

To use a spreadsheet function in a Basic routine, you must first declare an object variable to hold the function. You only have to declare it once since you can't use more than one spreadsheet function at a time. I use the same name, svc, and declare it as an object variable:

DIM svc AS OBJECT

Then the variable has to be set as a function variable. That only has to be done once, also.

svc=createUnoService("com.sun.star.sheet.FunctionAccess")

Uno is the object manipulation language used by LibreOffice. This statement invokes a Uno service called FunctionAccess.

After the variable is set, it can be used to call spreadsheet functions such as :

PCS=svc.CallFunction("CHIDIST",Array(CSQW,dfCS))

This statement calls the CHIDIST spreadsheet function. CHIDIST evaluates a chi square value at a specified degrees of freedom value. The CallFunction method for a function variable requires the name of the function as a string (in quotes) followed by the information to be transferred to the function in an array. The function requires a value and degrees of freedom (if you look the function up in the Calc Function Wizard, you can see the structure of the function) to be passed as an array. Even if you pass an array to a spreadsheet function, you have to use the Array statement followed by the name of the array in parentheses. You can see that here:

XMIN=SVC.CALLFUNCTION("MIN",ARRAY(Eij))

in a statement that finds the minimum value in the array Eij.

To get the return value, a variable, PCS, is set equal to the function variable.

One of the aggravating things about LibreOffice Basic is you can't return a matrix like this, which is why I went about programming a complete matrix language for DANSYS.

Displaying the results is easy. I just set the function to the output matrix, like:

XTab=OutMat.

There was some debugging that had to be done, but you can see the final product when I post DANSYSX in the Therian Timeline, here:

http://www.theriantimeline.com/excursions/labbooks

Note: DANSYSX version 1.0 is available now on the timeline.



No comments: