Showing posts with label statistics. Show all posts
Showing posts with label statistics. Show all posts

Sunday, October 13, 2019

A step back

For those who have found my data analysis package, DANSYS, useful (and have probably also found that some of it doesn't work), I have taken the opportunity presented by the recent turmoil in my life to take a step back and go through all the functions and subroutines. I fixed several bugs and added error handling. Now, if you try to feed a function with data it doesn't allow, it won't crash and open the program editor. It will just sit there staring at you so you can check your input. I have also cleaned up some of my sloppy documentation in case you want to get into the IDE (that's where you actually develop the programs) and do some modifications.

I have the excuse that beta testing would have found these errors much sooner and I haven't had access to beta testers. For those who aren't familiar with the terminology, beta testing is when some people actually use a program that's under development and, when they find problems, or just want to recommend some improvements, they shoot a note to the programmer. I am, by the way, open to suggestions through comments to this blog.

In future months, I will be refurbishing DANSYSX, the user guides, and the LabBooks. I hope you like the changes. For those that haven't looked DANSYS over and want to, you can find it right here:

http://www.theriantimeline.com/excursions/labbooks

Indiana Jones was an archeologist so you can bet that he was into statistics. I've used statistics in some of my studies in these blogs. Data analysis is one of my favorite pastimes. My DANSYS user guides are not just manuals on how to use DANSYS. They also cover the statistics themselves. Check them out and see why I (and Indiana Jones) likes statistics!

Monday, March 18, 2019


--- The Highline Canal: Complexity ---


Here are several sources of information about the Highline Canal.

https://www.denverwater.org/sites/default/files/LargeMapFINAL.pdf

https://www.denverwater.org/recreation/high-line-canal/guide-to-the-high-line-canal-trail

https://www.denverwater.org/recreation/high-line-canal

https://highlinecanal.org

One thing you will notice right off is how the canal meanders across the countryside for 71 miles.

Actually, that is not exactly correct. Rivers meander - the canal does...something else.

The reason a stream meanders is that various factors pull it in different directions - the same kind of factors that makes a thin stream of water meander across a gently sloping pane of glass. The glass isn't perfectly smooth so the stream has to flow around the tiny imperfections. Also, the earth is rotating out from under the water as it flows, creating the Coriolis effect.

The South Platte River shows this kind of meandering as it flows through it's valley near Denver and, even more as it flows out onto the plains.

https://www.google.com/maps/place/South+Platte+River/@41.056048,-101.5642861,12z/data=!4m5!3m4!1s0x8776f2ebc5765be1:0xbe93203c9541fa6c!8m2!3d41.1143877!4d-101.4849727

The Highline Canal is very convoluted for another reason. It follows the topographic contours of the eastern rim of the valley cut by the South Platte River. It flows by gravity so it's gradient must always be down and the Highline flows downhill at a gradual 2 feet every mile. If you've ever looked at a topographical map, you know that it's rare that a contour ever follows a straight line.

By the way, THE Highline Canal is not the only highline canal in the Denver area. There is also the Farmer's Highline Canal that flows through Golden to Westminster and Thornton. There are also several other highline canals in Colorado and, certainly, many in the world. A highline canal is simply a canal the follows a topographical high line (contour).

The highline canal, at a stretch, could be called fractal. That would not be completely accurate because fractility implies repetition at different scales. If you magnify a design and it looks the same at each magnification, that would be fractility. The classic example is the Mandelbrot set.

In physics, scale is capped by the size of the universe at the top and quantum "graininess" at the bottom. There are no such limits in mathematics. You can always come up with a smaller number. So when I say that the Mandelbrot set has infinite detail, I mean it. You can keep magnifying the Mandelbrot set for ever and you will still discover more detail.

Check out this series of magnifications of the Mandelbrot set.




                                                                  [Mandelbrot set]

That last one....is that a tiny Mandelbrot set that I see? Why, yes...I believe it is!

There are, in fact, an infinite number of Mandelbrot sets embedded in any Mandelbrot set. There are also other designs like this one.



                                                          [Detail from the Mandelbrot set]

A shocking detail is how the Mandelbrot set is generated. It grows from this simple equation.

                                                       [Equation for the Mandelbrot set]

This equation tells you to set c to some complex value and then calculate the result of the equation when z is 0+0i. Next place the result back into the equation as the new value of z and keep doing that.

Some values of c, for instance c=0+0i, just sits there quietly. In the case of 0+0i, each step just returns 0+0i. For some values, like 0.804608667883013+0.820834819347395i, the equation "explodes" on the 11th step and my spreadsheet can't even calculate the 12th step because the result is too large. On the other hand, 0.107533656386414+0.479408179270122i shows no sign of getting out of hand through 25 steps.

To create the Mandelbrot set, you just do that for every value of the complex plane and, if a point "explodes", graph it. You can use different colors to indicate how quickly it explodes. The points that don't explode are members of the Mandelbrot set.

If you look at a map of the Highline Canal, a mile long section doesn't look like a 10 mile long section and there doesn't seem to be a copy of the whole canal hidden within a shorter segment, so it would be a stretch to call it "fractal". But is complexity conserved at different scales? In other words, is a 5 mile segment just as wiggly as a 20 mile segment?

Obviously, this could only go so far. A 5 foot section of the canal is not complex at all - it would look straight! But what about "reasonable" lengths - something you could see on a map.

There are measures of complexity that require a computer and a very complex equation to calculate - I wanted something a little simpler. I decided to use a measure that compares the "hiking distance" of a section with a "as the crow flies" section. The Highline Canal trail has been measured - I'm going to assume that the trail mileposts were set up according to hikers with pedometers, but I don't know. I can determine straight line distances using the Google Maps measuring utility.

Now for the comparisons. It would be easy enough to just subtract the straight line distance from the much longer hiker's distance, but then I couldn't compare a 5 mile section with a 10 mile section. I needed a standardized measure. A ratio would do the trick so I used the ratio of the hiker's distance to the straight line distance.

Ratios are dimensionless. As a fraction, if the numerator and denominator units of a ratio are the same, they cancel out. That's good for my purposes because, a ratio of two measurements will be the same whether the units are feet, miles, or kilometers. They aren't dependent on scale.

My measurements are in miles, to the nearest mile. I'm taking measurements off maps that show the positions of trail mileposts so I can't be more accurate than a half mile; therefore, I round map distances to the nearest mile and I can't do any better than that in calculations. That's okay. It's good enough for my purposes.

I have a ratio for the entire trail, four for the quarters that I'll be walking this year, and 15 for the roughly 5 mile segments shown in the Highland Canal Conservancy maps. I've also calculated the arithmetic and geometric means and standard deviations for the quarters and segments, and I have a histogram for the ratios for the segments.

                                                                         [histogram]

The reason for the histogram is that, if the distribution of the ratios looks close to normal (which it probably is not), I can put some stock in the statistics I've calculated. Luckily, the histogram shows a curve that could be mistaken for normal - it's probably binomial but I'll ignore that.

The Highline Canal is 71 miles long, walking distance. As the crow flies, it's only 30 miles long. The ratio is about 2. How similar are segments of the trail at different scales?

The quarter ratios range from 2 to 3 for an average of 2 and a very similar geometric average. The calculated standard deviation is about 0.4. Smaller standard deviations mean less variation. If you select a 20 mile segment of the canal trail, there is a 68% chance that it's ratio will be within 0.4 of the average: somewhere between 1.9 and 2.7.

The 5 mile segments look very similar in terms of complexity. The average and geometric average ratio is close to 2 which is well within the expected complexity of the quarter sections and not far off from the ratio for the whole trail. The standard deviation is about 0.7 - more variation than what is seen in the quarter sections, but still not a lot.

What all these numbers mean is that, in terms of how much the canal and it's trail meanders, 5 mile segments look a lot like 20 mile sections, and the 20 mile sections look a lot like the whole trail.

If you walk the trail, you'll see what I mean. You can see that it's very "wiggly" from the maps. There are places that you can walk 10 minutes and you find yourself very close to where you were 10 minutes ago.

Some parks have very convoluted walking trails simply to pack longer trails in smaller areas. Complexity can be useful. The Highline canal is convoluted so that it can follow the terrain with a constant, very shallow down-hill gradient.

On a surface more complexity provides more surface area. Your small intestine membranes are very complex, packed with tiny fingers jutting into the inner space of your gut. They increase the surface area of the membrane that absorbs nutrients from your food. Chemical engineers who design catalysts for things like catalytic converters and chemical reactors want the catalysts to have as much surface area as possible so that as much of the reactants can get to the catalyst as possible.

If the Mandelbrot set could be translated into a physical object, it would have infinite surface area and would make a great catalytic surface or absorber. Of course, it can't, but you can see why fractal surfaces can be useful.

If you would like to explore fractals, there are many Mandelbrot viewers on line such as the one at:

http://math.hws.edu/eck/js/mandelbrot/MB.html

or you can download the Mandelbrot explorer at

https://www.mandel.org.uk

There are many examples of fractals in nature. The way limbs split off a larger limb looks a lot like the way the larger limbs split off even larger limbs, and the way vein in leaves divide. Smaller sections of the swirl of sunflower seeds look like the whole swirl. The next time you walk, look for fractals on the trail. You might be surprised at how common these intricate patterns are.


Thursday, May 3, 2018

DANSYSX Version 2.0

DANSYSX Version 2.0 is up and can be downloaded here:

http://www.theriantimeline.com/DANSYSX.ods

with it's user guide:

http://www.theriantimeline.com/DANSYSXGuide2.0.ods

This version adds correlation procedures, contour charts, phasor math, digraphs, chart axes, and more.

It's free, but you have to have LibreOffice Calc to use it (but that's free, too.)

Wednesday, June 21, 2017

DANSYSX

DANSYSX Version 1.0 is finished and has been posted on the Timeline:

http://theriantimeline.com/DANSYSX.ods

This update includes extensive complex math and combinatorics functions, graphic routines that allow you to specify and generate graphic objects on the spreadsheet or programmatically, 2-way and 3-way crosstabulation routines, ntile conversion, and Monte Carlo generation of sampling statistics (a way to convert misbehaving statistics from non-normal distributions to more civilized values.)

There's also a programming language that allows you to code directly on a spreadsheet. It looks sorta like assembly language and is more a toy for me than anything else and, at present, doesn't work well enough to actually use for anything. The looping structure doesn't work yet, but, if you want to play around with it, the macros are accessible.

I'm working on a user guide to the basic DANSYS and I'll be working on one for this update of DANSYSX. When those are done, I'll post them. In the meantime, all the functions are documented on the DANSYS Functions sheet.

Have fun!

Friday, April 21, 2017


--- First impressions ---

When I was in college, I enjoyed classes that were just helping a researcher with their research. Universities do that. It helps academicians publish instead of perish and students have fun and improve their grades. Everybody wins.

You don't believe it? Well try it out. There is a place on the Internet that you can take part in active psychological research. It's called the Online Psychological Laboratory and it's right here.

http://opl.apa.org/Main.aspx

I was interested in a study called "First Impressions" so I signed up as a participant.

I......won't tell you about the experiment beforehand, just in case you want to participate.

Be sure and read the write-up under "Studies" after you do the experiment. If you're into statistics, you can download the data from various groups of other subjects and see if those results match what you expect. Feel free to use my statistics packages for LibreOffice Calc, DANSYS and DANSYSX at

http://www.theriantimeline.com/excursions/labbooks

(If DANSYSX is not up yet, it will be soon. It's a new expanded version of DANSYS.)

One of the big problems with psychological or social experimentation as an individual is that they usually require groups of people to participate. Students usually have access to groups (their class, families that don't mind helping with homework, groups being served by special programs at colleges or universities, etc.), but strangers look at unassociated folks askance when they start asking to fill out surveys or answer more or less personal questions. You can do chemistry or physics experiments all day and no one really cares (except, maybe, your boss at work or your significant other - "You make messes all day but you won't carry out the garbage for nothing!"), but when you start asking weird questions, that's another matter.

The Online Psychology Laboratory will give you a taste of the real thing without danger of prosecution.

By the way, when you finish one of these experiments, see if you can catch the principle explored by the experiment at work when you're out peoplewatching!