Math for Engineers

This file documents the Math for Engineers work I've been doing with Squeak.

To just jump into it, filein the following files IN THE FOLLOWING ORDER.
MathDD1.cs
MathDD2.cs
PointDD1.cs
PointDD2.cs
CollectionDD.cs
CollectionStatistics.cs
CollectionReindexed.cs
StringsAsNumbers.cs

Background

My education is actually as a Mechanical Engineer. I was fortunate enough however to get involved with Smalltalk early on in my undergraduate years and found that the ST environment was ideal for an engineer. I had done quite a bit of Fortran and a little of C and was just amazed at how easy it was to transcribe engineering ideas into simulation and problem solving using the Smalltalk environment. One of the most powerful things was being released from the doldrums of typed number systems. Especially for the everyday little engineering tasks. It is not every day that engineers sit down and simultaneously solve systems of 47 equations, or other scary matrix stuff, yet this is all most "math libraries" ever seem to do for you. It was so nice to sit down in Smalltalk with a little workspace open a file to parse out some numbers and do some simple calculation (like computing the standard deviation) without having to compile and link a program. Where I really began to fall in love with ST, was when I realized how easily I could extend the mathematical framework.

The work contained in these change sets is a more refined version of snippets I have been carrying around with me at school and two companies in VisualWorks. In fact, much of it was ported from VisualWorks (yes, I will make a VW version of the Squeak polish available soon).

Double Dispatching

The first change I had to make to the Squeak environment was to make it use double dispatching for the four basic mathematical operations (+,-,/,*). Double dispatching is a pattern or mechanism by which an object receiving a message can further refine the semantics of the message by turning around and sending a message back to the argument with type information embodied in the message. For example, using double dispatching, the method Float>>+ will look something like:
 
+ aNumeric
    ^aNumeric sumFromFloat: self
Different kinds of numerics (Integer, Fraction, Float, etc.) can then implement the sumFromFloat: message in a way that is appropriate for that combination of numeric types.

Two change sets actually make the change to the system. They are MathDD1.cs and MathDD2.cs. These two need to be filed in before anything else, and in that respective order. I had to do two change sets, because the first one puts in all of the double dispatching methods, the second one actually changes the core math methods to use them. Two not have completed the first in its entirety would create a situation where adding two integers raised a message not understood --- not a good thing.

Points

The next thing I did was to change Points to take advantage of the double dispatching methods. The next two change sets are PointDD1.cs and PointDD2.cs. In addition to making points interface with single numerics using DD, I also cleaned up many of the derivative methods, such as // and \\ and quo: and rem: and etc. The original implementations kind of assume that Point is at the top of some sort of coercion hierarchy, and that was not what I had in mind.

Collections

Up till now, nothing in your system has really changed. Now the fun begins. When doing a lot of engineering work, I noticed that it was a very common pattern that I would take two like sized collections and add (or multiply, subtract, divide) their respective elements. At first in my naivety, I created a subclass of Array called NumericArray. I wrote the appropriate asNumericArray and asArray messages and proceeded to litter my code with typecast like messages. When I discovered that it's quite inefficient to copy large arrays just so you can send a different message to it, I got smart and made NumericArray a wrapper like object (this got rid of NumericCollection, and numerous Numeric* thingies that had followed as well). Still I found myself putting these cast like methods everywhere. So I said heck with it and added the numeric protocol to collections and sequenceable collections and made them DD with other numeric objects. Here's some examples:
#(1 2 3) + 4.0 -> #(5.0 6.0 7.0)
#(3 2 1) * #(4 5 6) -> #(12 10 6)
4@1 / #(2 4) -> #((2@(1/2)) (1@(1/4))) "not really legal syntax"
2 - #(-2 0 2) asSet -> #(4 2 0) asSet
#(5 6 7) - #(1 2 3) asSet -> Error "cannot mix unordereds with anything but single numbers"
To get this stuff, file in the CollectionDD.cs change set. At least at my places of employ and school, I have found this to be immensely beneficial, in particular because you can build more complex algorithms on them. Which brings us to...

Statistic Type Stuff

Which is very easy to build up and nice to have in conjunction with the numerical stuff. The CollectionStatistics.cs change set adds things like max, min, and sum to collections. Once you have these, then methods like range, average, deviation, variance, etc. get easy to add. Take for example plotting a collections of x and y values in a given box on the screen. Sooner or later, you have to compute the actual pixel points from the x and y values. Using the collection stuff, it might look something like this:
 
(xVals - xVals min / xVals range) @
    (1 - (yVals - yVals min / yVals range))
    * box extent + box origin
Which is much more concise than the loops and vars one might expect otherwise.

Reindexing Collections

One of the things that we've found ourselves doing a lot of lately is wanting to enumerate over every nth element of a sequence. This is particularly applicable in image processing where colors are usually stored as repeating RGB values. For example, we may want to compute the standard deviation or average of the red content of an image. ReindexedCollections, are collection wrappers which change the indexing scheme. The actually abstract and genericize the concept of reverseDo:. We don't send that message any more, we just send something like:
(sequence by: -1) do: [:each | ...work, work, work...]
The change set CollectionReindexed.cs has this stuff in it.

Perl Too (OK, not completely)

It has been my experience that when you add rather generic behaviors to rather standard objects, that some people freak out and start tirading about errors being harder to detect because there "one off" type errors. In the seven years of evolving this simple stuff, I've only had that happen once. It occurred because I had read a number string from an input field and forgotten to convert it to a number before mathematically combining it with another number, which resulted in the characters being modified because Strings are SequenceableCollections.

I remember when someone was showing me Perl once, I thought it was really cool that you could add a string representation of a number with a normal number, and get the resultant number. It seemed ideal for numeric parsing/scripting. So I've done something like that in StringAsNumbers.cs to avoid ever having the above problem again. Double dispatching is used to actually treat the number as it represents and mix it with other numeric types (or other strings!). It's actually kind of cool, because I don't really worry about converting strings I've read if all that I'm going to do is start crunching numbers with them, because that will happen automatically.

#('3' 4 '5.0') + '1' -> #('4' 5 '6.0')
The only thing that the StringsAsNumbers change set is lacking in is that it uses the default asNumber message for Strings to extract the number. Unfortunately, 'abc' asNumber will return zero rather than raising a conversion error which for this case, I think it should. Eventually, I will get around to writing an asNumberOnError: message to use. But I was kind of waiting to see what would come in the way of exception handling in the future.

Future Work

This is not the end. I do have quite a few other things I plan on doing in the future, such as: