Friday, September 9, 2011

MCU linear regression implementation

Due to noise in the circuit and real-time fluctuations in the phenomenon being measured the readings from a sensor are unsteady and would jump around from millisecond to millisecond just like the day to day up and down variations in the index of trading bourses. However, in the long run the value does show a trend as the variable being measured increases or decreases or remains relatively steady. Thus I'm trying out linear regression--fitting a line through the mess of data points--to tease out this trend. I've used the least squares method since that's what I'm familiar with (actually it's been years and had to hit the math textbooks to refresh my memory!).

What I need to know during every sensor read is the trend at that point. But of course I need some historical readings to compute for it. So I take the last n readings and use those to derive the line of best fit through the n data points. I only need to know the slope of the line (the y-intercept, i.e., the b in ax+b, is irrelevant). If the slope is positive then this implies the value being sensed is increasing, if it's negative it's decreasing, and if it's very close to zero then it's plateaued.

In the application that I have sensor readings are taken at fixed time intervals. The last n sensor readings are stored in an array. During every sensor read the slope is calculated. Because we're always taking the last n readings every time we compute for the slope what we're doing is analogous to computing for a moving average. In this case we can call it a moving linear regression. 

The slope of the line of best fit is given by

(nΣxy - ΣxΣy) / [nΣx2 - (Σx)2]
where

x = point in time
y = sensor reading

y is easy enough to understand. That's just the ADC reading of the analog sensor output. x, as I'm using it here, needs a little more clarification. Let's say our oldest reading of n readings is in cell y[1] and the latest reading is in y[n]. y[i] was taken at time x[i] . But x[i] does not store actual clock time, but rather normalized time. Thus x[1] = 1, x[2] = 2, ...., x[n] = n. The values of x are fixed even as we discard the old y value and shift in the latest. Moreover, the actual time interval between reads isn't or need not be reflected in x as well. Thus the actual elapsed time between two readings may be 5ms, 48ms, 100ms, 2000ms or what have you. As long as that time interval is fixed for all readings then we're good.

Because the values of x never change, the above equation can be greatly simplified. The formula for an arithmetic progression Σx = n(n+1)/2. Hence the least squares equation can be rewritten as

[nΣxy - n(n+1)(Σy)/2] / [nΣx2 - n2(n+1)2/4]

n is common to both the numerator and denominator and so cancel out. We're left with

[Σxy - (n+1)(Σy)/2] / [Σx2 - n(n+1)2/4]

At this point we can provide the firmware the value of n and let it do the calculations, but I'm using an MCU with only 2K of flash. And because I don't want to slow the processor down with unnecessary math computations--whose answers we already know beforehand because n is specified in the defines--I've simplified the formula further still.

Because the 8-bit PICs have very limited RAM and since the rest of the firmware also needs a lot of memory, I settled on n = 15 (I initially tried n = 31 but ran out of RAM). This number is based on the criterion n = 2i- 1, where i = integer, so that n+1 = 2i. The rationale for this constraint is that multiplying by 2i requires nothing more than left shifting the register bits. This reduces computing time and may also reduce code length. With n = 15 the slope of the line simplifies to

[Σxy - (15+1)(Σy)/2] / [Σx2 - 15(16)2/4]

[Σxy - 8(Σy)] / [Σx2 - 960]

Because x = 1, 2, ... n, Σx2 = 1 + 4 + 9 + .... + 225 = 1,240. So the above further simplifies to

[Σxy - 8(Σy)] / [1,240 - 960]

(Σxy - 8Σy) / 280

Keep in mind the above equation is valid only for n = 15. For other values of n the constants will be different.

Here a C implementation of the above equation. It's in mikroC v.5.0.0. Because there are type conversions--integer to floating point and back--it's important to get type casting correctly or you'll get garbage. If in doubt cast the variables explicitly.

Given n = 15, the latest sensor reading is stored in Y[15]. The 15 historical readings are in Y[0] to Y[14]. These 15 values are the ones used in the equation. Y[15] will be copied into Y[14] and used in the next cycle through the state machine (not shown below). Because I want to see the value of the slope, I've added  variable INTSLOPE. To reduce complications of deriving this from the floating point variable SLOPE when it's negative, the firmware computes for (Σxy - 8Σy) such that it's always positive. Because the slope can be and may often lie between -1 and +1, SLOPE is multiplied by 1000 before it's converted to an integer type and stored in INTSLOPE. This provides a resolution of 3 decimal places. For instance if INTSLOPE = 3571 then the actual slope = 3571/1000 = 3.571. The 16-bit INTSLOPE and the sign (plus or minus) of the slope are then transmitted out of the EUSART TX pin so I can view the values on a logic analyzer.



// MCU = PIC12F18xx or PIC16F18xx
// compiler = mikroC v5.0.0
// September 2011


#define  n                   15        // number of data points for linear regression
#define  positive            1         // sign of slope
#define  negative            0         // sign of slope

void LinReg()
{
  unsigned char i;           // counter
  unsigned int Y[n+1];       // newest reading is temporarily stored in Y[n]
                             // and then copied to Y[n-1]
                             // oldest reading is in Y[0] 
  unsigned int SUMY;         // sum of all Y
  unsigned int SUMXY;        // sum of all X*Y
  bit   sign;                // 1 = slope is positive; 0 = slope is negative
  float SLOPE;               // linear regression value of the slope
  unsigned int INTSLOPE;     // integer version of SLOPE. Actually equal to 1000*SLOPE

  Y[n] = CURR;               // CURR = latest sensor reading
  SUMXY = SUMY = 0;
  for (i=0; i<n; i++)
  {
    SUMXY += Y[i]*(i+1);
    SUMY += Y[i];
    Y[i] = Y[i+1];           // discard oldest reading and left-shift all readings including latest one
  }

  SUMY *= 8;
  if (SUMXY >= SUMY)
  {
    SUMXY -= SUMY;
    sign = positive;
  }
  else
  {
    SUMXY = SUMY - SUMXY;
    sign = negative;
  }
  
  SLOPE = SUMXY/280.0;       // this particular equation is valid only for n = 15
                             // SLOPE will always be positive.
                             // Check variable "sign" to determine if slope is negative or positive
  INTSLOPE = SLOPE*1000.0;   
  
  // transmit 16-bit slope and its sign via EUSART
  // EUSART is set up in a separate initialization routine
  while (!PIR1.TXIF);
  TXREG = sign;              // sign byte; 1 = positive slope, 0 = negative slope

  asm {nop}                  // datasheet says TXIF will give false value if read right after loading TXREG
  while (!PIR1.TXIF);
  TXREG = INTSLOPE >> 8;     // high byte

  asm {nop}                  // datasheet says TXIF will give false value if read right after loading TXREG
  while (!PIR1.TXIF);
  TXREG = INTSLOPE;          // low byte
}

----


IMPORTANT ERRATA
September 10 2011

I was fiddling with the circuit and found the following needed to be changed:

unsigned int SUMY;         // sum of all Y
unsigned int SUMXY;        // sum of all X*Y

to

unsigned long SUMY;         // sum of all Y
unsigned long SUMXY;        // sum of all X*Y

16 bits isn't enough when using a 10-bit ADC. SUMXY can exceed 0xFFFF. While Σy will not exceed 16 bits, 8Σy may. So to make sure, I changed both to 32-bit integers.

For a 10-bit ADC, each Y[i] will not exceed 16 bits of course. However, I found out after hours of going nuts that mikroC does not like the following expression/assignment

SUMXY += Y[i]*(i+1);

because, as far as I can tell, it goes bonkers due to the fact that SUMXY is 32 bits while Y[i] is 16 bits and i is 8 bits. To prevent SUMXY from ending up with garbage I had to change the above to

SUMXY = SUMXY + Y[i]*(i+1);

1 comment: