Without any explanation...I just didn't want to forget it. More to follow when I get the graphing stuff finished. Served as a good opportunity to explore numpy in more detail. No effort was made to simplify it down further. I will be adding shapefile reading capabilities as well. Just open it and run it...a simple data set is contained within. Cross-posted from:

https://geonet.esri.com/blogs/dan_patterson/2014/08/18/correlation-and-regression-using-numpy

'''

correlation.py

Author: Dan.Patterson@carleton.ca

Purpose:

calculates the correlation coefficient and regression parameters for simple

correlation using numpy

'''

import numpy as np

def correlation(xs,ys):

s_x = np.std(xs, ddof=1, dtype=np.float64)

s_y = np.std(ys, ddof=1, dtype=np.float64)

covar = np.cov(xs,ys)[0][1]

r = covar/(s_x * s_y)

return r

def regress1D(xs, ys, dim=1):

'''simple first-order least squares regression in the form y=mx+b'''

coeff = np.polyfit(xs,ys,dim)

polynomial = np

polynomial = np.poly1d(coeff)

y_cal = polynomial(xs)

return [coeff, y_cal]

if __name__ == "__main__":

xs = [0,1,2,3,4,5,6,7,8,9,10]; ys = [0.2,0.7,2.7,2.6,4.1,5,5.7,7.3,7.9,9.1,9.8]

r = correlation(xs,ys)

coeff, y_cal = regress1D(xs,ys,1)

text ='y = %.3fx + %.3f' % (coeff[0],coeff[1])

print "\nPearson's r: ", r

print "Equation (y=mx+b): ", text

# ObiDanGIS's Geomatics Blog

This is Dan Patterson's Geomatics blog. ObiDan is a nickname given to me by my former students. I will post ramblings about GIS and links that I find relevant to the student population and my former student-friends. Initially, I will focus on code samples that will be useful to students in the upcoming academic year. Feel free to contact me about any of the content or for help. ObiDan (ObiDanGIS@gmail.com)

## Tuesday, 19 August 2014

### Collections in numpy: producing frequency distributions and graphing

A simple verbose demo again, so that I don't forget. The script is just pasted here until the python format and copying with IE11 is fixed. The script is pretty self explanatory and no effort has been made to simplify it. Cross-posted at

https://geonet.esri.com/blogs/dan_patterson/2014/08/18/collections-in-numpy-producing-frequency-distributions-and-graphing

'''

CollectionsDemo.py

Author: Dan.Patterson@carleton.ca

Purpose:

To demonstrate the utility of using the collections module to

obtain a simple frequency distribution, in this case, a list

of random integers. It is written verbosely so that the user

can see the sequence of events and the results of the various

methods.

A sequence of random numbers is generated and a dictionary of

key:values is produced by collections.Counter. The resultant

keys are cloned to "classes" to prevent alteration of the

initial keys. An extra class is appended to the list to ensure

that the last class in the keys is included since the behaviour

of histogram is to combine the last two classes into

one frequency (long story). I just add a value of 1 to the last

class to produce an extra bin.

A histogram is produced which contains the classes and the frequency

for those classes.

'''

import collections

import random

import numpy as np

from matplotlib import pyplot as plt

rand_int = [random.randrange(1,6) for i in xrange(0,25)]

dict = collections.Counter(rand_int)

keys = dict.keys()

counts = dict.values()

classes = keys[:] #clone the keys

classes.append(classes[-1] + 1) #to ensure that the last bin has values

histo = np.histogram(rand_int,classes)

plt.hist(rand_int,bins=classes,align='left')

plt.title("Sample Histogram", loc='center')

plt.xlabel("class"); plt.ylabel("frequency")

plt.show()

print "\nCollections and pylab\n"

print "Random integers:\n", rand_int

print "Collections dict: ", dict

print " keys: ", keys

print " values (freq): ", counts

print "histogram ", histo

print " classes: ", histo[1]

print " frequency: ", histo[0]

https://geonet.esri.com/blogs/dan_patterson/2014/08/18/collections-in-numpy-producing-frequency-distributions-and-graphing

'''

CollectionsDemo.py

Author: Dan.Patterson@carleton.ca

Purpose:

To demonstrate the utility of using the collections module to

obtain a simple frequency distribution, in this case, a list

of random integers. It is written verbosely so that the user

can see the sequence of events and the results of the various

methods.

A sequence of random numbers is generated and a dictionary of

key:values is produced by collections.Counter. The resultant

keys are cloned to "classes" to prevent alteration of the

initial keys. An extra class is appended to the list to ensure

that the last class in the keys is included since the behaviour

of histogram is to combine the last two classes into

one frequency (long story). I just add a value of 1 to the last

class to produce an extra bin.

A histogram is produced which contains the classes and the frequency

for those classes.

'''

import collections

import random

import numpy as np

from matplotlib import pyplot as plt

rand_int = [random.randrange(1,6) for i in xrange(0,25)]

dict = collections.Counter(rand_int)

keys = dict.keys()

counts = dict.values()

classes = keys[:] #clone the keys

classes.append(classes[-1] + 1) #to ensure that the last bin has values

histo = np.histogram(rand_int,classes)

plt.hist(rand_int,bins=classes,align='left')

plt.title("Sample Histogram", loc='center')

plt.xlabel("class"); plt.ylabel("frequency")

plt.show()

print "\nCollections and pylab\n"

print "Random integers:\n", rand_int

print "Collections dict: ", dict

print " keys: ", keys

print " values (freq): ", counts

print "histogram ", histo

print " classes: ", histo[1]

print " frequency: ", histo[0]

## Sunday, 29 December 2013

### Nested Conditional Operators

From my last post....consider this...

>>> a = [1,2,3,4,5]

>>> for i in a: True if i < 3 else ("blah" if i == 4 else False)

...

True

True

False

'blah'

False

>>>

Which means you can extend the added function in Python 2.5 to include nested operators which is a handy function if you only have a few values to reclassify

So consider an integer field containing random numbers and an output text field, that you want to reclassify as

"True" if !Random! < 5 else "False"

would result in all the values in the Random field being reclassified to as either True or False. If we want to expand this reclassification to include more breaks, consider the following.

"True" if !Random! < 5 else ("sort of" if !Random! == 5 else "False")

which would result in values of 5 in Random being classified as "sort of" and values > 5 as "False"

>>> a = [1,2,3,4,5]

>>> for i in a: True if i < 3 else ("blah" if i == 4 else False)

...

True

True

False

'blah'

False

>>>

Which means you can extend the added function in Python 2.5 to include nested operators which is a handy function if you only have a few values to reclassify

So consider an integer field containing random numbers and an output text field, that you want to reclassify as

"True" if !Random! < 5 else "False"

would result in all the values in the Random field being reclassified to as either True or False. If we want to expand this reclassification to include more breaks, consider the following.

"True" if !Random! < 5 else ("sort of" if !Random! == 5 else "False")

which would result in values of 5 in Random being classified as "sort of" and values > 5 as "False"

## Saturday, 19 October 2013

### Conditional Expressions

Came across a couple of links that I will expand upon when I get some time.

http://docs.python.org/2/whatsnew/2.5.html#pep-308-conditional-expressions

http://forums.arcgis.com/threads/92241-In-line-if-statements-in-the-field-calculator-v10.0

Will greatly simplify field calculator expression.

http://docs.python.org/2/whatsnew/2.5.html#pep-308-conditional-expressions

http://forums.arcgis.com/threads/92241-In-line-if-statements-in-the-field-calculator-v10.0

Will greatly simplify field calculator expression.

## Saturday, 31 August 2013

### Circular mean for directional data

When working with angles or time (minutes, days, months or years) the mean (average) of a list of values can't be determined in the same fashion as we normally do. This example show one how to calculate the values of angles (azimuths specifically). The function can be used in any script once the CircularMean.py script is imported. I will get around to other descriptive measures for circular data.

'''

CircularMean

Author

Dan Patterson

obidangis@gmail.com

Dan_Patterson@carleton.ca

Date Aug 2013

Purpose:

To calculate the mean (average) for points along a circle using the

angular measure to these points from the center of the circle

Notes:

If the angles are given in degrees then there is no need to convert

If the units are time, such as hours (24 hour clock) or months (12) then

these need to be converted by

angle in degrees = time quantity / number of equal intervals

ie 0.5 = (12 hours / 24 hours per day)

or 180 degrees

required modules

arcpy, math

'''

def circularMean(angleList):

'''

angles to be given in degrees

math module needs to be imported in the main module

'''

cosList = []

cosSum = 0.0

sinSum = 0.0

for i in angleList:

theCos = math.cos(math.radians(float(i)))

theSin = math.sin(math.radians(float(i)))

cosSum += theCos

sinSum += theSin

N = len(angleList)

C = cosSum/N

S = sinSum/N

theMean = math.atan2(S,C)

if theMean < 0.0:

theMean += math.radians(360.0)

return theMean

if __name__ == "__main__":

import math

sampleData = [ [350,10],

[350, 5],

[355, 10],

[90,270],

[180,0],

[180,360],

[0,360],

[90,180,270]

]

print "\nCircular Mean Demo"

print "%20s %20s %20s" % ("angle List", "Mean radians", "Mean degrees")

for angleList in sampleData:

theMean = circularMean(angleList)

print "%20s %20.8f %20.8f" % (angleList, theMean, math.degrees(theMean))

'''

CircularMean

Author

Dan Patterson

obidangis@gmail.com

Dan_Patterson@carleton.ca

Date Aug 2013

Purpose:

To calculate the mean (average) for points along a circle using the

angular measure to these points from the center of the circle

Notes:

If the angles are given in degrees then there is no need to convert

If the units are time, such as hours (24 hour clock) or months (12) then

these need to be converted by

angle in degrees = time quantity / number of equal intervals

ie 0.5 = (12 hours / 24 hours per day)

or 180 degrees

required modules

arcpy, math

'''

def circularMean(angleList):

'''

angles to be given in degrees

math module needs to be imported in the main module

'''

cosList = []

cosSum = 0.0

sinSum = 0.0

for i in angleList:

theCos = math.cos(math.radians(float(i)))

theSin = math.sin(math.radians(float(i)))

cosSum += theCos

sinSum += theSin

N = len(angleList)

C = cosSum/N

S = sinSum/N

theMean = math.atan2(S,C)

if theMean < 0.0:

theMean += math.radians(360.0)

return theMean

if __name__ == "__main__":

import math

sampleData = [ [350,10],

[350, 5],

[355, 10],

[90,270],

[180,0],

[180,360],

[0,360],

[90,180,270]

]

print "\nCircular Mean Demo"

print "%20s %20s %20s" % ("angle List", "Mean radians", "Mean degrees")

for angleList in sampleData:

theMean = circularMean(angleList)

print "%20s %20.8f %20.8f" % (angleList, theMean, math.degrees(theMean))

## Saturday, 24 August 2013

### Segment Intersection Demo

An addendum to my previous post in the quest to find a solution to polygon diameter, antipodal distance and/or polygon fetch for concave polygons. This ditty checks to see if line segments intersect...may come in handy...read the header etc.

'''

SegmentIntersectionDemo.py

Author

Dan Patterson

obidangis@gmail.com

Dan_Patterson@carleton.ca

Purpose:

Determines whether two line segments intersect and returns it if it does

Detects parallel lines. This differs from line intersections, whereby the

point of intersection need not be on either line

Background:

http://gis.stackexchange.com/questions/69025/antipodal-distance-or-polygon \

-fetch-or-polygon-diameter-for-concave-polygons

'''

def SegmentIntersection(A,B,C,D):

#[read the point objects, A,B,C,D which are lists of X,Y coordinates

x0,y0 = A; x1, y1 = B; x2,y2 = C; x3,y3 = D

# calculate differences

Ax = x1-x0; Ay = y1-y0

Bx = x3-x2; By = y3-y2

Cx = x0-x2; Cy = y0-y2

# calculate the lengths of the two lines

len1 = math.sqrt(Ax*Ax + Ay*Ay)

len2 = math.sqrt(Bx*Bx + By*By)

# calculate angle between the two lines

dot = (Ax*Bx + Ay*By) #dot product

deg = dot/(len1*len2)

#if abs(angle) == 1: #lines are parallel, no intersection is possible

if(abs(deg) == 1):

return None

#find the intersection point (pt) between two lines

pt = [0,0]

div = By*Ax - Bx*Ay

ua = (Bx*Cy - By*Cx)/div

ub = (Ax*Cy - Ay*Cx)/div

pt[0] = x0 + ua*Ax

pt[1] = y0 + ua*Ay

# calculate the combined length of the two segments

# between pt- pnt0 and pt-pnt1

Ax = pt[0] - x0

Bx = pt[0] - x1

Ay = pt[1] - y0

By = pt[1] - y1

segmentLen1 = math.sqrt(Ax*Ax + Ay*Ay) + math.sqrt(Bx*Bx + By*By)

#print "segmentLen1 ", segmentLen1

# calculate the combined length of the two segments

# between Pt-p2 and Pt-p3

Ax = pt[0] - x2

Bx = pt[0] - x3

Ay = pt[1] - y2

By = pt[1] - y3

segmentLen2 = math.sqrt(Ax*Ax + Ay*Ay) + math.sqrt(Bx*Bx + By*By)

'''

if the lengths of both sets of segments are the same as

the lengths of the two lines the point is actually

on the line segment.

if the point isn't on the line, return None, else the intersection

'''

if(abs(len1-segmentLen1) > 0.0 or abs(len2-segmentLen2) > 0.0):

return None

else:

return pt

if __name__ == "__main__":

import math

print "\nSegment Intersection Demo...Tests to see if two line segments intersect"

dataList = [[[0.0,0.0],[1.0,1.0], [0.0,1.0],[1.0,0.0]],

[[0.0,0.0], [0.0,1.0],[0.0,1.0],[1.0,1.0]],

[[0.0,0.0], [0.0,1.0],[1.0,1.0],[0.0,1.0]],

[[0.0,0.0],[0.25,0.75],[0.0,1.0],[1.0,1.0]]

]

print dataList[0]

for pnts in dataList:

A,B,C,D = pnts

pt = SegmentIntersection(A,B,C,D)

print "\nintersection point = ", pt, " for pnts: ",A,B,C,D

'''

SegmentIntersectionDemo.py

Author

Dan Patterson

obidangis@gmail.com

Dan_Patterson@carleton.ca

Purpose:

Determines whether two line segments intersect and returns it if it does

Detects parallel lines. This differs from line intersections, whereby the

point of intersection need not be on either line

Background:

http://gis.stackexchange.com/questions/69025/antipodal-distance-or-polygon \

-fetch-or-polygon-diameter-for-concave-polygons

'''

def SegmentIntersection(A,B,C,D):

#[read the point objects, A,B,C,D which are lists of X,Y coordinates

x0,y0 = A; x1, y1 = B; x2,y2 = C; x3,y3 = D

# calculate differences

Ax = x1-x0; Ay = y1-y0

Bx = x3-x2; By = y3-y2

Cx = x0-x2; Cy = y0-y2

# calculate the lengths of the two lines

len1 = math.sqrt(Ax*Ax + Ay*Ay)

len2 = math.sqrt(Bx*Bx + By*By)

# calculate angle between the two lines

dot = (Ax*Bx + Ay*By) #dot product

deg = dot/(len1*len2)

#if abs(angle) == 1: #lines are parallel, no intersection is possible

if(abs(deg) == 1):

return None

#find the intersection point (pt) between two lines

pt = [0,0]

div = By*Ax - Bx*Ay

ua = (Bx*Cy - By*Cx)/div

ub = (Ax*Cy - Ay*Cx)/div

pt[0] = x0 + ua*Ax

pt[1] = y0 + ua*Ay

# calculate the combined length of the two segments

# between pt- pnt0 and pt-pnt1

Ax = pt[0] - x0

Bx = pt[0] - x1

Ay = pt[1] - y0

By = pt[1] - y1

segmentLen1 = math.sqrt(Ax*Ax + Ay*Ay) + math.sqrt(Bx*Bx + By*By)

#print "segmentLen1 ", segmentLen1

# calculate the combined length of the two segments

# between Pt-p2 and Pt-p3

Ax = pt[0] - x2

Bx = pt[0] - x3

Ay = pt[1] - y2

By = pt[1] - y3

segmentLen2 = math.sqrt(Ax*Ax + Ay*Ay) + math.sqrt(Bx*Bx + By*By)

'''

if the lengths of both sets of segments are the same as

the lengths of the two lines the point is actually

on the line segment.

if the point isn't on the line, return None, else the intersection

'''

if(abs(len1-segmentLen1) > 0.0 or abs(len2-segmentLen2) > 0.0):

return None

else:

return pt

if __name__ == "__main__":

import math

print "\nSegment Intersection Demo...Tests to see if two line segments intersect"

dataList = [[[0.0,0.0],[1.0,1.0], [0.0,1.0],[1.0,0.0]],

[[0.0,0.0], [0.0,1.0],[0.0,1.0],[1.0,1.0]],

[[0.0,0.0], [0.0,1.0],[1.0,1.0],[0.0,1.0]],

[[0.0,0.0],[0.25,0.75],[0.0,1.0],[1.0,1.0]]

]

print dataList[0]

for pnts in dataList:

A,B,C,D = pnts

pt = SegmentIntersection(A,B,C,D)

print "\nintersection point = ", pt, " for pnts: ",A,B,C,D

## Monday, 19 August 2013

### Antipodal Distance Demo

This is a variant of the antipodal distance which allows you to isolate points pairs within a convex polygon forming the longest path from one side to the other side of a polygon without following the polygon edge. If this algorithm were applied to concave polygons, then there would be no guarantee that the formed vector would be fully contained within the polygon.

Other variants of the antipodal distance can consider travel along the polygon edges, and there are many examples on the web. Enjoy

'''

AntipodalDistDemo.py

Author

Dan Patterson

obidangis@gmail.com

Dan_Patterson@carleton.ca

Purpose:

To calculate the largest distance between the points the form

a convex hull.

Requirements:

- A list of list of xy pairs eg [ [1,1],[3,4,[2,6], [1,1] ]

first and last points must be the same to close the polygon

- numpy

Limitations:

Does not account for donuts or multi-part polygons

'''

#-------------------------------------------------------------------

def antipodalDistance(pnts):

'''

Calculates the largest distance between the points

that make up a convex hull

Requires: a list of list of xy pairs eg [[1,1],[3,4,[2,6],[1,1]]

'''

N = len(pnts)

#thePairs = itertools.combinations(range(N), 2) #pairs of numbers in the range N

distList = []

edgeList = []

dmax = 0.0

for i in range(0,N-2):

for j in range(i+1, N-1):

dist = numpy.math.hypot(pnts[i][0] - pnts[j][0], pnts[i][1] - pnts[j][1])

diff = abs(i - j)

if (diff == 1) or diff == abs(N-2) :

edgeList.append( [dist, [i,j]] )

else:

distList.append( [dist, [i,j]] ) # compute the lower triangle of the matrix

#distList.append( [dist, [j,i]] ) # form the upper triangle of the matrix

distList.sort()

distList.reverse()

edgeList.sort()

dmax = distList[0]

return [dmax, distList, edgeList]

#---------------------------------------------------------

if __name__ == "__main__":

import numpy

pnts = [ [0,0], [0,1], [1,4],[3,5],[5,4],[4,1],[0,0] ] # a closed-loop convex polygon

#pnts = [ [0,0], [2,1], [1,4],[3,5],[5,4],[4,1],[0,0] ] # a closed-loop concave polygon

dmax, distList, edgeList = antipodalDistance(pnts)

print "\ndmax and index pair ", dmax

print "\nDistList"

for i in distList:

print i

print "\nedgeList:"

for i in edgeList:

print i

Other variants of the antipodal distance can consider travel along the polygon edges, and there are many examples on the web. Enjoy

'''

AntipodalDistDemo.py

Author

Dan Patterson

obidangis@gmail.com

Dan_Patterson@carleton.ca

Purpose:

To calculate the largest distance between the points the form

a convex hull.

Requirements:

- A list of list of xy pairs eg [ [1,1],[3,4,[2,6], [1,1] ]

first and last points must be the same to close the polygon

- numpy

Limitations:

Does not account for donuts or multi-part polygons

'''

#-------------------------------------------------------------------

def antipodalDistance(pnts):

'''

Calculates the largest distance between the points

that make up a convex hull

Requires: a list of list of xy pairs eg [[1,1],[3,4,[2,6],[1,1]]

'''

N = len(pnts)

#thePairs = itertools.combinations(range(N), 2) #pairs of numbers in the range N

distList = []

edgeList = []

dmax = 0.0

for i in range(0,N-2):

for j in range(i+1, N-1):

dist = numpy.math.hypot(pnts[i][0] - pnts[j][0], pnts[i][1] - pnts[j][1])

diff = abs(i - j)

if (diff == 1) or diff == abs(N-2) :

edgeList.append( [dist, [i,j]] )

else:

distList.append( [dist, [i,j]] ) # compute the lower triangle of the matrix

#distList.append( [dist, [j,i]] ) # form the upper triangle of the matrix

distList.sort()

distList.reverse()

edgeList.sort()

dmax = distList[0]

return [dmax, distList, edgeList]

#---------------------------------------------------------

if __name__ == "__main__":

import numpy

pnts = [ [0,0], [0,1], [1,4],[3,5],[5,4],[4,1],[0,0] ] # a closed-loop convex polygon

#pnts = [ [0,0], [2,1], [1,4],[3,5],[5,4],[4,1],[0,0] ] # a closed-loop concave polygon

dmax, distList, edgeList = antipodalDistance(pnts)

print "\ndmax and index pair ", dmax

print "\nDistList"

for i in distList:

print i

print "\nedgeList:"

for i in edgeList:

print i

Subscribe to:
Posts (Atom)