Tuesday, May 28, 2013

Dot Plots with pgfplots

I did some more tweaking and came up with this:

Much better, huh?  Here's the code... if you use it replace the "\\" with just "\"
% Preamble: \\pgfplotsset{width=7cm,compat=1.8}
x tick label style={major tick length=0pt}
] \\addplot+[only marks,mark=*] plot coordinates
{(0,3) (0,2) (0,1) (0,0) (1,0) (1,1) (1,2) (2,0) (2,1) (2,2) (2,3) (2,4) (3,0) (3,1) (3,2) (3,3) (4,0) (4,1) (4,2)};

Monday, May 27, 2013

Making Plots

As I mentioned before, I've been using Asymptote to generate geometrical figures, but I also would like to create nicely formatted plots - bot plots, dot plots, and histograms - with minimal effort.  I found a few options:

  • Asymptote does have a graphing library,  and it would be nice to not have to learn yet another tool.  However, I couldn't see how I could make it format things in the way I'd like.  For example, I'd like my histogram to have labels showing the range of values for each bin.  I think Asymptote is outstanding as a tool for general technical drawing, and it can certainly do graphing, but creating histograms, bar charts, and dot plots, I think that specialized tools (such as those below) will be more effective.
  • http://matplotlib.org/ - a python library
  • http://pgfplots.sourceforge.net/ 
    • creates histograms, box plots (shown in section 5.9.1 of the manual and in this question on tex.stackexchange).  
    • Getting dotplots in the format I'd like will take some doing... you basically use the scatterplot function to stack dots on top of each other.  Should probably move the x-axis labels further down, remove the y-axis labels, and obviously add a point at (2,2), but you get the general idea: 
    • It is possible to create a PDF that contains just the plot, using the directions in 7.1.2 of the manual: "Using the Externalization Framework of PGF 'By Hand'."  
  • R - 
    • Easy to export R graphs to PDF or PNG, though I was thrown off by the fact that some code that works to export to an PNG when executing line-by-line doesn't work when executing within a loop (see the ggplot2 section here)
    • When I tried to the image smaller, I ended up with a funny image where the fonts were disproportionately large.  I'm not sure how much effort would be necessary to fix this.  
    • The ggplot2 package does have a nice dotplot feature... with very little futzing I was able to produce this:
    • I want to write a loop that will create a whole bunch of random plots and write the related information (mean, median, range) to the database.  It took me a while to figure out how to configure the ODBC database connection to MySQL - you have to make sure that everything is either 32-bit or 64-bit, otherwise R throws the error: [RODBC] ERROR: state IM014, code 0, message [Microsoft][ODBC Driver Manager] The specified DSN contains an architecture mismatch between the Driver and Application.
Decision: In the end I think pgfplots is the quickest way to produce nicely formatted images of the desired size.

Thursday, May 23, 2013

Bogus Criticism of the Common Core

Here's a popular post at The Atlantic, criticizing the Common Core math standards.  The author approvingly quotes an email from a math teacher:
I am teaching the traditional algorithm this year to my third graders, but was told next year with Common Core I will not be allowed to. They should use mental math, and other strategies, to add. Crazy! I am so outraged that I have decided my child is NOT going to public schools until Common Core falls flat.
They can't use the standard algorithm, but instead must resort to "mental math" and "other strategies"?  Hmmmm... let's take a look at the standards for third grade, available at www.corestandards.org:
CCSS.Math.Content.3.NBT.A.2 Fluently add and subtract within 1000 using strategies and algorithms based on place value, properties of operations, and/or the relationship between addition and subtraction.
Please notice that this does not forbid the "standard algorithm" for addition.  Rather, as I read it, this means that the standard algorithm can be taught, but it must be taught with reference to place value, properties of operations, and/or the relationship between addition and subtraction.  When we, for example, do \(17+24\) we recognize that that 17 is composed of one ten and seven ones, while 24 is composed of two tens and four ones.  We can add numbers in whatever order we like without changing the result, but if we are doing the standard algorithm we begin by adding the four and the seven, \(4+7\), which gives us 11, or one ten and one one.  We then add the tens (one from 17, the two tens from 24, and the ten from 11) to get \(10 + 20 + 10\) to get 40.  Our final result is 41.  This can be shown in the standard algorithm using the standard way of writing things it (with one number on top of the other, lining up the tens and the ones), but we don't use terms like "carry" and  we make sure to remind students that the first 1 in eleven is not "1" but rather "1 ten."

So, I don't think the teachers complaint is even slightly legitimate.  However, I do think it points to a real question - will the common core be implemented correctly?  Will teachers get the misimpression that they must stop teaching the standard algorithm?  As far as I can tell, the mathematics core standards are far better than the prior standards, but will the standards be used in a way that will improve the quality of math education?

Wednesday, May 22, 2013

6th Grade Common Core Math Worksheets:

These are the ones I've put together so far:

  • Ratios & Proportional Relationships
    • A.1 Describe a ratio relationship between two quantities
    • A.2 Use rate language in describing ratio relationships
    • A.3 Solve problems using ratio and rate reasoning
  • The Number System
    • B.2 Divide multi-digit numbers
    • B.3 Add, subtract, multiply, and divide multi-digit decimals.
    • B.4 Find the gcf and lcm. Use gcf to factor the sum of two integers.
  • Expressions & Equations
    • A.1 Expressions with whole-number exponents.
    • A.2 Write, read, and evaluate expressions in which letters stand for numbers.
    • A.3 Apply the properties of operations to generate equivalent expressions.
    • A.4 Identify when two expressions are equivalent.
    • B.5 Determine whether a given number makes an equation or inequality true.
    • B.6 Write expressions using unknown variables to solve real world problems
  • Geometry
    • A.1 Find the area of regular polygons
    • A.2 Find the volume of rectangular prisms.
The most current version can be found here: http://www.mathtestnow.com/problem_index.php

Making my site crawlable

My site does terribly when I google it.  I realized there are a few major problems:
  • I haven't done the extra work necessary to get Google to index AJAX content
  • Most of the site's content is behind the login, so the bots can't find it.
  • Using JavaScript redirects rather than <a href="URL">link</a>
  • Solution:
    • Make sure that all the pages I want indexed can be reached via a simple href link that does NOT require login.  
      • As they explain here  buttons using javascript can be changed so that they are really links which are simply styled as buttons.
    • Create a large repository of static content that does not require a login.  Direct traffic to this.

Tuesday, May 21, 2013

Updates to MathTestNow

I haven't been doing regular updates on my progress with MathTestNow.com, but things are moving along:

  • Posted sample worksheets to TeachersPayTeachers.  I plan to make some free and some $1.
  • Got some really wonderful help from a guy at StackOverflow about how to make AngularJS and MathJax play nicely together.
  • Had a friend who is a math teacher take a look at the website - made me realize that I had a number of issues with the user interface on the main page.
  • Creating more question types and improving existing ones.
  • Creating sample worksheets for each content area.
  • Redoing some of the image files that were too large
  • Having trouble formatting the PDFs when there are images.  Asked a question at http://tex.stackexchange.com

Wednesday, May 15, 2013

Final Project for Coursera "Passion Driven Statistics"

Title - The Association Between Marital Status and Voting in the 2000 Election

Many factors are thought to have an impact on voting.  Racial, cultural and religious identity may all play a role, as well as views on economic, social, and foreign policy.  One of the factors which may be associated with voting patterns is marital status.  Since non-married people do not have a spouse to fall back on in case of a job loss, pregnancy, or disability, they may be more keenly aware of the need for government assistance and therefore more likely to vote for a larger social welfare state, and therefore for the Democratic candidate.  If this hypothesis is correct, non-married (widowed, divorce or single) people who are caring for children are even more likely to feel the need for a large welfare state, and therefore perhaps more likely to support the democratic candidate.

Research Questions
  1. For independent voters who voted in the 2000 election, are nonmarried people (single, widowed, divorced) more likely to vote for the democratic candidate?
  2. Is the association between marital status and voting similar for individuals with and without a child under 18 at home?
    Sample - The 1987 to 2012 Values Merge File contains core values questions from 15 surveys conducted between May 1987 and April 2012, some of which were done face-to-face and some over the telephone.  The combined N=35,578.
    Measures - Data for voting, marital status, and dependent at home were based on telephone or in-person responses.  We filtered out follwing: those who were not asked about whether htey had kids at home, those who did not vote for either the republican or the democrat, and those who were affiliated with either the republican or the democratic party.  In other words, we are looking only at those who voted but did not have an affiliation with either political party.  For marital status, we converted the data so that the values are either married, not married, or did not respond.

    Univariate - Independents (those not identifying as Republican or Democrat) who voted for a non-third party candidate in the 2000 election voted for the Democrat 40.26% of the time.  63.6% of the sample was married, and 31.29% reported having kids living at home.

    Bivarite - As expected, the chi-squared analysis shows that nonmarried (single, widowed, divorced) are more likely to vote for the Democratic presidential candidate - in our sample, 46.85% of married voted for the Democrat, while 58.73% of unmarried voted for the Democrat, and the chi-squared test has a p-value below .001.  When we look at those with and without kids, we see that for both married and unmarried, those with kids at home under 18 are less likely to vote for the Democrat than those without.

These results confirm the well-known pattern that non-married people are more likely to vote for the Democrat than married people.  Further, they show that those without kids at home are more likely to vote for the Democrat than those with kids at home.  However, we do not know much about the causality - are Democratic voters less likely to marry?  Or does marriage cause a change in voting patterns?  This study does not give sufficient evidence to make a determination.  Also, this study only looks at the 2000 election, so the result might not generalize to other election years.  

For future research, it would be good to look at a wider array of demographic features, to see whether marital status was the best predictor to look at, or whether a different indicator (correlated with marital status) would better predict voting patterns.  It would be also interesting to look at the same voters across time - if someone moves from single status to married or from married to divorced or widowed, does this change cause a change in voting?

Thursday, May 2, 2013

Coursera - Passion Driven Statistics Assignment 2: Frequency Tables

I'm taking another MOOC - Passion Driven Statistics!  Mostly I chose it because it the class includes free access to SAS OnDemand.  SAS training is ridunkulously expensive... the listed price for the the course SAS Programming Introduction: Basic Concepts is currently 1100, and the combined cost for the two courses they recommend as preparation for the base certification come to around $3.5K.  So why not get some training for free with Coursera?

Anyway, I'm using some data published by Pew.  The data was in SPSS format, but it's easy to import to SAS.  Here is my code & output for the second assignment:

libname mydata "/courses/u_coursera.org1/i_1006328/c_5333" access=readonly;

data new; set work.VALUES_MERGE; 

/*Change labels for race and education */
LABEL RACETHN="Race and Ethnicity"
EDUC="Last Grade Completed";
/* include only those who always or nearly always vote */
/* Exclude all who did not vote or voted for a 3rd party */
IF PARTY > 2; 
/* Exclude those who didn't vote for a major party candidate */

/*Treat Don't Know / Refused to answer as missing data. */
IF RACETHN = 9 then RACETHN = .;

/* Let those who voted for the dem be 0, those who voted for repub be 1.
This will allow us to see % repub by doing an average of PRESCATEGORY*/

PROC SORT;  by respid;

/* create the frequency tables */

The FREQ Procedure
How often do you vote?
Nearly always307044.266936100.00
Frequency Missing = 258
Race and Ethnicity
White Non-Hispanic608485.39608485.39
Black Non-Hispanic4696.58655391.97
Other Non-Hispanic2583.627125100.00
Frequency Missing = 69
Last Grade Completed
8th or less1632.271632.27
Some H.S.3294.594926.86
H.S. Grad189926.48239133.34
Some College180425.15452963.15
College +264336.857172100.00
Frequency Missing = 22
Party Identification
No preference3715.30689398.51
Other party1041.496997100.00
Frequency Missing = 197