2014 FR questions, AP Statsitics Exam. MY REVISED first attempt at solutions.

Posted on May 11, 2014 by roughlynormal

Hi all: I just worked through the 2014 AP Statistics Free response questions, which are publicly available here.

~~My attempts at solutions can be found here .~~

Possible Free response solutions 2014 frq, Second Draft!

Those were my first attempt. Thanks to Corey Andreasen, Pat Humphrey and others who caught some errors!

Again, these are simply attempts at solutions, and they probably still have errors… so tear them apart! I invite corrections, critiques, questions, and commentary.

I look forward to the dialogue.

If they provide a starting point for further dialogue about the questions, then I have succeeded.

UPDATE: Thanks to Corey Andreasen for his on-point comments.

I agree with his critiques, but I also want to think more about 4a: Is there more to a complete solution than simply “means are pulled up by unusually high incomes, and medians aren’t?”

About roughlynormal

I have been a math/statistics teacher for 25 years. I currently teach at an independent school in southern California. I also coach teaching fellows for Math for America - Los Angeles chapter. I love my career, my colleagues, and my friends & family.

View all posts by roughlynormal →

This entry was posted in Uncategorized and tagged AP Statistics Free response questions. Bookmark the permalink.

25 Responses to 2014 FR questions, AP Statsitics Exam. MY REVISED first attempt at solutions.

Corey Andreasen says:

May 11, 2014 at 8:25 pm

I think you misread #3c. The question asks for the probability that none of the 3 days is a T, W, or H. That means all 3 are M or F. So I got (2/5)^3.

In #4a, I’m think the median varies more than the mean does in general. Maybe not for skewed distributions. But I suspect the key here is that the median is resistant to outliers (or skewness), which you address later.

Typo in #4b. “. I would choose method 1 over method 2 over method 1”

But it looks like we agree on almost everything. (I had made an embarrassing calculation error that Jared’s solution helped me catch.)

Reply
- roughlynormal says:
  
  May 11, 2014 at 8:30 pm
  
  Thanks Corey: 100% agree that I misread
  
  Regarding 4a: Help me be convinced that the median would vary more: I wonder how you answer the prompt about a statistical advantage without going into variability of each statistic.
  
  Reply
  - Corey Andreasen says:
    
    May 11, 2014 at 8:34 pm
    
    I can’t explain why, but I can tell you that every time I do the German Tanks problem, the distribution of median*2 is always way more spread out than mean*2. I don’t know if that’s true for different shaped distributions. I think the statistical advantage is simply resistance to outliers and skewness.
    
    Reply
    - roughlynormal says:
      
      May 11, 2014 at 8:53 pm
      
      Hmm. Yes, I think I am agreeing with you: I just ran a simulation on Fathom where I tested out sampling distributions of a mean and and a median for a skewed population – the mean was less variable.
      
      Reply
    - roughlynormal says:
      
      May 11, 2014 at 9:01 pm
      
      It does beg the question, however:
      How is this a statistical advantage, if variability in the statistic is not reduced?
      
      I’m not 100% sure what could be classified as a “statistical” advantage.
      
      Reply
      - Corey Andreasen says:
        
        May 11, 2014 at 9:04 pm
        
        I think what they’re looking for is that the median is resistant to outliers. I don’t think it’s about the sampling distribution.
        
        Reply
Terri Daubert says:

May 11, 2014 at 8:28 pm

Doesn’t 3c actually ask for not Tuesday Wednesday Thursday rather than not Monday or Friday? So 2/5 cubed rather than 3/5?

Reply
- roughlynormal says:
  
  May 11, 2014 at 8:30 pm
  
  Agreed! Thank you!
  
  Reply
Dori Peterson says:

May 11, 2014 at 9:03 pm

On problem 6, I looked at the two graphs from the perspective of the LINER conditions for linreg t test and concluded that the variability was more equally distributed in graph III and that graph II tends towards systematic change which leads me to believe that #2 includes some form of bias that would preclude making good predictions. I confess that linear regression is my weak point so I may be misinterpreting these ….

Reply
- roughlynormal says:
  
  May 11, 2014 at 9:11 pm
  
  Hi Dori: Thanks for reading!
  
  I think this problem asks the students to something different than inference for regression we see in the course outline. There’s no evidence of needing to check conditions for inference.
  
  Notice the labeling of each axis: The residuals are from the original regression model, not new regression models with the new variables. So comparing residuals from the old model vs engine size shows a pattern. That tells me that the unexplained variability in fuel consumption is correlated with engine size. So adding engine size to our model can improve our predictions.
  
  Reply
Wolowitz says:

May 12, 2014 at 2:29 am

Is it just me or did the Free Response questions differ from test to test? I suppose I did not receive these questions during my AP Statistics Exam last Friday

Reply
- roughlynormal says:
  
  May 13, 2015 at 9:35 am
  
  I’m not sure what you mean; The AP Statistics Test will be administered later today for 2015. Every year, there are multiple versions of every AP test. My responses are for the “operational” exam, which is the version taken by the most students.
  
  Reply
Pat Humphrey says:

May 12, 2014 at 2:32 am

I won’t address comments already made. Nice attempt. I did notice that in 6, you consistently referred to engine sizes as short or long (instead of small and large) – volume doesn’t have linear units. Also, wheel base would be correlated with overall length. A possible reason why that variable doesn’t add anything here.

4 is sneakingly familiar. I do believe it’s based on an old exercise. Can’t recall if it’s from a BVD or SYM version!

Reply
- roughlynormal says:
  
  May 12, 2014 at 5:22 am
  
  Thanks Pat!
  
  Ooh, crap… that’s a glaring error about engine size.vThanks for catching that.
  A nice commentary about wheel base vs car length.
  
  I do recall questions from the past about addressing unexplained variation in a variable:
  
  Reply
  - AR says:
    
    May 12, 2014 at 1:35 pm
    
    I still think that students should have picked graph 2. If you look at the engine size most of the residuals of FCR based on length were negative or under predicted. The higher engine sizes were over predicted. The engine size should be taken into account. The wheel base did not have that can of effect on the residuals from the first part. Clearly the engine size impacted your predictions.
    
    Reply
    - AR says:
      
      May 12, 2014 at 1:37 pm
      
      Second sentence should have said that smaller engine sizes were under predicted.
      
      Reply
    - roughlynormal says:
      
      May 19, 2014 at 10:56 am
      
      Yes: I agree!
      
      Reply
shabaz says:

May 12, 2014 at 9:35 am

are u sure number 3 is not binomial problem. just wondering.

Reply
Jack says:

May 12, 2014 at 3:41 pm

I think 6D has two possible correct answers. The answer you gave is likely correct, but I believe they will also accept choosing graph III as a better model because it meets the conditions for interference. They would have to go on to say that graph II has a pattern in the residual, making a linear model inappropriate.. ,

Reply
- roughlynormal says:
  
  May 19, 2014 at 10:56 am
  
  Hi Jack: I don’t see how the reasoning you propose will work.
  
  Firstly, I”m not sure how meeting the conditions for inference is addressing the goal: “improving the prediction.” I think an improved prediciton has less error. If there’s a pattern in the existing residuals and a new variable, then we can include that to the mathematical model. The resulting multiple regression model will explain more variation.
  
  secondly, the two residual plots say nothing about meeting the inference conditions for a model with FCR and car length.
  
  Reply
Kevin Adams says:

May 14, 2014 at 6:09 am

For 6c, the rubrics have traditionally been picky about using “compare” words like more than, less than, etc. You gave two strong interpretations, but didn’t truly compare. I wonder how critical they will be since this is part of the investigative task.
Thanks for putting these together.

Reply
- roughlynormal says:
  
  May 19, 2014 at 10:48 am
  
  You’re welcome, Kevin:
  I understand your point about the “compare” issue in the context of comparing two quantitative distributions. I’m not so sure, however, that strictness would make sense if applied to this problem. Maybe say graph II shows a stronger positive association than graph III? We shall see in June!
  
  Reply
op says:

May 18, 2014 at 9:11 pm

is a 70 enough for a 5 on this test?
I feel like this test is a little bit easier

Reply
- roughlynormal says:
  
  May 19, 2014 at 10:43 am
  
  It’s hard to know: Every question is graded differently. The criteria for 1,2,3, or 4 are determined by the question leaders later next month. In past years cutoffs for a “5” were as low as 60 and as high as 70.
  
  Reply
Pingback: Stay Tuned: My attempts at 2015 AP Statistics Free Response solutions coming soon! | roughlynormal