KDnuggets Home » News » 2018 » Mar » Tutorials, Overviews » R Fundamentals: Building a Simple Grade Calculator ( 18:n12 )

R Fundamentals: Building a Simple Grade Calculator


In this tutorial, we'll teach you the basics of R by building a simple grade calculator. While we do not assume any R-specific knowledge, you should be familiar with general programming concepts.



By Jeffrey M Li, Dataquest

R is one of the most popular languages for statistical analysis, data science, and reporting. At Dataquest, we have been adding R courses (you can learn more in our recent update). For a comparison of R and Python, check out our analysis here.

In this tutorial, we'll teach you the basics of R by building a simple grade calculator. While we do not assume any R-specific knowledge, you should be familiar with general programming concepts. You'll learn how to:

  • Make calculations
  • Store your values
  • Use specific functions to answer questions

This tutorial is based on part of our newly released introductory R course. The course is entirely free and includes a certificate of completion. Go here to start the course.

 

Calculating your grades

 
Let's say you're a high school senior and want to calculate your grade point average (GPA). A GPA represents the average value of the accumulated final scores earned in all your classes. You are taking seven classes, with exams, homework, and projects all equally weighted. We'll assume that the GPA is measured on a 0-100 scale.

In your math class, you've scored a 92 on exams, 87 on homework, and 85 on projects. To calculate the average math score, we could write the following:

# Math
(92 + 87 + 85)/3


We could perform tasks like calculating the average, by hand. However, if we had to calculate the averages for a thousand students, hand calculations wouldn't be an effective use of our time.

Instead, we'll use programming to ask a computer to carry out the calculations.

 

Performing calculations

 
We'll start by using R as a basic calculator. We previously wrote the following to calculate the final grade for math class:

(92 + 87 + 85)/3


This entire line of code is called an expression. We write expressions in a text file called a script. A script is a set of instructions we're giving the computer. After writing our expressions in a script, the interpreter will run the code and display the results of the expression in a new window.

Let's run the print() statement with our math score as an expression in between the ():

print((92 + 87 + 85)/3)


Running the expression, the interpreter will output the following value in a new window:

[1] 88


Note: The "[1]" will make sense when you dive deeper into vectors but you don't need to understand it for the context of this tutorial.

In our displayed result, both the calculation and the print() statement have pairs of matching parentheses. To make this clearer, here's the same calculation:

print(
    (92 + 87 + 85)/3
)


Running this expression will produce the same result as writing everything on one line.

Every starting parenthesis needs a closing parenthesis. Let's try removing the closing parenthesis:

print((92 + 87 + 85)/3


If there is a mistake in your code, the interpreter will tell you there's an error, and what that error is. In our case, the interpreter returns:

Error in parse(text = x, srcfile = src): <text>:7:0: unexpected end of input

The text unexpected end of input means that the input to the R interpreter (our code) was missing a closing parentheses ). You can try playing around with the above expression and seeing what other kinds of errors you get.

 

Performing multiple calculations

 
Now that we've seen our results using the print() statement, let's dive deeper into how the R interpreter runs our code. It:

  1. Scans and looks for syntax errors.
  2. Interprets and runs each line of code, from top to bottom.
  3. Exits when the last line of code is run.

We've written one expression calculating your final score in math class. To understand the sequential way R code is interpreted, let's also add an expression calculating your chemistry score. In Chemistry, your scores were 9081, and 92.

What happens if we run both calculations on separate lines?

print((92 + 87 + 85)/3)
print((90 + 81 + 92)/3)


Running this code, the R interpreter will display:

[1] 88
[1] 87.66667


Does R always display two lines if we write two lines of code? What if we break up our code into multiple lines?

print(
    (92 + 87 + 85)/3
)
print(
    (90 + 81 + 92)/4
)


The R interpreter will still display the same values:

[1] 88
[1] 87.66667


Notice how R interprets our code. Each print statement corresponds to it's own line in the result:

print_lines_v5

If we wanted to calculate the average scores for writing and art, we can write these expressions on each subsequent line:

  • Writing: 849579
  • Art: 958693
print((92 + 87 + 85)/3) # Math
print((90 + 81 + 92)/3) # Chemistry
print((84 + 95 + 79)/3) # Writing
print((95 + 86 + 93)/3) # Art


Running these expressions would display the following results:

[1] 88
[1] 87.66667
[1] 86
[1] 91.33333


 

Performing calculations using arithmetic operators

 
+ and / are called arithmetic operators. Arithmetic operators are used to carry out mathematical operations. In the following diagram, you'll find a list of the most common operators and a simple expression using each operator:

operators_v2

For those who are unfamiliar with exponentiation, exponentiation is a way of multiplying a number by itself a specific number of times, using the ** or ^ operator. If we wanted to multiply the value 4 by itself 3 times, this would look like the following using the multiplication * operator:

4 * 4 * 4


While multiplying 4 by itself three times using the multiplication operator isn't too cumbersome, if we wanted to multiply the value 4 by itself 20 times, using the multiplication operator isn't the most efficient method. Instead, we can express the calculation as an exponent:

4**20


Running 4**20 will return:

[1] 1.099512e+12


Now that we understand arithmetic operators, let's calculate the final scores for our last three classes: history, music and physical education:

  • history: 77, 85, 90
  • music: 92, 90, 91
  • physical education: 85, 88, 95
print((77 + 85 + 90)/3) # History
print((92 + 90 + 91)/3) # Music
print((85 + 88 + 95)/3) # Physical Education


The interpreter then would display:

[1] 84
[1] 91
[1] 89.33333


 

Performing Calculations with Order of Operations

 
Now that we've learned how to use arithmetic operators to calculate the average scores for each class, let's return to our average calculation for math:

print(
   (92 + 87 + 85)/3
)


What if we deleted the parenthesis surrounding 92 + 87 + 85?

print(
   92 + 87 + 85/3  
)


This will display:

207.333


By deleting the parentheses surrounding 92 + 87 + 85, the R interpreter makes a different calculation. When using multiple operators, there are rules that determine the order in which calculations are performed.

A simple way to determine the order of your calculations, is to throw a parenthesis around the calculation you want performed first. This is useful for a more complex calculation like this:

print(
    (92 + 87 + 85 + 67 + 92 + 84)/6 - (77 + 90 + 98)/3
)


In this scenario, we've thrown a parentheses around the 92 + 87 + 85 + 67 + 92 + 84 and 77 + 90 + 98. We're telling the interpreter to execute the addition operator before executing the division.

The R interpreter follows the order of operations rules in mathematics. An easy way to remember this is PEMDAS:

  • Parentheses
  • Exponent
  • Multiplication or Division
  • Addition or Subtraction

Let's take a look at an example without the parentheses. For 92 + 87 + 85/3, the R interpreter will calculate the expression in this sequence:

pemdas1

When you don't include a parentheses surrounding 92 + 87 + 85, based on PEMDAS, the R interpreter will calculate the division operator first.

Now, let's re-add the parentheses onto our expression. For (92 + 87 + 85)/3. The R interpreter will calculate the expression in a difference sequence:

pemdas2

Here are the final scores for each class:

  • math: 88
  • chemistry: 87.66667
  • writing: 86
  • art: 91.33333
  • history: 84
  • music: 91
  • physical_education: 89.33333

Let's calculate the overall average while keeping PEMDAS in mind. After calculating the overall average, in the same expression, subtract this overall average from the math score:

print(
    88 - ((88 + 87.66667 + 86 + 91.33333 + 84 + 91 + 89.33333)/7) 
)


[1] -0.1904757



Sign Up