Introduction to Programming C/Lectures/Variables

What are variables? edit

Variables are one of the central concepts of computer programming. Simply put, a variable is some bit of data that you use later on in the program. In fact, its the only way to store data for later. In some ways, programming can be seen as the art of moving data from one set of variables into another.

For those of you who are familiar with variables in algebra (all of you, I hope), the concept is similar. Imagine if you had a series of algebraic equations where the values stayed the same from equation to equation, unless specificly redefined (by saying x=some equation). And imagine that a value was never unknown, you always know a variables value (although you may change it occasionally). Thats pretty much a C variable.

The 7 properties of variables edit

There are 7 properties of variables that exist in any programming language. You need to understand these things to understand how variables work. The last item (scope) may be a little tricky to understand why it exists until later on, so feel free to come back to it after we've studied functions.

Name edit

Every variable has a name. Whenever you want to use a variable, you specify what variable you want to use by using its name. For example, if you want to add 3 to a variable called foo, you would say foo+x.

How do you learn the name of a variable? You choose it. In C, a variable name can be any combination of letters, digits, and underscores (the _ character) you want, so long as the first character is not a digit. The computer doesn't really care what name you pick. In fact, by the time the compiler is done with your program, it doesn't even know what name you picked. That does not mean picking a name isn't important. Picking a good name makes your program much easier to read and understand. Trust me in 6 months, you won't remember what some random variable was meant to do. Heck, you'll be lucky to remember in 6 days. If you pick a good name, you won't have to remember- the name itself will tell you. A good name describes exactly what the data it holds is. Does it hold the current temperature? Call it current_temperature, or something similar. The score of a video game? Use player_score. Neither of these leave any doubt what the data is. Whenever you pick a name, take a second look at it and make sure yours has no doubt either. If it does, pick a new one. This advice goes for anything you need to name in programming- clear naming is one of the most important things you can do to make your code more understandable.

One thing to be careful of in naming is using abbreviations- for example, current_temperature could be abbreviated to cur_temp. It saves you a bit of typing, but it can remove a lot of clarity from your code. You should really only use abbreviations if there is only 1 possible interpretation for it, and if its obvious what that interpretation is by the context of the program. Using HTTP as an abbreviation is fine, I know of no meaning other than Hyper Text Transport Protocol. Using temp isn't as good- does it mean temperature? Temporary? Something else altogether? Just use the full name in those situations, its not that many more keys.

Also, a note. There are two common ways of naming variables; the underscore way and the camel-hump way. A few examples of underscore; current_temperature, temp_variable, this_is_a_long_name. A few examples of the camel-hump; currentTemperature, tempVariable, ThisIsALongName. Both ways are fine, it's just a matter of preference/readability.

Type edit

The type of a variable describes what kind of data it holds. The type and name of a variable combine to completely describe the variable. The type tells us the kind of data, and the name specifies what version of the data we're storing. For example, we could have two variables of type Temperature called today_high and yesterday_high. After a bit of practice programming, the types become like units in physics. You can pretty much guess how you're going to use the variables by reading their names and types, without even reading the code. When figuring out the type of a variable, remember that the more specific, the better. Take temperature- how are we measuring temperature? Celsius? Farenheit? Kelvin? Not so long ago, NASA lost the Mars Rover because one group of programmers used length in feet, and the other used meters. Be specific, and you won't leave loopholes like that.

Storage type edit

Every language has some built in types that are use to describe data. These are usually a very small set of types, mainly for describing letters and numbers. When you want to describe a more complex type of data, you need to pick one of these types to use for those variables. In C, there are 6 main types- char, short, int, long, float, and double. The first four describe different types of integers. Why do we need 4 of them? Some hold more data then others. The smallest of them is a char, the largest is a long. On Intel processors, char is 8 bits, short is 16, int and long are both 32. In addition, a char holds 1 letters or symbol, when you want to hold text data.

Floats and doubles are for floating point values. Floating point variables are numbers in scientific notation. We can use them to hold very small or very large numbers. Floats are 32 bit numbers, and doubles are 64 bit. If you want to have anything other than integers, you need to use these.

Mapping storage types to real types edit

You should have noticed by now that storage types and real types are not the same. Unfortunately, you need to tell the compiler what your variables are in terms of storage types. So after you determine the type of your data, you need to decide upon a storage type to implement it. The most important question to ask here is- are my values discrete or continuous? In other words, do you have only certain values that are legal (such as 1,2,3,4) or can any value be legal (such as 1 and 1/9)? In the first case, you want to use int. If the values are continuous, you have to use double.

Creating your own storage types edit

Wouldn't it be nice if you could tell the compiler real types instead of using storage types? Well, you can. You just need to define the mapping of real types to storage types first, then you can declare variables using the real type you defined. There are 3 ways to do this- typedefs, enums, and pseudo-enums.

Enums edit

Enum is short for enumeration. An enum allows you to use a different name for one of the built-in storage types, and simultaneously defines what the legal values for the enum are. You can declare an enum by using:

enum {
  VALUE_ONE_NAME=value1,
  VALUE_TWO_NAME=value2,
  ...
} real_type;


From then on, you can use real_type where you would otherwise use the storage type. In the above example, VALUE_XXX_NAME are names you can use for the different values it can take. If you were creating colors, you can use RED, WHITE, BLUE, BLACK, etc instead of trying to remember that red is 0, white 1, etc. Also, the =valuex part is optional. If you don't use it, the value for the name is 1 greater than the last value, and the first name takes a value of 0. So if you don't use any, they count up 0,1,2,3,4.

When should you use enums? Well, the real type of your variable needs to be an integer- enums don't work for floats. You also need to have reasonable names for all the possible values- if the values don't have names (other than ONE, TWO, THREE), use a typedef instead.

Typedefs edit

Typedef is short for type definition. A typedef basicly allows you to use a different name for one of the built-in storage types. You can declare a typedef by using:

typedef storage_type real_type;

From then on, you can use real_type wherever you would normally use a storage type.

When should you use typedefs? If your real type maps to a float or double storage type or if you have an integer real type and the values don't have sensible names for an enum.

Pseudo-enums edit

A pseudo-enum is a way of getting something very close to an enum for floats and doubles. First, you need to create a typedef to be able to use the name of the real type. Then you create a series of constants to give you names for the values. You can create a constant like this:

const type const_name=value;

You don't need to create constants for every possible value, you just need to make them for important ones.

Why would you ever use a pseudo-enum? Well, floats and doubles can't be enums and this gives you most of the functionality of an enum. And sometimes you may have an integer type where you only want to create names for some values, you can't (or at least shouldn't) do that with an enum. But in any other circumstance use a real enum, compilers can help find odd bugs with real enums that they can't with pseudo enums.

Value range edit

A value range is all the possible values a variable can take. For example, test grades can only go 0-100% (assuming no extra credit). An enum has a value range of all the constants you defined. For storage types, its physcially impossible to give it a value outside its range. The number just won't fit in the space- if you try you'll end up with really odd results like getting a very large negative number when you expect a positive one. For user defined types, you aren't that lucky- the compiler doesn't know its illegal. The exception here is enums- a compiler can sometimes see when you do that, and warn you about a possible problem. That's why you should always favor an enum if possible.

Value edit

A variable always has a value of some type. It keeps the same value from the time you set until the end of the program, unless you decide to change it. You can change it at any time, just use:

variable=value;

Value doesn't have to be a number- it can be an equation with other variables in it. You can read the value out of memory at any time, just by using the variable's name.

One thing to watch out for- if you read a variable's value before you set it, you get a random value. This is not a good thing. Always set a value for a variable before reading it. You have been warned.

Scope edit

A variable's scope limits where in the program you can access it. That's right, you can't always access every variable. There are 4 types of scope in the C language:

  • Global scope. These variables can be accessed anywhere in the program. You should use these sparingly, if your program needs a lot of globals you are probably doing something wrong. You can declare a global variable just by defining it in any file outside of any functions. In any other file you want to use it, you need to use export type variable;
  • File scope. These variables can be accessed anywhere in the file its declared. These are a bit better than global variables, but you should still try and minimize them. You can declare one by declaring a variable at global scope with the static keyword.
  • Local scope. These variables can be accessed only within the nearest set of {}. Any variable inside a set of {} (also know as inside of a block) are automaticly local scope. Every time you go inside the block you create a new version of the variable. The old value of the variable no longer exists. These are the vast, vast majority of variables.
  • Static local scope. These variables are just like local scope, except they keep their value between times you enter a block. You can declare one by creating a local variable with the static keyword. These should be used sparingly, its extremely rare to see one of these.

Sometimes you may end up with two variables in different scopes with the same name (usually 1 file or global and 1 local). In that case, you end up getting the local variable, not the global one.

So, why does scope exist? Well, in a program you frequently want to call several things by the same name in different places. For example, a variable newBalance may exist in a checking program in the code where you withdraw money and where you deposit money. If all variables were global, you'd need two names. With local variables, you can use the same name for two different variables in two different blocks. Allowing you to reuse names like that makes it much easier to use good names. It also stops you from accidently accessing variables that other parts of the program may be using, because you unknowingly chose the same name.

Declaring variables edit

Some programming languages make you declare a variable before you use it. This has pluses and minuses. The good part is that it explicitly states what variables exist, both name and type. This makes it much easier to read your program, you can easily find out what variables exist and what data they hold. It also allows the compiler to find certain bugs that could otherwise cause a lot of pain- if you misspell a variable somewhere, the compiler will see an undeclared variable and tell you about the error. Without declaring variables, it would think you were using a new variable. On the other hand, it can be slightly quicker to write a program if you don't have to declare variables, especially if you haven't planned out your algorithms ahead of time and are coding off the cuff.

Which is better? Programmers are still arguing about that, and most likely always will. One thing to keep in mind is that as a program increases in size, the time it takes to declare variables becomes trivial compared to the debugging time of typos. But in any case, in C it doesn't matter- you have to declare variables. You don't have a choice. To declare a variable, you just need to use:

type variable;

One thing to make sure of- for local variables, you need to use that line after the {, but before you write any other line of code. There cannot be any other lines between the { and the last variable declaration.

Defining a variable, step by step edit

So, lets go step by step through defining a variable.

  1. Determine the real type of the variable. Remember to be detailed enough that what is being stored is obvious.
  2. Create a good, descriptive name for that type.
  3. Map the real type to a storage type via typedef or enum
  4. Create a good, descriptive name for the variable
  5. Determine what scope you want for the variable. This is usually local. If a value is used in multiple functions and persists between those functions, use file. If it persists and is accessed by multiple files, make it global.
  6. Declare it in the right place for the scope.

Naming again edit

Since naming is by far the most important thing to take away from this lesson, I'm going to hit it again. Choosing good names is one of the most important thing to do when writing code. It will do more for making your code easy to read and understand than almost anything else. Its important to get in good habits now, too many programmers learn how to code and use x and y for all their variables. Lets use an example. Which of these give you the most information:

  1. int x;
  2. int myHeight;
  3. inches x;
  4. inches myHeight;

The first tells you nothing. The second tells you whats being stored, but not how its represented (inches? centimeters? microns?). The third tells you how something is being stored, but not what it is. The final one tells us not only what's being stored, but how its represented. Now imagine a large program with 10,000 variables. Which do you think will take you less time to understand? So be kind to your fellow programmers, and use descriptive names.