Variables and Memory
02 February, 2013 - 9 min read
When a C++ program starts, 5 distinct areas of memory are created. These are:
- Code Space: This is where the executable instructions of the program are kept.
- Registers: are part of the CPU that take care of internal housekeeping. Among other things, they contain an identifier that points to the next line of code that is to be executed, and the stack pointer.
- Global Name Space: contains objects allocated by the linker which will persist for the duration of the program.
- Stack: contains local variables, whose persistency is defined by their scope.
- Free Store, or Heap is explicitly created and destroyed by issuing
new
anddelete
commands.
This lesson will concentrate on the differences between the last three.
The Stack
The stack is where local variables and function parameters reside. It is called a stack because it follows the last-in, first-out principle. As data is added or pushed to the stack, it grows, and when data is removed or popped it shrinks. In reality, memory addresses are not physically moved around every time data is pushed or popped from the stack, instead the stack pointer, which as the name implies points to the memory address at the top of the stack, moves up and down. Everything below this address is considered to be on the stack and usable, whereas everything above it is off the stack, and invalid. This is all accomplished automatically by the operating system, and as a result it is sometimes also called automatic memory. On the extremely rare occasions that one needs to be able to explicitly invoke this type of memory, the C++ key word auto
can be used. Normally, one declares variables on the stack like this:
void func () {
int i; float x[100];
...
}
Variables that are declared on the stack are only valid within the scope of their declaration. That means when the function func()
listed above returns, i
and x
will no longer be accessible or valid.
There is another limitation to variables that are placed on the stack: the operating system only allocates a certain amount of space to the stack. As each part of a program that is being executed comes into scope, the operating system allocates the appropriate amount of memory that is required to hold all the local variables on the stack. If this is greater than the amount of memory that the OS has allowed for the total size of the stack, then the program will crash. While the maximum size of the stack can sometimes be changed by compile time parameters, it is usually fairly small, and nowhere near the total amount of RAM available on a machine.
The Global Namespace
The fact that variables on the stack disappear as soon as they go out of scope limits their usefulness. Another class of variables exist that do not have this limitation. These are global and namespace variables, static class members, and static variables in functions. Global variables are accessible throughout the program, and are declared in this manner:
#include < iostream >
void func();
int i = 5;
int j = 3;
float f = 10.0;
int main() {
int j = 7;
cout << "i in main: " << i << endl;
cout << "j in main: " << j << endl;
cout << "global j: " << ::j << endl;
func();
return 0;
}
void func() {
float f = 20.0;
cout << "f in func: " << f << endl;
cout << "global f: " << ::f << endl;
}
The output of this program will be:
i in main: 5 j in main: 7 global j: 3 f in func: 20.0 global f: 10.0
Local variables take precedence over global variables of the same name. If both are defined as shown above for the variable int j
, then j
refers to the local copy, whereas ::j
refers to the global copy.
Despite their attraction, global variables are very dangerous, and should be avoided. The permit uncontrolled access to data, which runs counter to the object nature of C++ programming.
When data is common to, or must be shared amongst all instances of a class, one can use static
variables:
class Muon {
public:
Muon(float E):Energy(E) { MuonsInEvent++; }
~Muon() {MuonsInEvent--; }
static int MuonsInEvent;
private:
float Energy;
};
int Muon::MuonsInEvent = 0;
int main() {
Muon *muons[10];
int nummuons;
for (int i=0; i<10; i++) {
muons[i] = new Muon(0.);
}
nummuons = Muon::MuonsInEvent;
....
}
Here the integer MuonsInEvent
is static, and common to all instances of the Muon
class. The declaration of MuonsInEvent
does not define an integer, and no storage space is set aside. So the variable must be defined and initialized, as it is on the 13th line. Since it was declared public
, it can then be accessed directly in main()
by its class reference, and not by a reference to a specific instance of the class. If one wishes to restrict access to the static variable, as is often wise, it can be declared private
, and then only member functions of the class can access it, just like a normal private variable.
Member functions of a class can also be declared static
. They exist not in an object, but in the scope of a class, and can thus be called without having an object of that class:
class Muon {
public:
Muon(float E):Energy(E) { MuonsInEvent++; }
~Muon() {MuonsInEvent--; }
static int GetNumMuons() { return MuonsInEvent; }
private:
float Energy;
static int MuonsInEvent;
};
int Muon::MuonsInEvent = 0;
int main() {
Muon *muons[10];
int nummuons;
for (int i=0; i<10; i++) {
muons[i] = new Muon(0.);
}
nummuons = Muon::GetNumMuons();
....
}
Note that static member functions do not have a this
pointer, and therefore cannot be declared const
. And since member data variables are accessed in member functions using thethis
pointer, static member functions cannot access any non-static member variables.
Furthermore, static variables should not be used as a pseudo-global variable. While there are certain distinct implementations where they are mandated, static variables and functions should be used with caution. If you find yourself using them frequently, there is a significant chance that your design is at fault. This is especially true for Level 3 code, as tools are instantiated at the beginning of the run, and thus their members are easily accessible at any time thereafter.
The Free Store
The main problem with local variables, is that they don't persist. As soon as a function returns, all the local variables declared within it vanish. While it is possible to get around this with global variables, this is not wise. Instead, one can make use of the free store, or heap as it is often called.
Variables and objects are declared on the heap using the keyword new
, and are referenced using pointers. new
will attempt to allocate the requested memory on the heap, and will throw a bad_alloc
exception if it can not. While running out of memory is not usually a problem, when dealing with large programs it is often wise to check:
int *pI;
try { pI = new int; }
catch(std::bad_alloc) {
cout << "ERROR: no more memory\n";
return(1);
}
.....
delete pI;
Once you're done with something declared on the heap, it must be explicitly freed using the keyword delete
. If this is not done, and the pointer is reassigned, a memory leak occurs. The original memory becomes unavailable, and cannot be accessed. If this is done many times, then eventually all the available memory will be gobbled up, and your program will crash.
The heap is where the vast majority of your variables should be declared. This is because the heap is far larger than the stack, and once something is placed there, it won't disappear until you tell it to, avoiding the use of those nasty global variables.
Note that most memory managers are greedy: even when memory is properly freed with delete
, it is not really returned to the OS. That is, if your program allocates 1000 bytes, the size of the program in memory will grow to about that size. If you then free 1000 bytes, the program will not release the memory back to the OS - it will still look like it owns 1000 bytes of system memory. However you will have to then reallocate 1001 bytes of memory before the program size will grow again. Only when the program ends will this memory be returned to the system.
An interesting note: on the SGI, the preceding paragraph is correct. However, it seems that the memory manager in Linux is more intelligent. When space on the heap is deleted in Linux, the allocated memory is returned to the operating system. Depending on the application, this can be a good or a bad thing.
When checking for an out of memory condition, one can get a little fancier, and make use of the new
function set_new_handler()
which specifies what new
will do when it fails. You still need to try/catch
the exception though:
void out_of_store() {
cerr << "ERROR: operator new failed: out of store\n";
throw std::bad_alloc();
}
int main() {
....
int *pI;
set_new_handler(out_of_store);
try { pI = new int; }
catch(std::bad_alloc) {
return(1);
}
....
}
If you want your code to follow the standard C syntax and return a null pointer when new
fails, then the form new(nothrow)
must be used:
int *pI;
pI = new(nothrow) int;
if (pI == NULL) {
cout << "ERROR: no more memory\n";
return(1);
}
Note:
Variables are usually stored in RAM. This is either on the heap (e.g. all global variables will usually go there) or on the stack (all variables declared within a method/function usually go there). Stack and Heap are both RAM, just different locations. Pointers have different rules. The pointer to something (a memory block, an object, etc.) itself usually follows the rules of above (a pointer declared within a function is stored on the stack), but the data it points to (the memory block itself or the object you created with new) is stored on the heap. You can create pointers pointing to the stack (e.g. "int a = 10; int * b = &a;", b points to a and a is stored on the stack), but memory allocation using malloc or new counts towards heap memory.
http://www.pmzone.org/chapter04.html#top http://www.inf.udec.cl/~leo/teoX.pdf