Thursday, November 22, 2012

The static modifier

I get tired of trying to explain this and not being able to find a proper page through Google to link people to that explains things, so I'm going to do it myself. The static keyword in C and C++ has multiple different uses, so I'm going to clarify each one. The first two cases are the most talked about, and it's the third case that I'm really interested in going over.

static local variables

A static local variable has static duration. That is, it's like a global variable in that there is only one copy of it and it exists for the whole program, but it's different from a global because a static variable can only be used in its local scope; it can't be used anywhere like a global can. For example:
#include <iostream>

void function()
{
    int x = 0;        // Each call to function() gets its own copy of x because it's non-static
    static int y = 0; // Each call to function() re-uses the same y because it's static

    ++x;
    ++y;

    std::cout << "x: " << x << " y: " << y << std::endl;
}

int main()
{
    // Note that it's impossible for main() to access function()'s x or y variables
    function();
    function();
    function();
    function();
}

/*
Program output is:
x: 1 y: 1
x: 1 y: 2
x: 1 y: 3
x: 1 y: 4
*/

static member functions

This one doesn't apply to C, but to C++. If you put static on a class member function or variable, it means you no longer need an instance of that class to use it. For example:

#include <iostream>

class MyClass
{
public:
    // You have to have an instance of MyClass to call f1() because it's non-static
    void f1()
    {
        // f1() can use both v1 and v2. It can use v1 because both f1() and v1
        // need an instance of MyClass. It can use v2 because v2 doesn't even
        // need an instance of MyClass.
        std::cout << "f1 - v1: " << v1 << " v2: " << v2 << std::endl;
    }

    // You don't have to have an instance of MyClass to call f2() because it's static
    static void f2()
    {
        // Note that f2() can't use v1. f2() is static and so is v2, which is
        // why it can use v2. But v1 is not static and requires an instance
        // of MyClass, which f2() does not.
        std::cout << "f2 - v2: " << v2 << std::endl;
    }

    int v1;        // Each instance of MyClass gets their own v1 variable
    static int v2; // Every instance of MyClass shares this v2 variable
};

// Because v2 is a static member variable of MyClass, we have to put this line
// in one (and only one) source file for it to work. It's just the rules of C++
// and I won't go into it today.
int MyClass::v2;

int main()
{
    MyClass m;

    // This next line produces a compiler error; v1 requires an instance to use
    //MyClass::v1 = 1;

    // This next line is fine; it has the instance m it needs
    m.v1 = 1;

    // This next line is fine; v2 is static and doesn't need an instance
    MyClass::v2 = 2;

    // This next line is fine, but remember every instance of MyClass is sharing the same v2!
    m.v2 = 2;

    // This next line produces a compiler error; f1() requires an instance
    //MyClass::f1();

    // This next line is fine; it has the instance m to work on
    m.f1();

    // This next line is fine; f2() is static and doesn't need an instance
    MyClass::f2();

    // This next line is fine, but weird; f2() doesn't need the instance m, but it still works
    m.f2();
}

/*
Program output is:
f1 - v1: 1 v2: 2
f2 - v2: 2
f2 - v2: 2
*/

static free functions and variables

Ah, the final case of static. To understand this one, first a basic understanding of how C++ code is compiled is needed. Let's say you have two source files, f1.cpp and f2.cpp, and you're trying to compile them together into a single program. There are actually a few stages in the compiling process. In the first stage, f1.cpp and f2.cpp are compiled separately as two different translation units. The compiler will typically generate intermediate object files, f1.o and f2.o, that represent the compiled versions of f1.cpp and f2.cpp, respectively. However, you still don't have your program, so the second stage of compiling* is started: the linking stage.
*A more technically correct way of saying this is that linking is the second stage of generating your executable program (compiling was the first stage).

Linking is what it sounds like: linking compiled translation units together to get a final executable (or library, if that's what you're making). f1.o and f2.o need to be linked together, and if they use any functions from other libraries (like the standard library), they have to be linked to the libraries as well.

For example, let's say f1.cpp uses the standard library's abs function. The compiler will compile f1.cpp into f1.o, and notice that it's trying to use a function named abs. The compiler doesn't have the code for abs, as it's only working with f1.cpp, so it makes a note in f1.o that basically says, "Hey linker, the code right here is trying to call a function named std::abs; I don't have it. When you link things together, could you find a function named std::abs and patch this part to call that function?" When the linker comes in and starts linking f1.o and f2.o, it looks for dependencies that it needs to resolve, like the std::abs dependency in f1.o. The linker looks at all of its available symbols and says "Yeah, I've got an std::abs right here and its symbol matches the symbol that f1.o needs." If you ever compile a program and it gives you some "Undefined/unresolved symbol" error, it's basically the linker saying "The compiler compiled some code and asked me to patch a call to a function it didn't have access to (like in our std::abs example), however, I can't find anything that matches what the compiler said needed to go there!"

Here's where the static modifier comes in. When you apply it to a free function/variable (like a global function or variable), it basically says "This function/variable is only for this translation unit; it is not to be shared with other translation units." The reason your code can use std::abs is specifically because std::abs is not static. If it was, the linker would complain and give you an "Undefined/unresolved symbol" error because it couldn't find an std::abs function that both matched your needs and that you had access to use. Consider the following example:

f1.cpp:
#include <iostream>

void function()
{
    std::cout << "f1 function" << std::endl;
}

int main()
{
    function();
}
f2.cpp:
#include <iostream>

void function()
{
    std::cout << "f2 function" << std::endl;
}

Compiling output from command line:
$ c++ -o program f1.cpp f2.cpp 
duplicate symbol __Z8functionv in:
    /var/folders/3k/sr3xwzjn16v2n40ng0l0xgzr0000gn/T/f1-URDiYq.o
    /var/folders/3k/sr3xwzjn16v2n40ng0l0xgzr0000gn/T/f2-HKXeEk.o
ld: 1 duplicate symbol for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)

Here, the linker complained saying there was a duplicate symbol __Z8functionv (which is just the mangled symbol the compiler produced that represents void function()). The linker doesn't know if it should use function() from f1.cpp or function() from f2.cpp. The solution is to mark one (or both) of the functions as static. The linker tries to share non-static symbols from translation units, and if there's a conflict, it errors out. It can't share static symbols, however (because that's the whole point of static in this case), so marking one (or both) of them as static means that the linker doesn't have the conflict of trying to share two symbols with the same name.

static is also useful for "privatizing" functions in a file. For example:

f1.cpp:
void privateFunction();

int main()
{
    privateFunction();
}
f2.cpp:
#include <iostream>

void privateFunction()
{
    std::cout << "f2 privateFunction!" << std::endl;
}

Here, in this example, f1.cpp is able to call f2.cpp's privateFunction(), which may not be what you want. However, if f2.cpp had marked privateFunction() as static, then f1.cpp couldn't use it and the linker would error out saying it couldn't find a void privateFunction() symbol for f1.cpp to use.

In C++, there's an alternative way to accomplish what this third use of static does. It's unnamed namespaces (sometimes (technically incorrectly) call anonymous namespaces)). I don't really like that StackOverflow question I linked to though, because in C++11, they've un-deprecated the static keyword when declaring objects in a namespace scope (7.3.1.1 paragraph 2 no longer exists in C++11). Also, note the subtle (but potentially significant) difference between static and unnamed namespaces in this answer.

And there you have it! The three different ways to use static. And now I have a resource I can just link to when I need to explain this instead of googling in vain for a good explanation of the third case.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.