Welcome, Guest! Login | Register

An Introduction to STL [Print this Article]
Posted by: Persuter
Date posted: Jun 26 2003
User Rating: N/A
Number of views: 5070
Number of comments: 5
Description: (Standard Template Library)
An Introduction to STL

(I am NOT going to explain templates in this article. If you do not know what templates are, or if you do not understand them, this is required reading.)

The Standard Template Library was introduced formally to the C++ standard in 1994. The main idea behind it is the concept of "generic" programming, in which a general set of algorithms can be applied to a set of containers. This helps minimize the so-called MxN problem. Normally, sorting algorithms, for example, must be rewritten for each different data structure. So, if I have 3 sorting algorithms and 4 data structures, I end up with 12 implementations that I have to write. However, if I use generic programming, I write three sorting algorithms that interface in a standard way to a general container, and then four specific structures that conform to the general container type. Thus I have only written three algorithms, but they work with all the data structures.

First, I'll talk about some standard containers in the STL, and then I'll talk about the algorithms. You'll find that the containers come in handy a lot. Finding where the algorithms are useful takes a bit of practice and experience, so try to implement the containers first and then look for opportunities to use the algorithms. Also, follow along by looking up the containers and algorithms we reference at SGI's STL site, which is a really indispensable STL reference. It's pretty heavy on the mathematical aspect of the STL, but hey, math is good for you. user posted image

(Please note, by the way, that the example code does not include everything you need to make it compile. Most specifically, I omit the definition of MyType in all but the first sample, and I often omit using namespace std;, which is required for every example, and the relevant files.)

Containers

One of the simplest and most useful (and most efficient) containers is the vector class. This class allows us to pack a bunch of items of the same type in a handy-dandy vector, which we can address as if it were an array.

We use it like so:

 CODE (C++) 
#include <vector>

using namespace std;

class MyType {

public:

    MyType( void ) { a = 1; }
    MyType( int b ) { a = b; }
    MyType( const MyType& x ) { a = x.a; }

    MyType operator=( const MyType& x ) { a = x.a; return x; }

    int a;

};

void main( void )
{
    vector<MyType> vec;
    vec.push_back( MyType( 4 ) );
    vec.push_back( MyType( 3 ) );
    vec.push_back( MyType( 2 ) );

    for( vector<MyType>::iterator vecitr = vec.begin(); vecitr != vec.end(); vecitr++ )
    {
        printf( "%i ", (*vecitr).a );
    }

    printf( ": %i\n", vec[2].a );
}


In this code, I first defined a class that is a model of the concept Assignable. This means it must have a copy constructor, or MyType( const MyType& ), and an assign operator, operator=( const MyType& ). It also must have a swap function, but quite frankly I never write one, because unless you want to do something special, there is a generic swap function.

Then, I created a vector that holds MyType objects, by using the vector template. I then pushed three objects onto the back of the vector.

The for loop is a bit scary to many people who haven't seen the STL before. In it, I create a variable that is an iterator in a vector. Iterators are central to the idea of the STL; they are what allow algorithms to interface in the same way to every different data structure. You can use them exactly like pointers to MyType. vec.begin() is an iterator that points to the first object in the vector, while vec.end() is an imaginary element at the end of the vector, letting you know you've reached the end. Again, vec.end() will not return the last element in the vector, it returns garbage. That's why we break out of the for loop when vecitr == vec.end(). Dereferencing vec.end() would crash the program. As you can see, we use vecitr++ to tell the iterator to go to the next object in the vector.

As you can see, in the last line, we can also treat vectors exactly like regular arrays, indexing into them using the [] operator.

This program will output "4 3 2 : 2".

Another popular type is deque, which implements a double-ended queue. Another example using MyType:

 CODE (C++) 
#include <deque>

using namespace std;

void main( void )
{
    deque<MyType> deq;

    deq.push_back( MyType( 2 ) );
    deq.push_front( MyType( 3 ) );
    deq.push_front( MyType( 4 ) );
    deq.push_back( MyType( 1 ) );

    for( deque<MyType>::iterator deqitr = deq.begin(); deqitr != deq.end(); deqitr++ )
    {
        printf( "%i ", (*deqitr).a );
    }

    printf( ": %i %i\n", deq.front().a, deq.back().a );
}


This will print out "4 3 2 1 : 4 1". Deques can also be accessed using the [] operator just like vectors.

Now we'll look at a more interesting type, called map. Map is used to associate keys with data. For example, if you wanted to keep a bunch of files in memory, you might use this to associate the files with their names.

 CODE (C++) 
#include <map>

struct ltstr
{
  bool operator()(const char* s1, const char* s2) const
  {
    return strcmp(s1, s2) < 0;
  }
};

using namespace std;

void main( void )
{
    map<const char*, MyType, ltstr> Map;

    Map["MyType1"] = MyType( 1 );
    Map["MyType2"] = MyType( 2 );

    printf( "%i %i\n", Map["MyType2"].a, Map["MyType1"].a );
}


(Note that this code will probably give you a lot of warnings in MSVC. This is because the actual type that the compiler sees is VERY long, so it truncates it to 255 characters in the debug info, but don't worry, it won't affect your program.)

This will print out "2 1". Now, there are two interesting parts of this code. The first is the struct you see above the main function. This is called a "function object", and STL uses them quite often. As you can see, it uses the () operator. What this means is that if you have an object of type ltstr named obj, you can write the following code:

 CODE (C++) 
obj( "MyType1", "MyType2" )


This will return the bool corresponding to strcmp( "MyType1", "MyType2" ) < 0. Map requires a "partial ordering" of the keys, which means it must have a way to determine if one key is "less" than another. (For the interested, this is because it creates a binary tree of the keys with pointers to the associated data at every node.) These function objects are used in many other instances, mostly algorithms, which we'll look at in a bit.

The other interesting part is that you can use the [] operator to index into maps, thus making maps basically able to be used like an array where you can use anything, not just ints, to index into it. I'm sure you can already think of many instances where this might be useful.

Algorithms

So, now we get to the other end, where we learn how the STL's authors plan to alleviate the MxN problem. The answer is, simply, iterators. We'll look first at a nice and easy way to print out everything in a vector:

 CODE (C++) 
#include <vector>
#include <algorithm>
#include <iostream>

using namespace std;

ostream& operator<< (ostream& ostr, MyType obj)
{
    return ostr << obj.a;
}

void main( void )
{
    vector<MyType> vec;
    vec.push_back( MyType( 1 ) );
    vec.push_back( MyType( 2 ) );
    vec.push_back( MyType( 3 ) );
    vec.push_back( MyType( 4 ) );

    copy( vec.begin(), vec.end(), ostream_iterator<MyType>( cout, " " ) );
}


This will print out "1 2 3 4". Now, you may well be asking yourself, why does it do that? First, we define an operator << that allows us to do something like cout << obj on a MyType variable. We set up a vector like normal. Then, we use the copy algorithm to copy everything from begin to end into the "ostream_iterator", which is what is known as an "Output Iterator". Each item that is fed into this output iterator is printed to cout, with a " " in between. We can also use copy to copy things into another container, like so:

 CODE (C++) 
void main( void )
{
    vector<MyType> vec;
    vec.push_back( MyType( 1 ) );
    vec.push_back( MyType( 2 ) );
    vec.push_back( MyType( 3 ) );
    vec.push_back( MyType( 4 ) );

    deque<MyType> deq;

    copy( vec.begin(), vec.end(), back_insert_iterator< deque<MyType> >( deq ) );
    copy( deq.begin(), deq.end(), ostream_iterator<MyType>( cout, " " ) );
}


(Note before you go on that when you put templated types inside templates, as in the back_insert_iterator definition, the spaces are absolutely essential, at least in MSVC. It will give you an error stating unexpected end of file if you do not put spaces between the brackets.)

This will also print out "1 2 3 4". Note what we did here. We copied all the information from the vector to the deque using a generic iterator. This is how the STL solves the MxN problem: algorithms never deal with actual container objects, but rather simply with iterators.

Now we'll look at a slightly harder one, using iterators and function objects.

 CODE (C++) 
struct addfunc {

    addfunc( void ) { count = 0; }

    void operator()( MyType& obj )
    {
        count++;
        obj.a = obj.a + 1;
    }

    int count;
};

void main( void )
{
    vector<MyType> vec;
    vec.push_back( MyType( 1 ) );
    vec.push_back( MyType( 2 ) );
    vec.push_back( MyType( 3 ) );
    vec.push_back( MyType( 4 ) );

    addfunc myfunc = for_each( vec.begin(), vec.end(), addfunc() );
    copy( vec.begin(), vec.end(), ostream_iterator<MyType>( cout, " " ) );
    printf( ": %i\n", myfunc.count );
}


This should return "2 3 4 5 : 4". For_each is a useful algorithm that applies a "unary function object" to the members between begin and end. In this case, it takes a reference to each MyType object and adds 1 to the a variable. Thus, when we print the vector out, it prints "2 3 4 5". Now note an interesting thing here: for_each returns a variable of the type of the function object we passed in. This means that you can retain state between function calls, a very useful thing. In this case, we keep a count variable, and update it every time the function is called. Thus, when we print it out at the end it shows us how many items we dealt with, that is to say, 4.

We'll introduce two new algorithms now:
 CODE (C++) 
struct ltmt {
    bool operator()( MyType obj1, MyType obj2 )
    {
        return obj1.a < obj2.a;
    }
};

void main( void )
{
    vector<MyType> vec1;
    vec1.push_back( MyType( 1 ) );
    vec1.push_back( MyType( 2 ) );
    vec1.push_back( MyType( 3 ) );
    vec1.push_back( MyType( 4 ) );

    vector<MyType> vec2;
    vec2.push_back( MyType( 3 ) );
    vec2.push_back( MyType( 1 ) );
    vec2.push_back( MyType( 5 ) );
    vec2.push_back( MyType( -1 ) );

    sort( vec2.begin(), vec2.end(), ltmt() );
    set_intersection( vec1.begin(), vec1.end(), vec2.begin(), vec2.end(), ostream_iterator<MyType>( cout, " " ), ltmt() );
}


This will return "1 3". First note that we've again made a function object that determines whether one object is less than another. We then use the sort algorithm (this little baby is VERY useful) to sort vec2, because the next algorithm, set_intersection, requires both sets passed in to be sorted. We then use set_intersection to find the objects that are common to both sets, and use an ostream_iterator to print them out to the console.

I'm going to leave off here, as part of the fun part of the STL is looking at the interesting algorithms. Let me suggest checking out find, remove, replace, unique (and their variants, such as remove_copy_if), and any other algorithms that happen to catch your eye.

(Also, on SGI's page, there's a list of "functors". This is worth a look, but I haven't found them to be particularly useful, whereas the algorithms are.)

I just wanted to close off by providing a few examples of how the STL has come in handy for me personally. As part of my research earlier in the year, I had to find the intersection of two polygonal meshes, and then do some lengthy calculations on each vertex in the mesh. With the STL, it was a simple matter of reading the meshes in (easily accomplished with copy and istream_iterator), intersecting the two meshes (sort and set_intersection), calculating the curvature at each vertex (for_each), and finally printing out the final mesh and curvature to files (copy and ostream_iterator). For a simpler example, in one of my classes we were required to write a proxy cache for Web pages. While my classmates messed around with various ways to do it, writing, debugging, fixing, etc., I simply used two maps, and had the program done in a little under an hour and a half, with extremely clean code and no bugs whatsoever, since all the memory management, sorting, and data structures was handled by STL code.

This is the real advantage of STL code: by using pre-written algorithms and data structures, you ensure fewer bugs. Also, your code is much cleaner because it deals only in the high-level algorithms, rather than futzing about with low-level, uninteresting, and cluttered code. Moreover, it's quite fast. In the mesh example, I had to work with polygonal meshes of more than a million polygons. Yet the program ended up taking only twenty or thirty seconds, despite the extremely complex calculations, as the sorting and intersection was much quicker than I probably could have made it myself. I hope you're excited about using the STL in your own programs now, so start looking for places where you can use it, and you'll find your code to be nicer and less bug-prone in short order.

Rate This Article
This article has not yet been rated.

You have to register to rate this article.
User Comments Showing comments 1-5

Posted By: Persuter on Jan 21 2004 at 19:55:21
By the way, here's another advantage of the STL: it's very amenable to macros. For example, in every last one of my projects, I have the macro:

 CODE (C++) 
#define ForEachIn( a, b ) for( a = b.begin(); a != b.end(); a++ )


This way, if I want to iterate through ANY container, I can do it easily, like so:
 CODE (C++) 

vector<int> myvec;
vector<int>::iterator vecitr;

ForEachIn( vecitr, myvec )
{
   // some code that is run on each element of myvec, where the element is (*vecitr)
}


This comes in handy so often it's silly. For example, if I open up a project I'm working on right now and do Find in Files for "ForEachIn", I use it 20 times, in two thousand lines. That's once every hundred lines, and that's counting braces, function definitions, and so forth. (And it's that rare because I use algorithms a lot more. For example, I see I use copy alone 35 times.)

This, by the way, is good high-level coding. Every time you write some macro to keep you from having to write the same bit of code structure over and over again, you win in several ways. You don't run the risk of typing anything wrong (or if you do, the compiler will tell you that it doesn't recognize ForEhacIn). And, if you decide for whatever reason that you want to change that code structure, you can do it without having to redo everything.

By the way, another important use of macros is typedef for the often long and unwieldy map names. For example, let's take a real dilly I have in the above project:
 CODE (C++) 
    map< DifferentialEquation*, map< DifferentialEquation*, int > >


As you can imagine, that would get tiresome typing out again and again. It could get especially bad if I was doing actual objects, so I had to put in predicate types:
 CODE (C++) 
   map< DifferentialEquation, map< DifferentialEquation, int, ltdeq >, ltdeq >


So you typedef the map name:
 CODE (C++) 
typedef map< DifferentialEquation, map< DifferentialEquation, int, ltdeq >, ltdeq > DiffEqCouplingTable;


Now I've won again, in several different ways. First off, I obviously don't have to type this when creating the variable, nor its iterators (I can use DiffEqCouplingTable::iterator). Secondly, my type name has now become more descriptive. Though perhaps it was not obvious to you at first, I named it DiffEqCouplingTable because this particular variable describes the mathematical coupling between various differential equations. Thus the type is descriptive of what it does (it's a table). Finally, let's say that I happen to have another variable with the exact same type, but that does something different. Then I might also typedef its type like so:

 CODE (C++) 
typedef map< DifferentialEquation, map< DifferentialEquation, int, ltdeq >, ltdeq > DiffEqRelationTable


Now, if for some reason I need to change how I've stored the coupling but not the relations (there is no such thing, it's just an example word), I simply change the type in DiffEqCouplingTable. Now all my uses of that type change automatically. If I had wanted to do this without using typedefs, I would have had to replace each usage of map< DifferentialEquation, map< DifferentialEquation, int, ltdeq >, ltdeq >, and check each time whether I was dealing with the coupling table or the relations table.

So, this is further evidence of how cool and awesome macros are and how much other procedural languages suck for not having them. :D (Templates, typedefs, and actual macros all fall under the heading of macro.) Go forth and use macros like you've never used them before!

(P.S. Some of you may have been asking yourself, can I do the following:
 CODE (C++) 
template < typename A >
typedef map < DifferentialEquation, map< DifferentialEquation, A, ltdeq >, ltdeq > DiffEqTable< A >;


The answer, unfortunately, is no. The best you can do is the following:
 CODE  
template< typename A >
struct DiffEqTable
{
   typedef map< DifferentialEquation, map< DifferentialEquation, A, ltdeq >, ltdeq > Type;
};


Now you can do something like this:

 CODE (C++) 
DiffEqTable< int >::Type myTable;


That's the best you can do, unfortunately.)Edited by Persuter on Jan 21 2004, 19:59:36

Posted By: rkzad on Jan 21 2004 at 23:24:24
Holy comment, Persuter! It almost looks like this should be appended to yor article, or in a new one.

PS I didn't realize that you could do IBcode in comments... I guess I forgot lol.

Posted By: Persuter on Jan 22 2004 at 05:26:16
It is easily big enough to be a new one but I figured what the heck, I'd just put it in a little comment. Consider it an extension.

Posted By: Persuter on Feb 18 2004 at 05:33:11
Another helpful little comment. Getting annoyed at your compiler complaining about every last usage of map in your code because the ultimate type is longer than 256 characters? Put this handy-dandy little snippet at the beginning of code files:

 CODE (C++) 
#pragma warning (disable:4786)


This will, as it suggests, disable warning 4786, which is that warning. (You'll have to experiment with where to put it, I'm still not sure about the exact semantics. There's no performance penalty for using it, though, so just put it everywhere.)Edited by Persuter on Feb 18 2004, 05:33:48

Posted By: Persuter on Feb 03 2005 at 23:52:50
You know, reading over this article, it occurred to me for the first time ever that you could easily think that STL containers could ONLY store user-defined types. Not at all, the STL can work very easily with standard types, int, float, pointers, whatever. In fact, the STL can iterate over pointers as if they were iterators! For example:

 CODE  
int a[5] = { 1, 2, 3, 4, 5 };
int b[5];

copy( a, a+5, b );


That will copy the first array into the second. Isn't STL cool?Edited by Persuter on Feb 03 2005, 23:54:39


You must register to post a comment. If you have already registered, you must login.

Latest Articles
3rd person View in Multiplayer
Half-Life 2 | Coding | Client Side Tutorials
How to enable it in HL2DM

By: cct | Nov 13 2006

Making a Camera
Half-Life 2 | Level Design
This camera is good for when you join a map, it gives you a view of the map before you join a team

By: slackiller | Mar 05 2006

Making a camera , Part 2
Half-Life 2 | Level Design
these cameras are working monitors that turn on when a button is pushed.

By: slackiller | Mar 04 2006

Storing weapons on ladder
Half-Life 2 | Coding | Snippets
like Raven Sheild or BF2

By: British_Bomber | Dec 24 2005

Implementation of a string lookup table
Half-Life 2 | Coding | Snippets
A string lookup table is a set of functions that is used to convert strings to pre-defined values

By: deathz0rz | Nov 13 2005


Latest Comments
knock knock
General | News
By: omega | Dec 22 2016
 
knock knock
General | News
By: MIFUNE | Oct 10 2015
 
New HL HUD Message System
Half-Life | Coding | Shared Tutorials
By: chbrules | Dec 31 2011
 
knock knock
General | News
By: Whistler | Nov 05 2011
 
Particle Engine tutorial part 4
Half-Life | Coding | Client Side Tutorials
By: darkPhoenix | Feb 18 2010
 
Particle Engine tutorial part 2
Half-Life | Coding | Client Side Tutorials
By: darkPhoenix | Feb 11 2010
 
Particle Engine tutorial part 3
Half-Life | Coding | Client Side Tutorials
By: darkPhoenix | Feb 11 2010
 
Game Movement Series #2: Analog Jumping and Floating
Half-Life 2 | Coding | Shared Tutorials
By: mars3554 | Oct 26 2009
 
Particle Engine tutorial part 5
Half-Life | Coding | Client Side Tutorials
By: Deadpool | Aug 02 2009
 
Particle Engine tutorial part 5
Half-Life | Coding | Client Side Tutorials
By: Persuter | Aug 02 2009
 

Site Info
297 Approved Articless
8 Pending Articles
3940 Registered Members
0 People Online (3 guests)
About - Credits - Contact Us

Wavelength version: 3.0.0.9
Valid XHTML 1.0! Valid CSS!