CPlusPlus.pdf

1 Superficially: Not so different from C.

1.1 C and C++ both have header files, implementation files

C: suffixes .h and .c C++: usually either (.h and .cpp), or (.hh and .cc).

1.2. Compiler: g++ replaces gcc

$ gcc −Wall m yFil e. c −o myProgram
$ . /myProgram

C++:

$ g++ −Wall m yFil e. c c −o myProgram
$ . /my Program

1.3. main() function. Required.

Get in the habit of writing out the main function’s arguments.

main ( int argc , char ✯ argv [] )

1.4. Functions

a) format: same

return−type function−name ( type name , type name ) {
// c a l c u l a t i o n s h e r e ;
r e t u r n something−matching−the−promised−return−type ;
}

b) pass by value, or pass by reference. Same.

c) My Guru Said “constify anything that will bear it.” “const” modifier exists in C and in C++. Used to protect arguments from modification

void myGreatFunction ( c o n s t i n t ✯x , c o n s t d ouble ✯y , d ouble ✯
z ) ;

The const modifier will protect x and y from change, but z can be revised in the function. This is how functions in the GSL or BLAS generally work. 1 d) prototypes: interface declarations, usually in separate *.h files. Function names;

e) Difference: C++ allows “overloaded” functions. Use the SAME NAME for a function over and over again, and vary the argument types.

1.5. C variable types present in C++ as well

All of the types from C are available: int, float, double, char. C++ and its Standard Library offer easy access to a larger set of variable types

1.6. Math: same! * / % += *= ++i i++

1.7. Control: if, else, for, while, do, ?:, and so forth

1.8. Scope concepts same/similar. Squiggly braces demarcate the area where local variables can live. As in C, a variable declared inside the braces, like i, only lives inside the braces

i n t x ;
{
i n t i = 7 ;
c ou t << ” i time s x i s ” << x ✯ i << e n dl ;
}

//Note x s t i l l e x i s t s a f t e r that , but i was l o c a l t o the s c o p e .

1.9. Pointers. Like C, C++ allows pointers. Syntax is same. type ✯ x ; “type” can be “integer” “double”, etc. This creates a location in memory we call x, and the value at that point is *x. &x Asks x what its memory location is. C++ uses the “&” operator much more often than we did in Objective-C. I’m trying to find out if that is simply a “novice” thing textbooks tead. Use pointers, as in C Use pointers also to refer to more elaborate data structures (including C++ classes, discussed below)

1.10. Fixed size arrays created as in C int x[ 7 ] ; Recall this is “local” to a scope, it ceases to exist when C++ exists from scope. To pass vector to another function, or to grab a “really large” amount of storage, need to dynamically allocate memory from the “heap” of RAM. Allocation of pointers different in C++, we don’t use “malloc”, instead, use new.

2 Initially Noticed Differences from C

namespaces and include statements. The Superficial difference is the syntax. C: #include <stdlib.h> C++: #include Note: no “.h” in C++ when asking for header files from the “built in libraries” from the system. Difference with header files in your code, though: C: #include “header-in-my-code-folder.h” Quotation marks signal “don’t go look in the system for this header, look in my current folder” C++: #include <header-in-my-code-folder.h> In C++, you’d only see the “.h” where it tells the compiler to look in your code for a header file. Every installation of the g++ compiler will include the C++ Standard Library, which has many header files. It is is larger than the C standard library. Many C libraries adapted for use within C++, such as stdlib.h is now accessed as: C: #include More Substantial Difference: simply including a header file does not make its functions “ready to use”. Functions are inside namespaces. That is discussed below.
Discourages printf, instead uses functions cout endl with << between pieces. Somewhat reminiscent of Perl string printing (IMHO). c ou t << ”Some message t o u s e r ” << x << y << e n dl ; Arguably, this is more convenient than printf. To the novice user, it is the most prominent difference between C++ and C, and I think that’s unfortunate because its not really important at all in the larger scheme of things. I wish I could just ignore that, and use printf or Rprintf() instead. I honestly see no benefit in having a new way to do the old thing. CAUTION cout and endl are in the namespace “std” and we can’t use them unless we use that namespace (next topic)
Functions and variables can be organized within “namespaces”. The using keyword tells the compiler to look for otherwise unfound functions in a namespace. Quick R user note. In R, namespaces are now required for all packages and we can access functions using “::” notation. This is becoming more usual to R users, as more-and-more help pages in R use package : : f u n c ti o n ( ) That notation is more specific, it does not load a whole package when a function is required. Instead, we have often been taught to use l i b r a r y ( package ) f u n c ti o n ( ) 3 Let’s consider the function “cout” which writes to the screen. If you are “bare bones” namespace-respecting serious C++ programmer, your code explicitly refers to the namespace when cout is used. Like this: s t d : : c ou t << ”some words p ri n t on the s c r e e n ” << s t d : : e n dl ; In the olden days, say before 2005, it would be usual to simply import the WHOLE NAMESPACE like so, thus making all of its functions immediately available. Here, we tell it to look in a namespace “std”, and if it finds it, it uses it. I use “endl” here because it gives new lines at the end. Otherwise, we have to insert “\n” at the end. //from h t tp : //www.cplusplus.com /doc / t u t o r i a l /namespaces #include u si n g namespace s t d ; i n t main ( ) { c ou t << ”some message t o the r e a d e r ” << e n dl ; r e t u r n 0 ; } a) The previous makes ALL of the functions in the std namespace available: u si n g namespace s t d ; Eubank & Kupresanin say this is “overkill” because it gives access to the whole standard template library. In various C++ Web forums, I find authors going further, arguing it is generally “bad form” to do this with any namespaces, because it gives the author poor control over access to function names. Suppose there are several “using” statements and there are functions with the same name in each? All hell breaks loose. See http://stackoverflow.com/questions/1452721/why-is-using-namespace-std-considered-a-bad-practice-in-c. b) A program can have several namespaces floating about, even within a user’s code namespaces can be spawned, as shown here #include u si n g namespace s t d ; namespace f i r s t { i n t var = 5 ; } ; namespace sec ond { d ouble var = 3 . 1 4 1 6 ; } ; i n t main ( ) { c ou t << f i r s t : : var << e n dl ; c ou t << sec ond : : var << e n dl ; r e t u r n 0 ; } c) There is an in-between phase, where the “using” statement can bring in just one function from a namespace, not the whole thing. 4 #include u si n g namespace s t d : : c ou t ; u si n g namespace s t d : : e n dl ; namespace f i r s t { i n t x = 5 ; i n t y = 1 0; } namespace sec ond { d ouble x = 3 . 1 4 1 6 ; d ouble y = 2 . 7 1 8 3 ; } i n t main ( ) { u si n g namespace f i r s t ; c ou t << x << e n dl ; c ou t << y << e n dl ; c ou t << sec ond : : x << e n dl ; c ou t << sec ond : : y << e n dl ; r e t u r n 0 ; } d) cout, endl, cin are in the namespace “std”. The C++ standard library puts those functions in std. e) Generally, we are advised to follow one of 2 coding standards. i. Don’t import the namespace at all, but explicitly refer to the functions by their full names, like this: s t d : : c ou t o r s t d : : e n dl Example usage s t d : : c ou t << ”some s i l l y message ” << s t d : : e n dl ; That seems verbose to me, but it seems to be what the experts prefer for clarity. ii. Use a more focused using statement u si n g s t d : : c ou t ; u si n g s t d : : e n dl ; tells compiler where to look for functions cout and endl.
Strings!: More convenient usage of character strings #include Code allows new variable type sting s = “whatever you want in parens”. And s can be printed with cout. 5 3 Memory Management 3.1 C++ allows pointers, manual allocation and freeing of memory 3.2 But they don’t want us to think of explicitly allocating arrays in the C style, anymore. Instead, we should use a pre-designed Vector class that will handle that for us. The language is gravitating to a point where the allocation of memory is concealed and automatic. 3.3 new and delete: replacements for malloc and and delete. If you ever do need to explicitly allocate memory, here’s how it can be done. new is an operators.
Use new to allocate memory for pointer variables. Don’t use “malloc” to allocate dynamic memory. Allocate a pointer to a single variable: d ouble ✯myX = new d ouble ; For a single value, new works as well i n t ✯x = new i n t ; //same a s i n t ✯x ; x = new i n t ; //remember t o d e l e t e x when you don ✬ t need i t anymore ! When finished, remove it from memory. d e l e t e x ; Allocate storage for an array of n floating point numbers. d ouble ✯x = new d ouble [ n ] ; The user can treat that like a pointer in C. x[0] grabs the first element. When finished, remove it from memory. d e l e t e [ ] x ; Summary: No malloc, the allocation of arrays uses new instead.
A caution I should have emphasized in C, but will make a big point about it now in C++! Recall this example from the pointer discussion i n t ✯x ; i n t i = 3 2; x = &i ; //x e q u al s the a d d r e s s o f i Please note that this essentially lets *x “steal” the memory location created by allocating i. HOWEVER, when the process leaves that scope, i will cease to exist, and I’m not confident that x will point to anything reasonable. Thus, for our purposes, the most commonly illustrated usage of pointers is almost always wrong, or at least useless, since we DO need to pass information between scopes. Instead, we need to allocate memory for the pointer *x. This declares a pointer, but does not allocate permanent storage: 6 i n t ✯x ; It just means that, when storage is eventually created, we can refer to (the first position of it) by x. Arrays will automatically claim the memory for us. ch a r ✯ aMessage = ” h e l l o ” ; i n t ✯a = new i n t ( 2 8 5 ) ; // 285 d yn amic all y a l l o c a t e d a r r a y
Recall in C, a two dimensional array (a matrix) is handled in either of 2 ways a) Create a 1 dimensional array, and then get good at “stride” calculations to find the value that would be in row i, column j if it were a matrix. This is how FAST calculations are done in C programs. C++ can do that, of course. b) Create a ’star star’ pointer, and use malloc to first allocate the initial positions of the columns, then add a for loop to allocate the storage for each column. Recall in C we used this idiom to create a matrix “myDynamArray”, which will have 10 rows and 5 columns. This was done with malloc: //C 2 dim e n si o n al a r r a y i n t nRows = 1 0; i n t nCols = 5 i n t i = 0 , j = 0 ; d ouble ✯✯ myDynamArray ; myDynamArray = m all oc ( nRows ✯ s i z e o f ( d ouble ✯ ) ) ; i f ( ! myDynamArray ) { p r i n t f ( ”Memory a l l o c f a i l e d ”) ; e x i t ( 0 ) ; } f o r ( i = 0 ; i < nCols ; i++) { myDynamArray [ i ] = m all oc ( nCols ✯ s i z e o f ( d ouble ) ) ; i f ( ! myDynamArray [ i ] ) e x i t ( 0 ) ; } The C++ variant is a bit simpler looking. The new operator knows how much space to claim. i n t i ; nRows = 1 0; i n t nCols = 5 ; d ouble ✯✯myDynamArray ; myDynamArray = new i n t ✯ [ nRows ] ; f o r ( i = 0 ; i < nRows ; i++) { myDynamArray [ i ] = new d ouble [ nCols ] ; } c) I hasten to remind you, however, that C++ users would not usually manually allocate memory in this way, they would instead find a pre-built class in some library and use that. 4 Standard Library The Standard Template Library was an early attempt to provide a standard, large set of functions and data structures. Poor, incomplete, implementations of the STL caused me a lot of frustration when I started programming, I decided it was better to use C and then get best-ofbreed addon libraries to fill the gaps that the STL had offered. 7 The current C++ Standard Library is a large set of things, similar to the STL. Until very recently, I had believed they were the same thing. The C++ Standard Library was made part of the ISO C++ standard, the STL is not. See: http://en.wikipedia.org/wiki/C++_Standard_Library Highlights: Collections, “doubly-linked lists”“set”“map”“vector” C’s Standard Library http://en.wikipedia.org/wiki/C_Standard_Library is also included. “Each header from the C Standard Library is included in the C++ Standard Library under a different name, generated by removing the .h, and adding a ’c’ at the start; for example, ’time.h’ becomes ’ctime’.” Note: C’s Standard Library is a large list of header files, only one of which is “stdlib.h”. Wish I had a couple of working examples of vector and map to insert here. 5 Big new C++ feature: Classes
Think informally for a moment. a) We want to create separate “things” that have data and abilities. Call those things “optimizers” or “MCMC chains” or ... i. Maybe an optimizer is a maximum likelihood estimation routine. All optimizers need to receive data, receive instructions. After that, they may go in separate directions searching for the answers. ii. In MCMC simulations, we often need several separate “chains”. If we design the chain class well, then we can comfortable create several separate instances and let them grow their own separate chains. b) We design a common framework in a “class”, and c) Then create instances from that class. We tell each separate instance what to do. They can “go their own ways” by adjusting their variables separately. Benefits a) We preserve our sanity by isolating calculations into separate places. There’s less danger that we accidentally copy information from one object to another. b) We economize on coding by placing some work into common, re-usable routines. c) If we write good routines, we learn to trust them, and “abstract away” from the implementation details. d) We more clearly conceptualize the differences among different types of things.
In C, we can approximate object-oriented computing, but the syntax and design may be tedious. It requires a re-thinking of the way we use C, and many people then say “why not just use C++ or Java instead?” The answer generally is, “C will outlast those other languages, so we will stick with that.” a) In C, we usually think of a function as a thing with no “state”. We put arguments in, results come out. i. We have to pass in ALL of the values that are required to make a calculation. b) Pass by reference is one way to create illusion of continuity, to “remember” by passing in a pointer variable. It allows a function to take note of the “current state” of the information, and then “revise variables.” 8 i. We have a pointer to some data, say a large struct containing all of our instance variables. ii. we pass that pointer to a function, iii. the function dereferences the data to create new values, iv. and the pointer variable’s values themselves might be changed. v. Example: Recall the random generator functions in L’Ecuyer’s MCRG generator http://www.iro.umontreal.ca/~lecuyer/myftp/streams00/c2010. That keeps the internal state of the generator in a pointer variable that is declared like so. (typedef is a C convenience for creating new variable types). t y p e d e f s t r u c t Rn gS t re am In f oS t a te ✯ RngStream ; s t r u c t Rn gS t re am In f oS t a te { d ouble Cg [ 6 ] , Bg [ 6 ] , I g [ 6 ] ; i n t Anti ; i n t IncP rec ; ch a r ✯name ; } ; An instance of that type RngStream is created, and then it is passed to every function that does work to generate random numbers. v oid RngStream ResetStartStream ( RngStream g ) ; v oid RngStream ResetNextSubstream ( RngStream g ) ; s t a t i c d ouble U01d ( RngStream g ) ; vi. Problem: this is a bit of a “mental stretch” and not convenient to manage lots of variables. c) BTW: Define “static”. Static has many grossly different meanings, it can be horribly confusing: ❼ A “static function”is visible only inside the current file. The opposite of “extern”, globally visible. ❼ A “static variable” inside a function is a value that is remembered between calls to the function. This can be used for continuity of calculations. I think most everyone would agree that it is weird (mistaken) to use the word static for such different idea. And it has other uses as well when we come to classes. ❼ problem: that “static variables” values are not selectively remembered. It will be available to any caller of the function who comes along. d) Function names cannot be re-used, there is no “overloading” that allows one function name to be used with various argument types. e) There is no obvious method to “subclass” in C. If we have a struct with 41 variables in it, and we then want to specialize it to a new kind of struct that has all of those variables plus 10 more, we should “copy and paste” the old code to make the new struct?
A class a) instance variables b) constructor function. (Every class must have one, at least) c) destructor function. 9 d) “member functions” aka “methods” aka “method functions” See example-3-class.cc example-4-class.cc example-5-class.cc example-6-class.cc e) Varying levels of information access. i. public: can be accessed from main with “.” or “->” ii. private: can only be accessed from within the object’s member functions. iii. protected: accessed from other objects from same class, derived classes. Key idea: If some other type of object has a pointer to a MyClass object x, a public variable would allow x−>var1 ; // t o r e t r i e v e v al u e o f var1 x−>var1 = 1 9; // t o s e t var1 Variables that are declared private or protected don’t allow that, they will insist that the other object types interact with “x” in a more polite way. MyClass should include “get” and “set” methods to allow values in and out. v oid MyClass : : se tV a r 1 ( i n t x ) { var1 = x ; } i n t MyClass : : getVar1 ( ) { r e t u r n var1 ; } Clever C++ writers who like to use “overloaded functions” might use the same name for both of those, as in v oid MyClass : : Var1 ( i n t x ) { var1 = x ; } i n t MyClass : : Var1 ( ) { r e t u r n var1 ; } f) The things I called “class variables” (variables common among all instances of a class) can be created by the const static modifier. That’s confusing, don’t worry about it at the moment. You’d really have to study C++ before you “get it”.
this, this*, this->. “this” a self-referential keyword that can be used inside methods, a pointer to the instance in which the C++ process is currently located. The most reasonable use of “this” is to pass “oneself” as a return value or a function argument. In many contexts, it is quite reasonable to return “this” or “this *” after setting variables inside an object. Here’s an example I revised from a forum post on (http://publib. boulder.ibm.com/infocenter) called “The this pointer (C++ only)”. 10 //Suppose MyClass i s d e fi n e d , with i n s t a n c e v a r i a b l e s : // i n t l e n ; // ch a r ✯myName; // And member f u n c t i o n s ( aka methods) p r o t o t yped i n t GetLen ( ) ; ch a r ✯ GetName ( ) ; MyClass& Se t ( ch a r ✯ ) ; MyClass& Cat ( ch a r ✯ ) ; MyClass& Copy ( MyClass&) ; ch a r ✯ MyClass : : GetName { r e t u r n myName; } MyClass& MyClass : : Se t ( ch a r ✯pc ) { l e n = s t r l e n ( pc ) ; myName = new ch a r [ l e n ] ; s t r c p y (myName, pc ) ; //myName e q u al ✯pc r e t u r n ✯ t h i s ; } MyClass& MyClass : : Cat ( ch a r ✯pc ) { l e n += s t r l e n ( pc ) ; s t r c a t (myName, pc ) ; //makes myName l o n g e r by appending ✯pc r e t u r n ✯ t h i s ; } MyClass& MyClass : : Copy ( MyClass& x ) { Se t ( x.GetName ( ) ) ; //copy name from some o t h e r i n s t a n c e x r e t u r n ✯ t h i s ; } My guess would be that these functions would be more useful if they returned a pointer, rather than the “actual thing” this. In an agent-based simulation model, for example, it would be much more usual to write: MyClass ✯ MyClass : : Copy ( MyClass ✯ x ) { Se t (x−>GetName ( ) ) ; r e t u r n t h i s ; }
Relationship between “structs” and “classes”. a) Review “struct” in C. Like C, a C++ struct combines different variables into one “thing”. We access the struct data formally like so: s t r u c t whatever x ; s t r u c t whatever ✯ xp t r ; 11 x. v a r 1 = 7 ; // supp o se var1 i s a v a r i a b l e i n the s t r u c t xptr).var1 = 7; //same as -> next linexptr->var1 = 7; b) A C++ class is really just a struct–a lot of variables grouped together–along with ❼ functions can be “members” of a struct. ❼ The “member functions” of a struct have free access to the data in all of the variables of the class (which is really just a struct). A class is a super-powered struct in C++ ❼ Inside the member functions, one can be explicit in asking for values of our own instance variables. – “this” is a keyword, a pointer to the “instance” in which I currently am positioned. – Could access an instance variable by notation like this−>var1 – However, it is not necessary to do so. Inside a member function, the “this->” part is assumed. If you ask for a variable, the C++ runtime system looks in “this” for it. – The only counter-examples I know of are complicated mistakes in which programmers have used the same name for instance variables and the arguments to member functions. Suppose var1 is an instance variable. Suppose a function also declares an argument var1. MyClass : : someFunction ( i n t var1 ) { // Danger , Will R obin s on. I f you u se ”var1 ” , do you e x p e c t // the c om pil e r t o u se the l o c a l au t om a tic v a r i a b l e ”var1 ” // o r the i n s t a n c e v a r i a b l e var1 ? this−>var1 = 2 ✯ var1 ; // m ul ti pl y the argument by two , put the v al u e i n t o this−>v a r 1. – Objective-C forbids this kind of mistake at compile time, but C++ tolerates the ambiguity. – Many C++ novices write “this->” compulsively in their member functions for “clarity,” but if you go ask in the Forums on StackOverflow or CPlusPlus.com, you see the experts uniformly say “stop doing that.” – Nevertheless, you will find plenty of “production quality” code that is littered with unnecessary “this->” usage. Example: Check the C++ source code for the R package RSiena. For example, from the Model.m constructor: Model : : Model ( ) { this−>l c o n d i t i o n a l = f a l s e ; this−>lneedCh ain = f a l s e ; this−>l n e e d S c o r e s = f a l s e ; this−>l n e e d D e r i v a t i v e s = f a l s e ; this−>l p a r a l l e l R u n = f a l s e ; this−>l i n s e r t D i a g o n a l P r o b a b i l i t y = 0 ; 12 this−>l c a n c e l D i a g o n a l P r o b a b i l i t y = 0 ; this−>l p e rm u t eP r o b a bili t y = 0 ; this−>li n s e r t P e rm u t e P r o b a b ili t y = 0 ; this−>l d el e t eP e rm u t eP r o b a bili t y = 0 ; this−>li n s e r t R a n d omMi s si n gP r o b a bili t y = 0 ; this−>l d el e t eR a n d omMi s si n gP r o b a bili t y = 0 ; this−>l sim pl e R a t e s = 0 ; this−>lmodelType = NORMAL; } c) In fact, a struct in C++ is simply a class with the member variables and functions “public” (Eubank and Kupresanin, p. 60).
Subclasses. Here’s where the power of the truly object-oriented language comes to the forefront. If class Citizen exists, we might create classes Voter, Politician, Teacher, and FactoryWorker as variants of it. ❼ Idea: “Citizen”, a collection of variables and methods. ❼ Other types are created from Citizen. Subclass – it “inherits” all of the variables and methods – It can add MORE variables and methods – It can OVERRIDE methods from C (to specialize them, for example). Benefits a) conceptual clarity, model matches our theoretical idea of different object types b) re-usable code. Write one really good set of methods for “Citizens”, then put them to use in all the different object types. 13

shamun/c++_notes.md

C++ Notes - http://pj.freefaculty.org/guides/c-programming/CPlusPlus/CPlusPlus.pdf

1 Superficially: Not so different from C.

2 Initially Noticed Differences from C