General Notes on C

Notes in an attempt to document a C library.

Pointers

You can write C pointers in the following ways:

someType* somePtr;
someType *somePtr;
someType*somePtr;
someType * somePtr;

How to use pointers

From: https://www.tutorialspoint.com/cprogramming/c_pointers

#include <stdio.h>

int main () {

   int  var = 20;   /* actual variable declaration */
   int  *ip;        /* pointer variable declaration */

   ip = &var;  /* store address of var in pointer variable*/

   printf("Address of var variable: %x\n", &var  );

   /* address stored in pointer variable */
   printf("Address stored in ip variable: %x\n", ip );

   /* access the value using the pointer */
   printf("Value of *ip variable: %d\n", *ip );

   return 0;
}

Pointer Operations

Source: https://www.programiz.com/c-programming/c-pointers

&: reference operator. &p is the address of variable p.
*: dereference operator; also used in pointer assignment/declaration. *(&p) gets the value at address &p.
->: arrow operator; used to access a member of an object referenced by a pointer. foo->bar is the same as writing (*foo).bar from the struct that foo points to.

Pointer arrow operations

Arrow operations (https://stackoverflow.com/questions/2575048/arrow-operator-usage-in-c):

struct foo
{
  int x;     // 5
  float y;
};

struct foo var;
struct foo* pvar;
pvar = malloc(sizeof(pvar));

var.x = 5;   // var.x is 5
(&var)->y = 14.3;
pvar->y = 22.4;
(*pvar).x = 6;   // (*pvar).x is 5

A good explanation on why, by Lukasz Matysiak:

I'd just add to the answers the "why?".

. is standard member access operator that has a higher precedence than * pointer operator.

When you are trying to access a struct's internals and you wrote it as *foo.bar then the compiler would think to want a 'bar' element of 'foo' (which is an address in memory) and obviously that mere address does not have any members.

Thus you need to ask the compiler to first dereference whith (*foo) and then access the member element: (*foo).bar, which is a bit clumsy to write so the good folks have come up with a shorthand version: foo->bar which is sort of member access by pointer operator.

C pointer weirdness

excellent explainer re: the weirdness of C pointers: http://c-faq.com/decl/cdecl1.html

Q: How do I construct declarations of complicated types such as ``array of N pointers to functions returning pointers to functions returning pointers to char'', or figure out what similarly complicated declarations mean?

A: The first part of this question can be answered in at least three ways:

char *(*(*a[N])())();
Build the declaration up incrementally, using typedefs:

  typedef char *pc;	/* pointer to char */
	typedef pc fpc();	/* function returning pointer to char */
	typedef fpc *pfpc;	/* pointer to above */
	typedef pfpc fpfpc();	/* function returning... */
	typedef fpfpc *pfpfpc;	/* pointer to... */
	pfpfpc a[N];		/* array of... */

Use the cdecl program, which turns English into C and vice versa. You provide a longhand description of the type you want, and cdecl responds with the equivalent C declaration:

cdecl> declare a as array of pointer to function returning
  	pointer to function returning pointer to char

  char *(*(*a[])())()

cdecl can also explain complicated declarations (you give it a complicated declaration and it responds with an English description), help with casts, and indicate which set of parentheses the parameters go in (for complicated function definitions, like the one above). See question 18.1.

C's declarations can be confusing because they come in two parts: a base type, and a declarator which contains the identifier or name being declared, perhaps along with *'s and []'s and ()'s saying whether the name is a pointer to, array of, or function returning the base type, or some combination.[footnote] For example, in

  char *pc;

the base type is char, the identifier is pc, and the declarator is *pc; this tells us that *pc is a char (this is what declaration mimics use'' means). One way to make sense of complicated C declarations is by reading them inside out,'' remembering that [] and () bind more tightly than *. For example, given

  char *(*pfpc)();

we can see that pfpc is a pointer (the inner *) to a function (the ()) to a pointer (the outer *) to char. When we later use pfpc, the expression *(*pfpc)() (the value pointed to by the return value of a function pointed to by pfpc) will be a char.

Another way of analyzing these declarations is to decompose the declarator while composing the description, maintaining the ``declaration mimics use'' relationship:

  *(*pfpc)()	is a	char
  (*pfpc)()	is a	pointer to char
  (*pfpc)	is a	function returning pointer to char
  pfpc	is a	pointer to function returning pointer to char

If you'd like to make things clearer when declaring complicated types like these, you can make the analysis explicit by using a chain of typedefs as in option 2 above.

The pointer-to-function declarations in the examples above have not included parameter type information. When the parameters have complicated types, declarations can really get messy. (Modern versions of cdecl can help here, too.)

Additional links:

A message of mine explaining the difference between array-of-pointer vs. pointer-to-array declarations

David Anderson's ``Clockwise/Spiral Rule''

References: K&R2 Sec. 5.12 p. 122 ISO Sec. 6.5ff (esp. Sec. 6.5.4) H&S Sec. 4.5 pp. 85-92, Sec. 5.10.1 pp. 149-50

Pointer casting

const u8int *sp = (const u8int *)src;
u8int *dp = (u8int *)dest;
for(; len != 0; len--) *dp++ = *sp++;

Sometimes, you'll see pointers written as (const u8int *)src on the right side operand. This is a pointer casting.

Usually used when the left side is unknown.

Some devs will define a pointer type on the left side, and also cast the pointer type on the right side as double assurance that it's the correct pointer type.

There is no additional benefit gleaned from this practice, and can be considered to add unecessary complexity to the code: https://stackoverflow.com/questions/14464136/why-explicity-cast-right-side-pointer-operand-if-type-matches-left-side-operand

Types

https://www.tutorialspoint.com/cprogramming/c_data_types.htm

Basic Types: Integer types or Floating Point types. Chars are integer types i.e. their utf code point: 'a' would have an int value of 61.
Void:
- Indicates no value is available.
- A function that returns void returns no value. E.g. void exit(int status);
- A function that accepts no parameters can accept void. E.g. int rand(void);
- Pointers of type void are the address of an object without defining its type, allowing you to cast its type later. E.g. void * malloc(size_t size);
Enumumerated: enum types are used to define variables that can assign discrete int values throughout the program.
Derived Types: Include pointers, arrays, structs, union, function types. These objects have their types derived from their content.

Structs

Composite type that allows you to combine data items of different types.

Looks like this:

struct [structure tag] {
  member definition;
  member definition;
  ...
} [one or more structure variables];

[structure tag]: optional
member definition: normal variable definition e.g. int i or float i
[one or more structure variables]: optional. This defines a variable of that struct type. Same as the following in Go:

thisStruct := struct {
  name string
  email string
} {"zed","[email protected]"}

To define a struct:

struct point {
  int x;
  int y;
}
struct point this_point = {3,7}
struct point *p = &this_point /* declare and define p as a pointer of type struct, and assign it the address of this_point */

Interesting things about a C struct:

It's a physically grouped set of data, which means it represents a contiguous block of memory.
This also means that it can be addressed by a single pointer.
This block is delimted by word-length boundaries.

Sources:

Functions

A function definition has this form (K&R, 25):

return-type function-name(parameters)
{
  decelations
  statements
}

Glossary

JNI: Java Native Interface. Foreign function interface that allows Java code running inside a JVM to call and be called by programs written in other languages running outside the JVM.
COM: Component Object Model. "COM is a platform-independent, distributed, object-oriented system for creating binary software components that can interact. COM is the foundation technology for Microsoft's OLE (compound documents) and ActiveX (Internet-enabled components) technologies."
word-length boundaries: word-length is a chunk of data. Different platforms and architectures have different word-length sizes. This used to be tied to hardware, but is now more tied to platform or VM (e.g. JVM or EVM). 16-bit architectures would have 16-bit or 2 byte words. 32-bit archs would have 4 byte words. So when we encounter composite data structure like structs, defining a struct would allocate a block of memory that allocates space for each data item it contains as a multiple of its word-length, i.e. struct { int i; char a; }; would be allocated as 4 bytes per item on a 32-bit arch platform, even though char a should in theory only take up 1 byte.

zeddee/c-notes-aug2019.md