Created
May 22, 2021 14:26
-
-
Save markv12/66429712b6c8e60d92f2a8c37e390feb to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Hey everyone, today I'd like to go over the differences between classes and structs in C# | |
We'll start with how their memory is allocated. | |
Whenever you see a struct variable, the memory for that struct will be stored in the same location as the variable. | |
So if you have a struct variable in a function, the memory for that struct will be stored within that function's stack frame. | |
If you have an array of structs, the structs' memory will be stored within the array itself. | |
and if you have a struct variable within a class, the struct's memory will be stored directly inside each object. | |
Structs and other types with this behavior like ints, floats, and bools are known as "value types" | |
In contrast, whenever you see a class variable, the variable is only a reference to an object which is stored in some random location in the heap. | |
So if you have a class variable in a function, that function's stack frame only contains a reference to an object somewhere on the heap. | |
If you have an array of objects, the array is actually just an array of references to objects. | |
And if you have a class variable inside an object, again it's just a reference. | |
classes and other types with this behavior like arrays are known as "reference types" | |
The relevance of this difference becomes apparent when we start assigning class and struct variables and when we pass them into functions. | |
Whenever a struct variable comes into scope, you can assume that the memory for that struct has already been allocated. | |
So if we declare two struct variables, even if we don't do anything with them, when the variables come into scope, the system will allocate memory for them. | |
The system won't let you read from these variables before they're assigned, but that's just C# protecting you from shooting yourself in the foot. If you assign all the member variables of a struct by hand, you can start using it without ever explicitly allocating memory. | |
In practice, initializing a struct like this is really tedious so we can instead just use the "new" keyword | |
The "new" keyword for structs is basically a formality for calling one of the struct's constructors. It doesn't allocate memory like the "new" keyword for classes. Using the word "new" in both these places is honestly really confusing and makes classes and structs seem more similar than they actually are. | |
In this example it simply calls the struct's default constructor which sets all the struct's variables to their default values. | |
Now that we've initialized the first struct, let's assign it to the second struct variable. | |
The memory for the second variable has already been allocated, so when we assign the 1st variable to it, it simply copies the data from the first struct into the second struct. | |
Now we have two completely separate structs. | |
Modifying struct1 won't affect struct2. | |
Now let's see how this works with classes. | |
Here we declare two class variables. | |
When these variables come into scope, the system only allocates enough space to hold a memory reference. | |
By default these memory references are "null" references meaning they don't point to anything. | |
Now let's assign the first variable. | |
We create a new object using the new keyword. | |
This allocates a piece of memory on the heap and gives us a reference to where that memory is located. | |
We now assign that reference to our class variable. | |
Now let's assign the 1st variable to the second variable. | |
This simply copies the object reference from the first variable to the second variable. | |
Now both variables refer to the same location in memory, so modifying object1 will also modify object2; | |
Something similar happens when we pass a class variable into a function. | |
The function parameter simply gets the memory address of the variable you pass into it. | |
So if the object gets modified within the function, the object outside the function also gets modified. | |
This is not the case for structs. | |
If we define a function with a struct parameter, when that function is called and the parameter comes into scope, the system will automatically allocate memory to hold that struct and it will take the struct you passed into the function, and copy its values into the new memory it just allocated. | |
If the function modifies the struct parameter, those changes do not affect the struct outside the function, because the structs are using different pieces of memory. | |
Because new struct variables always allocate new memory, you might think that structs are immutable, but this is not the case. | |
If you assign a value to a member variable of a struct, that variable is modified as you would expect. | |
The duplication only occurs when you assign the struct variable itself. | |
There are some performance implications of using classes vs structs, but it is definitely not the case that structs are always faster. | |
Structs make memory management simpler and this can sometimes improve performance. | |
If you have an array of structs, the array itself is an object on the heap. But the structs are completely contained within the array, so the garbage collecter only has one object it has to consider. If the array goes out of scope, the garbage collector can deallocate all the structs inside the array in one go. | |
If any other code retrieved a struct out of that array, they got a copy of that struct, not the original, so we can safely deallocate the original when the array goes out of scope. | |
If we have an array of objects, the array as well as each object in the array are separate objects on the heap. | |
Each object could be stored in a totally different part of the heap and other variables and arrays could have references to those same objects. | |
So when our original array goes out of scope, that tells us very little about whether we can deallocate its objects. | |
The garbage collector has to consider each object individually and make sure there are no references to each object before deallocating them. | |
For CPU intensive tasks, structs provide another advantage. | |
Because of the way modern CPU cache's work, a CPU can operate much faster if all the pieces of memory it's using are close to each other. | |
With an array of structs, all the structs are located contiguously in memory so the CPU can intelligently fetch large pieces of memory pulling in multiple structs in a single request. | |
With an array of objects, each object could be in a different place in memory, so as your code loops over the array, the CPU may need to be pulling in memory from all over the place requiring a separate request for each object. | |
Structs only have this advantage when they consist of value types like ints, bools, and other structs. | |
If a struct contains reference types like classes and arrays, the memory for those variables will again be stored in random places on the heap so if your CPU intensive code reads from those variables, the CPU will not be able to use its cache as effectively. | |
So if you're running into issues with Garbage Collection taking too long or you have a very CPU intensive process looping over large arrays, using structs may improve performance, but otherwise the difference may be negligable. | |
In some cases structs may actually be slower because of all the copying involved. | |
If your struct has lots of variables that need to be copied, converting it to a class and just passing references around may be faster. | |
Running a test and getting good profiling data is your best bet for figuring out what's best for your use case. | |
There are a few other practical difference between classes and structs. | |
Structs can not be null. When you consider how the memory for structs is automatically allocated this makes sense. When you see a struct variable declaration, its memory has already been allocated. | |
It's possible to make a nullable struct by adding a question mark to the end of the type. This basically adds a boolean value to your struct to keep track of if it's null. | |
It's a bit of a hack, but it can be useful when you're unable to return a valid struct. | |
The nullable version of your struct is still a value type, so it still has all the memory characteristics of a regular struct. | |
You can not write your own parameterless constructor for a struct. This is because of how structs are allocated. | |
When you create an array of structs, the system automatically calls the default constructor to initialize each struct to its default values. | |
If you were allowed to override the default constructor then it would run your custom code multiple times whenever someone creates an array of your structs. | |
If your default constructor does something expensive or has side effects where it behaves differently each time you call it, it could result in lots of unexpected behavior. | |
If programmers didn't do foolish things like that it would be fine to override the default constructor, but the authors of C# decided it wasn't worth the risk. | |
Constructors for structs must assign all the variables in that struct whereas constructors for classes do not. | |
I've heard this has something to do with how heap memory is guaranteed to be zeroed out whereas stack memory is not, but honestly I don't know for sure. | |
There is no concept of inheritance for structs, though structs can implement interfaces. | |
If you store a struct in an interface variable it actually turns it into a reference type like a class. | |
There is one final difference that is a revealing consequence of how structs are stored. | |
If a struct definition has another struct variable inside of it, the memory for the inner struct will be contained inside the main struct. | |
For this reason a struct definition can not contain a struct variable of the same type. This would basically create an infinite recursive definition. | |
Classes do not have this problem because a class variable is just a reference to another piece of memory somewhere on the heap. | |
A class variable could reference another object, reference nothing, or even reference itself. | |
I hope this video has clarified the most relevant differences between classes and structs. | |
Check the description for a link to the example code. | |
If you found this video informative, consider giving it a like. Thanks for watching! |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment