1NF is the most basic of normal forms - each cell in a table must contain only one piece of information, and there can be no duplicate rows.
2NF and 3NF are all about being dependent on the primary key. Recall that a primary key can be made up of multiple columns. As Chris said in his response:
The data depends on the key [1NF], the whole key [2NF] and nothing but the key [3NF] (so help me Codd).
Say you have a table containing courses that are taken in a certain semester, and you have the following data:
|-----Primary Key----| uh oh |
V
CourseID| Semester | #Places | Course Name |
-------------------------------------------------|
IT101 | 2009-1 | 100 | Programming |
IT101 | 2009-2 | 100 | Programming |
IT102 | 2009-1 | 200 | Databases |
IT102 | 2010-1 | 150 | Databases |
IT103 | 2009-2 | 120 | Web Design |
This is not in 2NF, because the fourth column does not rely upon the entire key - but only a part of it. The course name is dependent on the Course's ID, but has nothing to do with which semester it's taken in. Thus, as you can see, we have duplicate information - several rows telling us that IT101 is programming, and IT102 is Databases. So we fix that by putting the course name into another table, where CourseID is the ENTIRE key.
Primary Key |
CourseID | Course Name |
---------------------------|
IT101 | Programming |
IT102 | Databases |
IT103 | Web Design |
No redundancy!
Okay, so let's say we also add the name of the teacher of the course, and some details about them, into the RDBMS:
|-----Primary Key----| uh oh |
V
Course | Semester | #Places | TeacherID | TeacherName |
---------------------------------------------------------------|
IT101 | 2009-1 | 100 | 332 | Mr Jones |
IT101 | 2009-2 | 100 | 332 | Mr Jones |
IT102 | 2009-1 | 200 | 495 | Mr Bentley |
IT102 | 2010-1 | 150 | 332 | Mr Jones |
IT103 | 2009-2 | 120 | 242 | Mrs Smith |
Now it should be obvious that TeacherName is dependent on TeacherID - so this is not in 3NF. To fix this, we do much the same as we did in 2NF - take TeacherName out of this table, and put it in its own, which has TeacherID as the key.
Primary Key |
TeacherID | TeacherName |
---------------------------|
332 | Mr Jones |
495 | Mr Bentley |
242 | Mrs Smith |
No redundancy!!
One important thing to remember is that if something is not in 1NF, it is not in 2NF or 3NF either. So each additional Normal Form requires everything that the lower ones had, plus some extra conditions, which must all be fulfilled.
To explain your concern @yuvarja402
3nf refers to a table that hasn't been filtered out with numerical(primary_key) where there are Strings(TeacherName)
The reason 3nf -> 2nf -> 1nf are important is due to the processing speed of queries once your DB start growing exponentially.
So let's look at the example for 2NF, if you notice the primary key here, Course ID, refers to a String Course Name. This means that if were to do a query searching for this primary key in the 2NF table, the process would only look for the Course ID and then replace it with the Course name. Here instead of having to validate "Programming" X amount of times, "IT101" was validated instead saving a few String characters. Though it may not seem like much if you consider 30k students on one campus plus 15k students online, plus possible offshore campuses, those minimal letters save an exponential amount of time in the query.
So now let's look at what 1NF is. In the example they have added teachers. It's clear Mr.Jones is repeated multiple times. So instead of having to query "Mr.Jones" the 65K times, we will look for 332 and then replace it with a string.
So the idea of normalization is replacing these larger strings with a much simpler "primary key" in order to reduce overall query time. So now when you have a table
CourseID| Semester | #Places | Course Name |
-------------------------------------------------|
IT101 | 2009-1 | 100 | Programming |
IT101 | 2009-2 | 100 | Programming |
IT102 | 2009-1 | 200 | Databases |
IT102 | 2010-1 | 150 | Databases |
IT103 | 2009-2 | 120 | Web Design |
You are asking the computer to search for 25 characters per line compared to 7 characters per line
**CourseID| Semester | #Places
----------------------------------|
IT101 | 1 | 1 |
IT101 | 2 | 1 |
IT102 | 3 | 2 |
IT103 | 2 | 3 |
This equates to a difference of 18 bytes * the assumed 30k students which equals 540000 bytes which is roughly half a megabyte all for just one class during one semester.
Now imagine adding that number and multiplying it by semesters, and then add on top Alumni requiring transcripts, current students looking for requirements, queries to show available classes. All of a sudden the impressive 16gb of Ram starts to slow down.
This is a bit of an exaggeration but I hope the point and important of normalization is there.