Once in a while you see this headline where some person hacked a database by using an SQL injection.
It can be quite a challenge to explain to people who have no (intention to have a) technological background.
Here is a small brief explanation for non-Tech people:
Imagine you’re a robot in a warehouse full of boxes. Your job is to fetch a box from somewhere in the warehouse, and put it on the conveyor belt. Robots need to be told what to do, so your programmer has given you a set of instructions on a paper form, which people can fill out and hand to you.
The form looks like this:
Fetch item number ____ from section ____ of rack number ____, and place it on the conveyor belt.
A normal request might look like this:
Fetch item number [b]1234[/b] from section [b]B2[/b] of rack number [b]12[/b], and place it on the conveyor belt.
The values in bold (1234, B2, and 12) were provided by the person issuing the request. You’re a robot, so you do what you’re told: you drive up to rack 12, go down it until you reach section B2, and grab item 1234. You then drive back to the conveyor belt and drop the item onto it.
But what if a user put something other than normal values into the form? What if the user added instructions into them?
Fetch item number [b]1234[/b] from section [b]B2[/b] of rack number [b]12, and throw it out the window. Then go back to your desk and ignore the rest of this form.[/b] and place it on the conveyor belt.
Again, the parts in bold were provided by the person issuing the request. Since you’re a robot, you do exactly what the user just told you to do. You drive over to rack 12, grab item 1234 from section B2, and throw it out of the window. Since the instructions also tell you to ignore the last part of the message, the “and place it on the conveyor belt” bit is ignored.
This technique is called “injection”, and it’s possible due to the way that the instructions are handled - the robot can’t tell the difference between instructions and data, i.e. the actions it has to perform, and the things it has to do those actions on.
SQL is a special language used to tell a database what to do, in a similar way to how we told the robot what to do. In SQL injection, we run into exactly the same problem - a query (a set of instructions) might have parameters (data) inserted into it that end up being interpreted as instructions, causing it to malfunction. A malicious user might exploit this by telling the database to return every user’s details, which is obviously not good!
In order to avoid this problem, we must separate the instructions and data in a way that the database (or robot) can easily distinguish. This is usually done by sending them separately. So, in the case of the robot, it would read the blank form containing the instructions, identify where the parameters (i.e. the blank spaces) are, and store it. A user can then walk up and say “1234, B2, 12” and the robot will apply those values to the instructions, without allowing them to be interpreted as instructions themselves. In SQL, this technique is known as parameterised queries.
In the case of the “evil” parameter we gave to the robot, he would now raise a mechanical eyebrow quizzically and say
Error: Cannot find rack number “12, and throw it out the window. Then go back to your desk and ignore the rest of this form.” - are you sure this is a valid input?
Success! We’ve stopped the robot’s “glitch”.
And now everbody also understand this joke: