What is a SQL injection (SQLi) and how does it work?
Web applications store the information in databases. You can imagine a database as a table with named columns where data is stored inside. When users navigate within the web application, they may want to search for specific information, such as “T-shirts” on a shopping webpage. To do this, the web application interacts with the database using SQL (Structured Query Language) queries.
Imagine this table named clothes_table:
id | Type | Description | Size | Price |
---|---|---|---|---|
1 | T-shirt | A normal t-shirt | M | 12.99€ |
2 | T-shirt | A cooler t-shirt | S | 15.99€ |
3 | Pants | 10% discount applied! | S | 22.99€ |
If the user only wants to see the T-shirts, the webpage will query the database something like:
SELECT * FROM clothes_table WHERE Type="T-shirt"
Then this information will be the only one returned and it can be displayed in the webpage:
1 | T-shirt | A normal t-shirt | M | 12.99€ |
2 | T-shirt | A cooler t-shirt | S | 15.99€ |
A SQL injection is a common web security vulnerability that allows an attacker to interfere with the SQL query that the web application makes to its database.
This may imply that the attacker can retrieve confidential data from that database and modify or delete it. It can even allow attackers to execute remote code and gain access to the victim’s server.
We will first use a PortSwigger lab that will help us understand how SQL works and how it can be exploited. This web represents a shop and we can filter by categories:
If we filter by “Gifts” we will only see some items. If we check the URL, we can see that a parameter named “category” is being used:
Let’s try to imagine how this website could be working in the backend. It may have a database with all the products, and they are returned using a SQL query that uses “category” as a parameter.
It could be that the webpage only shows “released” products, but maybe some other products can be found in the database and not in the webpage because they haven’t been released yet.
The query that the webpage uses may look like this (where released = 1 implies that the product is released)
You could translate this query as: “Select everything from the products table where it’s category is GIFTS and Relased field equals 1”
We can change the ‘GIFTS’ value by writing whatever we want in the URL, but we can’t directly interact with the RELEASED = ‘1’ part of the query.
Everything that we submit after the ‘category=’ will be included between two single quotes, making it a string. However, if we can insert the single quote character ‘ in the input, we will be able to close the string wherever we want and add some code that will be executed.
If instead of GIFTS we send GIFTS’ OR 1=1 —, the resulting query would look like this:
The single quote will close the string and allow the OR 1=1 — to be executed as code. The ‘ that was left and was supposed to close our input, is now behind the —, that comments everything behind it, so it is like it doesn’t exist. Since 1=1 is always true, if we use the OR boolean operator the whole query will be true and will return everything. Since the RELEASED = ‘1’ restriction is in the comment side (after the –), it won’t be effective and we will be able to see everything:
We are now able to see some items that were not visible before, even when using the ‘All’ filter:
We can also exploit SQL injections for bypassing authentification pages.
Let’s do a hands-on example using Burp Suite and the vulnerable OWASP Juice Shop web. We will first use this login webpage and capture the request using Burp Suite:
When I put my credentials, the request sent to the webpage is this one:
Let’s assume that the backend application validates my login credentials using this SQL query:
Since I have an account with that email and that password, the output of this query will return some data and I will be able to log in. However, if I try an incorrect password the output won’t return anything.
Now, imagine that we want to test this log in page and check if it is vulnerable to SQLi. If we try to login with incorrect credentials we get this error message:
But instead of trying incorrect credentials, let’s use the ‘ character as an email.
This error is suspicious. This implies that this login page could be vulnerable to SQLi. But, what is happening here? Let’s imagine again the SQL query. Note that when our email or password is inserted, it is located within two ‘, implying that the value is a string.
This is the normal query:
This is the query using ‘ as email:
As you can see, the strings change, and the part that was part of the query (AND PWD =) is now treated as a string. This query ends with an unclosed string (’;) and this results in an error.
This implies that we can inject code within the query, so we could inject specific things to get some specific outputs.
If I want to log in to my account without using the password, I could inject some “-” to comment everything that is behind, so when the query gets executed, the password validation is not checked
To do this I need to use this email: test@gmail.com’ — The ‘ will close the string and then everything that I write behind will be executed, so I write the double “-” and this will imply that everything behind it won’t be executed. Since the password check is behind the username check, the password won’t be used.
Let’s try this:
It worked!
If we have been able to log in int our account with only knowing our username, this implies that we can log in to any account without even knowing their passwords.
Using the UNION keyword
There are webpages that directly display the output of the query in the webpage. If you are able to do a SQLi, you may be able to drop all the contents of the table in the webpage and steal the data. But, going back to the first example, in this query we are only interacting with the “products” table, as stated after the FROM keyword.
What if we want to retrieve information from other tables instead of “products”?
Well, SQL language offers the UNION keyword that allows to retrieve information from several tables. Behind a UNION keyword you can write another SELECT query.
This example query will return all product_names, but you can see that they are retrieved from 2 different tables, spain_products and usa_products. As a result, we will obtain a single list with all the products that exist in both databases.
If we know that may ebe a table that stores users data named “users”, if we can modify the SQL query using SQLi, we could inject something such as:
1
‘ UNION SELECT passwords FROM users --
If we are able to inject this into a query, we may be able to retrieve the “passwords” column from the “users” table (if it exists). But wait… It’s not that easy.
For a UNION query to work, it must:
- Return the same number of rows
- The data types in each column must be compatible with the individual queries.
This implies that if we try to inject that into this query (using the GIFTS variable):
The result would be:
1
SELECT * FROM products WHERE CATEGORY = '' UNION SELECT passwords FROM users --' AND RELEASED = '1
Assuming that the products table has more than one column, this won’t work because in the first SELECT we are returning all columns (*) and in the second one just one (passwords).
Digging into UNION attacks
As we previously mentioned, in order to do a correct UNION attack you must return the same number of columns and they must have the same data type. Here we will discuss how we can obtain this information.
Obtaining the number of columns with ORDER BY
There are two ways of obtaining this information. The first one is using the ORDER BY keyword.
The ORDER BY keyword is used in SQL to order the results. Usually, the ORDER BY keyword will be followed by the column name that should be considered for ordering, but it also accepts the column index
So, if we have already discovered that is SQLi vulnerable, before starting to use the UNION, we can inject the ORDER BY in the normal query, several times until we get an error (or a different response).
If we inject:
1
' ORDER BY 1 --
The resulting query could look like this:
1
SELECT name,price FROM products WHERE CATEGORY = ''ORDER BY 1 --AND RELEASED = '1
This will return the same results but ordered by name. Since we didn’t get any error, we will keep trying, now with:
1
' ORDER BY 2 --
Again, the resulting query would be:
SELECT name,price FROM products WHERE CATEGORY = ‘‘ORDER BY 1 –AND RELEASED = ‘1
1
SELECT name,price FROM products WHERE CATEGORY = ''ORDER BY 2 --AND RELEASED = '1
We won’t get an error yet, but now the results should be ordered by price. Let’s keep trying:
1
' ORDER BY 3 --
And:
1
SELECT name,price FROM products WHERE CATEGORY = ''ORDER BY 3 --AND RELEASED = '1
Now we will get an error since the query is only returning 2 columns (name and price) and we are asking to order by the third column (that doesn’t exist). So now we know that when doing the union attack, we will need to return 2 values in the injected query if we want it to succeed.
Obtaining the number of columns with NULL
The other option to know the number of columns is using the NULL value within the UNION SELECT query. It is like we are asking the query to return nothing, but it allows us to learn the number of columns.
The NULL value can be used as any data type, so it allows us to use it even though we don’t know the data types from the original query.
This time, instead of searching for the error as we did when using ORDER BY, we will try to find the number of NULLs that doen’t generate an error. So we could start injecting:
1
2
3
' UNION SELECT NULL --
' UNION SELECT NULL,NULL --
' UNION SELECT NULL,NULL,NULL --
Following the last query example, the injection code that won’t return an error is the second one, that resolves to this query:
1
SELECT name,price FROM products WHERE CATEGORY = ''UNION SELECT NULL, NULL --AND RELEASED = '1
Searching for a “string” data type column
When we are trying to retrieve data, the interesting data is usually text (string). Hence, the column that we have to use to retrieve this data has to be one where in the first query a string is returned too. In other words, we need to find what columns in the first query are strings so we can use them.
The method that we will use also implies the use of the NULL keyword. Now that we know the number of columns, we can change one of the columns by a string, for example, ‘test’.
When using null, we specify that we want the query to return nothing for that column, however, when using a specific string, like ‘test’, we are making the query return that string for that column no matter what.
Since we are making the injected query return a string in a specific column, if that column is not of a string type in the first query, we will get an error/no response. We can keep trying until we get a different output.
Assuming that the query returns three columns, we can try:
1
2
3
' UNION SELECT 'test',NULL,NULL --
' UNION SELECT NULL,'test',NULL --
' UNION SELECT NULL,NULL,'test'--
Once we know the number of columns and the ones that return a string, we are ready to try our UNION attack. Assuming that the original query returns three parameters of which the last two return string, and assuming that there is a table named users that has a password and username columns, we could inject this:
1
GIFTS' UNION SELECT NULL,username,password FROM users --
The resulting query will look like this:
1
SELECT price,name,shop FROM products WHERE CATEGORY = 'GIFTS'UNION SELECT NULL,username,password --AND RELEASED = '1
And assuming that it displays the contents on the webpage, the results could look like this:
Price | Name | Shop |
---|---|---|
1 | Candy | Barcelona |
5 | Flowers | Barcelona |
50 | Jeweler’s | Paris |
admin | 12$”Adas8! | |
user1 | 122134da!32 |
What if we only have one column that we can use to display text, but we want to display text from two different columns (like username and password)? We can use a single returning column to display results from two columns by concatenating them. We can use || to concatenate text:
1 ... UNION SELECT NULL, username || ‘-’ || password FROM usersThis is going to return columns, the NULL and another one with username and password in the same column, but divided by a -.
Obtaining information about the database
Until now we have been making several assumptions, like the type of database, that there are tables named users with username and password column, etc.
We can’t assume all of this, but there are ways to know this information. We can use UNION SELECTS to ask the database about itself. The databases have well-known predefined tables that we can use to retrieve information.
For example, if we are against a PostgreSQL database, we can ask for the name of the tables using the information_schema.tables :
1
... UNION SELECT NULL,table_name FROM information_schema.tables --
If we are against another type of database, the way to obtain this information will be different. We can also obtain the version of the database, the column names for a specific table, etc.
I highly recommend this cheat sheet.
Blind SQL injections
Nowadays is difficult to find SQLi where the contents are directly displayed in the HTTP response in a way that you are able to just read the contents of the database. A Blind SQLi occurs when the application is vulnerable to the injection but we are not able to see the contents, hence, doing UNION attacks is useless.
But there are other clever ways to exploit this type of SQLi.
Exploiting Blind SQLi by analyzing the responses.
Imagine a webpage that uses cookies to maintain the session. The backend of the webpage confirms the cookie by doing an SQL query in the database.
1
SELECT session_id FOM Users WHERE session_cookie='sdaYSD92sdWcaAEL21'
Cookies are used by web browsers and applications for different purposes, one of them is to maintain a session within the application. Cookies are sent automatically in each request without the intervention of the user. To inject SQL in this case, we will need to intercept the request and modify the session_cookie variable using a web proxy.
If the query is successful, we may get a personalized webpage (for example when your username is displayed on the right top of the webpage). If the query is not successful, we may get an error or the webpage may look like when you are not logged in. The takeaway of this is that depending on the result of the query (true or false), the contents of the webpage change.
Even though we won’t be able to see the results of the query, we can inject some conditional code and see if the result is true or false. A basic example:
1
2
sdaYSD92sdWcaAEL21' OR 1=1 -- This is going to be true always
sdaYSD92sdWcaAEL21' AND 1=2 -- This is going to be false
Before continuing with the explanation, we need to understand the SUBSTRING function. This function is used to extract some characters from a string of text. It requires three parameters:
- The string
- The starting position
- Number of characters to extract (length)
1
*SUBSTRING(string, start, length)*
These are some output examples:
1
2
3
SUBSTRING("Hello world",1,5) = "Hello"
SUBSTRING("Hello world",1,4) = "Hell"
SUBSTRING("Hello world",7,5) = "world"
In the “string” input, we can insert another SQL query, so instead of directly providing the string, we can extract it from the database. Example:
1
2
3
SUBSTRING((SELECT password FROM users WHERE username='admin'),1,10) = "ab12345678"
SUBSTRING((SELECT password FROM users WHERE username='admin'),1,5) = "ab123"
SUBSTRING((SELECT password FROM users WHERE username='admin'),1,1) = "a"
How can we use this? Our situation is the following:
- We know that the cookie value is injectable
- We can’t get the results of the query
- If the query is true, we get a specific response, if it is false, we get another one
Now assume that we intercept the request and modify the cookie value with this payload:
1
session_cookie="sdaYSD92sdWcaAEL21 ' AND SUBSTRING ((SELECT password FROM users WHERE username 'admin'),1,1) = 'b"
(Assume that the admin’s password is “ab12345678”)
The resulting query will be:
1
SELECT session_id FOM Users WHERE session_cookie='sdaYSD92sdWcaAEL21' AND SUBSTRING ((SELECT password FROM users WHERE username 'admin'),1,1) = 'b'
Since SUBSTRING ((SELECT password FROM users WHERE username ‘admin’),1,1) will return ‘a’, this part: AND SUBSTRING ((SELECT password FROM users WHERE username ‘admin’),1,1) = ‘b’ will traduce to: AND ‘a’ = ‘b’. Since this is false, we will know it because all the query will be false (we are using AND instead of OR), and the webpage content will be the one displayed when the query is false.
However, if instead of SUBSTRING ((SELECT password FROM users WHERE username ‘admin’),1,1) = ‘b’ we use SUBSTRING ((SELECT password FROM users WHERE username ‘admin’),1,1) = ‘a’, this will result in AND ‘a’=’a’ and it will be true, then the webpage returned will be the other one and this implies that the first letter of the admin’s password is an ‘a’
Now we can start to guess the second letter, using the same query but changing the start parameter of the substring:
SUBSTRING ((SELECT password FROM users WHERE username ‘admin’),2,1)
With some time and patience, we could retrieve all the passwords by checking all possible values and analyzing when the webpage returned differs from the others.
Triggering conditional errors
Sometimes is not possible to see any differences in the response that we get, hence, the last method explained won’t be useful.
Even though we may not get any different response when the query is false or true, we may get a different response if we generate an unhandled error.
Note the difference between a query that results in ‘false’ and a query that triggers an error: 1=2 results in false, but the query gets executed successfully. 1/0 will result in an error because is impossible to divide by 0, so the query won’t get executed and will trigger an error.
If errors are treated correctly, we may be able to see a difference in the response. We can try to test this using the CASE keyword.
The CASE keyword will analyze a condition, if it is true, it will execute what’s behind the THEN keyword, if it is false, it will execute what’s behind the ELSE keyword.
Following the last example, we could inject this payload:
1
session_cookie="sdaYSD92sdWcaAEL21 ' AND (SELECT CASE WHEN (username = 'admin' AND SUBSTRING (password,1,1) = 'b') THEN 1/0 ELSE 'a' END FROM users)='a"
If we analyze this “SELECT CASE WHEN (username = ‘admin’ AND SUBSTRING (password,1,1) = ‘a’) THEN 1/0 ELSE ‘a’ END FROM users” if the first letter of the admin’s password is a ‘b’, the CASE will be true so the THEN will be executed, which is a divide by zero operation that will trigger an error. If we are able to see the error, we know that the first letter of the admin password is ‘b’. However, if it is not, the ELSE will be executed and it just returns an ‘a’. It will resolve to ‘a’ = ‘a’ and it won’t cause any error. If an error wasn’t caused, we will need to try with another letter instead of ‘b’
Using the error messages content
Sometimes, when triggering an error, we will be able to see the error message. Errors messages can be used to see the result of the query, so we can transform a Blind SQL injection into a SQL injection where we can see the results.
Using the CAST keyword (enables you to convert one data type to another), we could try to transform the result of the query into an incompatible data type. Assuming that what we want to extract is a string, we could try to transform it into an int. This will cause an error, and in the error message the string value that we are trying to transform could be displayed.
For example, a query containing this:
1
CAST((SELECT password FROM users where username = 'admin') AS int)
could lead to an error message like this:
ERROR: invalid input syntax for type integer: "ab12345678"
Where “ab12345678” is the admin’s password.
Time-based SQLi
What if data is not displayed, there are no differences in the responses, and errors are handled? Then we can do something similar to generating a conditional error, but instead of generating an error, make the database sleep for some seconds.
Since SQL queries are generally processed synchronously, if we delay the SQL query we will also delay the HTTP response, this means that the response may take longer to come if the time delay was executed successfully.
If we are able to inject this:
1
… AND IF (SELECT SUBSTRING(password,1,1) FROM users WHERE username = 'admin') = 'a' WAITFOR DELAY '0:0:10’
and the response takes 10 seconds or more to come, this means that the first letter of the admin’s password is an ‘a’
Out of Band SQL injections
When Time-based SQLi injections are not an option because the query is asynchronous or the database is configured with a maximum time, another option is to perform an Out of Band Injection.
To perform this, you are required to control another system, like a DNS server, so is more complex.
In this type of injection, you make the database perform a request (like a DNS lookup or a HTTP request) to a server you control. Within the request, there is the information that you extracted. So the data is not sent directly to you, but to another server. Then you can collect the retrieved data from that server.
If you want to go full hacker goblin mode, with more details about how to perform Out of Band SQLi, bypass WAF, and much more I recommend this webpage and the SQLi cheat sheet from Portswigger.
You can also inject SQL payloads in JSON or XML data sent by the webpage. However, you will have to considere that the query may be parsed differently (it may not involve direct concatenation) so you may need to identify the scenario. For example, it may not be necessary to close the string using a single quote (
'
).
You can use the Hackvector Burp extension to encode payloads and try to bypasss WAFs