JSP/Servlet | Passing Values Between Client and Server | Escaping Received Data

When learning how to exchange data between a client and server, there is something else you need to learn at the same time: data escaping.

When using values sent by users, you must keep in mind that you do not know what values a user might send. JavaScript is especially important here. What happens if someone writes a script using a <script> tag in an input field and sends it? If your processing displays that text on the screen as-is, the script may run when the page appears. Recent browsers often block this, but you should not rely on that. In systems such as bulletin boards or comment posts, if such scripts are posted and displayed, unintended scripts may run every time another person accesses the page.

XSS, or cross-site scripting, is a type of site attack that targets this kind of vulnerability. The small samples here may not cause a serious problem, but once you build a system that stores data in a database and displays it later, you cannot avoid this kind of issue. At this stage, let’s at least learn how to handle it.

The basic countermeasure for this kind of attack is to escape text before outputting it. For example, replacing symbols used in HTML tags, such as < and >, with &lt; and &gt; is enough to neutralize those tags. By escaping symbols that have special behavior, you neutralize the behavior contained in the data.

Look at the very simple example below.

<%@ page language="java" contentType="text/html; charset=utf-8"
    pageEncoding="utf-8"%>
<%
String inpt = request.getParameter("input");
inpt = inpt == null ? "" : inpt;
String chk = request.getParameter("check");
chk = chk == null ? "OFF" : "ON";
String rd = request.getParameter("radio");
rd = rd == null ? "" : rd;

String str = "INPUT:" + getEscapedString(inpt) + "<br>" +
        "CHECK: " + getEscapedString(chk) + "<br>" +
        "RADIO: " + getEscapedString(rd) + "<br>";
%>
<%!
public String getEscapedString(String s){
    String str = s;
    str = str.replace("&","&amp;");
    str = str.replace("<","&lt;");
    str = str.replace(">","&gt;");
    str = str.replace("\"","&quot;");
    return str;
}
%>
<!DOCTYPE html>
<html>
<head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
    <title>Sample jsp</title>
    <style>
    h1 {font-size:16pt; background:#AAFFAA; padding:5px; }
    </style>
</head>
<body>
    <h1>Sample jsp page</h1>
    <p>This page is a sample.</p>
    <p><%=str%></p>
    <form method="post" action="hello.jsp">
    <table>
        <tr>
            <td>Input</td>
            <td><input type="text" id="input" name="input"></td>
        </tr>
        <tr>
            <td></td>
            <td><input type="checkbox" id="c1" name="check" value="Une"><label for="c1">Checkbox</label></td>
        </tr>
        <tr>
            <td></td>
            <td>
                <input type="radio" name="radio" id="r1" value="first"><label for="r1">Radio button 1</label><br>
                <input type="radio" name="radio" id="r2" value="Second"><label for="r2">Radio button 2</label>
            </td>
        </tr>
        <tr>
            <td></td>
            <td><input type="submit" value="Send"></td>
        </tr>
    </table>
    </form>
    </body>
</html>

Here, when text is sent, symbols such as <, >, ", and & in the submitted text are escaped before display. The escaping process is defined in a method named getEscapedString. When outputting text, pass the text to getEscapedString and display the escaped result.

One point to be careful about is not to assume that only text input values need to be escaped. In this example, even checkbox and radio button values are processed with getEscapedString. You might think, “Those submitted values are fixed, so isn’t this unnecessary?” However, JSP can obtain values with getParameter in the same way for both GET and POST. For example, someone can access a URL such as hello.jsp?check=hogehoge and pass a value different from the original check value. Unless you process the page to accept only POST with getMethod, that value is processed as-is.

For this reason, whenever a program outputs text to the client side, it should always escape it.