Interesting as this is, how does it relate to security?
Well, consider a guestbook as an example. Here, users are invited to enter a message into a form, which then gets displayed on the HTML page along with everyone else’s messages. For now, we won’t go intodatabase security issues, the problems dealt with below can occur whether the data is stored in a database, a file, or some other construct.
If a user enters data which contains HTML, or even JavaScript, then when the data is included into your HTML for display later, their HTML or JavaScript will also get included.
If your guestbook page displayed whatever was entered into the form field, and a user entered the following,
Hi, I <b>love</b> your site.
Then the effect is minimal, when displayed later, this would appear as,
Hi, I love your site.
Of course, when the user enters JavaScript, things can get a lot worse. For example, the data below, when entered into a form which does not prevent JavaScript ending up in the final displayed page, will cause the page to redirect to a different website. Obviously, this only works if the client has JavaScript enabled in their browser, but the vast majority of users do.
Hi, I love your site. Its great!<script
language=”JavaScript”>document.location=”http://www.acunetix.com/”;</script>
For a split second when this is displayed, the user will see,
Hi, I love your site. Its great!
The browser will then kick in and the page will be refreshed from www.acunetix.com. In this case, a fairly harmless alternative page, although it does result in a denial of service attack; users can no longer get to your guestbook.
Consider a case where this was entered into an online order form. Your order dispatchers would not be able to view the data because every time they tried, their browser would redirect to another site. Worse still, if the redirection occurred on a critical page for a large business, or the redirection was to a site containing objectionable material, custom may be lost as a result of the attack.
Fortunately, PHP provides a way to prevent this style of PHP hack attack. The functions strip_tags(), nl2br() and htmlspecialchars() are your friends, here.
strip_tags() removes any PHP or HTML tags from a string. This prevents the HTML display problems, the JavaScript execution (the <script> tag will no longer be present) and a variety of problems where there is a chance that PHP code could be executed.
nl2br() converts newline characters in the input to <br /> HTML tags. This allows you to format multi-line input correctly, and is mentioned here only because it is important to run strip_tags() prior to running nl2br() on your data, otherwise the newly inserted <br /> tags will be stripped out when strip_tags() is run!
Finally, htmlspecialchars() will entity-quote characters such as <, > and & remaining in the input after strip_tags() has run. This prevents them being misinterpreted as HTML and makes sure they are displayed properly in any output.
Having presented those three functions, there are a few points to make about their usage. Clearly, nl2br() and htmlspecialchars() are suited for output formatting, called on data just before it is output, allowing the database or file-stored data to retain normal formatting such as newlines and characters such as &. These functions are designed mainly to ensure that output of data into an HTML page is presented neatly, even after running strip_tags() on any input.
strip_tags(), on the other hand, should be run immediately on input of data, before any other processing occurs. The code below is a function to clean user input of any PHP or HTML tags, and works for both GET and POST request methods.
function _INPUT($name)
{if ($_SERVER['REQUEST_METHOD'] == 'GET')