Facebook Share Button Twitter Share Button Reddit Share Button

Building a Spam-Filtering PHP Contact Form

 

The basic techniques of building contact forms are well-known. Equally well-known is that an unfiltered contact form will soon be inundated with spam, which may or may not be filtered downstream before it reaches its recipients. The reason for this is that mail servers usually "whitelist" mail sent by the server itself, on the assumption that most folks won't spam themselves. Unfortunately, this setting can also cause contact form spam to bypass downstream mail filters.

Most contact-form spam-filtering methods analyze the data after it's been submitted by the user. My system does a bit of that, too. But I've found in more than 20 years of writing forms that effective contact form spam-filtering has to begin with the form itself.

In fact, an effective contact form spam-filtering system starts working before the contact form page even opens on the user's screen. Before the page renders, the form will have gathered information about the user's visit and their device that will help the processing script decide whether or not to accept and relay the message.

 

Preventing Spammers from Bypassing Your Contact Form

Of course, for a contact form to be an effective part of a spam-filtering system, it has to be the only way messages can be sent. The most elegantly-coded form in the world won't do a bit of good if spammers can bypass it.

PHP contact forms work by passing the user's input to a form processor, which is another script that collects the information that the user submitted and compiles it into an email that is sent to the recipient. Typically, the form processor also sends the user to either a success page or a failure page, depending on whether or not the submission was accepted.

One trick that both human and robotic spammers use to send spam through Web contact forms is to bypass the form page altogether and submit their spam directly to the processor script. This is pretty easy to do because the URL of the form processor is coded into the form and is human- and machine-readable in the page source. If the form processor has no way to verify that the submission came from the form, a script can be crafted to mimic the form and bombard the form processor with spam submissions.

Attempts to restrict access to the form processor to requests made through the form using session variables, cookies, referrer information, or hidden form values are not very reliable because all of those methods can be spoofed by a smart spammer. There is, however, a highly-reliable way to restrict a form processor's input to that which is submitted through the form, and which doesn't require databases.

 

Using Temporary PHP Files to Restrict Access to the Mail Form Processor

The simplest and best way I know of to prevent spammers from bypassing your contact form is to create a temporary file when the form is loaded. The file will be populated with information collected by the form page that the form processor can use to check whether the submission came through the form, as well as other information that can be used for spam-filtering or other purposes. Because I work in PHP, I create a PHP file that can be easily included into the form processor script.

For the purpose of this site's example, I will collect the following information when the example form page is loaded:

I'm also going to create an md5 hash of the concatenated timestamp and remote IP address to use as a token.

The PHP file will be stored in the tmp/ directory, one level above the Web root (that is, one level above public_html, html, htdocs, httpdocs, or web, depending on your operating system and the server software you're using). That means it will not be accessible to Web users, but will be accessible to PHP.

The method I'll be outlining here also requires that we start a session. Although we will not be passing any session variables, we will be using the session ID as part of the temporary file name.

Let's look at the script.

Start the Session and Collect some Data

    <?php
    session_start();
    date_default_timezone_set('America/New_York'); // replace with the server's time zone
    $startTime = time(); // gets start time
    $startBrowser = $_SERVER['HTTP_USER_AGENT']; // gets browser
    $startIP = $_SERVER['REMOTE_ADDR']; // gets IP address
    $startHash = md5($startTime . $startIP); // creates a unique hash
    $startReferer = $_SERVER['HTTP_REFERER']; // gets referring URL, if available

Create the File Content

This section creates the code that will be inside the PHP file. Note that when you're using PHP to write a PHP file that contains PHP code, you have to escape the leading $ of variable names with \ . That will tell PHP to write the expression to the file rather than interpreting it.

    /* The following can be placed on one line. It is split up here for clarity */
    $tempFileContent = "<?php " . "\$startTime =\"" . $startTime . "\"; "
                    . "\$startBrowser =\"" . $startBrowser . "\"; "
                    . "\$startIP =\"" . $startIP . "\"; "
                    . "\$startHash =\"" . $startHash . "\"; "
                    . "\$startReferer =\"" . $startReferer . "\"; ?>";

Define the File Path and File Name and Write the File

This section writes a file named "[session ID].php" and writes the code generated in the previous step into the file.

    $tempFileName = "/home/yourusername/tmp/" . session_id() . ".php"; // defines the path and file name
    file_put_contents($tempFileName,$tempFileContent); // writes the file

Housekeeping

Because the PHP files created by the above script aren't really temporary files, we need to clean them up once in a while to prevent the tmp/ directory from getting filled up with garbage. One easy way to do this is to append the following code, which will check for and remove any PHP files older than one hour in the tmp/ directory every time the form is loaded:

    $oldFiles = glob('/home/yourusername/tmp/*.php'); // get all file names ending with .php
    foreach($oldFiles as $file){
        $lastModifiedTime = filemtime($file);
        $currentTime = time();
        $timeDiff = abs($currentTime - $lastModifiedTime)/(60*60); // one hour
        if(is_file($file) && $timeDiff > 96) // checks if file is more than 96 hours old
        unlink($file); // deletes the file
    }
    ?>

Running that code every time the form page is loaded, rather than when the processor script runs, also assures that files created by users who don't submit the form are eventually deleted.

The reason I chose 96 hours is because these are tiny files that don't take up much space; and it's not unheard of for users to open a form page, minimize the browser window, and return to the form a few days later. Because this is something a spammer would never do, there's no harm in leaving the files there for a while. Another alternative would be to expire the page after a while.

The next step would be to add the page HTML, including the form code. So the finished form submission page might look like this:

    <?php
    session_start();
    date_default_timezone_set('America/New_York'); // replace with the server's time zone
    $startTime = time(); // gets start time
    $startBrowser = $_SERVER['HTTP_USER_AGENT']; // gets browser
    $startIP = $_SERVER['REMOTE_ADDR']; // gets IP address
    $startHash = md5($startTime . $startIP); // creates a unique hash
    $startReferer = $_SERVER['HTTP_REFERER']; // gets referring URL, if available

    /* The following can be placed on one line. It is split up here for clarity */
    $tempFileContent = "<?php " . "\$startTime =\"" . $startTime . "\"; "
                    . "\$startBrowser =\"" . $startBrowser . "\"; "
                    . "\$startIP =\"" . $startIP . "\"; "
                    . "\$startHash =\"" . $startHash . "\"; "
                    . "\$startReferer =\"" . $startReferer . "\"; ?>"
    $tempFileName = "/home/yourusername/tmp/" . session_id() . ".php"; // defines the path and file name
    file_put_contents($tempFileName,$tempFileContent); // writes the file

    $oldFiles = glob('/home/yourusername/tmp/*.php'); // get all file names
    foreach($oldFiles as $file){
        $lastModifiedTime = filemtime($file);
        $currentTime = time();
        $timeDiff = abs($currentTime - $lastModifiedTime)/(60*60); // one hour
        if(is_file($file) && $timeDiff > 1) // checks if file is more than 1 hour old
        unlink($file); // deletes the file
    }
    ?>

    <!DOCTYPE html>
    <html>
    <head>
        <title>Test Form Page</title>
    </head>

    <body>

    <form method="post" action="path-to-form-processor.php">
    
        <p>
        <input name="name" required>
        <label>Name</label>
        </p>

        <p>
        <input name="email" type="email" required>
        <label>Email</label>
        </p>

        <p>
        <input name="phone" required>
        <label>Phone</label>
        </p>

        <p>
        <select name="subject">
            <option label="Compliment" value="Compliment">Compliment</option>
            <option label="Complaint" value="Complaint">Complaint</option>
            <option label="Poem" value="Poem">Poem</option>
            <option label="Song" value="Song">Song</option>
        </select>
        <label>Subject</label>
        </p>

        <p>
        <textarea name="message" required></textarea>
        <label>Message</label>
        </p>

        <input id="submit" name="submit" type="submit" value="Submit">

    </form>

    </body>
    </html>

Easy enough, right? Let's look at the next page to learn how the temp file is used on the form processor end of things.