Facebook Share Button Twitter Share Button Reddit Share Button

Evaluating Data Submitted through PHP Contact Forms

 

In the previous page, we had PHP do some preliminary checking to make sure that the message had actually been submitted through the contact form, and that the form hadn't been submitted in less time than a fast human typist could fill it out. Now we're going to talk about evaluating the actual data that was submitted through the form.

Let's begin by reviewing the form processor code we've written thus far:

<?php
session_start();
date_default_timezone_set('America/New_York'); // replace with the server's time zone
$submitTime = time(); // gets submission time;
$submitBrowser = $_SERVER['HTTP_USER_AGENT']; // gets the current browser
$submitIP = $_SERVER['REMOTE_ADDR']; // gets the current IP address
$submitReferer = $_SERVER['HTTP_REFERER']; // gets the URL of page that sent the form data, if available

$checkFile = "/home/yourusername/tmp/" . session_id() . ".php";
if (!file_exists($checkFile)) {
    include("/success.php");
    die;
}

include("$checkFile");

$testHash = md5($startTime . $startIP);
if  (
    ($testHash !== $startHash) ||
    ($startBrowser !== $submitBrowser) ||
    ($submitTime - $startTime < 4)
    )
    {
        include("/success.php");
        die;
}

unlink("$checkfile");
?>

Believe it or not, that simple code alone will eliminate the bulk of your spam submissions with a very low false-positive rate if you are careful not to set the time required to fill out the form too low. On a very long form this is easy, but it gets trickier with shorter forms. There are some scary-fast typists out there.

Another test you can do even before the form data is loaded into the processing script is to check the IP address that was used to submit the form against a blocklist of known Web form spammers. You can learn more about that here. It's especially useful if you're on shared hosting and don't have control over the server's firewall.

Once a submitted message has passed the above preliminary tests, we can start collecting and evaluating the actual data submitted through the form. Using our sample form as a guide:

<!DOCTYPE html>
    <html>
    <head>
        <title>Test Form Page</title>
    </head>

    <body>

    <form method="post" action="path-to-form-processor.php">

        <p>
        <input name="name" required>
        <label>Name</label>
        </p>

        <p>
        <input name="email" type="email" required>
        <label>Email</label>
        </p>

        <p>
        <input name="phone" required>
        <label>Phone</label>
        </p>

        <p>
        <select name="subject">
            <option label="Compliment" value="Compliment">Compliment</option>
            <option label="Complaint" value="Complaint">Complaint</option>
            <option label="Poem" value="Poem">Poem</option>
            <option label="Song" value="Song">Song</option>
        </select>
        <label>Subject</label>
        </p>

        <p>
        <textarea name="message" required></textarea>
        <label>Message</label>
        </p>

        <input id="submit" name="submit" type="submit" value="Submit">

    </form>

    </body>
    </html>

We begin by assigning variables for the data submitted through the form:

$name = $_POST['name'];
$email = $_POST['email'];
$phone = $_POST['phone'];
$subject = $_POST['subject'];
$message = $_POST['message'];

And assigning the recipient email address:

$mailTo = "you@yourdomain.tld";

Evaluating Form Data

Evaluating submitted form data can help reduce the amount of human-generated spam as well as robotic spam. But it can also lead to lots of false positives if you're not careful. It's best to limit your filtering to input that you are very confident, based on your own experience, is spam.

Where evaluating input comes in handy is for classifying borderline spam, by which I mean manually-entered offers of legitimate services that you didn't ask for, don't need, and aren't interested in receiving.

The way I usually evaluate submitted form data is to assign a number from 1 to 20 to suspicious entries, and then decide what to do with the message once all the numbers are added up. A total score of 20 will get a message deleted. A score between 10 and 20 will get the subject rewritten. A score of less than 10 will pass the message through without changes. Some examples of suspicious items may include:

The PHP preg_match function is a handy way to evaluate form submissions. Different matches can be assigned different values, and the total can determined the message's fate. For example, using a freemail address may only rate a 2 because many legitimate senders use freemail addresses. But an "http" in the name field can be assigned a 20, rendering that message spam all by itself.

If I were a businessman getting a lot of unsolicited and unwanted offers for search engine optimization and business loans, I might use something like the following to evaluate form submissions:

$spam = 0;
if(preg_match('/http/i|/https/i|/www/i', $name)) { $spam = $spam + 20; }
if(preg_match('/yahoo.com/i|/gmail.com/i|/hotmail.com/i', $email)) { $spam = $spam + 2; }
if(preg_match('/seo/i|/search engine optimization/i|/first page of google/i', $message)) { $spam = $spam + 10; }
if(preg_match('/business loan/i|/business lender/i|/quick approval/i', $message)) { $spam = $spam + 10; }
if($spam >= 20) {
    include("/success.php");
    die;
}
if($spam >= 10 ) {$subject = "**SPAM** " . $subject; }

Note that when making successive evaluations based on numeric values, and messages over a certain number are going to be outright deleted, that directive should be the first to run and should stop the script by way of exit or die once it's been executed. Once you know that the message is garbage, there's no point in doing further processing.

 

Sending the Mail

If the message is still alive after going through all the tests above, it's time to send it on its way. There are various ways to do this, but I'm going to use PHP's built-in mailer. A typical mail script would look something like this:


/* concatenate data for message */
$Body = "This is a Contact Form response from:";
$Body .= " ";
$Body .= $name;
$Body .= "\n";
$Body .= "Email Address: ";
$Body .= $email;
$Body .= "\n";
$Body .= "Phone: ";
$Body .= $phone;
$Body .= "\n";
$Body .= "Subject: ";
$Body .= $subject;
$Body .= "\n";
$Body .= "Message: ";
$Body .= $message;
$Body .= "\n";
$Body .= "Sent from IP address ";
$Body .= $submitIP;
$Body .= "\n";

// send email
$success = mail($mailTo, $subject, $Body, "From: robot@yourdomain.tld
Reply-to: $email
X-Mailer: PHP/" . phpversion());

// redirect to success or failure page
if ($success){
    print "<meta http-equiv=\"refresh\" content=\"0;URL=/https://yourdomain.tld/success.php\">";
}
else{
    print "<meta http-equiv=\"refresh\" content=\"0;URL=/https://yourdomain.tld/failed.php\">";
}

Here's the entire form processing script, with annotations:

<?php
session_start();
date_default_timezone_set('America/New_York'); // replace with the server's time zone
$submitTime = time(); // gets submission time;
$submitBrowser = $_SERVER['HTTP_USER_AGENT']; // gets the current browser
$submitIP = $_SERVER['REMOTE_ADDR']; // gets the current IP address
$submitReferer = $_SERVER['HTTP_REFERER']; // gets the URL of page that sent the form data, if available

/* tell PHP where the temporary PHP check file is */
$checkFile = "/home/yourusername/tmp/" . session_id() . ".php";

/* silently discard the message if the checkfile is not found */
if (!file_exists($checkFile)) {
    print "<meta http-equiv=\"refresh\" content=\"0;URL=/https://yourdomain.tld/success.php\">";
    die;
}

    /* or alternatively, redirect the user to your form page if the checkfile is not found */
    if (!file_exists($checkFile)) {
    print "<meta http-equiv=\"refresh\" content=\"0;URL=/https://yourdomain.tld/yourformpage.php\">";
    die;
    }

include("$checkFile"); // include the checkfile

/* reconstruct the hash and do some spam tests */
$testHash = md5($startTime . $startIP);
if  (
    ($testHash !== $startHash) || // checks hash values
    ($startBrowser !== $submitBrowser) || // checks whether browsers match
    ($submitTime - $startTime < 4) // checks the time used to complete the form
    )
    {
        /* silently discard the message if any of the above fail */
        print "<meta http-equiv=\"refresh\" content=\"0;URL=/https://yourdomain.tld/success.php\">";
        die;
}

unlink("$checkfile"); // delete the checkfile

/* import the form data */
$name = $_POST['name'];
$email = $_POST['email'];
$phone = $_POST['phone'];
$subject = $_POST['subject'];
$message = $_POST['message'];

$mailTo = "you@yourdomain.tld"; // assign the recipient email address

/* do some spam tests */
$spam = 0;
if(preg_match('/http/i|/https/i|/www/i', $name)) { $spam = $spam + 20; }
if(preg_match('/yahoo.com/i|/gmail.com/i|/hotmail.com/i', $email)) { $spam = $spam + 2; }
if(preg_match('/seo/i|/search engine optimzation/i|/first page of google/i', $message)) { $spam = $spam + 10; }
if(preg_match('/business loan/i|/business lender/i|/quick approval/i', $message)) { $spam = $spam + 10; }

/* silently discard if $spam >= 20 */
if($spam >= 20) {
    print "<meta http-equiv=\"refresh\" content=\"0;URL=/https://yourdomain.tld/success.php\">";
    die;
}
/* rewrite the subject if $spam >= 10 */
if($spam >= 10 ) {$subject = "**SPAM** " . $subject; }

    /* or alternatively, change the recipient email address if $spam >= 10 */
    if($spam >= 10 ) {$mailTo = "someotheraddress@yourdomain.tld; }

/* concatenate data for message */
$Body = "This is a Contact Form response from:";
$Body .= " ";
$Body .= $name;
$Body .= "\n";
$Body .= "Email Address: ";
$Body .= $email;
$Body .= "\n";
$Body .= "Phone: ";
$Body .= $phone;
$Body .= "\n";
$Body .= "Subject: ";
$Body .= $subject;
$Body .= "\n";
$Body .= "Message: ";
$Body .= $message;
$Body .= "\n";
$Body .= "Sent from IP address ";
$Body .= $submitIP;
$Body .= "\n";

// send email
$success = mail($mailTo, $subject, $Body, "From: robot@yourdomain.tld
Reply-to: $email
X-Mailer: PHP/" . phpversion());

// redirect to success or failure page
if ($success){
    print "<meta http-equiv=\"refresh\" content=\"0;URL=/https://yourdomain.tld/success.php\">";
}
else{
    print "<meta http-equiv=\"refresh\" content=\"0;URL=/https://yourdomain.tld/failed.php\">";
}
?>

And that's about it. Based on more than 20 years of experience writing contact forms, I've found that the simple scripts contained on this site, if carefully tweaked to your needs, will dramatically reduce the amount of spam you receive through your PHP contact form.

If you have any comments, questions, or suggestions, please feel free to contact me.