We recently experimented with crowdsourcing the review of outgoing campaigns from MailChimp’s servers. Normally, if our Omnivore algorithms detect something suspicious about a campaign, we’ll automatically suspend the account and follow up with a review by our internal Compliance Team. But we’ve been testing the idea of also sending the campaign to Amazon’s Mechanical Turk service for manual review by humans. We simply showed the email to a "turker" and asked them, "Is this spam?"
The experiment only involved sending roughly 7,000 email campaigns over to be reviewed. But within the first 2 days, we started getting back some unexpected, yet fascinating results.
In particular, there were certain email templates that kept getting repeatedly flagged as spam by these human reviewers, even though they weren’t spam at all.
All these "false positives" had some common design traits, so we thought we should share our findings…
How Did The Experiment Work?
When Omnivore detected an email that had traits of potential abuse, we sent it to Mechanical Turk. A copy of the email (sans private data, like recipient information) was displayed inside of an interface that looked something like this:
In general, we listed some rules at the top, then presented the campaign below it, then asked the reviewer to tell us if the email violated any of the listed rules, back at the top of the page. User Interface snobs will notice that in general, this interface looks like it was QWERTY-fied (designed to slow users down a little). We could’ve used very simple "Is this spam? Yes/No" buttons, but you don’t want people judging too fast.
How Effective Was The Experiment?
The experiment went as well as you’d expect, using people who weren’t heavily trained on the intricacies of permission-based email marketing. Generally speaking, Turkers like to work fast, so they’re best for picking out the most egregious offenders (think along the lines of porno or pharma spam). To that end, they’re great at catching the really evil spammers who try to penetrate into our system and send extremely bad stuff that would jeopardize our deliverability.
But when it came to reviewing an email from, say, a reputable business that purchased a not-so-reputable list from a local chamber of commerce, the reviewers experienced some difficulty. So crowdsourcing is good, but not a silver bullet with respect to abuse prevention (we are still crowdsourcing, but the experiment has changed significantly).
Though we weren’t thrilled with the initial results, this exercise revealed a lot about how people look at email design.
21 Seconds To Decide
Mechanical Turk measures how much time people spend performing each review, so we can tell when people are just clicking random stuff and moving on to their next task. On average, the human reviewers spent only 21 seconds reviewing these "false positive" emails. Now, we can’t read their minds, so there’s no reliable way of telling if they bothered to check for "permission reminders" or "CAN-SPAM compliance" in the footers. But it’s safe to say they weren’t doing a very thorough analysis. I’d wager that most of that 21 seconds was spent reading the criteria at the top of the interface, and not the email itself. They definitely weren’t visiting the senders’ websites to see if there was a proper signup form, and testing to see if they used opt-in best practices. They were making relatively quick, gut-level decisions on whether or not an email "looked spammy."
The False Positives
Below are some email designs that kept getting marked as spam by Mechanical Turk reviewers. Keep in mind that at the time of this experiment, none of the senders of these emails were determined to be abusive. Their email stats suggested they were sending permission-based emails. Their recipients probably knew the emails were legit — but our independent reviewers did not.
1. Want to learn Photoshop?
In general, I think the above email has got some layout issues that make it look a bit sloppy. Their images are breaking the template. At the top, where people are accustomed to seeing a logo, the sender only used text. In fact, the text isn’t even the company’s name, but a bright red "salesy" kind of question: "Want to learn Japanese or Chinese?" Doesn’t exactly inspire confidence that you know your recipient, or what he’s interested in. Unfortunately, the Chinese characters don’t help their reputation much either. We’ve all received a bit too much of this in our inbox:
2. The Red Flyer
I’m sure that loyal customers of this local pizzeria were happy to get an offer for a free t-shirt:
But I don’t think our human reviewers liked the "hyperlink blue" verdana font, then the giant red "FREE" text below that (then the green text below that, then the blue text below that, then the gray text below that). Something about this email made it look more like a stock template for a flyer, not an email newsletter to loyal customers. I couldn’t help but think that the scrunched up airplane logo looked like those images that spammers try to skew, in order to get around anti-spam filters who scan the content of images:
Aside from the image quality issues, some extra copy could’ve been added to demonstrate that this email was being sent to their customers. Don’t get me wrong. T-shirt giveaways can be extremely effective (here are some stats to prove it), but you should probably do more than just yell "FREE T-SHIRT!"
At the very least, an image of the actual t-shirt seems in order.
Here’s a nice example from ScoutMob:
3. Not Plain Enough Text
This email repeatedly got marked as spam by our reviewers:
You’ll notice it has no images. No branding, no logos, no photos.
Yes, one could make the case that plain, old-fashioned, text-only emails can be more personal, and therefore more effective under some circumstances.
But if you’re gonna go all-text, you need to go all the way, baby. Centered text, colored backgrounds, and colored borders look like you’re going for an HTML email look. But when you fail to include any logos or images, it looks half-baked. Like a spammer, getting all "Rich Text:"
Even if you don’t have a logo, one way of showing your brand is to include your website’s domain. But this sender used the bit.ly URL shortener instead:
In their defense, that’s probably because the link to the event they’re promoting was really long or something (webinar links get that way sometimes). The problem is that spammers are known to hide malicious links behind reputable URL shorteners (see: URL Shorteners and Blacklists), so that helpful little link just ends up hurting them.
4. Read it and Weep
This one was actually surprising to me, because I thought it was well designed:
The title font even looks customized (it’s not arial, it’s not verdana, and it’s certainly not comic sans). It’s laid out pretty nicely. The pink is a custom color, too. The only possible problem that I can see is that it’s extremely text-heavy, with zero images. To the untrained eye, it almost falls into that "not plain-enough text" category above, but this doesn’t look half-baked or sloppy at all to me. This email shows signs of actual craftsmanship and skill with typography (web design is 95% typography, right?). This sender’s subscribers are probably fine with all this text (the sender is an author, after all). But to our independent reviewers, this email apparently looked pretty spammy. In this case, I personally wouldn’t change my design or behavior. If I had to make recommendations, I’d consider adding elements that made it look more "newslettery." Perhaps a small avatar of the author could be worked into the template’s footer, or some "share this on social sites" icons. If this is all about the written word, and images are forbidden, text can be ornamental too.
5. Set it and Forgot it
Senders that used one of our stock RSS-to-email templates seemed to get flagged the most:
As I write this article, we’re actually working on tweaking this template so that the header is more customizable (forcing the title to be ALL CAPS, in retrospect, was not a great idea).
But many of the bloggers who used this template didn’t bother customizing the RSS merge tags any further to include images from their posts. They didn’t customize the fonts, link colors, or anything at all, it seems.
I also wonder if, in some cases, the Table of Contents was so large, our independent reviewers didn’t bother scrolling down to look for real content. All they saw was a bunch of nonsensical looking TOC links. This happens if you update your blog frequently, but you schedule your RSS-to-email campaign to go out in weekly or monthly batches. Not that I’d change my behavior just for random Mechanical Turk reviewers. What your subscribers want is more important.
But there’s a broader lesson here on image vs. text balance. A similar example plucked from my spam folder in Gmail:
Why this is important to email marketers
When you send a lot of email marketing, even to a totally permission-based double opt-in list, you’re going to get some spam complaints from your recipients. It’s inevitable. Sometimes, it’s because they’re too lazy to click your unsub link, they think the "spam" button is the unsub link, or sometimes it’s because they forgot signing up to your list (maybe because you send infrequently, like me).
So even if you do your list management right, and you design everything perfectly around your subscribers’ expectations, we always recommend that you give some consideration to this "secret" audience that also reads your email (See: "What makes a good permission reminder?"). Don’t bend over backwards for them, or anything.
It’s kind of like how your mother always told you to wear clean underwear, "in case you’re in an accident." Take a good look at your email templates, and ask yourself, "If my email got reported as spam, and some spamcop laid his eyes on it, what would they think? Would mom be proud?"