We recently experimented with crowdsourcing the review of outgoing campaigns from MailChimp’s servers. Normally, if our Omnivore algorithms detect something suspicious about a campaign, we’ll automatically suspend the account and follow up with a review by our internal Compliance Team. But we’ve been testing the idea of also sending the campaign to Amazon’s Mechanical Turk service for manual review by humans. We simply showed the email to a “turker” and asked them, “Is this spam?”
The experiment only involved sending roughly 7,000 email campaigns over to be reviewed. But within the first 2 days, we started getting back some unexpected, yet fascinating results.
In particular, there were certain email templates that kept getting repeatedly flagged as spam by these human reviewers, even though they weren’t spam at all.
All these “false positives” had some common design traits, so we thought we should share our findings…
How Did The Experiment Work?
When Omnivore detected an email that had traits of potential abuse, we sent it to Mechanical Turk. A copy of the email (sans private data, like recipient information) was displayed inside of an interface that looked something like this:
In general, we listed some rules at the top, then presented the campaign below it, then asked the reviewer to tell us if the email violated any of the listed rules, back at the top of the page. User Interface snobs will notice that in general, this interface looks like it was QWERTY-fied (designed to slow users down a little). We could’ve used very simple “Is this spam? Yes/No” buttons, but you don’t want people judging too fast.
How Effective Was The Experiment?
The experiment went as well as you’d expect, using people who weren’t heavily trained on the intricacies of permission-based email marketing. Generally speaking, Turkers like to work fast, so they’re best for picking out the most egregious offenders (think along the lines of porno or pharma spam). To that end, they’re great at catching the really evil spammers who try to penetrate into our system and send extremely bad stuff that would jeopardize our deliverability.
But when it came to reviewing an email from, say, a reputable business that purchased a not-so-reputable list from a local chamber of commerce, the reviewers experienced some difficulty. So crowdsourcing is good, but not a silver bullet with respect to abuse prevention (we are still crowdsourcing, but the experiment has changed significantly).
Though we weren’t thrilled with the initial results, this exercise revealed a lot about how people look at email design.
21 Seconds To Decide
Mechanical Turk measures how much time people spend performing each review, so we can tell when people are just clicking random stuff and moving on to their next task. On average, the human reviewers spent only 21 seconds reviewing these “false positive” emails. Now, we can’t read their minds, so there’s no reliable way of telling if they bothered to check for “permission reminders” or “CAN-SPAM compliance” in the footers. But it’s safe to say they weren’t doing a very thorough analysis. I’d wager that most of that 21 seconds was spent reading the criteria at the top of the interface, and not the email itself. They definitely weren’t visiting the senders’ websites to see if there was a proper signup form, and testing to see if they used opt-in best practices. They were making relatively quick, gut-level decisions on whether or not an email “looked spammy.”
The False Positives
Below are some email designs that kept getting marked as spam by Mechanical Turk reviewers. Keep in mind that at the time of this experiment, none of the senders of these emails were determined to be abusive. Their email stats suggested they were sending permission-based emails. Their recipients probably knew the emails were legit — but our independent reviewers did not.
1. Want to learn Photoshop?
In general, I think the above email has got some layout issues that make it look a bit sloppy. Their images are breaking the template. At the top, where people are accustomed to seeing a logo, the sender only used text. In fact, the text isn’t even the company’s name, but a bright red “salesy” kind of question: “Want to learn Japanese or Chinese?” Doesn’t exactly inspire confidence that you know your recipient, or what he’s interested in. Unfortunately, the Chinese characters don’t help their reputation much either. We’ve all received a bit too much of this in our inbox:
2. The Red Flyer
I’m sure that loyal customers of this local pizzeria were happy to get an offer for a free t-shirt:
But I don’t think our human reviewers liked the “hyperlink blue” verdana font, then the giant red “FREE” text below that (then the green text below that, then the blue text below that, then the gray text below that). Something about this email made it look more like a stock template for a flyer, not an email newsletter to loyal customers. I couldn’t help but think that the scrunched up airplane logo looked like those images that spammers try to skew, in order to get around anti-spam filters who scan the content of images:
Aside from the image quality issues, some extra copy could’ve been added to demonstrate that this email was being sent to their customers. Don’t get me wrong. T-shirt giveaways can beĀ extremely effective (here are some stats to prove it), but you should probably do more than just yell “FREE T-SHIRT!”
At the very least, an image of the actual t-shirt seems in order.
Here’s a nice example from ScoutMob:
3. Not Plain Enough Text
This email repeatedly got marked as spam by our reviewers:
You’ll notice it has no images. No branding, no logos, no photos.
Yes, one could make the case that plain, old-fashioned, text-only emails can be more personal, and therefore more effective under some circumstances.
But if you’re gonna go all-text, you need to go all the way, baby. Centered text, colored backgrounds, and colored borders look like you’re going for an HTML email look. But when you fail to include any logos or images, it looks half-baked. Like a spammer, getting all “Rich Text:”
Even if you don’t have a logo, one way of showing your brand is to include your website’s domain. But this sender used the bit.ly URL shortener instead:
In their defense, that’s probably because the link to the event they’re promoting was really long or something (webinar links get that way sometimes). The problem is that spammers are known to hide malicious links behind reputable URL shorteners (see: URL Shorteners and Blacklists), so that helpful little link just ends up hurting them.
4. Read it and Weep
This one was actually surprising to me, because I thought it was well designed:
The title font even looks customized (it’s not arial, it’s not verdana, and it’s certainly not comic sans). It’s laid out pretty nicely. The pink is a custom color, too. The only possible problem that I can see is that it’s extremely text-heavy, with zero images. To the untrained eye, it almost falls into that “not plain-enough text” category above, but this doesn’t look half-baked or sloppy at all to me. This email shows signs of actual craftsmanship and skill with typography (web design is 95% typography, right?). This sender’s subscribers are probably fine with all this text (the sender is an author, after all). But to our independent reviewers, this email apparently looked pretty spammy. In this case, I personally wouldn’t change my design or behavior. If I had to make recommendations, I’d consider adding elements that made it look more “newslettery.” Perhaps a small avatar of the author could be worked into the template’s footer, or some “share this on social sites” icons. If this is all about the written word, and images are forbidden, text can be ornamental too.
5. Set it and Forgot it
Senders that used one of our stock RSS-to-email templates seemed to get flagged the most:
As I write this article, we’re actually working on tweaking this template so that the header is more customizable (forcing the title to be ALL CAPS, in retrospect, was not a great idea).
But many of the bloggers who used this template didn’t bother customizing the RSS merge tags any further to include images from their posts. They didn’t customize the fonts, link colors, or anything at all, it seems.
I also wonder if, in some cases, the Table of Contents was so large, our independent reviewers didn’t bother scrolling down to look for real content. All they saw was a bunch of nonsensical looking TOC links. This happens if you update your blog frequently, but you schedule your RSS-to-email campaign to go out in weekly or monthly batches. Not that I’d change my behavior just for random Mechanical Turk reviewers. What your subscribers want is more important.
But there’s a broader lesson here on image vs. text balance. A similar example plucked from my spam folder in Gmail:
Why this is important to email marketers
When you send a lot of email marketing, even to a totally permission-based double opt-in list, you’re going to get some spam complaints from your recipients. It’s inevitable. Sometimes, it’s because they’re too lazy to click your unsub link, they think the “spam” button is the unsub link, or sometimes it’s because they forgot signing up to your list (maybe because you send infrequently, like me).
And sometimes, when your email is marked as spam, a human from an ISP, or a human from an anti-spam organization, will actually do a manual review of your email (See: “Who’s secretly reading your emails?”). Some anti-spam organizations use volunteers, who are driven by passion more than pay (nothing wrong with that, but you have to wonder how detailed their training is). We’ve experienced enough “your client’s email has been reviewed by our team, and determined to be spam, so we’re blocking your IP range” situations to know that those reviewers don’t always do a thorough analysis of your list management practices (not part of their job description anyway). This is partly why our own terms of use seem so strict to some. ISPs get complaints, they look at your email, and they make a split-second decision to “blacklist or not.”
So even if you do your list management right, and you design everything perfectly around your subscribers’ expectations, we always recommend that you give some consideration to this “secret” audience that also reads your email (See: “What makes a good permission reminder?”). Don’t bend over backwards for them, or anything.
It’s kind of like how your mother always told you to wear clean underwear, “in case you’re in an accident.” Take a good look at your email templates, and ask yourself, “If my email got reported as spam, and some spamcop laid his eyes on it, what would they think? Would mom be proud?”
Related:
- How your email design can get you blacklisted
- Stupid Email Design Mistakes
- How to avoid spam filters (the non-human kind)
- Want 700,000 HTML email templates? (more fun w/Mechanical Turk)
- Is your email marketing human?














I’ve emailed your compliance team about a similar issue – I follow double opt in best practice on all lists yet Mail CHimp occasionally shut down my account without warning.
When I mail a large list – say 100,000 recipients – all it takes is one person to make an unwarranted spam complaint and you shut down my account for several days while you investigate. It seems odd that larger lists are treated in the same way as someone who’s mailing just a handful of people Particularly as its the users with larger lists who contribute disproportionately to your profits.
SImilarly, when we send smaller campaigns via auto responders it might take just one person to cite the campaign as spam to push us over the 1 per cent threshold that means you issue a warning. We send the responders dailyand over the course of a week or even a month might get just the one complaint Yet you look at each day in isolation.
I wonderedwhether other customers have been penalized by such seemingly arbitrary math? and whether you were planning to review your spam procedures?
Justin, I think there are many different factors to consider. I’m not going to comment on your specific account, but here are some general points that you (and everyone here) might want to know:
* When we get feedback loop complaints (this is when someone using Hotmail or Yahoo or AOL or some other ISP webmail app clicks “this is spam”), you’ll get an automated warning. The warnings can be found on your Account Status page: http://blog.mailchimp.com/account-status-sasquatch-screen/ and we’ll post notifications to your Dashboard when you log in. These warnings are from our Omnivore system. They’re meant to be a heads-up that you might have some issues to look into. Down the road, if you got too many warnings, and we have to suspend the account, we don’t want you to be surprised. Nearly all cases where someone complains to us via email (or on twitter) about being shut down w/out warning, we see a very clear, long list of warnings on their Account Status page.
* When we get a “direct complaint,” that means a recipient reported the email as spam directly to our abuse desk. This is treated a little more severely than a feedback loop complaint. These recipients got your email, got angry, viewed the source code of the email, dissected the email headers, found the “report this campaignID to our abuse desk” link, and sent us an angry note. Or, they used a service like SpamCop.net to do the above. When this happens, a member of our Compliance Team will investigate your list stats (bounce, unsub, and complaint rates), and look for proof of opt-in with the recipient, and then decide how to proceed. This is a case where “just one complaint” can sometimes get you suspended.
* Except for only the most severe cases, we almost never “shut down accounts without warning.” That usually only happens when we see something extremely alarming in the account. Usually, it’s an account suspension, not a shut down. A suspension is where you can still access everything in your account, but not send any campaigns. Suspensions are equally inconvenient to some, but this is not meant to be punishment — it’s more of a safety switch. If we detect things that might get an IP range blocked, we have to “stop the line” and investigate.
Hi Ben
Thanks for coming back to me and outlining the differences between spam notifications and direct complaints.
I wondered whether you’re looking into taking a different approach for users with different sizes of list? As you say, an account suspension can cause great inconvenience – particularly if we promise our readers that we’ll send them details of, say, a new product or a new video on a particular date only to discover that our account is suspended. It doesn’t look good to our customers and in a worst case, may cost us several thousand dollars of lost revenue.
Also, once you’ve investigated an account and know that the user is following best practice, you don’t seem to ‘remember’ this information. Once an account is suspended, it can take several days for the suspension to be lifted while we resubmit information that you already have. As I say, that’s frustrating when we’ve spent lots of time implementing your recommendations on best practice.
If it’s easier to ask your compliance team to contact me by email, I’m at justin (at) wordtracker.com. I’ve tried emailing them several times, but haven’t had a reply.
Thanks again for the explanation and help.
Kind regards
Justin
This is Very interesting! My last campaign had a very high number of complaints and I suspected that it had to be something with the template as I had never had problems before. I do Love Mailchimp and I hope that you will be going through all the templates to correct anything that may flag our emails.
I would like to see more templates that have sections to them (That do not require an image as well) A box that each area can be separately customised with different color backgrounds.
This is great insight that few people would ever have enough data to build on their own… thanks for sharing it!
I am kind of curious has to how this plays into abuse complaints too.
It seems to me that the “turker” is sorta on par with an average opt in user. I am always flustered with the number of abuse complaints our large list gets and the occasional times it goes over 1%. There is a disproportionate of abuse complaints coming from one email provider, and with a “Report Spam” button being easier than the process a “turker” goes through ( along without knowing the basic criteria and idea they opted in and can opt out with a single click )
We’ve gone through a lot of template tweaks with help from the compliance team and the abuse rates are fairly steady. I sense a strong correlation between this experiment and the abuse reporting…
Love this post Ben: really brings home the importance of perceptions and the impact of subtle or small design elements.
Thank you, Austrian Chocolate Santa man! ;-)
So the Turk spent (on average) of 21 seconds to determine the spamminess of a message. No one’s paying me to evaluate spam, and I haven’t done a study, but I’d estimate in the real world it less than 10 seconds to classify a message in one’s inbox.
I can’t imagine how it would be technically feasible, but I’d LOVE to know how long a reader spends looking at the monthly newsletter we send out via Mailchimp. Are they skimming the headlines to see if any articles interest them (maybe 15 seconds)? Do they actually spend 2-3 minutes reading each article? Or do 90% of them just immediately click Delete because they’re conscientious enough not to click the Report Spam button but too busy to click the Unsubscribe link?
Ben, Thanks!!
I LOVE MAIL CHIMP ” Easy Human Technology”
Great post Ben, Momma would be proud!
HI Ben,
We are a B2B Company that helps IT Companies setting up their channel anywhere in Europe. We had selected MailChimp to create an informative Newsletter to our customers/prospects that gives useful channel insight for free.
People subscribe directly (we added a subscription link at the bottom of each mail) or when we meet with them we ask them whether there are interested in the newsletter and we subscribe them to save them from the administrative burden. Through this process about 1000 people had signed up.
Our last campaign has reached 1,8% unsubscription rate. It means that out of 100 people, less than 2 had unsubscribed. The main reason being simply that during these times of changes and crisis many people have changed job or focus. Look at your own environment and take 100 of your friends, how many have changed job lately while staying at the same Company? I suspect more than 2, no?
Yet, for this reason, our account was suspended. And it was suspended the day we were about to launch a new email…
Again we had no complaint nor abuse report. It is just because less than 2 people out of 100 had changed focus…
We are a tiny Company with little resource. One person had worked hardly to do a good and nice looking informative newsletter on which we had received many compliments and spent a lot of time to understand the way your service works.
We had addressed the issue to your compliance team. But the only option that was left to us was to remove all the people in our list and send them all a mail through a normal email client (which most of them can’t do to prevent spamming) to ask them to re-subscribe… But we don’t want to do this as it would tell that we are not very serious and haven’t been able to maintain our list properly.
Except if you are able to offer an alternative option, I’m afraid we’d have to cancel our account.
I think you should make more clear that if 2 out of 100 people in a list are not interested any more to what they had been in the past and decide to unsubscribe, then the account would be cancelled.
I suspect you won’t do this, as many of your customers would runaway indeed.
For your business I hope you will be able to fix this soon.
Laurent
We have the exact same problem, we are sending mails for a client and when less than 10 people out of almost 700 decides they don’t want the newsletter anymore we get a warning that we might get suspended.
According to MailChimp support we don’t need to worry as it is only a warning … I’m not really convinced and we are still considering if we should find another ESP.
MailChimp seriously needs to address this issue. Professional marketeers cannot work with this threat of suspension all the time.
Henrik, we believe the issue can only be addressed when setting expectations with subscribers at the point of opt-in. In some rare cases, design matters. So we work hard to post research like this article.
The warnings are not to be ignored, but they don’t always mean there’s an imminent shut-down in your future. They’re meant to suggest that maybe there’s an issue somewhere with your list: 1) perhaps expectations were not set, which means problems will persist; 2) perhaps you send too infrequently, in which case the complainers will probably weed themselves out; 3) perhaps there’s a problem with the opt-in process at the website, in which case problems will persist.
We believe in free will, and would not stop anyone from seeking out a new ESP. But finding a provider who provides less data and warns you less about it will not ultimately help with any underlying issues. If you need help with best practices, you might check out: http://resources.mailchimp.com/ or seek email marketing expertise at: http://experts.mailchimp.com/
Ben – I think you could make this problem go away by being much more specific about the problems when sending out warnings. We have gotten 3 warnings in 7 months for unsubscribe rates which are lower than 1,5% but above the 1% limit (supermarket newsletters are clearly not the most popular…). We are using double-opt-in and everything but still we, honestly, have no idea if we are close to a suspension or not and this is a serious problem.
So basically, a much higher information level would probably make these complaints go away and make happy customers!
Fair enough. Thanks for the feedback! I’ll suggest this to the team.
Ben, I’m sorry to tell you that in our case, our account was “suspended” which means it is useless to us.
And I don’t think we can address this.
Indeed, we can’t tell our opt-in subscribers:”please, even if you change job, don’t unsubscribe!”
We can’t send our newsletter more frequently as it would dilute the topics and eventually may encourage our subscribers to unsubscribe!
Our proposal would have been simply to distinguish people who report an abuse and people who unsubscribe.
In the meantime, in the absence of a good option offered by your team, we have been forced to work with another ESP.
Still we’d rather work with your solutions that is well designed and might be happy to get back, should you have a solution along the lines of what I have just mentioned.
I entirely support Henrik’ statement.
Let us know.
Laurent, have you been in touch with our Compliance team about the issue with your account? If not, I urge you to get in touch by emailing compliance at mailchimp dot com so we can take a look.
One thing you may want to consider in the future is to segment your list and only send to those people with a high subscriber rating. See http://eepurl.com/hhpc and http://eepurl.com/hcPr for more information on how to accomplish this.
Hi Amanda,
Thanks for your reply.
We have been in touch with your compliance team who have asked how we were proceeding to subscribe our list members. We candidly told them that we’d leave our contacts the choice either to subscribe by themselves or to do it on their behalf (with their approval of course). As soon as we said this to your team, our account got immediately suspended!! They were the one to ask us to remove all members from the list and use our normal Outlook email system to ask the people to re-subscribe again! Which we haven’t done of course!
The solution that you are proposing would not solve the issue at all because the 5 stars subscribers may change job or focus like everyone else.
I’m wondering why is it so complicated to add a distinction between “abuse report” and “unsubscription”. You could add a link for the receivers likewise there is one for updating one’s profile or unsubscribing. You would just add “report abuse” and you might use this ratio instead for the unsubscription.
Would be fair no?
Laurent et al, based on the feedback received on this post, and via our abuse desk, we’ve re-calibrated how we measure abuse complaints, based on list size.
In short, if your list is small, you won’t be penalized as much for high abuse %.
Thanks to all for your feedback. When it comes to our product decisions and feature development, we believe in standing firm, but also listening hard.
These abuse desk algorithms are constantly tweaked, and I can’t promise this will always be in effect, or that we won’t be tweaking them even *more* during the extremely hectic holiday season, but this should hopefully amount to less worry about getting suspended.
Thanks again for all the constructive feedback.
Hi Ben
That’s great to hear. A really positive step.
Can I ask, will that also apply to people with larger lists that are sending autoresponders to small segments of the list (Eg, a series of welcome emails to new sign-ups)?
Justin
Ben, great test! And I must say what a great in-depth topic. I’ve been (as like most of us) trial-and-erroring email newsletters since 2004 and have tried nearly everything. I think as far as the plain text emails, we’ve come to expect more, and far too many junk marketers use autoresponders that use plain text, so I think for many it’s a quick delete. Second big thing is your subject and first story and picture better match up. If you’re subject line is “Learn how to ice skate” and the first article is on “home loans,” you’re outta there. And, this one always amazes me: Everybody likes something different than you do. Most of the templates I like are often the last picked? Humm? Go figure. I will keep this post as a favorite. Thanks.
Wow! I’ve heard of spammers who think what they’re doing isn’t spam and the “real” spammers are the evil ones, but this is the first time I see it with my own eyes. You do exist.
1,8% subscription rates? Being flagged all the time as spam and you’re working on tools to come around that? Take the hint. You’re just like any regular spammer.
Apparently you did not get the point. 1.8% UNsubscription rate. Not Subscription rate. All people have subscribed of course.
Ben, great article.
On the basis that a glass can be half full, not half empty.
Any chace of posting some examples that did not attract ANY spam clicks by Mechanical Turk?
Hi Ben,
We send out “perfect” text-only messages, according to your definition above. However, it gets redirected on some computers to the users Junk Email folder, only because the text/link ratio is high and eager spamfilters eat it happily.
This is a problem I think I cannot fix: I will not put more blah-blah in my messages jut to tweak this ratio.
All right, my subscribers get used to the fact they has to put the sender’s email address to their white list. Now I got the following complaint from one of my readers: he has to put arbitrary email addresses to his white list because the provider do not stick to one particular sender address but change the domain part from time to time. This is our current sender address: …@mcsv185.net but who knows what will be next time?
Is there any hope to change this behavior?
The subscriber only needs to enter your “Reply-to” email address, not the “@mcsv.net” address.
It’s interesting that a lot of the ones that got flagged were low on graphics. We have a marketing specialist in NZ who states you need to keep emails text only to help get past spam filters (along with no forward to a friend features and other misguided advice). I might forward her this link so she can get her facts right.
I don’t think it’s an issue of “images vs. no images.” It’s an issue of using images properly. Use your logo to brand the email, and make it look reputable. Use images in your content to reinforce a message. You get the gist.
In terms of the “forward to friend” links she warned you about, she’s partly right. Some links like that have horrible reputation with spam filters. Many years ago, when we first created our forward-to-friend links, they got blacklisted. They were too “fresh” to legitimately be present inside of so many emails (so the spam filters thought). But if the links have been around a while, their reputation should be established. Some URL shorteners get blacklisted from time to time, too: http://blog.mailchimp.com/url-shorteners-and-blacklists/
Great points Ben, thanks for sharing this research as it would be hard for those of us with smaller lists to conduct this sort of test. I love mailchimp because you guys are all about compliance to best practise and that decreases spam in the world.
This is a very interesting study. Of the mail I send, I generally get about 10% back as SPAMMED. I think that this is pretty much unavoidable, unless you can get clients to stick you in their address book.
I use a free online spam checker to weed out any problems in my mail.
Also, as people surf the net as a daily routine, they click, click, click like it’s going out of fashon,
So my point is that internet users also have a responsibility that if they are going to opt in to lots of things and then forget, they still have to accept responsibility for their actions and maybe not be protected in a huge ball of cotton wool each time they receive an email they want to delete.
What SHOULD be enforced is an easy way to UNSUBSCRIBE, so they can do that.
If an email user had to go to a website and simply report spam or abuse, although it would be easy – it wouldn’t be done by mistake or out of laziness or disregard.
That may bring a truer result of actual complaints. Some of us are working hard to try and keep up with technology and competition – and give a great service and product.
Good article, GREAT disussion. We’re a small non-profit health org and are going to give e-newsletters a go for the first time ever soon. …and not an experienced one among us! Needless to say, I’m trying to reseach and learn and watch all I can before we actually give it a go and comment threads like this one reveal a lot more about what goes into delivering and maintaining these lists/accounts/relationships than simple Tips & How To articles and videos.
I intended to propose MailChimp for us but haven’t actually decided yet. So far I love the company and the community.