Data Dumpster Dive – Six Steps To Trash Your Bad Data

If your database is overloaded with junk and dead leads then it’s likely that your marketing automation platform database is over the size limit. Bad data in your database is a lot like trash, you have to get rid of it!

Not only does bad data clog your database and hike up your costs and software subscriptions; it makes your marketing efforts ineffective, costing your organization in more ways than one.

In this post, we cover strategy and execution tips to help you get the dirt out of your data and to develop a six-step best practice data retention policy including:

  • Step 1 – Get Rid of Duplicates
  • Step 2 – Eliminate People without Email Addresses
  • Step 3 – Purge Recently Deleted Salesforce Records
  • Step 4 – Delete Disqualified Records
  • Step 5 – Remove the Hard Bounces
  • Step 6 – Delete the Inactive (but only after a Wake The Dead campaign!)

Learn how to reduce your database size by 10-40% to remove junk, reduce software licensing costs, and improve marketing results.

Why Bad Data Hurts Your Business

Bad Data Costs Money

Did you know that a bad record can cost as much as $100 to maintain?

Sirius Decisions estimates that between the tangible and perceived costs, organizations spend an average of $100 to maintain every record in their database – even the dirty ones.

If you have 100,000 records in your database and 20% are inaccurate (and that’s a conservative estimate), that’s $2M down the drain. Even if you think that $100 per record is high and take a conservative approach at $10, that’s still $200K.

This great infographic from RingLead breaks down all the costs. Check out the full article.

Bad Data Hikes Up Software Costs

“You are over your data limit.” How many times have you received this message from Marketo or other software services?

Most SAAS platforms charge based on record count. As your organization matures, you’ll have data living across CRMs, marketing automation solutions, ABM platforms, and others. If 20-30% of your data is bad, that’s needlessly hiking up the cost of multiple software subscriptions.

Bad Data Kills Marketing Results

Do you have low click-through rates or engagement activity? Does Management always ask why Marketing’s results are lower than industry standards?

Most likely, you are marketing to people that have no need for your service and this process is bringing down your overall results.

For example, we recently worked with a client whose click-through rates hovered around 0.5%–much lower than 1-4% we see for general mailings. The reason of course was that they were mailing to a wide target, including many folks with bad data.

Bad Data Leads Kills Segmentation and Personalization

If your data collection methods are all over the place, there is no good way to segment your database to offer a personalized experience.

For example, if you want to deliver persona-based content to IT executives but aren’t consistently collecting role or title information, you’ll be left with a generalized approach.

Bad Data Slows Down Your System

You know how when you buy a brand-new phone, it’s really fast? Then, six months later, it starts getting slower and slower?

Your marketing automation system and CRM have similar processes. From data syncs to record updates, it takes power to process every record. The bigger your database, the more processing it takes to make major changes —this is especially true for systems that are running inefficiently in the first place.

For example, we recently worked with a client with a 1 million+ database size. A data change on the Salesforce side affected all records and kicked off a data sync to Marketo that took a while to complete. We’ve since worked on a plan to pare that database by 10% to help with these performance issues.

The moral of the story—why perform system updates on data that is bad or outdated?

The Data Dumpster Dive Methodology

How do you systematically reduce the bad data in your systems? The methodology is like sifting through a dumpster: leads pass through different filters, slowly sifting out the trash while leaving shiny treasures of clean data remaining at the end.

We’ve used this methodology with clients to reduce their database size by 10-40%, getting them under their software database limits while boosting performance.

Part strategy, part execution, the Dumpster Dive Methodology is a long-term strategy–it’s a comprehensive approach to database cleansing. Don’t think this is an overnight process (Although, there are a few quick wins in the process).

At each step, you’ll identify the criteria to review and define processes for performing the purging tasks. Once you have your custom plan in place, you can confidently maintain your database throughout the year.

Note that these steps do not have to come in the exact order. However, they tend to get harder the further down the path that you go. This methodology applies to most platforms but I’ll describe some specific examples around Marketo and Salesforce. Part two of this post will describe how to do it Marketo.

Before You Get Started – Set Data Strategy

Before you dig into the six-step process, think through some big picture strategies around a holistic data retention policy.

  • Are there any regulations to consider (e.g. Financial)? Does your industry have a time-limit for keeping, or expunging, data?
  • How long is right for your company from a reporting perspective? One year? Two years? Keep in mind that your retention policy affects your reporting intelligence–once the data is gone, so is the reporting in most cases. Some companies elect a general policy of fifteen months to ensure a rolling four quarters of intelligence. You might also vary your policy on the types of data. For example, spam data might have a policy of one month while disqualified data might have a retention policy of fifteen months.
  • What is your risk tolerance? Some companies keep EVERYTHING while others like to purge more often.
  • What’s your philosophy on appending? Think about whether or not you want to delete or run your aging data through an appending service like Oceanos, LeadSpace, ZoomInfo and others. (Check out Oceano’s no-cost data assessment which we use frequently with our clients.)
  • Do you value the data from hard bounces?
  • What are your data backup plans? Before you delete, have a backup policy in place to retain those deleted records.
  • What is your account-based strategy? If you have a heavy account-based focus, consider those factors into your data deletion policies. For example, if a VP-level person has bounced, you may want to leverage that intelligence to determine account coverage.
  • For organizations using Salesforce, do you treat Lead records differently than Contact records? For example, Lead records are less risky to delete. A Contact record might have an Opportunity attached to it.

Step 1 – Get Rid of Duplicates

Duplicates are great with babies but not with your database.

The first step is to identify the duplicates by performing a quick assessment. If using a solution like Marketo, run a list of likely duplicates with the built-in Smartlist that identifies multiple records that share the same email address.

Marketo offers a one-time deduplication service. If you want to perform the one-time deduplication yourself or with a partner. We like the DemandTools solution because it helps you figure out all the business logic from a Salesforce perspective which is usually the system of record (e.g. A duplicate exists as two Contacts with two different Opportunities). RingLead is another option.

For ongoing deduplication management, check out Validity’s DupeBlocker and RingLead which help prevent duplicates on an ongoing basis.

Make sure to read Josh Hill’s Deduping Leads in Marketo for other deduplication considerations.

Step 2 – Eliminate People without Email Addresses

In your marketing automation platform, records without email addresses are virtually useless. You can’t email them and the records increase the possibility for future duplicates.

Quick Win Candidate – If there is one step that is easy to adopt, getting rid of people without email addresses is an option to consider. As a first pass, focus on Salesforce Lead records. Contact records are more complex since they may have Opportunities associated with them.

We saw one client upload 10K+ names to its database without email addresses which spiked up its database limit and causing a letter from Marketo. This step identified those historic records and deleted them out. Sales representatives can also add records without email addresses to your CRM which is a bad process.

A solution involves identifying these records and deleting them on a one-time and ongoing basis.

Step 3 – Purge Recently Deleted Salesforce Records

By default, records that are deleted in Salesforce are NOT deleted in Marketo. On one hand, this serves as a nice backup in case a Sales representative deletes a bunch of data in Salesforce. On the other hand, this deleted data lives in Marketo forever unless you do something specific to delete these records.

Quick Win Candidate – With a simple campaign, clear out your deleted Salesforce records in Marketo on a regular basis.

Step 4 – Delete the Disqualified

This step seems easy but it’s not.

If Sales or Marketing has disqualified a record, it might be time to delete it. I say “might” because sometimes a Sales representative disqualifies a record when it should really be recycled. Or, it’s possible that your Disqualified lead lifecycle process is not quite ready for prime time.

Do you trust Disqualified data inputted by Sales reps? If the answer is “No,”, you’ll need to work on a consistent lead lifecycle process with your Sales team.

This step involves solidifying a Disqualified process with your Sales team so you can manage the data appropriately. You’ll also want to consider adding Disqualified Reasons to your sales process to get a more granular understanding of the data. For example, if a record is Disqualified with a Disqualified Reason of Spam, that’s different than a Disqualified Reason of No Current Budget.

Step 5 – Remove the Bounces

If someone has bounced email, these should be easy candidates to delete, right? It’s not so simple.

You might want to keep these around to maintain some reporting intelligence. You might also want to mail a few last times to mine data from those returned emails.

Additionally, you might have a bounce that indicates a spam bounce rather than a no-longer-there bounce–you need to be careful to distinguish between the two.

Deliverability Best Practice: Put an automated campaign in place in Marketo to manage bounced emails. The program reviews your bounces for addresses/domains that are fake.

Step 6 – Delete the Inactive – The Big Sweep

How long should the couch potatoes live in your database without activity before you delete?

If you set your other five filters as conservative, these inactive criteria can serve as the final sweep. We’ve seen companies use very complex criteria to define this last step since it’s the end of the line for some data. Here are a few criteria to start with.

  • 15 months of inactivity
  • Exclude/include select Lead Sources.
  • Exclude/include Sales generated records.
  • Salesforce Lead records only
  • No active opportunities

Tip: Send a Wake the Dead email before deleting with a few last chance emails, with copy like: “Hey, we love you but is it time to say goodbye? If you want to keep receiving great content, click here. Or, we’ll remove you from our database.”


Dive deeper into data retention policies with our post: Guide to Deleting Bad Data in Marketo


Keeping bad data around is not a strategy for long-term success. The Dumpster Dive Data Strategy is a methodology that brings clarity to your data retention policies.

Once adopted, expect to see a database reduction of 10-40%+ which will boost your performance and decrease your data costs. Good luck mining.

If you have any questions or need help getting your data strategy up and running, send us a message at

Also, check out Data Management in Marketo in Marketing Rockstar Guides.

Your marketing technology experts.

At Digital Pi, we use technology to connect revenue to marketing efforts. We fuse marketing strategies, processes, data and applications to make marketing technology solutions work for clients' businesses.

Learn More
Share this resource

Cookies help us keep the site running smoothly and inform some of our advertising, but if you’d like to make adjustments, you can visit our Cookie Notice page for more information.