KK-DATA avatar KK-DATA

KK-DATA Deduplication Repository Tutorial: How to Use Deduplication Repository Across Tasks to Avoid Balance Waste - Console Operation Guide

去重仓库 教程 kkdata 跨任务去重

KK-DATA Deduplication Vault Tutorial: How to Avoid Balance Waste with Cross-Task Deduplication – Console Operation Guide

In offshore customer acquisition and B2B SaaS promotion, batch number screening is a critical step for verifying the activity of target leads. However, many teams encounter a pain point: the same numbers appear repeatedly across different batches and platform screening tasks, causing balances to be wasted. KK-DATA’s Deduplication Vault is designed to solve this problem—it allows you to automatically filter out already-checked numbers across tasks, preventing balance waste while improving screening efficiency. This article will guide you step by step on how to use the Deduplication Vault in the console, from entry points to best practices, covering all operational details.

What is the Deduplication Vault? Why Do You Need It for Offshore Acquisition?

The Deduplication Vault is essentially a cross-task shared number database that stores all numbers you have already screened (or actively imported) and supports management by platform, status, and other conditions. When you submit a new screening task, the system automatically compares the numbers against the vault records, skips existing numbers, and only performs paid screening on new or unchecked numbers.

Typical scenarios:

  • You generate 1,000 US Telegram numbers and screen them, finding 500 active ones. Next week, you import another batch from a different source that includes 200 numbers already screened before—without the Deduplication Vault, those 200 would be screened again and charged.
  • You run both Telegram and WhatsApp campaigns, and the same phone number might appear in screening tasks for both platforms. Cross-task deduplication ensures each number is screened only once, regardless of the platform.

The core value of the Deduplication Vault: One screening, global deduplication, zero waste.

Core Working Logic of the Deduplication Vault

The Deduplication Vault integrates tightly with KK-DATA’s “Number Generation → Batch Screening” pipeline. Its workflow is as follows:

  1. Number Import: You can bulk-import historical numbers via CSV/TXT files, or directly import results from completed screening tasks. During import, the system automatically deduplicates (internally, the vault maintains uniqueness).
  2. Screening Task Initiation: When creating a new screening task, check the “Use Deduplication Vault” option.
  3. Automatic Filtering: The system compares the numbers to be screened against the vault records one by one. Numbers already in the vault are marked “already screened” and skipped.
  4. New Numbers Enter the Vault: After the screening task ends, all newly screened numbers (whether valid or not) can be automatically added to the vault (optional setting), making them available for future deduplication.

Key point: The deduplication operation itself does not consume balance. Only numbers actually submitted for screening are charged per number. Therefore, the Deduplication Vault helps you “filter duplicates for free,” paying only for numbers that truly need screening.

How to Access the Deduplication Vault? Console Entry and Interface Preview

After logging into the KK-DATA Console, find “Data Vault” (or “Deduplication Vault” – the name may vary by version, but the functionality is the same) in the left navigation menu. Click to enter the main interface.

Main interface areas:

  • Number List: Shows all numbers stored in the vault, with filtering options by number, platform, import time, status, etc.
  • Import Button: Supports uploading CSV/TXT files or one-click import from historical tasks.
  • Deduplication Rule Settings: Configure matching strategies (detailed below).
  • Historical Task Records: Shows all operation logs associated with the vault, including import tasks and deduplication records from screening tasks.

You can also manually edit, export, or delete numbers in the vault to flexibly manage your data assets.

How to Create and Execute Cross-Task Deduplication? Step-by-Step Guide

Follow these steps to complete a full cross-task deduplication process from scratch.

Step 1 – Import Existing Numbers into the Vault

On the Deduplication Vault page, click the “Import Numbers” button.

  • Method 1: File Import: Select a CSV or TXT file with one number per line. You can optionally include a platform identifier (e.g., +8613800138000,telegram). If not specified, it defaults to all-platform deduplication.
  • Method 2: Import from Historical Tasks: Select a completed screening task from the dropdown list, and the system will automatically add all screened numbers (both valid and invalid) from that task to the vault.

Note: Importing Numbers is Free

Importing, querying, and deleting numbers in the Deduplication Vault do not consume balance. Only when you use these numbers in actual screening tasks (or import numbers from other sources with the “Use Deduplication Vault” option checked) will you be charged per number. Feel free to centralize your historical data in the vault.

Step 2 – Set Deduplication Matching Rules

In the vault settings, you can define what counts as a duplicate. Currently, the following rules are supported (based on the actual options in the console):

Rule OptionDescription
Strict matchNumbers must be exactly identical (including country code) to be considered duplicates
Ignore country codeMatch only the numeric part; e.g., +8613800138000 and 13800138000 are considered duplicates
Deduplicate by platformDeduplicate only within the same platform (e.g., Telegram numbers are compared only against the Telegram vault)
Deduplicate across all platformsAll platform numbers are deduplicated uniformly

We recommend selecting “Ignore country code + Deduplicate across all platforms” for most scenarios, because the same phone number may appear in different formats across different contexts, and cross-platform deduplication avoids duplicate screening across platforms.

Step 3 – Submit a New Screening Task with “Vault Deduplication” Enabled

Go back to the screening task creation page (e.g., create a new Telegram activity detection task). Under “Advanced Settings” or “Deduplication Options”, check the “Use Deduplication Vault” option. The system will immediately calculate how many numbers in the pending list already exist in the vault (and will be skipped) and display the estimated cost—this cost only covers the numbers that actually need screening.

After confirming, submit the task. During execution, skipped (deduplicated) numbers will not incur any charges. After the task completes, you can view the task details and see the count of “deduplicated” entries.

Best Practices and Precautions for the Deduplication Vault

Regularly Export and Clean Vault Data

If the vault accumulates more than a certain number of records (e.g., 500,000), the retrieval speed may slightly decrease. We recommend performing maintenance once per quarter:

  1. On the vault page, click “Export” to save all records to a local CSV.
  2. Delete numbers that are older than 90 days and no longer needed (e.g., invalid numbers, test numbers).
  3. Cleaning operations do not affect completed screening results—they only remove vault storage.

Note: Vault Capacity Affects Performance

If a single vault accumulates over 1 million records, some queries may become slower. We recommend exporting and cleaning once per quarter to free up space. Deleting vault records does not affect completed screening results.

Combine with the “Number Generation → Screening” Pipeline

The most efficient workflow is:

  1. Generate global numbers (using KK-DATA’s number generator, which can randomly generate numbers from 240+ countries/regions).
  2. Import into the vault: Import the generated numbers into the Deduplication Vault in one go.
  3. Set deduplication: Enable vault deduplication in the screening task.
  4. Batch screening: When submitting the task, numbers already screened before (e.g., from historical tasks) in the generated set are automatically skipped.
  5. Export active results: Export the valid/active numbers for subsequent marketing.

The vault can also retain invalid numbers (e.g., numbers not registered on Telegram); the system will automatically skip them in future generations, avoiding re-screening.

Note: Deduplication Can Proceed Even When Balance Is Low

The deduplication operation itself consumes no balance. However, if you submit a screening task and some numbers have no match in the vault, those numbers will be screened and charged normally. Therefore, ensure sufficient balance before submitting a task. If your balance is insufficient, the task will not be submitted, and you will receive a notification. Check the Billing Page for current prices and top-up options (USDT TRC20, minimum ~50 USDT).

Cross-Task Deduplication vs. Single-Task Self-Deduplication: What’s the Difference?

KK-DATA’s screening tasks also have built-in “single-task self-deduplication” functionality, meaning duplicate numbers within the same task batch are screened only once. Cross-task deduplication goes a step further. The comparison table is as follows:

FeatureSingle-Task Self-DeduplicationCross-Task Deduplication (Deduplication Vault)
Deduplication scopeOnly numbers within the current taskAll historical tasks + manually imported numbers
Data sourceThe number list submitted with the taskAll records in the Deduplication Vault
Billing impactAvoids duplicate charging within the same taskAvoids duplicate charging across tasks
Use caseOne-time small batch verificationLong-term, multi-batch, multi-channel acquisition operations
ConfigurationEnabled by default, no extra action neededRequires manually checking “Use Vault”

Recommendation: For teams that use KK-DATA for batch screening on a regular basis, be sure to enable cross-task deduplication. This can significantly reduce duplicate screenings, saving up to 30%–50% of balance (depending on the duplication rate). Single-task self-deduplication serves as a basic safeguard, while cross-task deduplication is an upgrade strategy. Combining both yields the best results.

Frequently Asked Questions

Q: Do numbers stored in the Deduplication Vault consume my screening balance?

A: No. The Deduplication Vault only stores number lists and performs no screening. Balance is deducted only when you take numbers from the vault and submit them for screening, or when you import numbers from other sources and check the “Use Deduplication Vault” option and actual screening occurs.

Q: Which platforms and detection types are supported for cross-task deduplication?

A: Currently, it supports all live platforms including Telegram, WhatsApp, iMessage, RCS, and number validity detection. The system matches based on the number and platform identifier (e.g., TGID, WSID). We recommend choosing “Deduplicate across all platforms” in the rules to maximize savings.

Q: If I clear the Deduplication Vault, will I lose historical screening results?

A: No. The Deduplication Vault only stores the list of numbers for deduplication comparison. Deleting the vault will not affect the result files of completed screening tasks (which can still be downloaded from task details). However, after clearing, new tasks will not automatically skip previously screened numbers, leading to duplicate charges—so proceed with caution.

Q: Can I manually delete or batch clean numbers in the vault?

A: Yes. On the Deduplication Vault page, you can select one or more numbers and click “Delete”; you can also use filters (e.g., by import time, status) to select a batch and delete it. Delete operations are irreversible, so we recommend exporting a backup first.

Q: I have topped up with USDT balance. Does the Deduplication Vault require extra payment?

A: No. The Deduplication Vault is a free feature. All registered KK-DATA users can use it at no cost. Only actual screening operations consume balance (see Console Real-Time Pricing for details).


Now go ahead and try the cross-task deduplication feature to get the most out of every screening. 👉 Log in to the console and start screening. If you have any operational questions, feel free to contact customer service via https://t.me/kkdata_robot. For more detailed tutorials, refer to the KK-DATA Documentation.