Design an Ad Click Aggregator

By a former Meta Staff engineer and Co-founder of hellointerview.com

Evan King
5 min readDec 4, 2024
Final Design

Understanding the Problem

🖱️ What is an Ad Click Aggregator An Ad Click Aggregator is a system that collects and aggregates data on ad clicks. It is used by advertisers to track the performance of their ads and optimize their campaigns. For our purposes, we will assume these are ads displayed on a website or app, like Facebook.

Functional Requirements

Core Requirements

  1. Users can click on an ad and be redirected to the advertiser’s website
  2. Advertisers can query ad click metrics over time with a minimum granularity of 1 minute

Below the line (out of scope):

  • Ad targeting
  • Ad serving
  • Cross device tracking
  • Integration with offline marketing channels

Non-Functional Requirements

Before we jump into our non-functional requirements, it’s important to ask your interviewer about the scale of the system. For this design in particular, the scale will have a large impact on the database design and the overall architecture.

We are going to design for a system that has 10M active ads and a peak of 10k clicks per second. The total number of clicks per day will be around 100M.

With that in mind, let’s document the non-functional requirements:

Core Requirements

  1. Scalable to support a peak of 10k clicks per second
  2. Low latency analytics queries for advertisers (sub-second response time)
  3. Fault tolerant and accurate data collection. We should not lose any click data.
  4. As realtime as possible. Advertisers should be able to query data as soon as possible after the click.
  5. Idempotent click tracking. We should not count the same click multiple times.

Below the line (out of scope):

  • Fraud or spam detection
  • Demographic and geo profiling of users
  • Conversion tracking

Here’s how it might look on your whiteboard:

The Set Up

Planning the Approach

For this question, which is less of a user-facing product and more focused on data processing, we’re going to follow the delivery framework outlined here, focusing on the system interface and the data flow.

System Interface

For data processing questions like this one, it helps to start by defining the system’s interface. This includes clearly outline what data the system receives and what it outputs, establishing a clear boundary of the system’s functionality. The inputs and outputs of this system are very simple, but it’s important to get these right!

  1. Input: Ad click data from users.
  2. Output: Ad click metrics for advertisers.

Data Flow

The data flow is the sequential series of steps we’ll cover in order to get from the inputs to our system to the outputs. Clarifying this flow early will help to align with our interviewer before the high-level design. For the ad click aggregator:

  1. User clicks on an ad on a website.
  2. The click is tracked and stored in the system.
  3. The user is redirected to the advertiser’s website.
  4. Advertisers query the system for aggregated click metrics.

Note that this is simple, we will improve upon as we go, but it’s important to start simple and build up from there.

High-Level Design

1) Users can click on ads and be redirected to the target

Let’s start with the easy part, when a user clicks on an ad in their browser, we need to make sure that they’re redirected to the advertiser’s website. We’ll introduce a Ad Placement Service which will be responsible for placing ads on the website and associating them with the correct redirect URL.

When a user clicks on an ad which was placed by the Ad Placement Service, we will send a request to our /click endpoint, which will track the click and then redirect the user to the advertiser’s website.

There are two ways we can handle this redirect, with one being simpler and the other being more robust.

Good Solution: Client side redirect

Approach

The simplest thing we can do is send over a redirect URL with each ad that’s placed on the website. When a user clicks on the ad, the browser will automatically redirect them to the target URL. It’s simple, straightforward, and requires no additional server side logic. We would then, in parallel, POST to our /click endpoint to track the click.

Challenges

The downside with this approach is that users could go to an advertiser’s website without us knowing about it. This could lead to discrepancies in our click data and make it harder for advertisers to track the performance of their ads. Sophisticated users could grab the url off of the page and navigate to it directly, bypassing our click tracking entirely. Someone would probably build a browser extension to do this.

Great Solution: Server side redirect

Approach

A more robust solution is to have the user click on the ad, which will then send a request to our server. Our server can then track the click and respond with a redirect to the advertiser’s website via a 302 (redirect) status code.

This way, we can ensure that we track every click and provide a consistent experience for users and advertisers. This approach also allows us to append additional tracking parameters to the URL, which can help the advertiser track the source of the click, but this is out of scope so we won’t go into more detail here.

Challenges

The only downside is the added complexity, which could slow down the user experience. We need to make sure that our system can handle the additional load and respond quickly to the user’s request.

2) Advertisers can query ad click metrics over time at 1 minute intervals

Our users were successfully redirected, now let’s focus on the advertisers. They need to be able to quickly query metrics about their ads to see how they’re performing. We’ll expand on the click processor path that we introduced above by breaking down some options for how a click is processed and stored.

Once our /click endpoint receives a request what happens next?

Read the rest of this article (for free, no signup) here!

--

--

No responses yet