Finding the Duplicate Number with Cycle Sort

In this post, we will solve another interesting coding interview problem using Cycle Sort technique.

3 min readMar 4, 2024

In my previous post, we looked at what cycle sort is and how it can be used to solve certain coding questions. We will continue to apply this technique to solve another coding problem. Cycle sort is useful in special case input scenarios, e.g. only one number missing in a given array, finding one duplicate number in the array etc.

Problem Statement

Imagine you’re given an array of n+1 integers where each integer is between 1 and n (inclusive), and there's a guarantee that at least one duplicate number exists. Your mission, should you choose to accept it, is to find any duplicate number in the array without modifying it and, importantly, in a way that exceeds O(n^2) time complexity and uses only constant extra space.

Example:

Consider the array [1,3,4,2,2]. Here, the duplicate number is 2.

How do we uncover this duplicity efficiently?

The Strategy: Cycle Sort Inspired Approach

Our solution takes cues from the Floyd’s Tortoise and Hare cycle detection algorithm, which, like cycle sort, relies on cyclic patterns within the data. Here’s the essence of our strategy:

Visualize the Problem as a Linked List: Imagine each element in the array points to the next element by its value. This visualization turns the array into a “virtual” linked list where the duplicate number forms a cycle.
Detect the Cycle: Use the slow (Tortoise) and fast (Hare) pointer technique to find the intersection point within the cycle.
Find the Entrance to the Cycle: Knowing a cycle exists, the next step is to find where it begins, which corresponds to our duplicate number.

Java Solution with Step-by-Step Explanation

public class Solution {
    public int findDuplicate(int[] nums) {
        // Step 1: Detect the cycle using the Tortoise and Hare approach
        int tortoise = nums[0];
        int hare = nums[0];
        do {
            tortoise = nums[tortoise];
            hare = nums[nums[hare]];
        } while (tortoise != hare);

        // Step 2: Find the entrance to the cycle (duplicate number)
        tortoise = nums[0];
        while (tortoise != hare) {
            tortoise = nums[tortoise];
            hare = nums[hare];
        }

        return hare; // hare (or tortoise) is now at the start of the cycle (duplicate number)
    }
}

Breaking It Down:

Visualizing the Array as a Linked List: We start with the assumption that each element in the array points to another element, forming a virtual linked list.
Detecting the Cycle: By advancing one pointer (tortoise) by one step and another (hare) by two steps, we're bound to find a meeting point if a cycle exists. This is guaranteed by Floyd's cycle detection principle.
Locating the Cycle’s Start: Once a cycle is detected, we reset one pointer to the start and then move both pointers at the same speed. The point where they meet again is the entrance to the cycle, i.e., our duplicate number.

Time and Space Complexity

Time Complexity: O(n). Both the cycle detection and entrance finding steps traverse the list in a linear fashion.
Space Complexity: O(1). Our solution only uses a few variables, maintaining constant space usage.

Please follow me on Medium for more such insightful explorations into coding challenges.