Published on

Algorithm 1 - Arrays & hashing

Welcome to our comprehensive guide to algorithmic problems, where we'll be delving into the intriguing world of Arrays & Hashing! 🚀 Whether you're a seasoned programmer looking to sharpen your skills or a coding enthusiast eager to explore new challenges, this series of problems is tailor-made for you.

Arrays are fundamental data structures that store elements of the same type, allowing us to organize and manipulate data efficiently. On the other hand, Hashing involves the use of hash functions to map data to unique indices, providing fast access and retrieval. 🗂️💨

Throughout this article, we'll explore various problem-solving techniques, best practices, and clever algorithms that will help you tackle a wide range of array and hashing challenges. So, buckle up and get ready to unlock the secrets behind these powerful tools, and elevate your problem-solving prowess to new heights! Let's embark on this exciting journey together. Happy coding! 💻🤩

Problem: Contains Duplicate (Easy)


You are given an array of integers, and you need to determine if there are any duplicate elements present in the array. Your task is to write a function that returns true if any value appears at least twice in the array, and false if every element is distinct.

Examples:

  • Example 1:

      Input: nums = [1, 2, 3, 1] Output: true
    
      Explanation: The array contains the duplicate element '1', which appears more than once.
    
  • Example 2:

      Input: nums = [3, 5, 9, 2] Output: false
    
      Explanation: The array contains no duplicate elements. All elements are distinct.
    

Solutions:

1. Brute Force

If you haven't learned about hashsets yet, this brute force approach might be the one that you can come up with. The idea is to iterate through the array and compare each element to every other element in the array. If we find a duplicate, we can return true immediately. Otherwise, we can return false after checking every element in the array. However it's time complexity is not reasonable, so it's not recommended for efficiency.

def containsDuplicate(nums):
    n = len(nums)
    for i in range(n):
        for j in range(i + 1, n):
            if nums[i] == nums[j]:
                return True
    return False
  • Time Complexity: O(n^2)
  • Space Complexity: O(1)

2. Sorting the Array

Another way which is not using hashset is that you can sort the array first in ascending order, and then check if any adjacent elements are equal. If we find a duplicate, we can return true immediately. Otherwise, we can return false after checking every element in the array. In terms of efficiency, this approach is better than the brute force approach, but it's still not the most efficient way to solve this problem. However, it's a good practice to sort the array first, because it can help you solve other problems

def containsDuplicate(nums):
    nums.sort()
    for i in range(1, len(nums)):
        if nums[i] == nums[i - 1]:
            return True
    return False
  • Time Complexity: O(n log n)
  • Space Complexity: O(1)

3. Using a Set

def containsDuplicate(nums):
    seen = set()
    for num in nums:
        if num in seen:
            return True
        seen.add(num)
    return False
  • Time Complexity: O(n)
  • Space Complexity: O(n)

The idea behind this solution is to utilize a set data structure to keep track of the elements encountered while traversing the input array. As we iterate through the array, we add each element to the set. If an element is already present in the set, it indicates that we have found a duplicate, and we can immediately return true. On the other hand, if the loop completes without finding any duplicates, we return false.

This approach leverages the constant-time average lookup property of sets, making it highly efficient. Sets store unique elements, so if we encounter a duplicate, it will be detected in constant time.

Summarize

So after this first problem, you can have a rough idea how the hashset works. It's an extremely useful data structure that can help you solve a lot of problems with high efficiency. Let's move on to the next problem!

Problem: Top K Frequent Elements (Medium)

Given an integer array nums and an integer k, return the k most frequent elements. You may return the answer in any order.

Examples:

  • Example 1:

    Input: (nums = [1, 1, 1, 2, 2, 3]), (k = 2)
    
    Output: [1, 2]
    
  • Example 2:

    Input: (nums = [1]), (k = 1)
    
    Output: [1]
    

1. Brute Force

Let's explore a brute force solution for finding the k most frequent elements in the array. The brute force approach involves counting the frequency of each element and then sorting the elements based on their frequency. We can then return the top k elements from the sorted list.

def k_most_frequent(nums, k):
    # Step 1: Create a frequency map
    frequency_map = {}
    for num in nums:
        frequency_map[num] = frequency_map.get(num, 0) + 1

    # Step 2: Sort elements based on frequency
    sorted_elements = sorted(frequency_map.keys(), key=lambda x: frequency_map[x], reverse=True)

    # Step 3: Pick the top k elements
    return sorted_elements[:k]

In this brute force approach, the time complexity is dominated by the sorting step, which has a time complexity of O(n log n) using Python's built-in sorting algorithm. The frequency counting step has a time complexity of O(n), so the overall time complexity of this brute force approach is O(n log n).

The space complexity is O(n) as we use additional space to store the frequency map.

2. Bucket Sort algorithm

To solve this problem with better than O(n log n) time complexity, we can use a variation of the Bucket Sort algorithm. The basic idea is to count the frequency of each element in the array and then use a bucket to store elements with the same frequency. We'll then pick the top k elements from the bucket.

Here's a step-by-step approach to implement this algorithm:

  1. Create a frequency map to count the occurrences of each element in the input array nums.
  2. Create a list of buckets, where each bucket will represent the elements with the same frequency. The index of the bucket will represent the frequency, and the bucket will contain the elements with that frequency.
  3. Traverse the frequency map and place each element in the corresponding bucket based on its frequency.
  4. Traverse the buckets from the end and pick the top k elements until k elements are selected.
def k_most_frequent(nums, k):
    # Step 1: Create a frequency map
    frequency_map = {}
    for num in nums:
        frequency_map[num] = frequency_map.get(num, 0) + 1

    # Step 2: Create a list of buckets
    max_frequency = max(frequency_map.values())
    buckets = [[] for _ in range(max_frequency + 1)]

    # Step 3: Place elements in the corresponding bucket
    for num, frequency in frequency_map.items():
        buckets[frequency].append(num)

    # Step 4: Pick the top k elements from the buckets
    result = []
    for i in range(max_frequency, 0, -1):
        result.extend(buckets[i])
        if len(result) >= k:
            break

    return result[:k]

This algorithm's time complexity is O(n) because we only need to traverse the array and count the frequency of elements once. The space complexity is also O(n) since we use additional space for the frequency map and the buckets.

Summarize

In this problem, we use another kind of hashing technique called bucket sort, and you can see how efficient hashing is when it comes to solving problems.

Problem: Longest Consecutive Sequence (Medium)

Given an unsorted array of integers nums, return the length of the longest consecutive elements sequence.

Example 1:

Input: nums = [100,4,200,1,3,2]

Output: 4

Explanation: The longest consecutive elements sequence is [1, 2, 3, 4]. Therefore its length is 4.

Example 2:

Input: nums = [0, 3, 7, 2, 5, 8, 4, 6, 0, 1]

Output: 9

1. Brute Force

A brute force approach involves sorting the array and iterating through it to find the longest consecutive sequence.

def longest_consecutive(nums):
    if not nums:
        return 0

    nums.sort()
    max_length = 1
    current_length = 1

    for i in range(1, len(nums)):
        if nums[i] == nums[i-1] + 1:
            current_length += 1
        elif nums[i] != nums[i-1]:
            current_length = 1

        max_length = max(max_length, current_length)

    return max_length

The sorting step takes O(n log n) time, and iterating through the sorted array takes O(n) time. So the total time complexity is O(n log n).

2. Hash Set

The most efficient approach to solve this problem is by using a HashSet.

def longest_consecutive(nums):
    num_set = set(nums)
    max_length = 0

    for num in num_set:
        if num - 1 not in num_set:
            current_num = num
            current_length = 1

            while current_num + 1 in num_set:
                current_num += 1
                current_length += 1

            max_length = max(max_length, current_length)

    return max_length

When the algorithm finds a number that is the start of a consecutive sequence (i.e., num - 1 is not in the HashSet), it initiates a while loop to explore the consecutive sequence by incrementing the current number. Since each number is explored at most once, the overall time complexity remains linear, and the total time complexity is O(n).

Summarize

In this problem, we explored two different approaches to find the length of the longest consecutive elements sequence in an unsorted array of integers. The brute force approach involved sorting the array and then iterating through it to find the longest sequence, resulting in a time complexity of O(n log n). On the other hand, the efficient approach utilized a HashSet to identify consecutive sequences in a single pass through the array, leading to an optimal time complexity of O(n).

Wrap up

🔍 Through these problems, we have witnessed the power and ease of using the hashing technique to solve a wide range of problems efficiently. As we continue on this algorithm journey, let's study together and explore more fascinating problem-solving techniques! 🤝📚