Implementing parallel search algorithms with `std::jthread`

With the increasing availability of multi-core processors, parallel programming has become a necessity in order to fully utilize the computing power of modern systems. When it comes to searching algorithms, parallelizing the search process can significantly improve performance, especially for large data sets. In this blog post, we will explore how to implement parallel search algorithms using std::jthread, a new lightweight threading library introduced in C++20.

What is std::jthread?

std::jthread is a new threading library introduced in the C++20 standard. It provides a high-level interface for managing threads, making it easier to write concurrent code. std::jthread is similar to std::thread, but with some additional features like automatic thread cleanup.

Linear search is a simple search algorithm that checks each element in a data set sequentially until the desired element is found. Let’s consider a simple linear search implementation in C++:

int linearSearch(const std::vector<int>& data, int target) {
    for (int i = 0; i < data.size(); ++i) {
        if (data[i] == target) {
            return i;  // Return the position of the element if found
        }
    }
    return -1;  // Element not found
}

To parallelize this linear search algorithm, we can divide the data set into smaller chunks and assign each chunk to a separate thread. Each thread will search a subset of the data in parallel. Here’s an example implementation using std::jthread:

#include <iostream>
#include <vector>
#include <algorithm>
#include <thread>
#include <future>

int parallelLinearSearch(const std::vector<int>& data, int target) {
    int numThreads = std::thread::hardware_concurrency();
    int chunkSize = data.size() / numThreads;

    std::vector<std::jthread> threads(numThreads);
    std::vector<std::future<int>> futures(numThreads);

    for (int i = 0; i < numThreads; ++i) {
        int startIndex = i * chunkSize;
        int endIndex = (i == numThreads - 1) ? data.size() : (startIndex + chunkSize);

        threads[i] = std::jthread([&, startIndex, endIndex](std::stop_token token) {
            auto it = std::find(data.begin() + startIndex, data.begin() + endIndex, target);
            if (it != data.begin() + endIndex && !token.stop_requested()) {
                // Element found, stop other threads
                std::stop_source source;
                for (int j = 0; j < numThreads; ++j) {
                    if (j != i) {
                        threads[j].request_stop();
                    }
                }
            }
            futures[i].set_value(it - data.begin());
        });
    }

    for (int i = 0; i < numThreads; ++i) {
        threads[i].join();
    }

    for (int i = 0; i < numThreads; ++i) {
        if (futures[i].valid() && futures[i].get() != -1) {
            return futures[i].get();
        }
    }

    return -1;  // Element not found
}

In this parallelized version, we determine the number of threads based on the hardware concurrency and divide the data set into equal-sized chunks. Each thread searches its assigned chunk using the std::find algorithm. If an element is found, the thread signals other threads to stop using std::jthread’s std::stop_token. Finally, we wait for all threads to complete and return the position of the found element, or -1 if the element is not found.

Conclusion

Implementing parallel search algorithms can greatly improve the performance of search operations, especially for large data sets. With the introduction of std::jthread in C++20, parallel programming becomes easier and more intuitive. By dividing the data set into smaller chunks and assigning each chunk to a separate thread, we can leverage the power of multi-core processors and achieve faster search times.