使用多核的前缀搜索算法

我有一项任务是从单词中过滤一个列表(向量)作为前缀。该算法应该使用现代多核处理器。


解决方案是使用许多线程来处理列表。


//      PrintWriter writer = new PrintWriter("C:\\DemoList.txt", "UTF-8");

//      

//      for(char i = 'A'; i<= 'Z'; i++) {

//          for(char j = 'A'; j<= 'Z'; j++) {

//              for(char n = 'A'; n<= 'Z'; n++) {

//                  for(char m = 'A'; m<= 'Z'; m++) {

//                      writer.println("" + i + j + n + m );

//                  }

//                      

//              }

//          }

//      }

    List<String> allLines = Files.readAllLines(Paths.get("C:\\", "DemoList.txt"));

    Collections.shuffle(allLines);

    String pattern = "AA";


    List<String> result = new ArrayList<>();

    int cores = Runtime.getRuntime().availableProcessors();

    int threadsNum = allLines.size() / cores;


    long start_time = System.nanoTime();


    for (String word : allLines) {

        if (word.startsWith(pattern))

            result.add(word);


    }


    long end_time = System.nanoTime();

    double difference = (end_time - start_time) / 1e6;

    System.out.println("Time difference in Milliseconds with Brute-Force: " + difference);


//With Parallisim:

    long new_start_time = System.nanoTime();


    List<String> filteredList = allLines.parallelStream().filter(s -> s.startsWith(pattern))

            .collect(Collectors.toList());


    long new_end_time = System.nanoTime();


    double new_difference = (new_end_time - new_start_time) / 1e6;

    System.out.println("Time difference in Milliseconds with Stream from Java 8: " + new_difference);   

结果:使用 Brute-Force 的时间差(以毫秒为单位):33.033602 使用 Java 8 的 Stream 的时间差(以毫秒为单位):65.017069


每个线程都应该从列表中过滤一个范围。


你有更好的主意吗?您认为我应该对原始列表进行排序而不是对其进行二分查找吗?我应该在二进制排序中也使用多线程,还是应该使用 Collections.sort?你将如何实施?


holdtom
浏览 120回答 2
2回答

一只萌萌小番薯

从您的代码示例中,您的时间测量方法与微基准测试相近,对于它来说,仅仅测量一次执行的时间会产生误导。您可以在以下 StackOverflow 帖子中看到详细讨论:How do I write a correct micro-benchmark in Java?我编写了一个更准确的基准测试,以获得对示例代码的更精确测量。该代码已在具有多线程的 QuadCore i7 上运行(但它是一台笔记本电脑,由于电源和热量管理,结果可能会略微偏向于产生更多热量的多线程代码)。@Benchmarkpublic void testSequentialFor(Blackhole bh, Words words) {&nbsp; &nbsp; List<String> filtered = new ArrayList<>();&nbsp; &nbsp; for (String word : words.toSort) {&nbsp; &nbsp; &nbsp; &nbsp; if (word.startsWith(words.prefix)) {&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; filtered.add(word);&nbsp; &nbsp; &nbsp; &nbsp; }&nbsp; &nbsp; }&nbsp; &nbsp; bh.consume(filtered);}@Benchmarkpublic void testParallelStream(Blackhole bh, Words words) {&nbsp; &nbsp; bh.consume(words.toSort.parallelStream()&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; .filter(w -> w.startsWith(words.prefix))&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; .collect(Collectors.toList())&nbsp; &nbsp; );}@Benchmarkpublic void testManualThreading(Blackhole bh, Words words) {&nbsp; &nbsp; // This is quick and dirty, bit gives a decent baseline as to&nbsp; &nbsp; // what a manually threaded partitionning can achieve.&nbsp; &nbsp; List<Future<List<String>>> async = new ArrayList<>();&nbsp; &nbsp; // this has to be optimized to avoid creating "almost empty" work units&nbsp; &nbsp; int batchSize = words.size / ForkJoinPool.commonPool().getParallelism();&nbsp; &nbsp; int numberOfDispatchedWords = 0;&nbsp; &nbsp; while (numberOfDispatchedWords < words.toSort.size()) {&nbsp; &nbsp; &nbsp; &nbsp; int start = numberOfDispatchedWords;&nbsp; &nbsp; &nbsp; &nbsp; int end = numberOfDispatchedWords + batchSize;&nbsp; &nbsp; &nbsp; &nbsp; async.add(words.threadPool.submit(() -> {&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; List<String> batch = words.toSort.subList(start, Math.min(end, words.toSort.size()));&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; List<String> result = new ArrayList<>();&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; for (String word : batch) {&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; if (word.startsWith(words.prefix)) {&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; result.add(word);&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; }&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; }&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; return result;&nbsp; &nbsp; &nbsp; &nbsp; }));&nbsp; &nbsp; &nbsp; &nbsp; numberOfDispatchedWords += batchSize;&nbsp; &nbsp; }&nbsp; &nbsp; List<String> result = new ArrayList<>();&nbsp; &nbsp; for (Future<List<String>> asyncResult : async) {&nbsp; &nbsp; &nbsp; &nbsp; try {&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; result.addAll(asyncResult.get());&nbsp; &nbsp; &nbsp; &nbsp; } catch (Exception e) {&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; e.printStackTrace();&nbsp; &nbsp; &nbsp; &nbsp; }&nbsp; &nbsp; }&nbsp; &nbsp; bh.consume(result);}@State(Scope.Benchmark)public static class Words {&nbsp; &nbsp; ExecutorService threadPool = ForkJoinPool.commonPool();&nbsp; &nbsp; List<String> toSort;&nbsp; &nbsp; @Param({"100", "1000", "10000", "100000"})&nbsp; &nbsp; private int size;&nbsp; &nbsp; private String prefix = "AA";&nbsp; &nbsp; @Setup&nbsp; &nbsp; public void prepare() {&nbsp; &nbsp; &nbsp; &nbsp; //a 4 to 13 letters long, random word&nbsp; &nbsp; &nbsp; &nbsp; //for more precision, it should not be that random (use a fixed seed), but given the simple nature of the fitlering, I guess it's ok this way&nbsp; &nbsp; &nbsp; &nbsp; Supplier<String> wordCreator = () -> RandomStringUtils.random(4 + ThreadLocalRandom.current().nextInt(10));&nbsp; &nbsp; &nbsp; &nbsp; toSort = Stream.generate(wordCreator).limit(size).collect(Collectors.toList());&nbsp; &nbsp; }}这是结果基准(大小)模式 Cnt 分数误差单位PerfTest.testManualThreading 100 thrpt 6 95971,811 ± 1100,589 ops/sPerfTest.testManualThreading 1000 thrpt 6 76293,983 ± 1632,959 ops/sPerfTest.testManualThreading 10000 thrpt 6 34630,814 ± 2660,058 ops/sPerfTest.testManualThreading 100000 thrpt 6 5956,552 ± 529,368 ops/sPerfTest.testParallelStream 100 thrpt 6 69965,462 ± 451,418 ops/sPerfTest.testParallelStream 1000 thrpt 6 59968,271 ± 774,859 ops/sPerfTest.testParallelStream 10000 thrpt 6 29079,957 ± 513,244 ops/sPerfTest.testParallelStream 100000 thrpt 6 4217,146 ± 172,781 ops/sPerfTest.testSequentialFor 100 thrpt 6 3553930,640 ± 21142,150 ops/sPerfTest.testSequentialFor 1000 thrpt 6 356217,937 ± 7446,137 ops/sPerfTest.testSequentialFor 10000 thrpt 6 28894,748 ± 674,929 ops/sPerfTest.testSequentialFor 100000 thrpt 6 1725,735 ± 13,273 ops/s所以顺序版本看起来方式更快达几千元素,它们是在同水准与手动线程10K前位,并与并行流后位,和线程代码进行更好的从那里。考虑到编写“手动线程变体”所需的代码量,以及在那里创建错误或通过计算批量大小而导致效率低下的风险,我可能不会选择该选项,即使它看起来可能比巨大列表的流。我不会尝试先排序,然后二分搜索,因为过滤是一个 O(N) 操作,然后排序一个 O(Nlog(N))(在它上面你必须添加一个二分搜索)。因此,除非您对数据有非常精确的模式,否则我认为它不会对您有利。请注意,尽管不要得出此基准测试无法支持的结论。一方面,它基于这样一个假设,即过滤是程序中唯一发生的事情并为 CPU 时间而战。如果您在任何类型的“多用户”应用程序(例如 Web 应用程序)中,那么这可能不是真的,您很可能会失去一切,尽管您可以通过多线程获得。

呼啦一阵风

从 Java 8 开始,您可以使用流来解决以下几行问题:List<String> yourList = new ArrayList<>(); // A list whose size requires parallelismString yourPrefix = ""; // Your custom prefixfinal List<String> filteredList = yourList.parallelStream()&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;.filter(s -> s.startsWith(yourPrefix))&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;.collect(Collectors.toList());我建议您阅读本文并查看此问题以了解并行性是否对您有所帮助。
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Java