为什么按位置读取 JDBC 结果集比按名称快多少,速度有多快?

在宣布休眠 6 时,休眠团队声称,通过在 JDBC ResultSet 中从按名称读取切换到按位置读取,他们获得了性能优势。

高负载性能测试表明,Hibernate 按名称从 ResultSet 读取值的方法是其在吞吐量中扩展的最大限制因素。

这是否意味着他们正在将调用从 getString(字符串列标签)更改为 getString(int 列索引)

为什么这样更快?

就像接口一样,性能提升不取决于实现它的JDBC驱动程序吗?ResultSet

收益有多大?


RISEBY
浏览 62回答 2
2回答

慕无忌1623718

作为 JDBC 驱动程序维护者(我承认,我进行了一些不一定适用于所有 JDBC 驱动程序的全面概括),行值通常存储在数组或列表中,因为这最自然地与从数据库服务器接收数据的方式相匹配。因此,按索引检索值将是最简单的。它可能像这样的东西一样简单(忽略实现JDBC驱动程序的一些更令人讨厌的细节):public Object getObject(int index) throws SQLException {&nbsp; &nbsp; checkValidRow();&nbsp; &nbsp; checkValidIndex(index);&nbsp; &nbsp; return currentRow[index - 1];}这几乎是最快的。另一方面,按列名查找需要更多的工作。列名需要处理为不区分大小写的,无论使用小写还是大写进行规范化,或者使用不区分大小写的查找,这都会产生额外的成本。TreeMap一个简单的实现可能是这样的:public Object getObject(String columnLabel) throws SQLException {&nbsp; &nbsp; return getObject(getIndexByLabel(columnLabel));}private int getIndexByLabel(String columnLabel) {&nbsp; &nbsp; Map<String, Integer> indexMap = createOrGetIndexMap();&nbsp; &nbsp; Integer columnIndex = indexMap.get(columnLabel.toLowerCase());&nbsp; &nbsp; if (columnIndex == null) {&nbsp; &nbsp; &nbsp; &nbsp; throw new SQLException("Column label " + columnLabel + " does not exist in the result set");&nbsp; &nbsp; }&nbsp; &nbsp; return columnIndex;}private Map<String, Integer> createOrGetIndexMap() throws SQLException {&nbsp; &nbsp; if (this.indexMap != null) {&nbsp; &nbsp; &nbsp; &nbsp; return this.indexMap;&nbsp; &nbsp; }&nbsp; &nbsp; ResultSetMetaData rsmd = getMetaData();&nbsp; &nbsp; Map<String, Integer> map = new HashMap<>(rsmd.getColumnCount());&nbsp; &nbsp; // reverse loop to ensure first occurrence of a column label is retained&nbsp; &nbsp; for (int idx = rsmd.getColumnCount(); idx > 0; idx--) {&nbsp; &nbsp; &nbsp; &nbsp; String label = rsmd.getColumnLabel(idx).toLowerCase();&nbsp; &nbsp; &nbsp; &nbsp; map.put(label, idx);&nbsp; &nbsp; }&nbsp; &nbsp; return this.indexMap = map;}根据数据库的 API 和可用的语句元数据,可能需要额外的处理来确定查询的实际列标签。根据开销,这可能仅在实际需要时才确定(按名称访问列标签时,或检索结果集元数据时)。换句话说,成本可能相当高。createOrGetIndexMap()但是,即使该成本可以忽略不计(例如,从数据库服务器准备元数据的语句包括列标签),将列标签映射到索引然后按索引检索的开销显然高于直接按索引检索的开销。驱动程序甚至可以每次都循环访问结果集元数据,并使用标签匹配的第一个;这可能比为具有少量列的结果集构建和访问哈希映射更便宜,但成本仍然高于通过索引直接访问。正如我所说,这是一个全面的概括,但是如果这(按名称查找索引,然后按索引检索)不是它在大多数JDBC驱动程序中的工作方式,我会感到惊讶,这意味着我希望按索引查找通常会更快。快速浏览一下许多驱动程序,情况如下:火鸟(杰伯德,披露:我维护这个司机)MySQL (MySQL Connector/J)PostgreSQL神谕断续器SQL Server (Microsoft JDBC Driver for SQL Server)我不知道JDBC驱动程序按列名检索的成本会相等,甚至更便宜。

互换的青春

在制作&nbsp;jOOQ&nbsp;的早期,我考虑了这两个选项,即按索引或名称访问 JDBC 值。出于以下原因,我选择按索引访问内容:ResultSet数据库管理系统支持并非所有 JDBC 驱动程序实际上都支持按名称访问列。我忘记了哪些没有,如果它们仍然没有,因为我在13年内再也没有接触过JDBC API的那一部分。但有些人没有,这对我来说已经是一个节目的阻碍。名称的语义此外,在那些支持列名的列名中,列名有不同的语义,主要是两个,JDBC调用:列名如结果集元数据::获取列名列标签如结果集MetaData::get列列标签关于上述两个的实现有很多歧义,尽管我认为意图非常明确:列名应该产生列的名称,而不管别名如何,例如 如果投影表达式是TITLEBOOK.TITLE AS X列标签应该生成列的标签(或别名),如果没有可用的别名,则生成名称,例如 如果投影表达式是XBOOK.TITLE AS X因此,名称/标签的这种模糊性已经非常令人困惑和担忧。一般来说,ORM似乎不应该依赖它,尽管在Hibernate的情况下,人们可以争辩说休眠控制着大多数SQL的生成,至少是为获取实体而生成的SQL。但是,如果用户编写 HQL 或本机 SQL 查询,我将不愿意依赖名称/标签 - 至少不要先在 中查找内容。ResultSetMetaData歧义在SQL中,在顶层有不明确的列名是完全可以的,例如:SELECT&nbsp;id,&nbsp;id,&nbsp;not_the_id&nbsp;AS&nbsp;id FROM&nbsp;book这是完全有效的 SQL。不能将此查询嵌套为派生表,因为不允许出现多义词,但在顶级中可以。现在,您将如何处理顶层的重复标签?您无法确定在按名称访问事物时会得到哪一个。前两个可能相同,但第三个非常不同。SELECTID清楚地区分列的唯一方法是按索引,索引是唯一的:, , 。123性能我当时也尝试过表演。我不再有基准测试结果,但很容易快速编写另一个基准测试。在下面的基准测试中,我正在对 H2 内存中实例运行一个简单的查询,并使用访问内容:ResultSet按索引按名称结果令人震惊:Benchmark&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Mode&nbsp; Cnt&nbsp; &nbsp; &nbsp; &nbsp; Score&nbsp; &nbsp; &nbsp; &nbsp;Error&nbsp; UnitsJDBCResultSetBenchmark.indexAccess&nbsp; thrpt&nbsp; &nbsp; 7&nbsp; 1130734.076 ±&nbsp; 9035.404&nbsp; ops/sJDBCResultSetBenchmark.nameAccess&nbsp; &nbsp;thrpt&nbsp; &nbsp; 7&nbsp; &nbsp;600540.553 ± 13217.954&nbsp; ops/s尽管基准测试在每次调用时运行整个查询,但按索引访问的速度几乎是其两倍!你可以看看H2的代码,它是开源的。它执行以下操作(版本 2.1.212):private int getColumnIndex(String columnLabel) {&nbsp; &nbsp; checkClosed();&nbsp; &nbsp; if (columnLabel == null) {&nbsp; &nbsp; &nbsp; &nbsp; throw DbException.getInvalidValueException("columnLabel", null);&nbsp; &nbsp; }&nbsp; &nbsp; if (columnCount >= 3) {&nbsp; &nbsp; &nbsp; &nbsp; // use a hash table if more than 2 columns&nbsp; &nbsp; &nbsp; &nbsp; if (columnLabelMap == null) {&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; HashMap<String, Integer> map = new HashMap<>();&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; // [ ... ]&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; columnLabelMap = map;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; if (preparedStatement != null) {&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; preparedStatement.setCachedColumnLabelMap(columnLabelMap);&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; }&nbsp; &nbsp; &nbsp; &nbsp; }&nbsp; &nbsp; &nbsp; &nbsp; Integer index = columnLabelMap.get(StringUtils.toUpperEnglish(columnLabel));&nbsp; &nbsp; &nbsp; &nbsp; if (index == null) {&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; throw DbException.get(ErrorCode.COLUMN_NOT_FOUND_1, columnLabel);&nbsp; &nbsp; &nbsp; &nbsp; }&nbsp; &nbsp; &nbsp; &nbsp; return index + 1;&nbsp; &nbsp; }&nbsp; &nbsp; // [ ... ]所以,有一个带有上写字母的哈希图,每次查找也执行上写。至少,它将映射缓存在预准备语句中,因此:您可以在每行上重复使用它您可以在语句的多次执行中重用它(至少这是我解释代码的方式)因此,对于非常大的结果集,它可能不再那么重要,但对于较小的结果集,它肯定很重要。关于管理权的结论像休眠或jOOQ这样的ORM可以控制大量的SQL和结果集。它确切地知道哪个列在什么位置,这项工作在生成SQL查询时已经完成。因此,当结果集从数据库服务器返回时,绝对没有理由进一步依赖列名。每个值都将位于预期位置。在休眠中,使用列名一定是一些历史性的事情。这可能也是为什么他们曾经生成这些不那么可读的列别名,以确保每个别名都是不明确的。这似乎是一个明显的改进,无论在现实世界(非基准)查询中的实际收益如何。即使改进只有 2%,也是值得的,因为它会影响每个基于 Hibernate 的应用程序执行的每个查询。下面的基准代码,用于复制package org.jooq.test.benchmarks.local;import java.io.*;import java.sql.*;import java.util.Properties;import org.openjdk.jmh.annotations.*;import org.openjdk.jmh.infra.*;@Fork(value = 1)@Warmup(iterations = 3, time = 3)@Measurement(iterations = 7, time = 3)public class JDBCResultSetBenchmark {&nbsp; &nbsp; @State(Scope.Benchmark)&nbsp; &nbsp; public static class BenchmarkState {&nbsp; &nbsp; &nbsp; &nbsp; Connection connection;&nbsp; &nbsp; &nbsp; &nbsp; @Setup(Level.Trial)&nbsp; &nbsp; &nbsp; &nbsp; public void setup() throws Exception {&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; try (InputStream is = BenchmarkState.class.getResourceAsStream("/config.properties")) {&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Properties p = new Properties();&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; p.load(is);&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; connection = DriverManager.getConnection(&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; p.getProperty("db.url"),&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; p.getProperty("db.username"),&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; p.getProperty("db.password")&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; );&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; }&nbsp; &nbsp; &nbsp; &nbsp; }&nbsp; &nbsp; &nbsp; &nbsp; @TearDown(Level.Trial)&nbsp; &nbsp; &nbsp; &nbsp; public void teardown() throws Exception {&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; connection.close();&nbsp; &nbsp; &nbsp; &nbsp; }&nbsp; &nbsp; }&nbsp; &nbsp; @FunctionalInterface&nbsp; &nbsp; interface ThrowingConsumer<T> {&nbsp; &nbsp; &nbsp; &nbsp; void accept(T t) throws SQLException;&nbsp; &nbsp; }&nbsp; &nbsp; private void run(BenchmarkState state, ThrowingConsumer<ResultSet> c) throws SQLException {&nbsp; &nbsp; &nbsp; &nbsp; try (Statement s = state.connection.createStatement();&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; ResultSet rs = s.executeQuery("select c as c1, c as c2, c as c3, c as c4 from system_range(1, 10) as t(c);")) {&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; c.accept(rs);&nbsp; &nbsp; &nbsp; &nbsp; }&nbsp; &nbsp; }&nbsp; &nbsp; @Benchmark&nbsp; &nbsp; public void indexAccess(Blackhole blackhole, BenchmarkState state) throws SQLException {&nbsp; &nbsp; &nbsp; &nbsp; run(state, rs -> {&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; while (rs.next()) {&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; blackhole.consume(rs.getInt(1));&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; blackhole.consume(rs.getInt(2));&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; blackhole.consume(rs.getInt(3));&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; blackhole.consume(rs.getInt(4));&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; }&nbsp; &nbsp; &nbsp; &nbsp; });&nbsp; &nbsp; }&nbsp; &nbsp; @Benchmark&nbsp; &nbsp; public void nameAccess(Blackhole blackhole, BenchmarkState state) throws SQLException {&nbsp; &nbsp; &nbsp; &nbsp; run(state, rs -> {&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; while (rs.next()) {&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; blackhole.consume(rs.getInt("C1"));&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; blackhole.consume(rs.getInt("C2"));&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; blackhole.consume(rs.getInt("C3"));&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; blackhole.consume(rs.getInt("C4"));&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; }&nbsp; &nbsp; &nbsp; &nbsp; });&nbsp; &nbsp; }}
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Java