龙空技术网

DBCP数据库连接打满原因分析

东东程序猿 94

前言:

此时姐妹们对“removeabandoned默认值”都比较关怀,同学们都想要知道一些“removeabandoned默认值”的相关文章。那么小编也在网摘上汇集了一些关于“removeabandoned默认值””的相关资讯,希望同学们能喜欢,咱们一起来了解一下吧!

实验背景

近一年来发生几起的数据库连接被打满的情况,初步分析是应用使用连接数量“超过了”连接池(DBCP1.4)的上限,导致数据库连接被打满,其中一个结论是连接池的bug导致

问题分析

1、DBCP连接池的Bug导致连接数超过大小设置,其根本原因是在连接池初始化的时候,有可能创建多个连接池导致

protected synchronized DataSource createDataSource()        throws SQLException {        if (closed) {            throw new SQLException("Data source is closed");        }        // Return the pool if we have already created it        if (dataSource != null) {            return (dataSource);        }        // create factory which returns raw physical connections        ConnectionFactory driverConnectionFactory = createConnectionFactory();        // create a pool for our connections        createConnectionPool();        // Set up statement pool, if desired        GenericKeyedObjectPoolFactory statementPoolFactory = null;        if (isPoolPreparedStatements()) {            statementPoolFactory = new GenericKeyedObjectPoolFactory(null,                        -1, // unlimited maxActive (per key)                        GenericKeyedObjectPool.WHEN_EXHAUSTED_FAIL,                        0, // maxWait                        1, // maxIdle (per key)                        maxOpenPreparedStatements);        }        // Set up the poolable connection factory        createPoolableConnectionFactory(driverConnectionFactory, statementPoolFactory, abandonedConfig);        // Create and return the pooling data source to manage the connections        createDataSourceInstance();        try {            for (int i = 0 ; i < initialSize ; i++) {                connectionPool.addObject();            }        } catch (Exception e) {            throw new SQLNestedException("Error preloading the connection pool", e);        }        return dataSource;    }

createDataSource方法里边调用createConnectionPool方法,如果后面方式执行失败,比如createPoolableConnectionFactory,当再次调用createDataSource,又会再次调用createConnectionPool,从而初始化多次连接池,下面是createConnectionPool

 protected void createConnectionPool() {        // Create an object pool to contain our active connections        GenericObjectPool gop;        if ((abandonedConfig != null) && (abandonedConfig.getRemoveAbandoned())) {            gop = new AbandonedObjectPool(null,abandonedConfig);        }        else {            gop = new GenericObjectPool();        }        gop.setMaxActive(maxActive);        gop.setMaxIdle(maxIdle);        gop.setMinIdle(minIdle);        gop.setMaxWait(maxWait);        gop.setTestOnBorrow(testOnBorrow);        gop.setTestOnReturn(testOnReturn);        gop.setTimeBetweenEvictionRunsMillis(timeBetweenEvictionRunsMillis);        gop.setNumTestsPerEvictionRun(numTestsPerEvictionRun);        gop.setMinEvictableIdleTimeMillis(minEvictableIdleTimeMillis);        gop.setTestWhileIdle(testWhileIdle);        connectionPool = gop;    }

显然这个情况只能是模块启动初始化的时候产生,且连接都不会被使用,而线上出现的问题是连接都在执行sql,不是这种情况!在1.4x的这个bug已经修复

2、启动参数分析,从启动参数入手,看一下是不是连接池主动释放连接,导致正在使用的连接被释放,从而创建新的连接,给我们的现象是“连接数”超过了连接池的限制,下面是一个入库的连接池参数

partition1.driverClassName=com.mysql.jdbc.Driverpartition1.initialSize=2partition1.maxActive=25partition1.minIdle=2partition1.maxIdle=5partition1.maxWait=3000partition1.threadPoolSize=10partition1.logAbandoned=truepartition1.testWhileIdle=truepartition1.testOnReturn=falsepartition1.testOnBorrow=truepartition1.validationQuery=select now()//在每次空闲连接回收器线程(如果有)运行时检查的连接数量partition1.numTestsPerEvictionRun=5//在空闲连接回收器线程运行期间休眠的时间值,以毫秒为单位partition1.timeBetweenEvictionRunsMillis=30000//连接在池中保持空闲而不被空闲连接回收器线程partition1.minEvictableIdleTimeMillis=180000//设置了rmoveAbandoned=true 那么在getNumActive()快要到getMaxActive()的时候,系统会进行无效的Connection的回收,回收的 Connection为removeAbandonedTimeout(默认300秒)中设置的秒数后没有使用的Connectionpartition1.removeAbandoned=true//强制回收连接的时间,单位秒partition1.removeAbandonedTimeout=18

注意connectionProperties参数

<bean id="partition[j]" class="org.apache.commons.dbcp.BasicDataSource"          destroy-method="close">        <property name="driverClassName" value="${partition[j].driverClassName}" ></property>        <property name="url" value="${partition[j].url}" ></property>        <property name="username" value="${partition[j].username}" ></property>        <property name="password" value="${partition[j].password}" ></property>        <property name="defaultAutoCommit" value="false" ></property>        <property name="maxActive" value="${partition[j].maxActive}" ></property>        <property name="maxIdle" value="${partition[j].maxIdle}" ></property>        <property name="maxWait" value="${partition[j].maxWait}" ></property>        <property name="initialSize" value="${partition[j].initialSize}" ></property>        <property name="minIdle" value="${partition[j].minIdle}" ></property>        <property name="logAbandoned" value="${partition[j].logAbandoned}" ></property>        <property name="testWhileIdle" value="${partition[j].testWhileIdle}" ></property>        <property name="testOnReturn" value="${partition[j].testOnReturn}" ></property>        <property name="testOnBorrow" value="${partition[j].testOnBorrow}" ></property>        <property name="validationQuery" value="${partition[j].validationQuery}" ></property>        <property name="numTestsPerEvictionRun" value="${partition[j].numTestsPerEvictionRun}" ></property>        <property name="timeBetweenEvictionRunsMillis" value="${partition[j].timeBetweenEvictionRunsMillis}" ></property>        <property name="minEvictableIdleTimeMillis" value="${partition[j].minEvictableIdleTimeMillis}" ></property>        <property name="removeAbandoned" value="${partition[j].removeAbandoned}" ></property>        <property name="removeAbandonedTimeout" value="${partition[j].removeAbandonedTimeout}" ></property>        <property name="connectionProperties" value="useUnicode=true;   characterEncoding=utf8;initialTimeout=1;connectTimeout=1000;socketTimeout=6000;   rewriteBatchedStatements=true;autoReconnectForPools=true;autoReconnect=true;maxReconnects=1;   failOverReadOnly=false;roundRobinLoadBalance=true;allowMultiQueries=true"></property>    </bean>

从上面参数我们重点关注removeAbandonedTimeout 这个参数的意义

  //创建连接   public Object borrowObject() throws Exception {        if (config != null                && config.getRemoveAbandoned()                && (getNumIdle() < 2)                && (getNumActive() > getMaxActive() - 3) ) {            removeAbandoned();        }        Object obj = super.borrowObject();        if (obj instanceof AbandonedTrace) {            ((AbandonedTrace) obj).setStackTrace();        }        if (obj != null && config != null && config.getRemoveAbandoned()) {            synchronized (trace) {                trace.add(obj);            }        }    ...    private void removeAbandoned() {        // Generate a list of abandoned connections to remove        long now = System.currentTimeMillis();        long timeout = now - (config.getRemoveAbandonedTimeout() * 1000);        ArrayList remove = new ArrayList();        synchronized (trace) {            Iterator it = trace.iterator();            while (it.hasNext()) {                AbandonedTrace pc = (AbandonedTrace) it.next();                if (pc.getLastUsed() > timeout) {                    continue;                }                if (pc.getLastUsed() > 0) {                    remove.add(pc);                }            }        }        // Now remove the abandoned connections        Iterator it = remove.iterator();        while (it.hasNext()) {            AbandonedTrace pc = (AbandonedTrace) it.next();            if (config.getLogAbandoned()) {                pc.printStackTrace();            }                         try {                invalidateObject(pc);            } catch (Exception e) {                e.printStackTrace();            }        }    }

代从代码可以看出,配置了removeAbandonedOnBorrow和removeAbandonedTimeout就是触发正在使用的连接remove,假如这个时候,数据库正在执行sql,就会出现连接池已经断开连接,而数据库的连接还在执行,造成数据库的连接超过连接池的显现,如果少量慢sql执行时间大于removeAbandonedTimeout,不会出现问题,出现大量的慢sql,就会导致数据库中的这种连接越来越多,最后把数据库打满

实验过程

创建测试库

mkdir /usr/local/data/mysqldocker run -d -e MYSQL_ROOT_PASSWORD=root --name centos/mysql-57-centos7 -v /usr/local/data/mysql:/var/lib/mysql -p 3306:3306 mysql --character-set-server=utf8mb4 --collation-server=utf8mb4_unicode_ci --lower_case_table_names=1

创建测试库

create database test ;

构建测试类,核心代码

public void setUp() throws Exception {        ds = createDataSource();        ds.setDriverClassName("com.mysql.jdbc.Driver");        ds.setUrl("jdbc:mysql://127.0.0.1:3306/test?allowMultiQueries=true&characterEncoding=UTF-8&autoReconnect=true&failOverReadOnly=false&useSSL=false&allowPublicKeyRetrieval=true");        ds.setUsername("root");        ds.setPassword("Htbuy@2016");        ds.setMaxActive(1);        ds.setMaxWait(1000);        ds.setTestWhileIdle(true);        ds.setTestOnBorrow(true);        ds.setTestOnReturn(false);        ds.setValidationQuery("select now()");        ds.setNumTestsPerEvictionRun(5);        ds.setMinEvictableIdleTimeMillis(2000);        ds.setLogAbandoned(true);        ds.setRemoveAbandoned(true);        ds.setRemoveAbandonedTimeout(1);}public void testAbandoned() throws Exception {        for (int i = 0; i < 20; i++) {            Thread t = new Thread(new Runnable() {                @Override                public void run() {                    try {                        Connection conn = ds.getConnection();                        Statement statement = conn.createStatement();                        ResultSet resultSet = null;     #模拟慢sql                        resultSet = statement.executeQuery("select  sleep(100),now()");                        while (resultSet.next()) {                            System.out.println("result+" + resultSet.getString(1));                        }                        resultSet.close();                        statement.close();                        conn.close();                    } catch (Exception ex) {                        System.out.println(ex.getMessage());                    }                    System.out.println(Thread.currentThread().getName() + "---------------------- end----------------------");                }            });            t.setName(i + "");            t.start();            Thread.sleep((i + 1) * 1000);        }        System.out.println(Thread.currentThread().getName() + "---------------------- end----------------------");        Thread.sleep(1000000);    }

通过运行上面代码,并没有像我们所期望那样,超过连接的最大限制1,通过RemoveAbandoned=true和RemoveAbandonedTimeout=1 应该已经触发了连接池的Abandoned机制,但是都阻塞到下面的一行代码,DelegatingStatement的close方法

/**     * Close this DelegatingStatement, and close     * any ResultSets that were not explicitly closed.     */    public void close() throws SQLException {        try {            try {                if (_conn != null) {                    _conn.removeTrace(this);                    _conn = null;                }                // The JDBC spec requires that a statment close any open                // ResultSet's when it is closed.                // FIXME The PreparedStatement we're wrapping should handle this for us.                // See bug 17301 for what could happen when ResultSets are closed twice.                List resultSets = getTrace();                if( resultSets != null) {                    ResultSet[] set = (ResultSet[]) resultSets.toArray(new ResultSet[resultSets.size()]);                    for (int i = 0; i < set.length; i++) {                        set[i].close();                    }                    clearTrace();                }//阻塞地方                _stmt.close();            }            catch (SQLException e) {                handleException(e);            }        }        finally {            _closed = true;        }    }

当我们在url设置socketTimeout=1000,这时候阻塞的地方成功执行完成,超过数据库出现大量的连接数1,问题重现!同时出现典型的日志The last packet successfully received from the server was 1,001 milliseconds ago. The last packet sent successfully to the server was 1,001 milliseconds ago.

当设置socketTimeout=1000,RemoveAbandoned=false ,并没有重现问题,且都是大量的等待连接超时

原因是下面的代码差异,开启RemoveAbandoned,使用的是AbandonedObjectPool,而不开启是默认的GenericObjectPool,AbandonedObjectPool增加了Abandoned逻辑

 */    protected void createConnectionPool() {        // Create an object pool to contain our active connections        GenericObjectPool gop;        if ((abandonedConfig != null) && (abandonedConfig.getRemoveAbandoned())) {            gop = new AbandonedObjectPool(null,abandonedConfig);        }        else {            gop = new GenericObjectPool();        }        gop.setMaxActive(maxActive);        gop.setMaxIdle(maxIdle);        gop.setMinIdle(minIdle);        gop.setMaxWait(maxWait);        gop.setTestOnBorrow(testOnBorrow);        gop.setTestOnReturn(testOnReturn);        gop.setTimeBetweenEvictionRunsMillis(timeBetweenEvictionRunsMillis);        gop.setNumTestsPerEvictionRun(numTestsPerEvictionRun);        gop.setMinEvictableIdleTimeMillis(minEvictableIdleTimeMillis);        gop.setTestWhileIdle(testWhileIdle);        connectionPool = gop;    }
总结

当RemoveAbandoned=true, 且执行时间超过socketTimeout ,达到RemoveAbandonedTimeout的触发点时,就会导致数据库连接数超过连接池的限制,注意这种情况关闭模块是没用的,sql还在数据库中执行,应该直接kill或者切库操作!!

优化方案

1、关闭RemoveAbandoned设置,目前wms系统的还是AP和TP混合型,有很大的几率触发这种情况

2、经过邱玉堃复核模块测试,需要关闭socketTimeout参数,引用的1.4.jar版本与源码的1.4版本存在差异

3、 注意当mysql-connector-java 版本小于5.1.45就会出现这个bug

标签: #removeabandoned默认值