一文详解MySQL text能存多少个字符

 

前言

今天测试给提了个bug, 排查原因是插入数据时字段长度不够导致。我使用的是MySQL8的数据库,在给某个表新增一条数据,其中一个字段submit_info使用的是text的类型预设用来存储大的json字符串。之前对text到底能存多长的字符串没概念,恰好这次预到问题决定调查一下。这里记录一下

 

一、字符集字符长度和字节长度

首先需要知道字符长度和字节长度、和字符集,不同的字符集存储的中文字符时占的字节长度不一样。

MySQL在当前库中通过 show variables like '%char%';可查看当前数据库的字符集character_set_database->utf8mb4 。

show variables like ‘%char%’;

Variable_nameValue
character_set_clientutf8mb4
character_set_connectionutf8mb4
character_set_databaseutf8mb4
character_set_filesystembinary
character_set_resultsutf8mb4
character_set_serverutf8mb4
character_set_systemutf8
character_sets_dirC:\Program Files\MySQL\MySQL Server 8.0\share\charsets\

在 utf8mb4字符集下英文占用1个字节长度,一般汉字占3-4个字节长度。可用 length(字段),char_length(字段)进行区分

select version(), submit_info,length(submit_info),char_length(submit_info) from appro_flow_main where id in(35,36);

version()submit_infolength(submit_info)char_length(submit_info)
8.0.18吕吕吕93
8.0.18aaa33

根据MySQL官网的资料显示,text的长度是 L < 2^16 = 65536个字节长度,注意这里是字节长度,而不是字符长度(varchar(n)这里的n是字符长度),所以说 text 在 character_set_database->utf8mb4 字符集下,大约能存 65535 / 3 = 21845个汉字 并不多,如果存json串实际上并存储不了多大的json对象尤其是包含中文比较多的情况下。

String Type Storage Requirements

In the following table, M represents the declared column length in characters for nonbinary string types and bytes for binary string types. L represents the actual length in bytes of a given string value.

Data TypeStorage Required
CHAR(*M*)The compact family of InnoDB row formats optimize storage for variable-length character sets. See COMPACT Row Format Storage Characteristics. Otherwise, M w bytes, <= *M* <= 255, where w is the number of bytes required for the maximum-length character in the character set.
BINARY(*M*)M bytes, 0 <= *M* <= 255
VARCHAR(*M*), VARBINARY(*M*)L + 1 bytes if column values require 0 − 255 bytes, L + 2 bytes if values may require more than 255 bytes
TINYBLOB, TINYTEXTL + 1 bytes, where L < 2^8
BLOB, TEXTL + 2 bytes, where L < 2^16
MEDIUMBLOB, MEDIUMTEXTL + 3 bytes, where L < 2^24
LONGBLOB, LONGTEXTL + 4 bytes, where L < 2^32
ENUM('*value1*','*value2*',...)1 or 2 bytes, depending on the number of enumeration values (65,535 values maximum)
SET('*value1*','*value2*',...)1, 2, 3, 4, or 8 bytes, depending on the number of set members (64 members maximum)

 

二、验证案例

(一)表结构DDL和Java Entity

CREATE TABLE `appro_flow_main` (
`id` int(9) NOT NULL AUTO_INCREMENT COMMENT '物理主键id',
`flow_instance_no` varchar(64) COLLATE utf8mb4_general_ci DEFAULT NULL COMMENT '流程实例号',
`flow_instance_name` varchar(128) COLLATE utf8mb4_general_ci NOT NULL COMMENT '流程实例名称',
`flow_no` varchar(32) COLLATE utf8mb4_general_ci DEFAULT NULL COMMENT '流程号',
`appro_status` char(1) COLLATE utf8mb4_general_ci DEFAULT NULL COMMENT '流程审核状态(进行中:1、已结束:0、保存:2、撤销)',
`submit_info` text COLLATE utf8mb4_general_ci COMMENT '提交信息json格式【元数据格式:name、key、value、type】',
`sponsor` varchar(128) COLLATE utf8mb4_general_ci DEFAULT NULL COMMENT '流程发起人',
`initi_date` int(8) DEFAULT NULL COMMENT '流程发起日期',
`create_time` datetime DEFAULT CURRENT_TIMESTAMP COMMENT '创建时间',
`update_time` datetime DEFAULT CURRENT_TIMESTAMP COMMENT '更新时间',
`table_id` varchar(128) COLLATE utf8mb4_general_ci DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `uk_fin_fn` (`flow_instance_no`,`flow_no`)
) ENGINE=InnoDB AUTO_INCREMENT=37 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci COMMENT='审批流程主表'
/**
* 流程主表
*
* @author lvzb
* @date 2022/08/09  10:40
**/
@Data
@Builder
@NoArgsConstructor
@AllArgsConstructor
@TableName("appro_flow_main")
public class FlowMain implements Serializable {

  @TableId(type = IdType.AUTO)
  private Integer id;
  private String flowInstanceNo;
  private String flowInstanceName;
  private String flowNo;
  private String approStatus;
  private String submitInfo;
  private String sponsor;
  private Integer initiateDate;
  private Integer branch;
  private Date createTime;
  private Date updateTime;

}

(二)Java代码

使用的mybatis-plus 这里只给出核心代码。注意如果for循环里 i 从 0.0开始的话,第一条insert就会直接抛异常 text的最大长度是65535个字节长度,2^16 = 65536

    @Test
  void createWorld() {
      double num = Math.pow(2, 16);
      System.out.println("长度:" + num);
      StringBuilder sb = new StringBuilder();
      StringBuilder cb = new StringBuilder();
      for (double i = 1.0; i < num; i++) {
          sb.append("吕");
          cb.append("a");
      }

      flowMainMapper.insert(FlowMain.builder()
              .flowNo("" + System.nanoTime())
              .flowInstanceName("a")
              .flowInstanceNo("aaa")
              .submitInfo(cb.toString())
              .build());

      flowMainMapper.insert(FlowMain.builder()
              .flowNo("" + System.nanoTime())
              .flowInstanceName("a")
              .flowInstanceNo("aaa")
              .submitInfo(sb.toString())
              .build());
  }

控制台打印如下:

长度:65536.0
Creating a new SqlSession
SqlSession [org.apache.ibatis.session.defaults.DefaultSqlSession@38f502fc] was not registered for synchronization because synchronization is not active
2022-10-27 15:57:53.448 INFO 58032 --- [ main] com.zaxxer.hikari.HikariDataSource : HikariPool-1 - Starting...
2022-10-27 15:57:53.922 INFO 58032 --- [ main] com.zaxxer.hikari.HikariDataSource : HikariPool-1 - Start completed.
JDBC Connection [HikariProxyConnection@1440325059 wrapping com.mysql.cj.jdbc.ConnectionImpl@521a506c] will not be managed by Spring
==> Preparing: INSERT INTO appro_flow_main ( flow_instance_no, flow_instance_name, flow_no, submit_info ) VALUES ( ?, ?, ?, ? )
==> Parameters: aaa(String), a(String), 360639398372500(String), aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa...........
<== Updates: 1
Closing non transactional SqlSession [org.apache.ibatis.session.defaults.DefaultSqlSession@38f502fc]
Creating a new SqlSession
SqlSession [org.apache.ibatis.session.defaults.DefaultSqlSession@4fc3529] was not registered for synchronization because synchronization is not active
JDBC Connection [HikariProxyConnection@954492773 wrapping com.mysql.cj.jdbc.ConnectionImpl@521a506c] will not be managed by Spring
==> Preparing: INSERT INTO appro_flow_main ( flow_instance_no, flow_instance_name, flow_no, submit_info ) VALUES ( ?, ?, ?, ? )
==> Parameters: aaa(String), a(String), 360640025716500(String), 吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕吕..............
Closing non transactional SqlSession [org.apache.ibatis.session.defaults.DefaultSqlSession@4fc3529]


org.springframework.dao.DataIntegrityViolationException:
### Error updating database. Cause: com.mysql.cj.jdbc.exceptions.MysqlDataTruncation: Data truncation: Data too long for column 'submit_info' at row 1
### The error may exist in demo/mybatis/plus/dao/mapper/FlowMainMapper.java (best guess)
### The error may involve demo.mybatis.plus.dao.mapper.FlowMainMapper.insert-Inline
### The error occurred while setting parameters
### SQL: INSERT INTO appro_flow_main ( flow_instance_no, flow_instance_name, flow_no, submit_info ) VALUES ( ?, ?, ?, ? )
### Cause: com.mysql.cj.jdbc.exceptions.MysqlDataTruncation: Data truncation: Data too long for column 'submit_info' at row 1
; Data truncation: Data too long for column 'submit_info' at row 1; nested exception is com.mysql.cj.jdbc.exceptions.MysqlDataTruncation: Data truncation: Data too long for column 'submit_info' at row 1

 

三、结论

实际上根据 以上的分析

  • text 可以存储 65535个字节=> 65535 (byte)/1024 ≈ 64KB 大小
  • 在utf8mb4的字符集下 text 最多可以存储 65535 / 3(一个汉字的占用字节长度) = 21845, 2万个左右的中文字符

由上可知,比较大的JSON串尤其是含中文较多的不适合使用text进行存储。根据需求使用 更长的 MEDIUMTEXT、LONGTEXT。

参考资料

  • MySQL官网|String Type Storage Requirements
  • byte 和 KB 换算 参考

 

总结

关于MySQLtext能存多少个字符的文章就介绍至此,更多相关MySQLtext存多少字符内容请搜索编程宝库以前的文章,希望以后支持编程宝库

当开启MySQL数据库主从时,会产生大量如mysql-bin.00000* log的文件,这会大量耗费您的硬盘空间。mysql-bin.000001mysql-bin.000002mysql-b ...