菜单

PHPExcel解决内存占用过大问题-设置单元格对象缓存

2012年06月6日 - php

PHPExcel是一个很强大的处理Excel的PHP开源类,但是很大的一个问题就是它占用内存太大,从1.7.3开始,它支持设置cell的缓存方式,但是推荐使用目前稳定的版本1.7.6,因为之前的版本都会不同程度的存在bug,以下是其官方文档:

PHPExcel1.7.6官方文档 写道

PHPExcel uses an average of about 1k/cell in your worksheets, so large workbooks can quickly use up available memory. Cell caching provides a mechanism that allows PHPExcel to maintain the cell objects in a smaller size of memory, on disk, or in APC, memcache or Wincache, rather than in PHP memory. This allows you to reduce the memory usage for large workbooks, although at a cost of speed to access cell data.

PHPExcel平均下来使用1k/单元格的内存,因此大的文档会导致内存消耗的也很快。单元格缓存机制能够允许PHPExcel将内存中的小的单元格对象缓存在磁盘或者APC,memcache或者Wincache中,尽管会在读取数据上消耗一些时间,但是能够帮助你降低内存的消耗。

PHPExcel1.76.官方文档 写道
By default, PHPExcel still holds all cell objects in memory, but you can specify alternatives. To enable cell caching, you must call the PHPExcel_Settings::setCacheStorageMethod() method, passing in the caching method that you wish to use.

默认情况下,PHPExcel依然将单元格对象保存在内存中,但是你可以自定义。你可以使用PHPExcel_Settings::setCacheStorageMethod()方法,将缓存方式作为参数传递给这个方法来设置缓存的方式。

Php代码

$cacheMethod = PHPExcel_CachedObjectStorageFactory::cache_in_memory;  
PHPExcel_Settings::setCacheStorageMethod($cacheMethod);  

PHPExcel1.7.6官方文档 写道
setCacheStorageMethod() will return a boolean true on success, false on failure (for example if trying to cache to APC when APC is not enabled).
setCacheStorageMethod()方法会返回一个BOOL型变量用于表示是否成功设置(比如,如果APC不能使用的时候,你设置使用APC缓存,将会返回false)

PHPExcel1.7.6官方文档 写道
A separate cache is maintained for each individual worksheet, and is automatically created when the worksheet is instantiated based on the caching method and settings that you have configured. You cannot change the configuration settings once you have started to read a workbook, or have created your first worksheet.

每一个worksheet都会有一个独立的缓存,当一个worksheet实例化时,就会根据设置或配置的缓存方式来自动创建。一旦你开始读取一个文件或者你已经创建了第一个worksheet,就不能在改变缓存的方式了。

PHPExcel1.7.6官方文档 写道
Currently, the following caching methods are available.
目前,有以下几种缓存方式可以使用:

Php代码
PHPExcel_CachedObjectStorageFactory::cache_in_memory;
PHPExcel1.7.6官方文档 写道
The default. If you don’t initialise any caching method, then this is the method that PHPExcel will use. Cell objects are maintained in PHP memory as at present.
默认情况下,如果你不初始化任何缓存方式,PHPExcel将使用内存缓存的方式。
===============================================

Php代码
PHPExcel_CachedObjectStorageFactory::cache_in_memory_serialized;
PHPExcle1.7.6官方文档 写道
Using this caching method, cells are held in PHP memory as an array of serialized objects, which reduces the memory footprint with minimal performance overhead.
使用这种缓存方式,单元格会以序列化的方式保存在内存中,这是降低内存使用率性能比较高的一种方案。
===============================================

Php代码
PHPExcel_CachedObjectStorageFactory::cache_in_memory_gzip;
PHPExcel1.7.6官方文档 写道
Like cache_in_memory_serialized, this method holds cells in PHP memory as an array of serialized objects, but gzipped to reduce the memory usage still further, although access to read or write a cell is slightly slower.
与序列化的方式类似,这种方法在序列化之后,又进行gzip压缩之后再放入内存中,这回跟进一步降低内存的使用,但是读取和写入时会有一些慢。

===========================================================

Php代码
PHPExcel_CachedObjectStorageFactory::cache_to_discISAM;
PHPExcel1.7.6官方文档 写道
When using cache_to_discISAM all cells are held in a temporary disk file, with only an index to their location in that file maintained in PHP memory. This is slower than any of the cache_in_memory methods, but significantly reduces the memory footprint.
The temporary disk file is automatically deleted when your script terminates.
当使用cache_to_discISAM这种方式时,所有的单元格将会保存在一个临时的磁盘文件中,只把他们的在文件中的位置保存在PHP的内存中,这会比任何一种缓存在内存中的方式都慢,但是能显著的降低内存的使用。临时磁盘文件在脚本运行结束是会自动删除。

===========================================================

Php代码
PHPExcel_CachedObjectStorageFactory::cache_to_phpTemp;
PHPExcel1.7.6官方文档 写道
Like cache_to_discISAM, when using cache_to_phpTemp all cells are held in the php://temp I/O stream, with only an index to their location maintained in PHP memory. In PHP, the php://memory wrapper stores data in the memory: php://temp behaves similarly, but uses a temporary file for storing the data when a certain memory limit is reached. The default is 1 MB, but you can change this when initialising cache_to_phpTemp.
类似cache_to_discISAM这种方式,使用cache_to_phpTemp时,所有的单元格会还存在php://temp I/O流中,只把他们的位置保存在PHP的内存中。PHP的php://memory包裹器将数据保存在内存中,php://temp的行为类似,但是当存储的数据大小超过内存限制时,会将数据保存在临时文件中,默认的大小是1MB,但是你可以在初始化时修改它:
Php代码
$cacheMethod = PHPExcel_CachedObjectStorageFactory:: cache_to_phpTemp;
$cacheSettings = array( ‘ memoryCacheSize ‘ => ‘8MB’
);
PHPExcel_Settings::setCacheStorageMethod($cacheMethod, $cacheSettings);
PHPExcel1.7.6官方文档 写道
The php://temp file is automatically deleted when your script terminates.
php://temp文件在脚本结束是会自动删除。

===========================================================

Php代码
PHPExcel_CachedObjectStorageFactory::cache_to_apc;
PHPExcle1.7.6官方文档 写道
When using cache_to_apc, cell objects are maintained in APC with only an index maintained in PHP memory to identify that the cell exists. By default, an APC cache timeout of 600 seconds is used, which should be enough for most applications: although it is possible to change this when initialising cache_to_APC.
当使用cach_to_apc时,单元格保存在APC中,只在内存中保存索引。APC缓存默认超时时间时600秒,对绝大多数应用是足够了,当然你也可以在初始化时进行修改:
Php代码
$cacheMethod = PHPExcel_CachedObjectStorageFactory::cache_to_APC;
$cacheSettings = array( ‘cacheTime’ => 600
);
PHPExcel_Settings::setCacheStorageMethod($cacheMethod, $cacheSettings);
PHPExcel1.7.6官方文档 写道
When your script terminates all entries will be cleared from APC, regardless of the cacheTime value, so it cannot be used for persistent storage using this mechanism.
当脚本运行结束时,所有的数据都会从APC中清楚(忽略缓存时间),不能使用此机制作为持久缓存。

===========================================================
Php代码
PHPExcel_CachedObjectStorageFactory::cache_to_memcache
PHPExcel1.7.6官方文档 写道
When using cache_to_memcache, cell objects are maintained in memcache with only an index maintained in PHP memory to identify that the cell exists.
By default, PHPExcel looks for a memcache server on localhost at port 11211. It also sets a memcache timeout limit of 600 seconds. If you are running memcache on a different server or port, then you can change these defaults when you initialise cache_to_memcache:
使用cache_to_memory时,单元格对象保存在memcache中,只在内存中保存索引。默认情况下,PHPExcel会在localhost和端口11211寻找memcache服务,超时时间600秒,如果你在其他服务器或其他端口运行memcache服务,可以在初始化时进行修改:
Php代码
$cacheMethod = PHPExcel_CachedObjectStorageFactory::cache_to_memcache;
$cacheSettings = array( ‘memcacheServer’ => ‘localhost’,
‘memcachePort’ => 11211,
‘cacheTime’ => 600
);
PHPExcel_Settings::setCacheStorageMethod($cacheMethod, $cacheSettings);
从初始化设置的形式上看,MS还不支持多台memcache服务器轮询的方式,比较遗憾。
PHPExcel1.7.6官方文档 写道
When your script terminates all entries will be cleared from memcache, regardless of the cacheTime value, so it cannot be used for persistent storage using this mechanism.
当脚本结束时,所有的数据都会从memcache清空(忽略缓存时间),不能使用该机制进行持久存储。

===========================================================
Php代码
PHPExcel_CachedObjectStorageFactory::cache_to_wincache;
PHPExcel1.7.6官方文档 写道
When using cache_to_wincache, cell objects are maintained in Wincache with only an index maintained in PHP memory to identify that the cell exists. By default, a Wincache cache timeout of 600 seconds is used, which should be enough for most applications: although it is possible to change this when initialising cache_to_wincache.
使用cache_towincache方式,单元格对象会保存在Wincache中,只在内存中保存索引,默认情况下Wincache过期时间为600秒,对绝大多数应用是足够了,当然也可以在初始化时修改:
Php代码
$cacheMethod = PHPExcel_CachedObjectStorageFactory::cache_to_wincache;
$cacheSettings = array( ‘cacheTime’ => 600
);
PHPExcel_Settings::setCacheStorageMethod($cacheMethod, $cacheSettings);
PHPExcel官方文档1.7.6 写道
When your script terminates all entries will be cleared from Wincache, regardless of the cacheTime value, so it cannot be used for persistent storage using this mechanism.

呃,终于“又”写完了,之前写的一版,有的文字是直接从word粘过来的,带了一大堆格式,被截断了,悲了个剧的,翻译文档也不是件容易的事啊……PHPExcel还是比较强大的,最大的问题就是内存占用的问题,我之前用的1.7.2,还没有这种机制,导出2W+数据,占用了400M+内存,改成1.7.6,使用cach_to_diskISAM方式,内存降低到200M-,效果还是很明显的,不过依然还是够高的,excel文件5.1M,就使用了200M-和未知大小的磁盘空间,PHPExcel啥时候能出一个轻量级的版本,不需要那么多花哨的功能,只需要导出最普通的数据的版本就好了!

发表评论

电子邮件地址不会被公开。 必填项已用*标注