java实现文件读和写,在编写实现文件读写功能的java

　　00-1010后台场景分析场景1:小文件单文件压缩模式1:在线循环(市面上流传的神话其实是一朵带刺的玫瑰)模式2:使用缓冲区模式3:使用通道模式4:使用mmp场景2:大文件单文件压缩场景3:大文件多文件压缩分析结论秘诀1、在线循环方法2、使用缓冲区模式3、使用通道4、使用mmp

　　00-1010最近在探索卡夫卡。为什么这么快？背后的秘密是什么？

　　带着好奇，我开始像剥洋葱一样一层一层的嵌入。一步步揭示卡夫卡能挂mq的真正原因。了解之后，不得不说卡夫卡：yyds。

　　了解顺序归档的应用。

　　检测到稀疏索引的引入。

　　了解其零拷贝技术的威力

　　神来之笔闻mmp(内存映射文件)。

　　Mmp这么神奇，应用在文件压缩上是否也能实现快速压缩？

　　带着好奇，我决定用实际行动证明这个结论(否则我们的知识只能是纸上谈兵)。

　　编码是我们本能的功能，好奇心是我们永恒的武器。不能丢

　　从前有个BA跟我说过他的经历：DEV皈依BA后，代码生疏了。后来他强迫自己每次迭代都拿一个小需求来推自己。

　　曾几何时，一位前辈告诉我：即使你长大成为架构师，甚至更高，也不能失去编码这个神器。不然你会觉得很尴尬。3354将被称为“需求翻译机”

　　这不是心灵鸡汤，这是来自心灵的建议，我深深地认识到，编码真的是一项活到老学到老的工作。

　　看到很多优秀的同事离职，离开，通过交流我感触更深。

　　所以，大家一定要记住：学一门知识，要努力再应用。这样才能牢牢记住；学习中不求解答，就要完全了解这些知识，为什么要做。

　　public void zipfile without buffer(String outFile){ long begin time=system . current time millis()；File zipFile=新文件(outFile)；File inputFile=新文件(INPUT _ File)；try(ZipOutputStream ZipOutputStream=new ZipOutputStream(new file output stream(zipFile))){ try(InputStream InputStream=new file InputStream(inputFile)){ ZipOutputStream . putnextentry(new zip entry(input file . getname()))；内部温度；while ((temp=inputStream.read())！=-1){ zipoutputstream . write(temp)；} } printResult(beginTime，不带缓冲区)；} catch(Exception e){ e . printstacktrace()；system . out . println( error e . getmessage())；} }

　　00-1010小王很高兴，提交了代码，翻转了需求状态，可以接受。

　　小华是团队里的高级技术专家。查代码的时候发现自己不知所措：我是不是在网上搜的？这个会比较慢，你可以再研究一下。

　　小王又改变了主意，利用BufferedOutputStream.这个缓冲地带

　　执行结果(快得多)

　　zipMethod=withBuffer

　　成本时间=5170毫秒

　　代码如下

　　：

public void zipFileWithBuffer(String outFile){ long beginTime = System.currentTimeMillis(); File zipFile = new File(outFile); File inputFile = new File(INPUT_FILE); try(ZipOutputStream zipOutputStream = new ZipOutputStream(new FileOutputStream(zipFile)); BufferedOutputStream bufferedOutputStream = new BufferedOutputStream(zipOutputStream)) { try (BufferedInputStream bufferedInputStream = new BufferedInputStream(new FileInputStream(inputFile))){ zipOutputStream.putNextEntry(new ZipEntry(inputFile.getName())); int temp; while ((temp = bufferedInputStream.read()) != -1){ bufferedOutputStream.write(temp); } } printResult(beginTime,"withBuffer"); } catch (Exception e) { e.printStackTrace(); System.out.println("error" + e.getMessage()); } }

方式3：使用通道

小王怀着忐忑的心情，又一次召集大家走查代码。

　　小花：速度要求没那么高，这样做已经差不多了，代码可以提交了

　　其实最近研究kafka，接触过nio，知晓：nio有种技术叫通道：Channel

　　执行结果（好快）

zipMethod=withChannel

　　costTime=1642ms
　　

代码如下：

public void zipFileWithChannel(String outFile){ long beginTime = System.currentTimeMillis(); File zipFile = new File(outFile); File inputFile = new File(INPUT_FILE); try(ZipOutputStream zipOutputStream = new ZipOutputStream(new FileOutputStream(zipFile)); WritableByteChannel writableByteChannel = Channels.newChannel(zipOutputStream)) { try (FileChannel fileChannel = new FileInputStream(inputFile).getChannel()){ zipOutputStream.putNextEntry(new ZipEntry(inputFile.getName())); fileChannel.transferTo(0,inputFile.length(),writableByteChannel); } printResult(beginTime,"withChannel"); } catch (Exception e) { e.printStackTrace(); System.out.println("error" + e.getMessage()); } }

方式4：使用mmp

研究kafka过程中，不止知晓nio有种技术叫通道：Channel，还有种技术叫mmp

　　执行结果（好快）

zipMethod=withMmp

　　costTime=1554ms
　　

代码如下：

public void zipFileWithMmp(String outFile){ long beginTime = System.currentTimeMillis(); File zipFile = new File(outFile); File inputFile = new File(INPUT_FILE); try(ZipOutputStream zipOutputStream = new ZipOutputStream(new FileOutputStream(zipFile)); WritableByteChannel writableByteChannel = Channels.newChannel(zipOutputStream)) { zipOutputStream.putNextEntry(new ZipEntry(inputFile.getName())); MappedByteBuffer mappedByteBuffer = new RandomAccessFile(INPUT_FILE,"r").getChannel() .map(FileChannel.MapMode.READ_ONLY,0,inputFile.length()); writableByteChannel.write(mappedByteBuffer); printResult(beginTime,"withMmp"); } catch (Exception e) { e.printStackTrace(); System.out.println("error" + e.getMessage()); } }

场景2：大文件单文件压缩

1、原始文件介绍：585M、 csv文件、单个文件2、对比技术介绍：使用缓冲区、使用管道、使用mmp3、对比结果展示：使用缓冲区使用通道使用mmpcostTime=46034mscostTime=11885mscostTime=10810ms

场景3：大文件多文件压缩

1、原始文件介绍：585M、 csv文件、5个文件2、对比技术介绍：使用缓冲区、使用管道、使用mmp3、对比结果展示：使用缓冲区使用通道使用mmpcostTime=173122mscostTime=53982mscostTime=50543ms

分析结论

1、对比见下表

　　压缩场景网上流传使用缓冲区使用通道使用mmp场景1：小文件单文件压缩（60M）327000ms5170ms1642ms1554ms场景2：大文件单文件压缩（585M）--46034ms11885ms10810ms场景3：大文件多文件压缩（5个585M）--173122ms53982ms50543ms场景4：100K文件单文件压缩--28ms26ms24ms场景5：5K文件单文件压缩18ms20ms23ms场景5：1K文件单文件压缩15ms21ms24ms结论：

　　1）网上流传的方法不可取，效率最差2）使用缓冲区虽然性能还凑合，但和两种nio技术（通道和mmp）相比，还是差了很多，尤其是在中型文件（500M左右）的单文件压缩和多文件压缩中，对比更加明显3）通道技术和mmp技术对比相差不大，小型文件基本没影响，大型文件差距也在几秒之间4）文件大于10K时，推荐使用通道技术或者mmp技术进行文件压缩5）文件小于10K时，推荐使用缓冲区技术（比两种nio技术表现了更好的性能）6）如果有些团队在使用api，可以看看其源码是否使用了nio技术。如果不是，建议修改为文中方式另外，操作文件操作时，都可以尝试使用nio技术，测试下其效率，理论上应该都是很可观的

背后机密

1、网上流传方法

FileInputStream的read方法如下：

/** * Reads a byte of data from this input stream. This method blocks * if no input is yet available. * * @return the next byte of data, or <code>-1</code> if the end of the * file is reached. * @exception IOException if an I/O error occurs. */public int read() throws IOException { return read0();}private native int read0() throws IOException;

这是调用本地方法与原生操作系统进行交互，从磁盘中读取数据。每读取一个字节数据就调用一次这个方法（一次交互很耗时）。

　　这个方法还是每次读取一个字节，假如文件很大，这个开销是巨大的

2、使用缓冲区

BufferedInputSream read方法如下：

/** * See * the general contract of the <code>read</code> * method of <code>InputStream</code>. * * @return the next byte of data, or <code>-1</code> if the end of the * stream is reached. * @exception IOException if this input stream has been closed by * invoking its {@link #close()} method, * or an I/O error occurs. * @see java.io.FilterInputStream#in */public synchronized int read() throws IOException { if (pos >= count) { fill(); if (pos >= count) return -1; } return getBufIfOpen()[pos++] & 0xff;}

这样虽然也是一次读一个字节，但不是每次都从底层读取数据，而是一次调用底层系统读取了最多buf.length个字节到buf数组中，然后从 buf中一次读一个字节，减少了频繁调用底层接口的开销。

3、使用通道

在复制大文件时，FileChannel复制文件的速度比BufferedInputStream/BufferedOutputStream复制文件的速度快了近三分之一，体现出FileChannel的速度优势。NIO的Channel的结构更加符合操作系统执行I/O的方式，所以其速度相比较于传统的IO而言速度有了显著的提高。

　　操作系统能够直接传输字节从文件系统缓存到目标的Channel中，而不需要实际的copy阶段（copy: 从内核空间转到用户空间的一个过程）