fwrite与write、fread与read

对文件进行读写操作,我们会使用fwrite\write,fread\read等函数,这些函数在应用场景上有什么区别呢?


fwrite与write:
The signature of fwrite is:

fwrite(const void * ptr, size_t size, size_t count, FILE * stream );

While the signature of write is:

write(int fd, const void *buf, size_t count);

主要区别:
fwrite是C语言库函数,标准的IO函数,带有缓冲区;而write是系统调用,无缓冲区。

下面来细细分析这些区别:
1)对于fwrite等标准IO
标准I/O对每个I/O流自动进行缓存管理(标准I/O函数通常调用malloc来分配缓存)。它提供了三种类型的缓存:
- 全缓存。当填满标准I/O缓存后才执行I/O操作。磁盘上的文件通常是全缓存的。
- 行缓存。当输入输出遇到新行符或缓存满时,才由标准I/O库执行实际I/O操作。stdin、stdout通常是行缓存的。
- 无缓存。相当于read、write了。stderr通常是无缓存的,因为它必须尽快输出。

使用fwrite时,如果该缓存未满,则并不将其排入输出队列,直到缓存写满或者内核再次需要重新使用此缓存时才将其排入磁盘I/O输入队列,再进行实际的I/O操作,也就是此时才把数据真正写到磁盘,这种技术叫延迟写。

一般而言,由系统选择缓存的长度,并自动分配。标准I/O库在关闭流的时候自动释放缓存。另外,也可以使用函数fflush()将IO缓冲所有未写的数据刷到内核缓冲区,fsync()将所有内核缓冲区的数据写到文件(磁盘)。

2)对于write等系统调用函数
write是系统调用函数,把数据从进程缓冲区复制到内核缓冲区。

那么什么场景下使用write,什么场景下使用fwrite呢?
这个问题在这里给出了仔细的分析。

Timing my application with an input of 10Mb in size and echoing it to /dev/null, and making sure the file in not cached, I've found that libc's frwite is faster by a LARGE scale when using very small buffers (1 byte in case).

fwrite works on streams, which are buffered. Therefore many small buffers will be faster because it won't run a costly system call until the buffer fills up (or you flush it or close the stream). On the other hand, small buffers being sent to write will run a costly system call for each buffer - that's where you're losing the speed. With a 1024 byte stream buffer, and writing 1 byte buffers, you're looking at 1024 write calls for each kilobyte, rather than 1024 fwrite calls turning into one write - see the difference?

For big buffers the difference will be small, because there will be less buffering, and therefore a more consistent number of system calls between fwrite and write.

In other words, fwrite(3) is just a library routine that collects up output into chunks, and then calls write(2). Now, write(2), is a system call which traps into the kernel. That's where the I/O actually happens. There is some overhead for simply calling into the kernel, and then there is the time it takes to actually write something. If you use large buffers, you will find that write(2) is faster because it eventually has to be called anyway, and if you are writing one or more times per fwrite then the fwrite buffering overhead is just that: more overhead.

If you want to read more about it, you can have a look at this document, which explains standard I/O streams.


fread与read:
与上面类似