12

In Windows 7, using the Device Manager, bringing up the properties of a disk, and going to the Policies tab, there are 2 switch items. The write cache, which this question it not about.

and

[X] Turn off Windows write-cache buffer flushing on the device <--- this one only!

Microsoft puts a disclaimer on the tab for that item. "To prevent data loss, do not select this checkbox unless the device has a separate power supply that allows the device to flush its buffer in case of a power loss."

In simple terms , what does this change for file writing, file saving, file copying?

1. Changing write actions for paranoid programs: (fact or fiction)
Does it change the way write flushes work for a program that Forces a cache flush to occur? Some programs are very intent on finishing the write, without speculating, are these programs able to continue thier protective writing, or does this change for those programs also?

2. Types of programs effected:
What are the types of actions/programs that would or would not be efffected by the change? Type, some programs stream, some do quick write outs, some are continuous, some are protective (or any other type you could define in simple terms).

3. Did You see anything, or a benchmark even:
If the setting is on, what are the observable changes in writing? Any loose examples of an observed change in behavior. or observed no change in behavior?

4. What is the holdup or delay :
We know most of these actions are very fast on most computers, The data will eventualy be written. Relative to the the speed of the drive, is the ammount of time significant?

For the purposes of my question, the risk that exists is not one of the questions, if you would like to cover it , it would not get in the way.

What does "Write cache buffer flushing" mean is almost a dupe of this, but the link is for a different OS. Although the A has some info , even the term used in the link is not the same. It also does not answer the most significant things that a user would want to know, that I have tried to outline here.

Psycogeek
  • 8,945
  • 6
  • 51
  • 74
  • 2
    http://serverfault.com/questions/65096/battery-backed-write-cache – Zoredache Feb 16 '12 at 07:56
  • 1
    NTFS uses journaling to protect against filesystem metadata corruption (though file contents are not journaled), but it only works if certain writes can be guaranteed to happen in the correct order, and Windows flushes the write cache at certain times to ensure correct ordering. – David Dec 23 '14 at 15:20

2 Answers2

9
  1. Your assertion in the first question is fiction. Windows API calls such as FlushFileBuffers() will still ensure that the data gets all the way out to the physical media, even with write buffer flushing disabled. So, programs that are "safe" and know what they're doing are going to be just fine. Calls such as FileStream.Flush() in .NET, etc. eventually call this API.

  2. Programs that do a lot of disk I/O without calling FlushFileBuffers() directly, or any helper API that eventually calls it, would see the most noticeable performance increase. For instance, if you were running non-essential I/O where it's okay if data gets lost, such as BOINC (if it gets lost you just re-download the file or attempt to re-compute the calculations), you could avoid calling FlushFileBuffers(), and just call an API like WriteFile() -- the data will get buffered to be written, but it won't actually be written for potentially a long time, such as when the file descriptor is closed, or when the program exits. Unfortunately it is also possible that if the system crashes (such as a BSOD), all the data is lost, so it is really important that if you are dealing with any kind of valuable / non-replaceable data that you do do call FlushFileBuffers(), whether buffer flushing is enabled or not! Otherwise a simple driver bug (for instance in your graphics driver) could cause you to lose a lot of data.

  3. Can't find any benchmarks, but you'll notice it more with programs that fit the description in the second item above.

  4. Syncing data to disk isn't actually that fast, especially if it is done frequently in a tight loop. By default, if I recall correctly from reading Windows Internals books, NTFS by default syncs all dirty filesystem buffers to disk every 5 seconds. This is apparently a decent tradeoff between stability and performance. The problem with frequently syncing data is that it makes the hard drive do a lot of seeks and writes.

Consider the following pseudocode:

1: seek to a certain block (1)
2: write a couple megabytes of data into blocks starting at (1)
3: wait 2 seconds
4: seek to another block (2)
5: write some more megabytes of data into blocks starting at (2)
6: seek back to block (1)
7: write some more megabytes of data into blocks starting at (1)
8: wait 10 minutes
9: seek to block (1)
10: write some megabytes of data into blocks starting at (1)
11: wait 5 seconds
12: seek to block (2)
13: write some megabytes of data into blocks starting at (2)
14: explicit call to FlushFileBuffers()

With automatic 5 second buffer flushing on:

  • The writes occurring on lines 2, 5 and 7 occur in RAM and the disk doesn't move, until 5 seconds have elapsed since the first write, and then the latest data (from line 7) gets written into block (1) and the only data written into block (2) gets written.
  • The writes occurring on lines 10 and 13, which overwrite data in blocks (1) and (2), have to get written out to disk again
  • So the total number of times that block (1) got written to RAM is 3, and to disk, 2. The total number of times that block (2) got written to RAM is 2, and to disk, 2.

With automatic 5 second buffer flushing off (the effect of the checkbox in your question):

  • The writes occurring on lines 2, 5, 7, 10 and 13 occur in RAM and the disk doesn't move, until line 14 is executed, and then the latest data (from lines 10 and 13) gets written into blocks (1) and (2). The old data from lines 2, 5, and 7 never hits the hard disk!

Considering that a busy system can experience between hundreds to tens of thousands of writes to files per second, this is great for performance, especially on traditional spinning hard drives (it's less impressive on SSDs). RAM is 20 times faster than hard drives as a general measure, although that gap is less with SSDs.

The reason they say you should use a battery backup is that you don't want to have 35 minutes worth of written data buffered in RAM that isn't written to disk just because your programmer was lazy and didn't call FlushFileBuffers(), and then have a power failure. Of course, a battery backup doesn't protect you against driver bugs that cause a BSOD....

0

In support of ChatBot John Cavil’s answer, I've written a little test program:

// ...
byteEx btTest;
btTest.resize(1024*1024, 0xff); // 1MB data

CSysFile sfTest(byT("test.bin"));

swTest.Start(); // Begin timing by call `QueryPerformanceCounter` API
for (UINT i=0; i<10000; ++i) // Write 1MB data for 10000 times
{
    sfTest.SeekBegin();
    sfTest.Write(btTest); // Call `WriteFile` API 
//  sfTest.Flush();       // Call `FlushFileBuffers` API
}
swTest.Stop(); // Calculate the time-consuming start from `swTest.Start() `
// ...

And run it on a Samsung 950pro NVMe disk with the “Turn off Windows write-cache buffer flushing on the device” option enabled.

The result is:

D:\tmp> test        // without sfTest.Flush();
00:00:00.729766     // use 0.73 seconds without FlushFileBuffers()

D:\tmp> test        // with sfTest.Flush();
00:00:06.736167     // use 6.74 seconds with FlushFileBuffers()

So you can see the FlushFileBuffers request is not omitted by the system (Windows does not ignore FlushFileBuffers call even if the options is enabled).

ASBai
  • 111
  • 2
  • Please remove your commentary from your answer. It is *never* acceptable to submit commentary as an answer. – Ramhound Apr 10 '17 at 17:14
  • @ASBai: (1) I know C++ (I *assume* that’s what your program is written in), but I don’t know the Windows API.  Can you explain your code a bit?  (Bear in mind that some users of [SU] are not programmers at all, *per se*.)  In particular, what is `swTest` (and why is it not declared)? (2) Are you saying that you made two copies of your program, one including the `sfTest.Flush()` call and one not (i.e., with it commented out), and compared them?  Please explain. (3) I know English, but I cannot understand your last sentence. – Scott - Слава Україні Apr 10 '17 at 17:35
  • @Ramhound but I've no enough reputation to vote or leave a commentary, how to solve it? – ASBai Apr 11 '17 at 17:31
  • @Scott (1), swTest is a high resolution timer, it use QueryPerformanceCounter API on windows platform to do the timing (I think it is not a critical point :-). (2) Yes, Exactly. (3) Sorry for my bad English, I just want to say: ChatBot John Cavil is right, Windows does not ignore `FlushFileBuffers` call even if the options is enabled (I saw some other sources sad the call would be ignored when this option has been enabled). I will add some more comments in the answer, thanks :-) – ASBai Apr 11 '17 at 17:46
  • @ASBai - You should not submit commentary as an answer. It does not really matter, you don't have the reputation required to submit a comment, because a comment should never be submitted as an answer. – Ramhound Apr 11 '17 at 22:44
  • @Ramhound (1) as a new guy for this site, I've only two things can do: 1. Ask a new question, 2. post a new answer. So what should you do if you are in this situation? Just be quiet? (2) My post is a long text, and need to be well formatted for readability, it is not fit in a constraint comment. – ASBai Apr 12 '17 at 16:04
  • @ASBai - The commentary was removed by an editor, which is the reason, it currently does not have commentary. I was responding to your, "I don't have enough reputation to submit a comment, with the simple response that commentary should not be submitted as an answer. If you a question requires a comment for you for clarity purposes, move to the next question that doesn't, that way you can simply answer the question. I encourage you to submit answers, but our standards are high, so comments submitted as an answer will simply be deleted. – Ramhound Apr 12 '17 at 16:16
  • @Ramhound the content of my current answer is at least 95% stay unchanged. So you consider it is not a "comment" any more now? In other words, You think it's a comment only because the first statement of the original one said I don't have enough permission to post a comment? – ASBai Apr 13 '17 at 17:20
  • As I said, the only reason I even replied is the fact, you pinged me and indicate that the reason your had commentary within your answer was because of your reputation. My response to that, that isn't a valid reason, and I kindly explained commentary should not be submitted as an answer. Yes; You have resolved the problem. – Ramhound Apr 13 '17 at 17:25