On Wed, 2011-04-20 at 23:40 -0400, Christoph Hellwig wrote:
> > +#define WRITE_BARRIER (REQ_WRITE | REQ_FLUSH | REQ_FUA)
> Any in case you want to keep usingthis POS out of tree you really need
> to fix this. REQ_FLUSH | REQ_FUA is not equivalent to the old barrier
> semantics that the Xen disk protocol exported to the guest. You need
> to drain the whole queue of outstanding I/Os as some old guest (at least
> those using reiserfs, e.g. old SLES) rely on it.
Yes, everybody is aware that the semantics were broken. But note it's
not even a consistency issue at this point, because there's currently no
frontend which relies on the original ordering semantics either. Take
xen-blkfront, since blk_flush it uses the barrier op for a flush, being
just a superset when ordering is enforced.
>From here on, there's two ways we can proceed:
We can add the reorderable flush command you asked for a while ago.
But before we just enumerate a new command, a potentially more viable
option would be FLUSH+FUA flags on the WRITE operation. As if mapping
The advantage is that it avoids the extra round trip implied by having
the frontend driving writes through FSEQ_PREFLUSH on their own. I'd
expect that to make much more of a performance difference. Somewhat
differentiating PV from the low physical layer.
Would you, maybe did you, consider that? I think it sounds interesting
enough to gather performance data, just asking beforehand.
Blk-flush presently has no concept of _not_ sequencing the preflush, but
eventually adding a REQ_FLUSH_FUA bit to the queue settings doesn't look
like a big diff compared to what's currently going on.
Note that it'd only be a device option for backends to offer, or not.
Userspace backends will almost certainly prefer to start out with FLUSH,
maybe FUA, because it's stateless and just maps to fdatasync().
Xen-devel mailing list