Hi Alex,
We are awfully sorry to have kept you waiting for a long time.
>Hi Masaki,
>
> Thanks for the write-up, generally looks like a good approach to me.
>A few comments and questions:
>
> How do you plan to handle the mismatch between dom0's vCPUs and the
>pCPUs reporting errors. For instance, will all pCPU's CMCs be injected
>into dom0 vCPU0? Will all CPE records be returned from all pCPUs when
>dom0 does a SAL_GET_STATE_INFO from vCPU0? SAL_GET_STATE_INFO_SIZE may
>need to return the platform state info size * number of pCPUs to allow
>dom0 enough space to save the records. On big SMP systems we need to
>make sure that's not more than can reasonable be allocated in the kernel
>by dom0.
>
Our design is to inject all CMC/CPEs into dom0 vcpu0. I think this is
sufficient because our goal of this initial support is logging of
hardware error, not recovery. See detailed flow below.
Step1: Xen receives CMC/CPE interrupt(1)(2) from each pCPUs, and
queues(3)(4) these interrupts.
+--------------------------------------+
| +-CMC/CPE handler------------------+ |dom0
| | | |
| +----------------------------------+ |
+--------------------------------------+
| +-vCPU0-+ |Xen
| | | +-------+ +-------+ |
| | ----> pCPU0 ---> pCPU1 | |
| | | +-------+ +-------+ |
| +-------+ |status | |status | |
| +-------+ +-------+ |
| A A |
| +-CMC/CPE handler-|----------|-----+ |
| | |(3) |(4) | |
| | queues interupts | |
| | with a handling state | |
| | A A | |
| +---------|----------------|-------+ |
+-----------|----------------|---------+
| +-pCPU0-+ | +-pCPU1-+ | |Hardware
| | (1)---+ | (2)---+ |
| +-+-----+ +-+-----+ |
| | | |
| +-+-----+ +-+-----+ |
| |record0| |record1| |
| +-------+ +-------+ |
+--------------------------------------+
Step2: Inject(5) a CMC/CPE into dom0 vCPU0 in turn.
Then dom0 issues(6) SAL_GET_STATE_INFO.
+--------------------------------------+
| +-CMC/CPE handler------------------+ |dom0
| | | |
| | SAL_GET_STATE_INFO | |
| | (6) | |
| | | A | |
| +--|------|------------------------+ |
+--(trap)---|--------------------------+
| | | |Xen
| V | |
| +-vCPU0-+ | |
| | (5)---+ |
| | | +-------+ +-------+ |
| | ----> pCPU0 ---> pCPU1 | |
| | | +-------+ +-------+ |
| +-------+ |status | |status | |
| +-------+ +-------+ |
+--------------------------------------+
| +-pCPU0-+ +-pCPU1-+ |Hardware
| | | | | |
| +-+-----+ +-+-----+ |
| | | |
| +-+-----+ +-+-----+ |
| |record0| |record1| |
| +-------+ +-------+ |
+--------------------------------------+
Step3: Xen traps this SAL call.
If the pCPU to get SAL record is the same as the vCPU,
then Xen issues(7) a normal SAL call to the pCPU.
Xen copies(8) SAL record to dom0.
+--------------------------------------+
| +-CMC/CPE handler------------------+ |dom0
| | | |
| | (8) Get SAL record | |
| | A | |
| +---------|------------------------+ |
+-----------|--------------------------+
| +-vCPU0-+ | |Xen
| | | | |
| | | | +-------+ +-------+ |
| | ---|--> pCPU0 ---> pCPU1 | |
| | | | +-------+ +-------+ |
| +-------+ | |status | |status | |
| | +-------+ +-------+ |
| | |
| SAL_GET_STATE_INFO |
| (7) | |
| | [Buffer] |
| | A |
+----|------|--------------------------+
| V | |Hardware
| +-pCPU0-+ | +-pCPU1-+ |
| | | | | | |
| +-+-----+ | +-+-----+ |
| | | | |
| +-+-----+ | +-+-----+ |
| |record0+-+ |record1| |
| +-------+ +-------+ |
+--------------------------------------+
Step4: Dom0 issues(9) SAL_CLEAR_STATE_INFO.
+--------------------------------------+
| +-CMC/CPE handler------------------+ |dom0
| | | |
| | SAL_CLEAR_STATE_INFO | |
| | (9) | |
| | | | |
| +--|-------------------------------+ |
+--(trap)------------------------------+
| | |Xen
| V |
| +-vCPU0-+ |
| | | +-------+ +-------+ |
| | ----> pCPU0 ---> pCPU1 | |
| | | +-------+ +-------+ |
| +-------+ |status | |status | |
| +-------+ +-------+ |
+--------------------------------------+
| +-pCPU0-+ +-pCPU1-+ |Hardware
| | | | | |
| +-+-----+ +-+-----+ |
| | | |
| +-+-----+ +-+-----+ |
| |record0| |record1| |
| +-------+ +-------+ |
+--------------------------------------+
Step5: Xen traps this SAL call.
If the pCPU to clear SAL record is the same as the vCPU,
then Xen issues(10) a normal SAL call to the pCPU.
Xen frees(11) pCPU0 information.
+--------------------------------------+
| +-CMC/CPE handler------------------+ |dom0
| | | |
| +----------------------------------+ |
+--------------------------------------+
| +-vCPU0-+ |Xen
| | | |
| | | +-------+ |
| | -----------------> pCPU1 | |
| | | (11) +-------+ |
| +-------+ |status | |
| +-------+ |
| SAL_CLEAR_STATE_INFO |
| (10) |
+----|---------------------------------+
| V |Hardware
| +-pCPU0-+ +-pCPU1-+ |
| | | | | |
| +-------+ +-+-----+ |
| | |
| +-+-----+ |
| |record1| |
| +-------+ |
+--------------------------------------+
Step6: Inject(12) the next CMC/CPE into dom0 vCPU0.
Then dom0 issues(13) SAL_GET_STATE_INFO.
+--------------------------------------+
| +-CMC/CPE handler------------------+ |dom0
| | | |
| | SAL_GET_STATE_INFO | |
| | (13) | |
| | | A | |
| +--|------|------------------------+ |
+--(trap)---|--------------------------+
| | | |Xen
| V | |
| +-vCPU0-+ | |
| | (12)---+ |
| | | +-------+ |
| | ---------------> pCPU1 | |
| | | +-------+ |
| +-------+ |status | |
| +-------+ |
+--------------------------------------+
| +-pCPU0-+ +-pCPU1-+ |Hardware
| | | | | |
| +-------+ +-+-----+ |
| | |
| +-+-----+ |
| |record1| |
| +-------+ |
+--------------------------------------+
Step7: Xen traps this SAL call.
If the pCPU to get SAL record is not the same as the
vCPU, Xen issues(14) IPI for another pCPU, Xen on
another pCPU issues(15) SAL call.
Xen copies(16) SAL record to dom0.
+--------------------------------------+
| +-CMC/CPE handler------------------+ |dom0
| | | |
| | (16) Get SAL record | |
| | A | |
| +---------|------------------------+ |
+-----------|--------------------------+
| +-vCPU0-+ | |Xen
| | | | |
| | | | +-------+ |
| | ---|-------------> pCPU1 | |
| | | | +-------+ |
| +-------+ | |status | |
| | +-------+ |
| | |
| | SAL_GET_STATE_INFO |
| send IPI | (15) |
| (14) | A | |
| | [Buffer] | | |
| | A | | |
| | | | | |
+----|------|---------|--|-------------+
| | +---------------------+ |Hardware
| | | | | |
| V | V | |
| +-pCPU0-+ +-pCPU1-+ | |
| | |------->| | | |
| +-------+ +-+-----+ | |
| | | |
| +-+-----+ | |
| |record1+------+ |
| +-------+ |
+--------------------------------------+
Step8: Dom0 issues(17) SAL_CLEAR_STATE_INFO.
+--------------------------------------+
| +-CMC/CPE handler------------------+ |dom0
| | | |
| | SAL_CLEAR_STATE_INFO | |
| | (17) | |
| | | | |
| +--|-------------------------------+ |
+--(trap)------------------------------+
| | |Xen
| V |
| +-vCPU0-+ |
| | | +-------+ |
| | ---------------> pCPU1 | |
| | | +-------+ |
| +-------+ |status | |
| +-------+ |
+--------------------------------------+
| +-pCPU0-+ +-pCPU1-+ |Hardware
| | | | | |
| +-------+ +-+-----+ |
| | |
| +-+-----+ |
| |record1| |
| +-------+ |
+--------------------------------------+
Step9: Xen traps this SAL call.
If the pCPU to clear SAL record is not the same as the
vCPU, Xen issues(18) IPI for another pCPU, Xen on
another pCPU issues(19) SAL call.
Xen frees(20) pCPU1 information.
+--------------------------------------+
| +-CMC/CPE handler------------------+ |dom0
| | | |
| +----------------------------------+ |
+--------------------------------------+
| +-vCPU0-+ |Xen
| | | |
| | | (20) |
| | | |
| | | |
| +-------+ |
| SAL_CLEAR_STATE_INFO |
| send IPI (19) |
| (18) A | |
+----|----------------|--|-------------+
| | | | |Hardware
| V | V |
| +-pCPU0-+ +-pCPU1-+ |
| | |------->| | |
| +-------+ +-------+ |
+--------------------------------------+
> What about clearing error records? We need to be careful that error
>records read by Xen and cleared before being passed to dom0 are volatile
>and could be lost if the system crashes or if dom0 doesn't retrieve
>them. It's best to only clear the log after the error record has been
>received by dom0 and dom0 issues a SAL_CLEAR_STATE_INFO. This will get
>complicated if we need to clear error records on all pCPUs in response
>to a SAL_CLEAR_STATE_INFO on dom0 vCPU0.
>
By our new design, Xen issues SAL_CLEAR_STATE_INFO synchronizing with
SAL_CLEAR_STATE_INFO that dom0 issues.
> Do you plan to support CMC and CPE throttling in Xen (ie. switching
>between interrupt driven and polling handlers under load) and dynamic
>polling intervals?
>
Yes, our design is supported CMC and CPE throttling in Xen and dynamic
polling intervals. We think that Xen must not fall or slow down with
hot CMC and CPE interruption.
> It may be overly complicated to support CPEI on dom0 (fake MADT
>entries, trapping IOSAPIC write, maybe an entirely virtual IOSAPIC in
>order to describe a valid GSI for the CPEI, etc...). Probably best to
>start out with just letting dom0 poll for CPE records. Thanks,
>
Thanks for your advice. As for MADT and IOSAPIC, we are not well
informed. We hope for advice from you and everyone.
Your advice modifies Linux/kernel(mca.c) of dom0, doesn't it? If so,
we modify Linux/kernel of dom0, and CPE supports polling mode only.
BTW, new member kaz has join our team.
> Alex
>
>--
>Alex Williamson HP Open Source & Linux Org.
Best regards,
Yutaka Ezaki(You)
Kazuhiro Suzuki(Kaz)
Masaki Kanno(Kan)
_______________________________________________
Xen-ia64-devel mailing list
Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-ia64-devel
|