Xiaowei,
In the original code the logic for encoding domain switches was buggy.
The whole approach of encoding the domain switch as a CPU mode change
does not seem right to me.
The code was somewhat difficult to follow, but I could find at least the
following bug.
This was the original code used to add a domain switch on the CPU
buffer:
+static void xenoprof_handle_passive(void)
+{
+ int i, j;
+
+ for (i = 0; i < pdomains; i++)
+ for (j = 0; j < passive_domains[i].nbuf; j++) {
+ xenoprof_buf_t *buf = p_xenoprof_buf[i][j];
+ if (buf->event_head == buf->event_tail)
+ continue;
+ oprofile_add_pc(IGNORED_PC,
CPU_MODE_PASSIVE_START, passive_domains[i].domain_id);
+ xenoprof_add_pc(buf, 1);
+ oprofile_add_pc(IGNORED_PC,
CPU_MODE_PASSIVE_STOP, passive_domains[i].domain_id);
+ }
+}
+
This code inserts the sequence (ESCAPE_CODE, CPU_MODE_PASSIVE_START,
IGNORED_PC, domain_id) in the CPU buffer, to indicate a domain switch.
Since ESCAPE_CODE = IGNORED_PC = 0, the inserted sequence is (0,
CPU_MODE_PASSIVE_START, 0, domain_id).
Now the code which reads the sample from the CPU buffer and copies them
to the event buffer is the following:
+ for (i = 0; i < available; ++i) {
+ struct op_sample * s =
&cpu_buf->buffer[cpu_buf->tail_pos];
+
+ if (is_code(s->eip)) {
+ if (s->event < CPU_TRACE_BEGIN) {
+ /* kernel/userspace switch */
+ cpu_mode = s->event;
+ if (state == sb_buffer_start)
+ state = sb_sample_start;
+
+ if (s->event == CPU_MODE_PASSIVE_START)
+ domain_switch =
DOMAIN_SWITCH_START_EVENT1;
+ else if (s->event ==
CPU_MODE_PASSIVE_STOP)
+ domain_switch =
DOMAIN_SWITCH_STOP_EVENT1;
+
+ if (domain_switch !=
DOMAIN_SWITCH_START_EVENT2)
+ add_cpu_mode_switch(s->event);
+ } else if (s->event == CPU_TRACE_BEGIN) {
+ state = sb_bt_start;
+ add_trace_begin();
+ } else {
+ struct mm_struct * oldmm = mm;
+
+ /* userspace context switch */
+ new = (struct task_struct *)s->event;
+
+ release_mm(oldmm);
+ mm = take_tasks_mm(new);
+ if (mm != oldmm)
+ cookie = get_exec_dcookie(mm);
+ add_user_ctx_switch(new, cookie);
+ }
+ } else {
+ if (domain_switch == DOMAIN_SWITCH_START_EVENT1)
{
(1)>>>>> add_event_entry(s->event);
+ domain_switch =
DOMAIN_SWITCH_START_EVENT2;
(2)>>>>> } else if (domain_switch ==
DOMAIN_SWITCH_START_EVENT1) {
+ add_sample_entry(s->eip, s->event);
+ } else if (domain_switch ==
DOMAIN_SWITCH_STOP_EVENT1) {
+ domain_switch = NO_DOMAIN_SWITCH;
+ } else {
+ if (state >= sb_bt_start &&
+ !add_sample(mm, s, cpu_mode)) {
+ if (state == sb_bt_start) {
+ state = sb_bt_ignore;
+
atomic_inc(&oprofile_stats.bt_lost_no_mapping);
+ }
+ }
+ }
+ }
+
+ increment_tail(cpu_buf);
+ }
The line marked (1)>>>>>, seems to be adding the domain id of the
passive domain switch.
However, because the domain_id was encoded as an "event" with "eip=0",
this code is not executed when it needs to. The code executes the "if"
part of the statement instead, since "s->eip=0" causes is_code() to
evaluate to true. This should cause samples to be misinterpreted.
Also the if statement on line (2)>>>> will never be executed since the
condition is inside an "else" for the same condition
(domain_switch == DOMAIN_SWITCH_START_EVENT1). Since the code was buggy
I changed the logic for representing domain switches. In the process of
doing that I eliminated the CPU_MODE_PASSIVE stop, as I have asked you
to do before. There is really no need to encode a START and a STOP. You
just need a mark to separate samples belonging to different domains.
I compared the results of oprofile with --passive-domains option and
with --active-domains while running a TCP network benchmark.
Here are the first few lines reported by oprofile in both cases:
--active-domains option:
29583 22.8896 vmlinux-syms-2.6.16.13-xenU
vmlinux-syms-2.6.16.13-xenU __copy_to_user_ll
4353 3.3681 vmlinux-syms-2.6.16.13-xenU
vmlinux-syms-2.6.16.13-xenU tcp_v4_rcv
2986 2.3104 vmlinux-syms-2.6.16.13-xenU
vmlinux-syms-2.6.16.13-xenU eth_type_trans
2820 2.1820 vmlinux-syms-2.6.16.13-xenU
vmlinux-syms-2.6.16.13-xenU netif_poll
2751 2.1286 vmlinux-syms-2.6.16.13-xenU
vmlinux-syms-2.6.16.13-xenU gnttab_end_foreign_transfer_ref
2402 1.8585 vmlinux-syms-2.6.16.13-xenU
vmlinux-syms-2.6.16.13-xenU network_tx_buf_gc
2332 1.8044 vmlinux-syms-2.6.16.13-xenU
vmlinux-syms-2.6.16.13-xenU network_start_xmit
--passive-domains option:
28805 10.5522 pvmlinux1-syms iowrite32_rep
3706 1.3576 pvmlinux1-syms iret_exc
2642 0.9679 pvmlinux1-syms netlink_recvmsg
2576 0.9437 pvmlinux1-syms ip4_datagram_connect
2541 0.9309 pvmlinux1-syms input_devices_read
2499 0.9155 pvmlinux1-syms hypercall_page
2204 0.8074 pvmlinux1-syms ip_setsockopt
1942 0.7114 pvmlinux1-syms skbuff_ctor
The set of functions reported in each case are completely different,
suggesting that the passive-domain case was in fact assigning samples to
the wrong functions.
After the modifications included in the patch that I submited I got the
following output for oprofile with --passive-domains:
23818 11.8026 pvmlinux3-syms __copy_to_user_ll
3272 1.6214 pvmlinux3-syms tcp_v4_rcv
2312 1.1457 pvmlinux3-syms eth_type_trans
2151 1.0659 pvmlinux3-syms netif_poll
2128 1.0545 pvmlinux3-syms hypercall_page
2024 1.0030 pvmlinux3-syms network_start_xmit
2023 1.0025 pvmlinux3-syms network_tx_buf_gc
1489 0.7378 pvmlinux3-syms ip_queue_xmit
This is much closer to the active-domain case and gives me confident
that the code is now doing the right thing
Anyway, thanks for your help on getting the passive domain support in.
If you had not generated the patch we would not have passive domain
support yet. I do appreciate your effort on getting this into
xen-unstable.
Thanks
Renato
>> -----Original Message-----
>> From: Yang, Xiaowei [mailto:xiaowei.yang@xxxxxxxxx]
>> Sent: Monday, July 10, 2006 12:18 AM
>> To: Santos, Jose Renato G; Keir Fraser
>> Cc: xen-devel@xxxxxxxxxxxxxxxxxxx
>> Subject: RE: [PATCH] Xenoprof passive domain support fixes
>>
>> >Currently, passive domain samples are being assigned to the wrong
>> >kernel functions.
>>
>> Can you explain more?
>> Thanks,
>> xiaowei
>>
passive-orig.txt
Description: passive-orig.txt
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|