WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] RE: [PATCH] Xenoprof passive domain support fixes

To: "Yang, Xiaowei" <xiaowei.yang@xxxxxxxxx>, "Keir Fraser" <Keir.Fraser@xxxxxxxxxxxx>
Subject: [Xen-devel] RE: [PATCH] Xenoprof passive domain support fixes
From: "Santos, Jose Renato G" <joserenato.santos@xxxxxx>
Date: Mon, 10 Jul 2006 11:19:37 -0700
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx
Delivery-date: Mon, 10 Jul 2006 11:20:27 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
In-reply-to: <E42673822206994F8293A048721B91A506132A@xxxxxxxxxxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: Acahc5FTBt/KOWTuTEact0++HxDp/ACfVxwQABW2urA=
Thread-topic: [PATCH] Xenoprof passive domain support fixes
Xiaowei,

In the original code the logic for encoding domain switches was buggy.
The whole approach of encoding the domain switch as a CPU mode change
does not seem right to me.
The code was somewhat difficult to follow, but I could find at least the
following bug.

This was the original code used to add a domain switch on the CPU
buffer:

+static void xenoprof_handle_passive(void)
+{
+       int i, j;
+
+       for (i = 0; i < pdomains; i++)
+               for (j = 0; j < passive_domains[i].nbuf; j++) {
+                       xenoprof_buf_t *buf = p_xenoprof_buf[i][j];
+                       if (buf->event_head == buf->event_tail)
+                               continue;
+                        oprofile_add_pc(IGNORED_PC,
CPU_MODE_PASSIVE_START, passive_domains[i].domain_id);
+                       xenoprof_add_pc(buf, 1);
+                        oprofile_add_pc(IGNORED_PC,
CPU_MODE_PASSIVE_STOP, passive_domains[i].domain_id);
+               }                       
+}
+

This code inserts the sequence (ESCAPE_CODE, CPU_MODE_PASSIVE_START,
IGNORED_PC, domain_id) in the CPU buffer, to indicate a domain switch.
Since ESCAPE_CODE = IGNORED_PC = 0, the inserted sequence is (0,
CPU_MODE_PASSIVE_START, 0, domain_id).

Now the code which reads the sample from the CPU buffer and copies them
to the event buffer is the following:

+       for (i = 0; i < available; ++i) {
+               struct op_sample * s =
&cpu_buf->buffer[cpu_buf->tail_pos];
+ 
+               if (is_code(s->eip)) {
+                       if (s->event < CPU_TRACE_BEGIN) {
+                               /* kernel/userspace switch */
+                               cpu_mode = s->event;
+                               if (state == sb_buffer_start)
+                                       state = sb_sample_start;
+
+                               if (s->event == CPU_MODE_PASSIVE_START)
+                                       domain_switch =
DOMAIN_SWITCH_START_EVENT1;
+                               else if (s->event ==
CPU_MODE_PASSIVE_STOP)
+                                       domain_switch =
DOMAIN_SWITCH_STOP_EVENT1;
+
+                               if (domain_switch !=
DOMAIN_SWITCH_START_EVENT2)
+                                       add_cpu_mode_switch(s->event);
+                       } else if (s->event == CPU_TRACE_BEGIN) {
+                               state = sb_bt_start;
+                               add_trace_begin();
+                       } else {
+                               struct mm_struct * oldmm = mm;
+
+                               /* userspace context switch */
+                               new = (struct task_struct *)s->event;
+
+                               release_mm(oldmm);
+                               mm = take_tasks_mm(new);
+                               if (mm != oldmm)
+                                       cookie = get_exec_dcookie(mm);
+                               add_user_ctx_switch(new, cookie);
+                       }
+               } else {
+                       if (domain_switch == DOMAIN_SWITCH_START_EVENT1)
{
(1)>>>>>                add_event_entry(s->event);
+                               domain_switch =
DOMAIN_SWITCH_START_EVENT2;
(2)>>>>>                } else if (domain_switch ==
DOMAIN_SWITCH_START_EVENT1) {
+                               add_sample_entry(s->eip, s->event);
+                       } else if (domain_switch ==
DOMAIN_SWITCH_STOP_EVENT1) {
+                               domain_switch = NO_DOMAIN_SWITCH;
+                       } else {
+                               if (state >= sb_bt_start &&
+                                   !add_sample(mm, s, cpu_mode)) {
+                                       if (state == sb_bt_start) {
+                                               state = sb_bt_ignore;
+
atomic_inc(&oprofile_stats.bt_lost_no_mapping);
+                                       }
+                               }
+                       }
+               }
+
+               increment_tail(cpu_buf);
+       }

The line marked (1)>>>>>, seems to be adding the domain id of the
passive domain switch.
However, because the domain_id was encoded as an "event" with "eip=0",
this code is not executed when it needs to. The code executes the "if"
part of the statement instead, since "s->eip=0" causes is_code() to
evaluate to true. This should cause samples to be misinterpreted.
Also the if statement on line (2)>>>> will never be executed since the
condition is inside an "else" for the same condition
(domain_switch == DOMAIN_SWITCH_START_EVENT1). Since the code was buggy
I changed the logic for representing domain switches. In the process of
doing that I eliminated the CPU_MODE_PASSIVE stop, as I have asked you
to do before. There is really no need to encode a START and a STOP. You
just need a mark to separate samples belonging to different domains.
 
I compared the results of oprofile with --passive-domains option and
with --active-domains while running a TCP network benchmark.
Here are the first few lines reported by oprofile in both cases:

--active-domains option:
29583    22.8896  vmlinux-syms-2.6.16.13-xenU
vmlinux-syms-2.6.16.13-xenU __copy_to_user_ll
4353      3.3681  vmlinux-syms-2.6.16.13-xenU
vmlinux-syms-2.6.16.13-xenU tcp_v4_rcv
2986      2.3104  vmlinux-syms-2.6.16.13-xenU
vmlinux-syms-2.6.16.13-xenU eth_type_trans
2820      2.1820  vmlinux-syms-2.6.16.13-xenU
vmlinux-syms-2.6.16.13-xenU netif_poll
2751      2.1286  vmlinux-syms-2.6.16.13-xenU
vmlinux-syms-2.6.16.13-xenU gnttab_end_foreign_transfer_ref
2402      1.8585  vmlinux-syms-2.6.16.13-xenU
vmlinux-syms-2.6.16.13-xenU network_tx_buf_gc
2332      1.8044  vmlinux-syms-2.6.16.13-xenU
vmlinux-syms-2.6.16.13-xenU network_start_xmit

--passive-domains option:
28805    10.5522  pvmlinux1-syms           iowrite32_rep
3706      1.3576  pvmlinux1-syms           iret_exc
2642      0.9679  pvmlinux1-syms           netlink_recvmsg
2576      0.9437  pvmlinux1-syms           ip4_datagram_connect
2541      0.9309  pvmlinux1-syms           input_devices_read
2499      0.9155  pvmlinux1-syms           hypercall_page
2204      0.8074  pvmlinux1-syms           ip_setsockopt
1942      0.7114  pvmlinux1-syms           skbuff_ctor


The set of functions reported in each case are completely different,
suggesting that the passive-domain case was in fact assigning samples to
the wrong functions.

After the modifications included in the patch that I submited I got the
following output for oprofile with --passive-domains:
23818    11.8026  pvmlinux3-syms           __copy_to_user_ll
3272      1.6214  pvmlinux3-syms           tcp_v4_rcv
2312      1.1457  pvmlinux3-syms           eth_type_trans
2151      1.0659  pvmlinux3-syms           netif_poll
2128      1.0545  pvmlinux3-syms           hypercall_page
2024      1.0030  pvmlinux3-syms           network_start_xmit
2023      1.0025  pvmlinux3-syms           network_tx_buf_gc
1489      0.7378  pvmlinux3-syms           ip_queue_xmit

This is much closer to the active-domain case and gives me confident
that the code is now doing the right thing

Anyway, thanks for your help on getting the passive domain support in.
If you had not generated the patch we would not have passive domain
support yet. I do appreciate your effort on getting this into
xen-unstable.

Thanks

Renato






>> -----Original Message-----
>> From: Yang, Xiaowei [mailto:xiaowei.yang@xxxxxxxxx] 
>> Sent: Monday, July 10, 2006 12:18 AM
>> To: Santos, Jose Renato G; Keir Fraser
>> Cc: xen-devel@xxxxxxxxxxxxxxxxxxx
>> Subject: RE: [PATCH] Xenoprof passive domain support fixes
>> 
>> >Currently, passive domain samples are being assigned to the wrong 
>> >kernel functions.
>> 
>> Can you explain more? 
>> Thanks,
>> xiaowei
>> 

Attachment: passive-orig.txt
Description: passive-orig.txt

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel