[Xen-devel] Re: [RFC][PATCH] Per-cpu xentrace buffers

Final release is still a few weeks away. It should probably go in for rc2
then.

 -- Keir

On 20/01/2010 18:06, "George Dunlap" <George.Dunlap@xxxxxxxxxxxxx> wrote:

> How long between rc2 and expected release (if no other candidates are
> considered)?  It's more of a debugging feature, so it's not going to
> screw over any production systems if it's got some subtle bugs.  (The
> "tb_init_done" flag that turns it on or off is exactly the same.)  I
> could try to put it through its paces this week and early next week, and
> if nothing turns up, it's probably fine to go in.
> 
> It will definitely require a tools rebuild if anyone's using xentrace,
> which people may not expect. :-)
> 
>  -George
> 
> Keir Fraser wrote:
>> Oh, I'm fine with it. I wasn't sure about putting it in for 4.0.0, but
>> actually plenty is going in for rc2. What do you think?
>> 
>>  -- Keir
>> 
>> On 20/01/2010 17:38, "George Dunlap" <George.Dunlap@xxxxxxxxxxxxx> wrote:
>> 
>>   
>>> Keir, would you mind commenting on this new design in the next few
>>> days?  If it looks like a good design, I'd like to do some more
>>> testing and get this into our next XenServer release.
>>> 
>>>  -George
>>> 
>>> On Thu, Jan 7, 2010 at 3:13 PM, George Dunlap <dunlapg@xxxxxxxxx> wrote:
>>>     
>>>> In the current xentrace configuration, xentrace buffers are all
>>>> allocated in a single contiguous chunk, and then divided among logical
>>>> cpus, one buffer per cpu.  The size of an allocatable chunk is fairly
>>>> limited, in my experience about 128 pages (512KiB).  As the number of
>>>> logical cores increase, this means a much smaller maximum per-cpu
>>>> trace buffer per cpu; on my dual-socket quad-core nehalem box with
>>>> hyperthreading (16 logical cpus), that comes to 8 pages per logical
>>>> cpu.
>>>> 
>>>> The attached patch addresses this issue by allocating per-cpu buffers
>>>> separately.  This allows larger trace buffers; however, it requires an
>>>> interface change to xentrace, which is why I'm making a Request For
>>>> Comments.  (I'm not expecting this patch to be included in the 4.0
>>>> release.)
>>>> 
>>>> The old interface to get trace buffers was fairly simple: you ask for
>>>> the info, and it gives you:
>>>> * the mfn of the first page in the buffer allocation
>>>> * the total size of the trace buffer
>>>> 
>>>> The tools then mapped [mfn,mfn+size), calculated where the per-pcpu
>>>> buffers were, and went on to consume records from them.
>>>> 
>>>> -- Interface --
>>>> 
>>>> The proposed interface works as follows.
>>>> 
>>>> * XEN_SYSCTL_TBUFOP_get_info still returns an mfn and a size (so no
>>>> changes to the library).  However, this new are is to a trace buffer
>>>> info area  (t_info), allocated once at boot time.  The trace buffer
>>>> info area contains mfns of the per-pcpu buffers.
>>>> * The t_info struct contains an array of "offset pointers", one per
>>>> pcpu.  These are an offset into the t_info data area of an array of
>>>> mfns for that pcpu.  So logically, the layout looks like this:
>>>> struct {
>>>>  int16_t tbuf_size; /* Number of pages per cpu */
>>>>  int16_t offset[NR_CPUS]; /* Offset into the t_info area of the array */
>>>>  uint32_t mfn[NR_CPUS][TBUF_SIZE];
>>>> };
>>>> 
>>>> So if NR_CPUS was 16, and TBUF_SIZE was 32, we'd have:
>>>> struct {
>>>>  int16_t tbuf_size; /* Number of pages per cpu */
>>>>  int16_t offset[16]; /* Offset into the t_info area of the array */
>>>>  uint32_t p0_mfn_list[32];
>>>>  uint32_t p1_mfn_list[32];
>>>>  ...
>>>>  uint32_t p15_mfn_list[32];
>>>> };
>>>> * So the new way to map trace buffers is as follows:
>>>>  + Call TBUFOP_get_info to get the mfn and size of the t_info area, and map
>>>> it.
>>>>  + Get the number of cpus
>>>>  + For each cpu:
>>>>  - Calculate the offset into the t_info area thus: unsigned long
>>>> *mfn_list = ((unsigned long*)t_info)+(t_info->cpu_offset[cpu]))
>>>>  - Map t_info->tbuf_size mfns from mfn_list using xc_map_foreign_batch()
>>>> 
>>>> In the current implementation, the t_info size is fixed at 2 pages,
>>>> allowing about 2000 pages total to be mapped.  For a 32-way system,
>>>> this would allow up to 63 pages per cpu (256MiB).  Bumping this up to
>>>> 4 would allow even larger systems if required.
>>>> 
>>>> The current implementation also allocates each trace buffer
>>>> contiguously, since that's the easiest way to get contiguous virtual
>>>> address space.  But this interface allows Xen the flexibility, in the
>>>> future, to allocate buffers in several chunks if necessary, without
>>>> having to change the interface again.
>>>> 
>>>> -- Implementation notes --
>>>> 
>>>> The t_info area is allocated once at boot.  Trace buffers are
>>>> allocated either at boot (if a parameter is passed) or when
>>>> TBUFOP_set_size is called.  Due to the complexity of tracking pages
>>>> mapped by dom0, unmapping or resizing trace buffers is not supported.
>>>> 
>>>> I introduced a new per-cpu spinlock guarding trace data and buffers.
>>>> This allows per-cpu data to be safely accessed and modified without
>>>> tracing with current tracing events.  The per-cpu spinlock is grabbed
>>>> whenever a trace event is generated; but in the (very very very)
>>>> common case, the lock should be in the cache already.
>>>> 
>>>> Feedback welcome.
>>>> 
>>>>  -George
>>>> 
>>>>       
>> 
>> 
>>   
> 



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
WARNING - OLD ARCHIVES

xen-devel

[Xen-devel] Re: [RFC][PATCH] Per-cpu xentrace buffers