|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH for 4-21 v4] xen/riscv: identify specific ISA supported by cpu
On 04.02.2025 08:20, Oleksii Kurochko wrote:
> +/*
> + * The canonical order of ISA extension names in the ISA string is defined in
> + * chapter 27 of the unprivileged specification.
> + *
> + * The specification uses vague wording, such as should, when it comes to
> + * ordering, so for our purposes the following rules apply:
> + *
> + * 1. All multi-letter extensions must be separated from other extensions by
> an
> + * underscore.
> + *
> + * 2. Additional standard extensions (starting with 'Z') must be sorted after
> + * single-letter extensions and before any higher-privileged extensions.
> + *
> + * 3. The first letter following the 'Z' conventionally indicates the most
> + * closely related alphabetical extension category, IMAFDQLCBKJTPVH.
> + * If multiple 'Z' extensions are named, they must be ordered first by
> + * category, then alphabetically within a category.
> + *
> + * 4. Standard supervisor-level extensions (starting with 'S') must be listed
> + * after standard unprivileged extensions. If multiple supervisor-level
> + * extensions are listed, they must be ordered alphabetically.
> + *
> + * 5. Standard machine-level extensions (starting with 'Zxm') must be listed
> + * after any lower-privileged, standard extensions. If multiple
> + * machine-level extensions are listed, they must be ordered
> + * alphabetically.
> + *
> + * 6. Non-standard extensions (starting with 'X') must be listed after all
> + * standard extensions. If multiple non-standard extensions are listed,
> they
> + * must be ordered alphabetically.
> + *
> + * An example string following the order is:
> + * rv64imadc_zifoo_zigoo_zafoo_sbar_scar_zxmbaz_xqux_xrux
> + *
> + * New entries to this struct should follow the ordering rules described
> above.
> + *
> + * Extension name must be all lowercase (according to device-tree binding)
> + * and strncmp() is used in match_isa_ext() to compare extension names
> instead
> + * of strncasecmp().
> + */
> +const struct riscv_isa_ext_data __initconst riscv_isa_ext[] = {
> + RISCV_ISA_EXT_DATA(i, RISCV_ISA_EXT_i),
> + RISCV_ISA_EXT_DATA(m, RISCV_ISA_EXT_m),
> + RISCV_ISA_EXT_DATA(a, RISCV_ISA_EXT_a),
> + RISCV_ISA_EXT_DATA(f, RISCV_ISA_EXT_f),
> + RISCV_ISA_EXT_DATA(d, RISCV_ISA_EXT_d),
> + RISCV_ISA_EXT_DATA(q, RISCV_ISA_EXT_q),
> + RISCV_ISA_EXT_DATA(h, RISCV_ISA_EXT_h),
> + RISCV_ISA_EXT_DATA(zicntr, RISCV_ISA_EXT_ZICNTR),
> + RISCV_ISA_EXT_DATA(zicsr, RISCV_ISA_EXT_ZICSR),
> + RISCV_ISA_EXT_DATA(zifencei, RISCV_ISA_EXT_ZIFENCEI),
> + RISCV_ISA_EXT_DATA(zihintpause, RISCV_ISA_EXT_ZIHINTPAUSE),
> + RISCV_ISA_EXT_DATA(zihpm, RISCV_ISA_EXT_ZIHPM),
> + RISCV_ISA_EXT_DATA(zbb, RISCV_ISA_EXT_ZBB),
> + RISCV_ISA_EXT_DATA(smaia, RISCV_ISA_EXT_SMAIA),
> + RISCV_ISA_EXT_DATA(ssaia, RISCV_ISA_EXT_SSAIA),
> +};
> +
> +static const struct riscv_isa_ext_data __initconst required_extensions[] = {
> + RISCV_ISA_EXT_DATA(zicsr, RISCV_ISA_EXT_ZICSR),
> + RISCV_ISA_EXT_DATA(zihintpause, RISCV_ISA_EXT_ZIHINTPAUSE),
> + RISCV_ISA_EXT_DATA(zbb, RISCV_ISA_EXT_ZBB),
> +};
Coming back to my earlier question regarding the B (pseudo-)extension:
Since riscv_isa_ext[] only contains Zbb, is it precluded anywhere in
the spec that DT may mention just B when all of its constituents are
supported?
Which gets me on to G, which is somewhat similar in nature to B. We
require G when RISCV_ISA_RV64G=y, yet required_extensions[] doesn't
name it or its constituents. Much like we require C when RISCV_ISA_C=y,
yet it's not in the table.
> +static bool is_lowercase_extension_name(const char *str)
> +{
> + /*
> + * `str` could contain full riscv,isa string from device tree so one
> + * of the stop condionitions is checking for '_' as extensions are
Nit: conditions
> + * separated by '_'.
> + */
> + for ( unsigned int i = 0; (str[i] != '\0') && (str[i] != '_'); i++ )
> + if ( !isdigit(str[i]) && !islower(str[i]) )
> + return false;
> +
> + return true;
> +}
> +
> +static void __init match_isa_ext(const char *name, const char *name_end,
> + unsigned long *bitmap)
> +{
> + const size_t riscv_isa_ext_count = ARRAY_SIZE(riscv_isa_ext);
> +
> + for ( unsigned int i = 0; i < riscv_isa_ext_count; i++ )
> + {
> + const struct riscv_isa_ext_data *ext = &riscv_isa_ext[i];
> +
> + /*
> + * `ext->name` (according to initialization of riscv_isa_ext[]
> + * elements) must be all in lowercase.
> + */
> + ASSERT(is_lowercase_extension_name(ext->name));
> +
> + if ( (name_end - name == strlen(ext->name)) &&
> + !strncmp(name, ext->name, name_end - name) )
memcmp() is generally more efficient, as it doesn't need to check for
a nul terminator.
> + {
> + __set_bit(ext->id, bitmap);
> + break;
> + }
> + }
> +}
> +
> +static int __init riscv_isa_parse_string(const char *isa,
> + unsigned long *out_bitmap)
> +{
> + if ( (isa[0] != 'r') && (isa[1] != 'v') )
> + return -EINVAL;
> +
> +#if defined(CONFIG_RISCV_32)
> + if ( isa[2] != '3' && isa[3] != '2' )
> + return -EINVAL;
> +#elif defined(CONFIG_RISCV_64)
> + if ( isa[2] != '6' && isa[3] != '4' )
> + return -EINVAL;
> +#else
> +# error "unsupported RISC-V bitness"
> +#endif
> +
> + isa += 4;
> +
> + while ( *isa )
> + {
> + const char *ext = isa++;
> + const char *ext_end = isa;
> + bool ext_err = false;
> +
> + switch ( *ext )
> + {
> + case 'x':
> + printk_once("Vendor extensions are ignored in riscv,isa\n");
> + /*
> + * To skip an extension, we find its end.
> + * As multi-letter extensions must be split from other
> multi-letter
> + * extensions with an "_", the end of a multi-letter extension
> will
> + * either be the null character or the "_" at the start of the
> next
> + * multi-letter extension.
> + */
> + for ( ; *isa && *isa != '_'; ++isa )
> + ;
> + ext_err = true;
> + break;
I was wondering why ext_end isn't updated here. Looks like that's because
ext_err is set to true, and the checking below the switch looks at ext_err
first. That's technically fine, but - to me at least - a little confusing.
Could setting ext_end to NULL take the role of ext_err? For the code here
this would then immediately clarify why ext_end isn't otherwise maintained.
(The "err" in the name is also somewhat odd: The log message above says
"ignored", and that's what the code below also does. There's nothing
error-ish in here; in fact there may be nothing wrong about the specific
vendor extension we're ignoring.)
> + case 's':
> + /*
> + * Workaround for invalid single-letter 's' & 'u' (QEMU):
> + * Before QEMU 7.1 it was an issue with misa to ISA string
> + * conversion:
> + *
> https://patchwork.kernel.org/project/qemu-devel/patch/dee09d708405075420b29115c1e9e87910b8da55.1648270894.git.research_trasio@xxxxxxxxxxxx/#24792587
> + * Additional details of the workaround on Linux kernel side:
> + *
> https://lore.kernel.org/linux-riscv/ae93358e-e117-b43d-faad-772c529f846c@xxxxxxxxxxxx/#t
> + *
> + * No need to set the bit in riscv_isa as 's' & 'u' are
> + * not valid ISA extensions. It works unless the first
> + * multi-letter extension in the ISA string begins with
> + * "Su" and is not prefixed with an underscore.
> + */
> + if ( ext[-1] != '_' && ext[1] == 'u' )
> + {
> + ++isa;
> + ext_err = true;
> + break;
> + }
> + fallthrough;
> + case 'z':
> + /*
> + * Before attempting to parse the extension itself, we find its
> end.
> + * As multi-letter extensions must be split from other
> multi-letter
> + * extensions with an "_", the end of a multi-letter extension
> will
> + * either be the null character or the "_" at the start of the
> next
> + * multi-letter extension.
> + *
> + * Next, as the extensions version is currently ignored, we
> + * eliminate that portion. This is done by parsing backwards from
> + * the end of the extension, removing any numbers. This may be a
> + * major or minor number however, so the process is repeated if a
> + * minor number was found.
> + *
> + * ext_end is intended to represent the first character *after*
> the
> + * name portion of an extension, but will be decremented to the
> last
> + * character itself while eliminating the extensions version
> number.
> + * A simple re-increment solves this problem.
> + */
> + for ( ; *isa && *isa != '_'; ++isa )
> + if ( unlikely(!isalnum(*isa)) )
> + ext_err = true;
> +
> + ext_end = isa;
> + if ( unlikely(ext_err) )
> + break;
This, otoh, is an error. Which probably shouldn't go silently.
Considering the earlier remark, ext_end would then perhaps also be more
logical to update after the above if().
> + if ( !isdigit(ext_end[-1]) )
> + break;
> +
> + while ( isdigit(*--ext_end) )
> + ;
> +
> + if ( tolower(ext_end[0]) != 'p' || !isdigit(ext_end[-1]) )
Leftover tolower()?
> + {
> + ++ext_end;
> + break;
> + }
> +
> + while ( isdigit(*--ext_end) )
> + ;
> +
> + ++ext_end;
> + break;
> +
> + default:
> + /*
> + * Things are a little easier for single-letter extensions, as
> they
> + * are parsed forwards.
> + *
> + * After checking that our starting position is valid, we need to
> + * ensure that, when isa was incremented at the start of the
> loop,
> + * that it arrived at the start of the next extension.
> + *
> + * If we are already on a non-digit, there is nothing to do.
> Either
> + * we have a multi-letter extension's _, or the start of an
> + * extension.
> + *
> + * Otherwise we have found the current extension's major version
> + * number. Parse past it, and a subsequent p/minor version number
> + * if present. The `p` extension must not appear immediately
> after
> + * a number, so there is no fear of missing it.
> + */
> + if ( unlikely(!isalpha(*ext)) )
> + {
> + ext_err = true;
> + break;
> + }
Like above this also looks to be a situation that maybe better would be
left go entirely silently. I even wonder whether you can safely continue
the parsing in that case: How do you know whether what follows is indeed
the start of an extension name?
> +static void __init riscv_fill_hwcap_from_isa_string(void)
> +{
> + const struct dt_device_node *cpus = dt_find_node_by_path("/cpus");
> + const struct dt_device_node *cpu;
> +
> + if ( !cpus )
> + {
> + printk("Missing /cpus node in the device tree?\n");
> + return;
> + }
> +
> + dt_for_each_child_node(cpus, cpu)
> + {
> + DECLARE_BITMAP(this_isa, RISCV_ISA_EXT_MAX);
> + const char *isa;
> + unsigned long cpuid;
> +
> + if ( !dt_device_type_is_equal(cpu, "cpu") )
> + continue;
> +
> + if ( dt_get_cpuid_from_node(cpu, &cpuid) < 0 )
> + continue;
> +
> + if ( dt_property_read_string(cpu, "riscv,isa", &isa) )
> + {
> + printk("Unable to find \"riscv,isa\" devicetree entry "
> + "for DT's cpu%ld node\n", cpuid);
> + continue;
> + }
> +
> + for ( unsigned int i = 0; (isa[i] != '\0'); i++ )
> + if ( !isdigit(isa[i]) && (isa[i] != '_') && !islower(isa[i]) )
> + panic("According to DT binding riscv,isa must be
> lowercase\n");
> +
> + riscv_isa_parse_string(isa, this_isa);
> +
> + /*
> + * In the unpriv. spec is mentioned:
> + * A RISC-V ISA is defined as a base integer ISA, which must be
> + * present in any implementation, plus optional extensions to
> + * the base ISA.
> + * What means that isa should contain, at least, I or E thereby
> + * this_isa can't be empty too.
> + *
> + * Ensure that this_isa is not empty, so riscv_isa won't be empty
> + * during initialization. This prevents ending up with extensions
> + * not supported by one of the CPUs.
> + */
> + ASSERT(!bitmap_empty(this_isa, RISCV_ISA_EXT_MAX));
This is again an assertion on input we consume. Afaict it would actually
trigger not only on all kinds of invalid inputs, but on something valid
like "rv32e".
> +void __init riscv_fill_hwcap(void)
> +{
> + unsigned int i;
> + size_t req_extns_amount = ARRAY_SIZE(required_extensions);
Imo if you really want such a local "variable", it would better be const.
Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |