[PATCH v3 00/17] hw/sd/sdcard: Accumulation of cleanups and fixes

Since v2: - Tested-by from Cédric recorded - more patches added :S Since v1: - various patches merged, few more added Various SD card cleanups and fixes accumulated over the years. Various have been useful to help integrating eMMC support (which will come later). Full series for testing: https://gitlab.com/philmd/qemu/-/tags/emmc-v4 Cédric Le Goater (1): hw/sd/sdcard: Introduce definitions for EXT_CSD register Philippe Mathieu-Daudé (16): hw/sd/sdcard: Deprecate support for spec v1.10 hw/sd/sdcard: Use spec v3.01 by default hw/sd/sdcard: Track last command used to help logging hw/sd/sdcard: Trace block offset in READ/WRITE data accesses hw/sd/sdcard: Trace requested address computed by sd_req_get_address() hw/sd/sdcard: Do not store vendor data on block drive (CMD56) hw/sd/sdcard: Send WRITE_PROT bits MSB first (CMD30) hw/sd/sdcard: Send NUM_WR_BLOCKS bits MSB first (ACMD22) hw/sd/sdcard: Use READY_FOR_DATA definition instead of magic value hw/sd/sdcard: Assign SDCardStates enum values hw/sd/sdcard: Simplify sd_inactive_state handling hw/sd/sdcard: Restrict SWITCH_FUNCTION to sd_transfer_state (CMD6) hw/sd/sdcard: Add direct reference to SDProto in SDState hw/sd/sdcard: Extract sd_blk_len() helper tests/qtest: Disable npcm7xx_sdhci tests using hardcoded RCA hw/sd/sdcard: Generate random RCA value docs/about/deprecated.rst | 6 ++ hw/sd/sdmmc-internal.h | 97 +++++++++++++++++++++ hw/sd/sd.c | 145 ++++++++++++++++++------------- tests/qtest/npcm7xx_sdhci-test.c | 7 ++ hw/sd/trace-events | 6 +- 5 files changed, 199 insertions(+), 62 deletions(-) -- 2.41.0

We use the v2.00 spec by default since commit 2f0939c234 ("sdcard: Add a 'spec_version' property, default to Spec v2.00"). Time to deprecate the v1.10 which doesn't bring much, and is not tested. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Cédric Le Goater <clg@redhat.com> --- docs/about/deprecated.rst | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/docs/about/deprecated.rst b/docs/about/deprecated.rst index ff3da68208..02cdef14aa 100644 --- a/docs/about/deprecated.rst +++ b/docs/about/deprecated.rst @@ -362,6 +362,12 @@ recommending to switch to their stable counterparts: - "Zve64f" should be replaced with "zve64f" - "Zve64d" should be replaced with "zve64d" +``-device sd-card,spec_version=1`` (since 9.1) +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +SD physical layer specification v2.00 supersedes the v1.10 one. +v2.00 is the default since QEMU 3.0.0. + Block device options '''''''''''''''''''' -- 2.41.0

Recent SDHCI expect cards to support the v3.01 spec to negociate lower I/O voltage. Select it by default. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Cédric Le Goater <clg@redhat.com> --- hw/sd/sd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/sd/sd.c b/hw/sd/sd.c index a48010cfc1..d0a1d5db18 100644 --- a/hw/sd/sd.c +++ b/hw/sd/sd.c @@ -2280,7 +2280,7 @@ static void sd_realize(DeviceState *dev, Error **errp) static Property sd_properties[] = { DEFINE_PROP_UINT8("spec_version", SDState, - spec_version, SD_PHY_SPECv2_00_VERS), + spec_version, SD_PHY_SPECv3_01_VERS), DEFINE_PROP_DRIVE("drive", SDState, blk), /* We do not model the chip select pin, so allow the board to select * whether card should be in SSI or MMC/SD mode. It is also up to the -- 2.41.0

The command is selected on the I/O lines, and further processing might be done on the DAT lines via the sd_read_byte() and sd_write_byte() handlers. Since these methods can't distinct between normal and APP commands, keep the name of the current command in the SDState and use it in the DAT handlers. This fixes a bug that all normal commands were displayed as APP commands. Fixes: 2ed61fb57b ("sdcard: Display command name when tracing CMD/ACMD") Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Cédric Le Goater <clg@redhat.com> --- hw/sd/sd.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/hw/sd/sd.c b/hw/sd/sd.c index d0a1d5db18..bc87807793 100644 --- a/hw/sd/sd.c +++ b/hw/sd/sd.c @@ -133,6 +133,7 @@ struct SDState { uint32_t pwd_len; uint8_t function_group[6]; uint8_t current_cmd; + const char *last_cmd_name; /* True if we will handle the next command as an ACMD. Note that this does * *not* track the APP_CMD status bit! */ @@ -1154,12 +1155,13 @@ static sd_rsp_type_t sd_normal_command(SDState *sd, SDRequest req) uint16_t rca; uint64_t addr; + sd->last_cmd_name = sd_cmd_name(req.cmd); /* CMD55 precedes an ACMD, so we are not interested in tracing it. * However there is no ACMD55, so we want to trace this particular case. */ if (req.cmd != 55 || sd->expecting_acmd) { trace_sdcard_normal_command(sd_proto(sd)->name, - sd_cmd_name(req.cmd), req.cmd, + sd->last_cmd_name, req.cmd, req.arg, sd_state_name(sd->state)); } @@ -1620,7 +1622,8 @@ static sd_rsp_type_t sd_normal_command(SDState *sd, SDRequest req) static sd_rsp_type_t sd_app_command(SDState *sd, SDRequest req) { - trace_sdcard_app_command(sd_proto(sd)->name, sd_acmd_name(req.cmd), + sd->last_cmd_name = sd_acmd_name(req.cmd); + trace_sdcard_app_command(sd_proto(sd)->name, sd->last_cmd_name, req.cmd, req.arg, sd_state_name(sd->state)); sd->card_status |= APP_CMD; @@ -1913,7 +1916,7 @@ void sd_write_byte(SDState *sd, uint8_t value) return; trace_sdcard_write_data(sd_proto(sd)->name, - sd_acmd_name(sd->current_cmd), + sd->last_cmd_name, sd->current_cmd, value); switch (sd->current_cmd) { case 24: /* CMD24: WRITE_SINGLE_BLOCK */ @@ -2069,7 +2072,7 @@ uint8_t sd_read_byte(SDState *sd) io_len = (sd->ocr & (1 << 30)) ? 512 : sd->blk_len; trace_sdcard_read_data(sd_proto(sd)->name, - sd_acmd_name(sd->current_cmd), + sd->last_cmd_name, sd->current_cmd, io_len); switch (sd->current_cmd) { case 6: /* CMD6: SWITCH_FUNCTION */ @@ -2214,6 +2217,7 @@ static void sd_instance_init(Object *obj) { SDState *sd = SD_CARD(obj); + sd->last_cmd_name = "UNSET"; sd->enable = true; sd->ocr_power_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, sd_ocr_powerup, sd); } -- 2.41.0

Useful to detect out of bound accesses. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Cédric Le Goater <clg@redhat.com> --- hw/sd/sd.c | 4 ++-- hw/sd/trace-events | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/hw/sd/sd.c b/hw/sd/sd.c index bc87807793..090a6fdcdb 100644 --- a/hw/sd/sd.c +++ b/hw/sd/sd.c @@ -1917,7 +1917,7 @@ void sd_write_byte(SDState *sd, uint8_t value) trace_sdcard_write_data(sd_proto(sd)->name, sd->last_cmd_name, - sd->current_cmd, value); + sd->current_cmd, sd->data_offset, value); switch (sd->current_cmd) { case 24: /* CMD24: WRITE_SINGLE_BLOCK */ sd->data[sd->data_offset ++] = value; @@ -2073,7 +2073,7 @@ uint8_t sd_read_byte(SDState *sd) trace_sdcard_read_data(sd_proto(sd)->name, sd->last_cmd_name, - sd->current_cmd, io_len); + sd->current_cmd, sd->data_offset, io_len); switch (sd->current_cmd) { case 6: /* CMD6: SWITCH_FUNCTION */ ret = sd->data[sd->data_offset ++]; diff --git a/hw/sd/trace-events b/hw/sd/trace-events index 724365efc3..0eee98a646 100644 --- a/hw/sd/trace-events +++ b/hw/sd/trace-events @@ -52,8 +52,8 @@ sdcard_lock(void) "" sdcard_unlock(void) "" sdcard_read_block(uint64_t addr, uint32_t len) "addr 0x%" PRIx64 " size 0x%x" sdcard_write_block(uint64_t addr, uint32_t len) "addr 0x%" PRIx64 " size 0x%x" -sdcard_write_data(const char *proto, const char *cmd_desc, uint8_t cmd, uint8_t value) "%s %20s/ CMD%02d value 0x%02x" -sdcard_read_data(const char *proto, const char *cmd_desc, uint8_t cmd, uint32_t length) "%s %20s/ CMD%02d len %" PRIu32 +sdcard_write_data(const char *proto, const char *cmd_desc, uint8_t cmd, uint32_t offset, uint8_t value) "%s %20s/ CMD%02d ofs %"PRIu32" value 0x%02x" +sdcard_read_data(const char *proto, const char *cmd_desc, uint8_t cmd, uint32_t offset, uint32_t length) "%s %20s/ CMD%02d ofs %"PRIu32" len %" PRIu32 sdcard_set_voltage(uint16_t millivolts) "%u mV" # pxa2xx_mmci.c -- 2.41.0

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> --- hw/sd/sd.c | 9 +++++++-- hw/sd/trace-events | 1 + 2 files changed, 8 insertions(+), 2 deletions(-) diff --git a/hw/sd/sd.c b/hw/sd/sd.c index 090a6fdcdb..464576751a 100644 --- a/hw/sd/sd.c +++ b/hw/sd/sd.c @@ -608,10 +608,15 @@ static void sd_response_r7_make(SDState *sd, uint8_t *response) static uint64_t sd_req_get_address(SDState *sd, SDRequest req) { + uint64_t addr; + if (FIELD_EX32(sd->ocr, OCR, CARD_CAPACITY)) { - return (uint64_t) req.arg << HWBLOCK_SHIFT; + addr = (uint64_t) req.arg << HWBLOCK_SHIFT; + } else { + addr = req.arg; } - return req.arg; + trace_sdcard_req_addr(req.arg, addr); + return addr; } static inline uint64_t sd_addr_to_wpnum(uint64_t addr) diff --git a/hw/sd/trace-events b/hw/sd/trace-events index 0eee98a646..43eaeba149 100644 --- a/hw/sd/trace-events +++ b/hw/sd/trace-events @@ -50,6 +50,7 @@ sdcard_ejected(void) "" sdcard_erase(uint32_t first, uint32_t last) "addr first 0x%" PRIx32" last 0x%" PRIx32 sdcard_lock(void) "" sdcard_unlock(void) "" +sdcard_req_addr(uint32_t req_arg, uint64_t addr) "req 0x%" PRIx32 " addr 0x%" PRIx64 sdcard_read_block(uint64_t addr, uint32_t len) "addr 0x%" PRIx64 " size 0x%x" sdcard_write_block(uint64_t addr, uint32_t len) "addr 0x%" PRIx64 " size 0x%x" sdcard_write_data(const char *proto, const char *cmd_desc, uint8_t cmd, uint32_t offset, uint8_t value) "%s %20s/ CMD%02d ofs %"PRIu32" value 0x%02x" -- 2.41.0

"General command" (GEN_CMD, CMD56) is described as: GEN_CMD is the same as the single block read or write commands (CMD24 or CMD17). The difference is that [...] the data block is not a memory payload data but has a vendor specific format and meaning. Thus this block must not be stored overwriting data block on underlying storage drive. Keep it in a dedicated 'vendor_data[]' array. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Cédric Le Goater <clg@redhat.com> --- RFC: Is it safe to reuse VMSTATE_UNUSED_V() (which happens to be the same size)? Cc: Peter Xu <peterx@redhat.com> Cc: Fabiano Rosas <farosas@suse.de> --- hw/sd/sd.c | 17 +++++++++-------- 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/hw/sd/sd.c b/hw/sd/sd.c index 464576751a..1f3eea6e84 100644 --- a/hw/sd/sd.c +++ b/hw/sd/sd.c @@ -142,6 +142,8 @@ struct SDState { uint64_t data_start; uint32_t data_offset; uint8_t data[512]; + uint8_t vendor_data[512]; + qemu_irq readonly_cb; qemu_irq inserted_cb; QEMUTimer *ocr_power_timer; @@ -656,6 +658,7 @@ static void sd_reset(DeviceState *dev) sd->wp_switch = sd->blk ? !blk_is_writable(sd->blk) : false; sd->wp_group_bits = sect; sd->wp_group_bmap = bitmap_new(sd->wp_group_bits); + memset(sd->vendor_data, 0xec, sizeof(sd->vendor_data)); memset(sd->function_group, 0, sizeof(sd->function_group)); sd->erase_start = INVALID_ADDRESS; sd->erase_end = INVALID_ADDRESS; @@ -771,7 +774,7 @@ static const VMStateDescription sd_vmstate = { VMSTATE_UINT64(data_start, SDState), VMSTATE_UINT32(data_offset, SDState), VMSTATE_UINT8_ARRAY(data, SDState, 512), - VMSTATE_UNUSED_V(1, 512), + VMSTATE_UINT8_ARRAY(vendor_data, SDState, 512), VMSTATE_BOOL(enable, SDState), VMSTATE_END_OF_LIST() }, @@ -2029,9 +2032,8 @@ void sd_write_byte(SDState *sd, uint8_t value) break; case 56: /* CMD56: GEN_CMD */ - sd->data[sd->data_offset ++] = value; - if (sd->data_offset >= sd->blk_len) { - APP_WRITE_BLOCK(sd->data_start, sd->data_offset); + sd->vendor_data[sd->data_offset ++] = value; + if (sd->data_offset >= sizeof(sd->vendor_data)) { sd->state = sd_transfer_state; } break; @@ -2165,12 +2167,11 @@ uint8_t sd_read_byte(SDState *sd) break; case 56: /* CMD56: GEN_CMD */ - if (sd->data_offset == 0) - APP_READ_BLOCK(sd->data_start, sd->blk_len); - ret = sd->data[sd->data_offset ++]; + ret = sd->vendor_data[sd->data_offset ++]; - if (sd->data_offset >= sd->blk_len) + if (sd->data_offset >= sizeof(sd->vendor_data)) { sd->state = sd_transfer_state; + } break; default: -- 2.41.0

Philippe Mathieu-Daudé <philmd@linaro.org> writes:
"General command" (GEN_CMD, CMD56) is described as:
GEN_CMD is the same as the single block read or write commands (CMD24 or CMD17). The difference is that [...] the data block is not a memory payload data but has a vendor specific format and meaning.
Thus this block must not be stored overwriting data block on underlying storage drive. Keep it in a dedicated 'vendor_data[]' array.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Cédric Le Goater <clg@redhat.com> --- RFC: Is it safe to reuse VMSTATE_UNUSED_V() (which happens to be the same size)?
Hi, sorry it took some time to get to this, I had just left for vacation when you first posted. I think it's ok: { "field": "unused", "version_id": 1, "field_exists": false, "size": 512 }, vs. { "field": "vendor_data", "version_id": 0, "field_exists": false, "num": 512, "size": 1 }, The unused field was introduced in 2016 so there's no chance of migrating a QEMU that old to/from 9.1.

On Tue, Jul 09, 2024 at 05:38:54PM -0300, Fabiano Rosas wrote:
Philippe Mathieu-Daudé <philmd@linaro.org> writes:
"General command" (GEN_CMD, CMD56) is described as:
GEN_CMD is the same as the single block read or write commands (CMD24 or CMD17). The difference is that [...] the data block is not a memory payload data but has a vendor specific format and meaning.
Thus this block must not be stored overwriting data block on underlying storage drive. Keep it in a dedicated 'vendor_data[]' array.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Cédric Le Goater <clg@redhat.com> --- RFC: Is it safe to reuse VMSTATE_UNUSED_V() (which happens to be the same size)?
Hi, sorry it took some time to get to this, I had just left for vacation when you first posted.
And I totally overlooked there's the email.. until you replied. Welcome back.
I think it's ok:
{ "field": "unused", "version_id": 1, "field_exists": false, "size": 512 },
vs.
{ "field": "vendor_data", "version_id": 0, "field_exists": false, "num": 512, "size": 1 },
The unused field was introduced in 2016 so there's no chance of migrating a QEMU that old to/from 9.1.
What happens if an old qemu 9.0 sends rubbish here to a new QEMU, while the new QEMU would consider it meaningful data? -- Peter Xu

Peter Xu <peterx@redhat.com> writes:
On Tue, Jul 09, 2024 at 05:38:54PM -0300, Fabiano Rosas wrote:
Philippe Mathieu-Daudé <philmd@linaro.org> writes:
"General command" (GEN_CMD, CMD56) is described as:
GEN_CMD is the same as the single block read or write commands (CMD24 or CMD17). The difference is that [...] the data block is not a memory payload data but has a vendor specific format and meaning.
Thus this block must not be stored overwriting data block on underlying storage drive. Keep it in a dedicated 'vendor_data[]' array.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Cédric Le Goater <clg@redhat.com> --- RFC: Is it safe to reuse VMSTATE_UNUSED_V() (which happens to be the same size)?
Hi, sorry it took some time to get to this, I had just left for vacation when you first posted.
And I totally overlooked there's the email.. until you replied. Welcome back.
Thanks!
I think it's ok:
{ "field": "unused", "version_id": 1, "field_exists": false, "size": 512 },
vs.
{ "field": "vendor_data", "version_id": 0, "field_exists": false, "num": 512, "size": 1 },
The unused field was introduced in 2016 so there's no chance of migrating a QEMU that old to/from 9.1.
What happens if an old qemu 9.0 sends rubbish here to a new QEMU, while the new QEMU would consider it meaningful data?
It will send zeros, no? The code will have to cope with that. The alternative is to put the vendor_data in a subsection and the code will also have to cope with the lack of data when the old QEMU doesn't send it.

On Wed, Jul 10, 2024 at 11:08:20AM -0300, Fabiano Rosas wrote:
I think it's ok:
{ "field": "unused", "version_id": 1, "field_exists": false, "size": 512 },
vs.
{ "field": "vendor_data", "version_id": 0, "field_exists": false, "num": 512, "size": 1 },
The unused field was introduced in 2016 so there's no chance of migrating a QEMU that old to/from 9.1.
What happens if an old qemu 9.0 sends rubbish here to a new QEMU, while the new QEMU would consider it meaningful data?
It will send zeros, no? The code will have to cope with that. The alternative is to put the vendor_data in a subsection and the code will also have to cope with the lack of data when the old QEMU doesn't send it.
Ah indeed, that "static const uint8_t buf[1024]" is there at least since 2017. So yes, probably always sending zeros. Nothing I can think of otherwise indeed, if we want to trust that nothing will migrate before 2016. It's just that we may want to know how that "2016" is justified to be safe if we would like to allow that in the future. One thing _could_ be that "rule of thumb" is we plan to obsolete machines with 6 years, so anything "UNUSED" older than 6 years can be over-written? -- Peter Xu

Peter Xu <peterx@redhat.com> writes:
On Wed, Jul 10, 2024 at 11:08:20AM -0300, Fabiano Rosas wrote:
I think it's ok:
{ "field": "unused", "version_id": 1, "field_exists": false, "size": 512 },
vs.
{ "field": "vendor_data", "version_id": 0, "field_exists": false, "num": 512, "size": 1 },
The unused field was introduced in 2016 so there's no chance of migrating a QEMU that old to/from 9.1.
What happens if an old qemu 9.0 sends rubbish here to a new QEMU, while the new QEMU would consider it meaningful data?
It will send zeros, no? The code will have to cope with that. The alternative is to put the vendor_data in a subsection and the code will also have to cope with the lack of data when the old QEMU doesn't send it.
Ah indeed, that "static const uint8_t buf[1024]" is there at least since 2017. So yes, probably always sending zeros.
@Philippe, can vendor_data be 0 after migration? Otherwise 9.0 -> 9.1 migration might crash.
Nothing I can think of otherwise indeed, if we want to trust that nothing will migrate before 2016. It's just that we may want to know how that "2016" is justified to be safe if we would like to allow that in the future.
It's not about trust, we simply don't support migrations other than n->n+1 and (maybe) n->n-1. So QEMU from 2016 is certainly not included.
One thing _could_ be that "rule of thumb" is we plan to obsolete machines with 6 years, so anything "UNUSED" older than 6 years can be over-written?

On Wed, Jul 10, 2024 at 01:21:51PM -0300, Fabiano Rosas wrote:
It's not about trust, we simply don't support migrations other than n->n+1 and (maybe) n->n-1. So QEMU from 2016 is certainly not included.
Where does it come from? I thought we suppport that.. The same question would be: are we requesting an OpenStack cluster to always upgrade QEMU with +1 versions, otherwise migration will fail? -- Peter Xu

Peter Xu <peterx@redhat.com> writes:
On Wed, Jul 10, 2024 at 01:21:51PM -0300, Fabiano Rosas wrote:
It's not about trust, we simply don't support migrations other than n->n+1 and (maybe) n->n-1. So QEMU from 2016 is certainly not included.
Where does it come from? I thought we suppport that..
I'm taking that from: docs/devel/migration/main.rst: "In general QEMU tries to maintain forward migration compatibility (i.e. migrating from QEMU n->n+1) and there are users who benefit from backward compatibility as well." But of course it doesn't say whether that comes with a transitive rule allowing n->n+2 migrations.
The same question would be: are we requesting an OpenStack cluster to always upgrade QEMU with +1 versions, otherwise migration will fail?
Will an OpenStack cluster be using upstream QEMU? If not, then that's a question for the distro. In a very practical sense, we're not requesting anything. We barely test n->n+1/n->n-1, even if we had a strong support statement I wouldn't be confident saying migration from QEMU 2.7 -> QEMU 9.1 should succeed.

On Wed, Jul 10, 2024 at 04:48:23PM -0300, Fabiano Rosas wrote:
Peter Xu <peterx@redhat.com> writes:
On Wed, Jul 10, 2024 at 01:21:51PM -0300, Fabiano Rosas wrote:
It's not about trust, we simply don't support migrations other than n->n+1 and (maybe) n->n-1. So QEMU from 2016 is certainly not included.
Where does it come from? I thought we suppport that..
I'm taking that from:
docs/devel/migration/main.rst: "In general QEMU tries to maintain forward migration compatibility (i.e. migrating from QEMU n->n+1) and there are users who benefit from backward compatibility as well."
But of course it doesn't say whether that comes with a transitive rule allowing n->n+2 migrations.
I'd say that "i.e." implies n->n+1 is not the only forward migration we would support. I _think_ we should support all forward migration as long as the machine type matches.
The same question would be: are we requesting an OpenStack cluster to always upgrade QEMU with +1 versions, otherwise migration will fail?
Will an OpenStack cluster be using upstream QEMU? If not, then that's a
It's an example to show what I meant! :) Nothing else. Definitely not saying that everyone should use an upstream released QEMU (but in reality, it's not a problem, I think, and I do feel like people use them, perhaps more with the stable releases).
question for the distro. In a very practical sense, we're not requesting anything. We barely test n->n+1/n->n-1, even if we had a strong support statement I wouldn't be confident saying migration from QEMU 2.7 -> QEMU 9.1 should succeed.
No matter what we test in CI, I don't think we should break that for >1 versions.. I hope 2.7->9.1 keeps working, otherwise I think it's legal to file a bug by anyone. For example, I randomly fetched a bug report: https://gitlab.com/qemu-project/qemu/-/issues/1937 QEMU version: 6.2 and 7.2.5 And I believe that's the common case even for upstream. If we don't do that right for upstream, it can be impossible tasks for downstream and for all of us to maintain. -- Peter Xu

Peter Xu <peterx@redhat.com> writes:
On Wed, Jul 10, 2024 at 04:48:23PM -0300, Fabiano Rosas wrote:
Peter Xu <peterx@redhat.com> writes:
On Wed, Jul 10, 2024 at 01:21:51PM -0300, Fabiano Rosas wrote:
It's not about trust, we simply don't support migrations other than n->n+1 and (maybe) n->n-1. So QEMU from 2016 is certainly not included.
Where does it come from? I thought we suppport that..
I'm taking that from:
docs/devel/migration/main.rst: "In general QEMU tries to maintain forward migration compatibility (i.e. migrating from QEMU n->n+1) and there are users who benefit from backward compatibility as well."
But of course it doesn't say whether that comes with a transitive rule allowing n->n+2 migrations.
I'd say that "i.e." implies n->n+1 is not the only forward migration we would support.
I _think_ we should support all forward migration as long as the machine type matches.
The same question would be: are we requesting an OpenStack cluster to always upgrade QEMU with +1 versions, otherwise migration will fail?
Will an OpenStack cluster be using upstream QEMU? If not, then that's a
It's an example to show what I meant! :) Nothing else. Definitely not saying that everyone should use an upstream released QEMU (but in reality, it's not a problem, I think, and I do feel like people use them, perhaps more with the stable releases).
question for the distro. In a very practical sense, we're not requesting anything. We barely test n->n+1/n->n-1, even if we had a strong support statement I wouldn't be confident saying migration from QEMU 2.7 -> QEMU 9.1 should succeed.
No matter what we test in CI, I don't think we should break that for >1 versions.. I hope 2.7->9.1 keeps working, otherwise I think it's legal to file a bug by anyone.
For example, I randomly fetched a bug report:
https://gitlab.com/qemu-project/qemu/-/issues/1937
QEMU version: 6.2 and 7.2.5
And I believe that's the common case even for upstream. If we don't do that right for upstream, it can be impossible tasks for downstream and for all of us to maintain.
But do we do that right currently? I have no idea. Have we ever done it? And we're here discussing a hypothetical 2.7->9.1 ... So we cannot reuse the UNUSED field because QEMU from 2016 might send their data and QEMU from 2024 would interpret it wrong. How do we proceed? Add a subsection. And make the code survive when receiving 0. @Peter is that it? What about backwards-compat? We'll need a property as well it seems.

On Wed, Jul 10, 2024 at 06:38:26PM -0300, Fabiano Rosas wrote:
Peter Xu <peterx@redhat.com> writes:
On Wed, Jul 10, 2024 at 04:48:23PM -0300, Fabiano Rosas wrote:
Peter Xu <peterx@redhat.com> writes:
On Wed, Jul 10, 2024 at 01:21:51PM -0300, Fabiano Rosas wrote:
It's not about trust, we simply don't support migrations other than n->n+1 and (maybe) n->n-1. So QEMU from 2016 is certainly not included.
Where does it come from? I thought we suppport that..
I'm taking that from:
docs/devel/migration/main.rst: "In general QEMU tries to maintain forward migration compatibility (i.e. migrating from QEMU n->n+1) and there are users who benefit from backward compatibility as well."
But of course it doesn't say whether that comes with a transitive rule allowing n->n+2 migrations.
I'd say that "i.e." implies n->n+1 is not the only forward migration we would support.
I _think_ we should support all forward migration as long as the machine type matches.
The same question would be: are we requesting an OpenStack cluster to always upgrade QEMU with +1 versions, otherwise migration will fail?
Will an OpenStack cluster be using upstream QEMU? If not, then that's a
It's an example to show what I meant! :) Nothing else. Definitely not saying that everyone should use an upstream released QEMU (but in reality, it's not a problem, I think, and I do feel like people use them, perhaps more with the stable releases).
question for the distro. In a very practical sense, we're not requesting anything. We barely test n->n+1/n->n-1, even if we had a strong support statement I wouldn't be confident saying migration from QEMU 2.7 -> QEMU 9.1 should succeed.
No matter what we test in CI, I don't think we should break that for >1 versions.. I hope 2.7->9.1 keeps working, otherwise I think it's legal to file a bug by anyone.
For example, I randomly fetched a bug report:
https://gitlab.com/qemu-project/qemu/-/issues/1937
QEMU version: 6.2 and 7.2.5
And I believe that's the common case even for upstream. If we don't do that right for upstream, it can be impossible tasks for downstream and for all of us to maintain.
But do we do that right currently? I have no idea. Have we ever done it? And we're here discussing a hypothetical 2.7->9.1 ...
So we cannot reuse the UNUSED field because QEMU from 2016 might send their data and QEMU from 2024 would interpret it wrong.
How do we proceed? Add a subsection. And make the code survive when receiving 0.
@Peter is that it? What about backwards-compat? We'll need a property as well it seems.
Compat property is definitely one way to go, but I think it's you who more or less persuaded me that reusing it seems possible! At least I can't yet think of anything bad if it's ancient unused buffers. And that's why I was asking about a sane way to describe the "magic year".. And I was very serious when I said "6 years" to follow the deprecation of machine types, because it'll be something we can follow to say when an unused buffer can be obsolete and it could make some sense: if we will start to deprecate machine types, then it may not make sense to keep any UNUSED compatible forever, after all. And we need that ruler to be as accurate as "always 6 years to follow machine type deprecation procedure", in case someone else tomorrow asks us something that was only UNUSED since 2018. We need a rule of thumb if we want to reuse it, and if all of you agree we can start with this one, perhaps with a comment above the field (before we think all through and make it a rule on deprecating UNUSED)? -- Peter Xu

Peter Xu <peterx@redhat.com> writes:
On Wed, Jul 10, 2024 at 06:38:26PM -0300, Fabiano Rosas wrote:
Peter Xu <peterx@redhat.com> writes:
On Wed, Jul 10, 2024 at 04:48:23PM -0300, Fabiano Rosas wrote:
Peter Xu <peterx@redhat.com> writes:
On Wed, Jul 10, 2024 at 01:21:51PM -0300, Fabiano Rosas wrote:
It's not about trust, we simply don't support migrations other than n->n+1 and (maybe) n->n-1. So QEMU from 2016 is certainly not included.
Where does it come from? I thought we suppport that..
I'm taking that from:
docs/devel/migration/main.rst: "In general QEMU tries to maintain forward migration compatibility (i.e. migrating from QEMU n->n+1) and there are users who benefit from backward compatibility as well."
But of course it doesn't say whether that comes with a transitive rule allowing n->n+2 migrations.
I'd say that "i.e." implies n->n+1 is not the only forward migration we would support.
I _think_ we should support all forward migration as long as the machine type matches.
The same question would be: are we requesting an OpenStack cluster to always upgrade QEMU with +1 versions, otherwise migration will fail?
Will an OpenStack cluster be using upstream QEMU? If not, then that's a
It's an example to show what I meant! :) Nothing else. Definitely not saying that everyone should use an upstream released QEMU (but in reality, it's not a problem, I think, and I do feel like people use them, perhaps more with the stable releases).
question for the distro. In a very practical sense, we're not requesting anything. We barely test n->n+1/n->n-1, even if we had a strong support statement I wouldn't be confident saying migration from QEMU 2.7 -> QEMU 9.1 should succeed.
No matter what we test in CI, I don't think we should break that for >1 versions.. I hope 2.7->9.1 keeps working, otherwise I think it's legal to file a bug by anyone.
For example, I randomly fetched a bug report:
https://gitlab.com/qemu-project/qemu/-/issues/1937
QEMU version: 6.2 and 7.2.5
And I believe that's the common case even for upstream. If we don't do that right for upstream, it can be impossible tasks for downstream and for all of us to maintain.
But do we do that right currently? I have no idea. Have we ever done it? And we're here discussing a hypothetical 2.7->9.1 ...
So we cannot reuse the UNUSED field because QEMU from 2016 might send their data and QEMU from 2024 would interpret it wrong.
How do we proceed? Add a subsection. And make the code survive when receiving 0.
@Peter is that it? What about backwards-compat? We'll need a property as well it seems.
Compat property is definitely one way to go, but I think it's you who more or less persuaded me that reusing it seems possible! At least I can't yet think of anything bad if it's ancient unused buffers.
Since we're allowing any old QEMU version to migrate to the most recent one, we need to think of the data that was there before the introduction of the UNUSED field. If that QEMU version is used, then it's not going to be zeroes on the stream, but whatever data was there before. The new QEMU will be expecting the vendor_data introduced in this patch.
And that's why I was asking about a sane way to describe the "magic year".. And I was very serious when I said "6 years" to follow the deprecation of machine types, because it'll be something we can follow to say when an unused buffer can be obsolete and it could make some sense: if we will start to deprecate machine types, then it may not make sense to keep any UNUSED compatible forever, after all.
Is there an easy way to look at a field and tell in which machine type's timeframe it was introduced? If the machine type of that era has been removed, then the field is free to go as well. I'd prefer if we had a hard link instead of just counting years. Maybe we should to that mapping at the machine deprecation time? As in, "look at the unused fields introduced in that timeframe and mark them free".
And we need that ruler to be as accurate as "always 6 years to follow machine type deprecation procedure", in case someone else tomorrow asks us something that was only UNUSED since 2018. We need a rule of thumb if we want to reuse it, and if all of you agree we can start with this one, perhaps with a comment above the field (before we think all through and make it a rule on deprecating UNUSED)?

On Thu, Jul 11, 2024 at 10:34:12AM -0300, Fabiano Rosas wrote:
Is there an easy way to look at a field and tell in which machine type's timeframe it was introduced?
I am not aware of any.
If the machine type of that era has been removed, then the field is free to go as well. I'd prefer if we had a hard link instead of just counting years. Maybe we should to that mapping at the machine deprecation time? As in, "look at the unused fields introduced in that timeframe and mark them free".
We can do that, but depending on how easy it would be. That can be an overkill to me if it's non-trivial. When it becomes complicated, I'd rather make machine compat property easier to use so we always stick with that. Currently it's not as easy to use. Maybe we shouldn't make it a common rule to let people reuse the UNUSED fields, even if in this case it's probably fine? E.g. I don't think it's a huge deal to keep all UNUSED fields forever - sending 512B zeros for only one specific device isn't an issue even if kept forever. If "over 6 years" would be okay and simple enough, then maybe we can stick with that (and only if people would like to reuse a field and ask; that's after all not required..). If more than that I doubt whether we should spend time working on covering all the fields. -- Peter Xu

Peter Xu <peterx@redhat.com> writes:
On Thu, Jul 11, 2024 at 10:34:12AM -0300, Fabiano Rosas wrote:
Is there an easy way to look at a field and tell in which machine type's timeframe it was introduced?
I am not aware of any.
If the machine type of that era has been removed, then the field is free to go as well. I'd prefer if we had a hard link instead of just counting years. Maybe we should to that mapping at the machine deprecation time? As in, "look at the unused fields introduced in that timeframe and mark them free".
We can do that, but depending on how easy it would be. That can be an overkill to me if it's non-trivial. When it becomes complicated, I'd rather make machine compat property easier to use so we always stick with that. Currently it's not as easy to use.
Maybe we shouldn't make it a common rule to let people reuse the UNUSED fields, even if in this case it's probably fine?
E.g. I don't think it's a huge deal to keep all UNUSED fields forever - sending 512B zeros for only one specific device isn't an issue even if kept forever.
If "over 6 years" would be okay and simple enough, then maybe we can stick with that (and only if people would like to reuse a field and ask; that's after all not required..). If more than that I doubt whether we should spend time working on covering all the fields.
I'm fine with a simple rule. But of course, that means we cannot claim to support all kinds of forward migrations anymore. Only those in the 6 year period.

On Thu, Jul 11, 2024 at 11:44:08AM -0300, Fabiano Rosas wrote:
But of course, that means we cannot claim to support all kinds of forward migrations anymore. Only those in the 6 year period.
That "6 years" comes from machine type deprecation period, and migration compatibility is mostly only attached to machine types, and we only ever allowed migration with the same machine type. It means, >6 years migration will never work anyway as soon as we start to deprecate machine types (irrelevant of any reuse of UNUSED), because the >6 years machine types will simply be gone.. See configuration_post_load(), where it'll throw an error upfront when machine type mismatched. -- Peter Xu

Peter Xu <peterx@redhat.com> writes:
On Thu, Jul 11, 2024 at 11:44:08AM -0300, Fabiano Rosas wrote:
But of course, that means we cannot claim to support all kinds of forward migrations anymore. Only those in the 6 year period.
That "6 years" comes from machine type deprecation period, and migration compatibility is mostly only attached to machine types, and we only ever allowed migration with the same machine type.
It means, >6 years migration will never work anyway as soon as we start to deprecate machine types (irrelevant of any reuse of UNUSED), because the >6 years machine types will simply be gone.. See configuration_post_load(), where it'll throw an error upfront when machine type mismatched.
Yes, duh! What am I talking about...

Per sections 3.6.1 (SD Bus Protocol) and 7.3.2 (Responses): In the CMD line the Most Significant Bit is transmitted first. Use the stl_be_p() helper to store the value in big-endian. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Cédric Le Goater <clg@redhat.com> --- RFC because I'm surprised this has been unnoticed for 17 years (commit a1bb27b1e9 "initial SD card emulation", April 2007). Cc: Peter Maydell <peter.maydell@linaro.org> --- hw/sd/sd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/sd/sd.c b/hw/sd/sd.c index 1f3eea6e84..4e09640852 100644 --- a/hw/sd/sd.c +++ b/hw/sd/sd.c @@ -1507,7 +1507,7 @@ static sd_rsp_type_t sd_normal_command(SDState *sd, SDRequest req) } sd->state = sd_sendingdata_state; - *(uint32_t *) sd->data = sd_wpbits(sd, req.arg); + stl_be_p(sd->data, sd_wpbits(sd, req.arg)); sd->data_start = addr; sd->data_offset = 0; return sd_r1; -- 2.41.0

Per sections 3.6.1 (SD Bus Protocol), 4.3.4 "Data Write" and 7.3.2 (Responses): In the CMD line the Most Significant Bit is transmitted first. Use the stl_be_p() helper to store the value in big-endian. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Cédric Le Goater <clg@redhat.com> --- RFC because I'm surprised this has been unnoticed for 17 years (commit a1bb27b1e9 "initial SD card emulation", April 2007). Cc: Peter Maydell <peter.maydell@linaro.org> --- hw/sd/sd.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/hw/sd/sd.c b/hw/sd/sd.c index 4e09640852..1f37d9c93a 100644 --- a/hw/sd/sd.c +++ b/hw/sd/sd.c @@ -1668,8 +1668,7 @@ static sd_rsp_type_t sd_app_command(SDState *sd, case 22: /* ACMD22: SEND_NUM_WR_BLOCKS */ switch (sd->state) { case sd_transfer_state: - *(uint32_t *) sd->data = sd->blk_written; - + stl_be_p(sd->data, sd->blk_written); sd->state = sd_sendingdata_state; sd->data_start = 0; sd->data_offset = 0; -- 2.41.0

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Cédric Le Goater <clg@redhat.com> --- hw/sd/sd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/sd/sd.c b/hw/sd/sd.c index 1f37d9c93a..135b7d2e23 100644 --- a/hw/sd/sd.c +++ b/hw/sd/sd.c @@ -561,7 +561,7 @@ FIELD(CSR, OUT_OF_RANGE, 31, 1) static void sd_set_cardstatus(SDState *sd) { - sd->card_status = 0x00000100; + sd->card_status = READY_FOR_DATA; } static void sd_set_sdstatus(SDState *sd) -- 2.41.0

SDCardStates enum values are specified, so assign them correspondingly. It will be useful later when we add states from later specs, which might not be continuous. See CURRENT_STATE bits in section 4.10.1 "Card Status". Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Cédric Le Goater <clg@redhat.com> --- hw/sd/sd.c | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/hw/sd/sd.c b/hw/sd/sd.c index 135b7d2e23..fbdfafa3a6 100644 --- a/hw/sd/sd.c +++ b/hw/sd/sd.c @@ -75,16 +75,16 @@ enum SDCardModes { }; enum SDCardStates { - sd_inactive_state = -1, - sd_idle_state = 0, - sd_ready_state, - sd_identification_state, - sd_standby_state, - sd_transfer_state, - sd_sendingdata_state, - sd_receivingdata_state, - sd_programming_state, - sd_disconnect_state, + sd_inactive_state = -1, + sd_idle_state = 0, + sd_ready_state = 1, + sd_identification_state = 2, + sd_standby_state = 3, + sd_transfer_state = 4, + sd_sendingdata_state = 5, + sd_receivingdata_state = 6, + sd_programming_state = 7, + sd_disconnect_state = 8, }; typedef sd_rsp_type_t (*sd_cmd_handler)(SDState *sd, SDRequest req); -- 2.41.0

Card entering sd_inactive_state powers off, and won't respond anymore. Handle that once when entering sd_do_command(). Remove condition always true in sd_cmd_GO_IDLE_STATE(). Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Cédric Le Goater <clg@redhat.com> --- hw/sd/sd.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/hw/sd/sd.c b/hw/sd/sd.c index fbdfafa3a6..7533a78cf6 100644 --- a/hw/sd/sd.c +++ b/hw/sd/sd.c @@ -1081,10 +1081,8 @@ static sd_rsp_type_t sd_cmd_unimplemented(SDState *sd, SDRequest req) /* CMD0 */ static sd_rsp_type_t sd_cmd_GO_IDLE_STATE(SDState *sd, SDRequest req) { - if (sd->state != sd_inactive_state) { - sd->state = sd_idle_state; - sd_reset(DEVICE(sd)); - } + sd->state = sd_idle_state; + sd_reset(DEVICE(sd)); return sd_is_spi(sd) ? sd_r1 : sd_r0; } @@ -1579,7 +1577,6 @@ static sd_rsp_type_t sd_normal_command(SDState *sd, SDRequest req) switch (sd->state) { case sd_ready_state: case sd_identification_state: - case sd_inactive_state: return sd_illegal; case sd_idle_state: if (rca) { @@ -1800,6 +1797,11 @@ int sd_do_command(SDState *sd, SDRequest *req, return 0; } + if (sd->state == sd_inactive_state) { + rtype = sd_illegal; + goto send_response; + } + if (sd_req_crc_validate(req)) { sd->card_status |= COM_CRC_ERROR; rtype = sd_illegal; -- 2.41.0

SWITCH_FUNCTION is only allowed in TRANSFER state (See 4.8 "Card State Transition Table). Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Cédric Le Goater <clg@redhat.com> --- hw/sd/sd.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/hw/sd/sd.c b/hw/sd/sd.c index 7533a78cf6..8f441e418c 100644 --- a/hw/sd/sd.c +++ b/hw/sd/sd.c @@ -1205,6 +1205,10 @@ static sd_rsp_type_t sd_normal_command(SDState *sd, SDRequest req) if (sd->mode != sd_data_transfer_mode) { return sd_invalid_mode_for_cmd(sd, req); } + if (sd->state != sd_transfer_state) { + return sd_invalid_state_for_cmd(sd, req); + } + sd_function_switch(sd, req.arg); sd->state = sd_sendingdata_state; sd->data_start = 0; -- 2.41.0

Keep direct reference to SDProto in SDState, remove then unnecessary sd_proto(). Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Cédric Le Goater <clg@redhat.com> --- hw/sd/sd.c | 37 +++++++++++++++++-------------------- 1 file changed, 17 insertions(+), 20 deletions(-) diff --git a/hw/sd/sd.c b/hw/sd/sd.c index 8f441e418c..aaa50ab2c5 100644 --- a/hw/sd/sd.c +++ b/hw/sd/sd.c @@ -116,6 +116,8 @@ struct SDState { uint8_t spec_version; BlockBackend *blk; + const SDProto *proto; + /* Runtime changeables */ uint32_t mode; /* current card mode, one of SDCardModes */ @@ -154,18 +156,11 @@ struct SDState { static void sd_realize(DeviceState *dev, Error **errp); -static const struct SDProto *sd_proto(SDState *sd) -{ - SDCardClass *sc = SD_CARD_GET_CLASS(sd); - - return sc->proto; -} - static const SDProto sd_proto_spi; static bool sd_is_spi(SDState *sd) { - return sd_proto(sd) == &sd_proto_spi; + return sd->proto == &sd_proto_spi; } static const char *sd_version_str(enum SDPhySpecificationVersion version) @@ -1044,7 +1039,7 @@ static bool address_in_range(SDState *sd, const char *desc, static sd_rsp_type_t sd_invalid_state_for_cmd(SDState *sd, SDRequest req) { qemu_log_mask(LOG_GUEST_ERROR, "%s: CMD%i in a wrong state: %s (spec %s)\n", - sd_proto(sd)->name, req.cmd, sd_state_name(sd->state), + sd->proto->name, req.cmd, sd_state_name(sd->state), sd_version_str(sd->spec_version)); return sd_illegal; @@ -1053,7 +1048,7 @@ static sd_rsp_type_t sd_invalid_state_for_cmd(SDState *sd, SDRequest req) static sd_rsp_type_t sd_invalid_mode_for_cmd(SDState *sd, SDRequest req) { qemu_log_mask(LOG_GUEST_ERROR, "%s: CMD%i in a wrong mode: %s (spec %s)\n", - sd_proto(sd)->name, req.cmd, sd_mode_name(sd->mode), + sd->proto->name, req.cmd, sd_mode_name(sd->mode), sd_version_str(sd->spec_version)); return sd_illegal; @@ -1062,7 +1057,7 @@ static sd_rsp_type_t sd_invalid_mode_for_cmd(SDState *sd, SDRequest req) static sd_rsp_type_t sd_cmd_illegal(SDState *sd, SDRequest req) { qemu_log_mask(LOG_GUEST_ERROR, "%s: Unknown CMD%i for spec %s\n", - sd_proto(sd)->name, req.cmd, + sd->proto->name, req.cmd, sd_version_str(sd->spec_version)); return sd_illegal; @@ -1073,7 +1068,7 @@ __attribute__((unused)) static sd_rsp_type_t sd_cmd_unimplemented(SDState *sd, SDRequest req) { qemu_log_mask(LOG_UNIMP, "%s: CMD%i not implemented\n", - sd_proto(sd)->name, req.cmd); + sd->proto->name, req.cmd); return sd_illegal; } @@ -1166,7 +1161,7 @@ static sd_rsp_type_t sd_normal_command(SDState *sd, SDRequest req) * However there is no ACMD55, so we want to trace this particular case. */ if (req.cmd != 55 || sd->expecting_acmd) { - trace_sdcard_normal_command(sd_proto(sd)->name, + trace_sdcard_normal_command(sd->proto->name, sd->last_cmd_name, req.cmd, req.arg, sd_state_name(sd->state)); } @@ -1185,8 +1180,8 @@ static sd_rsp_type_t sd_normal_command(SDState *sd, SDRequest req) return sd_illegal; } - if (sd_proto(sd)->cmd[req.cmd]) { - return sd_proto(sd)->cmd[req.cmd](sd, req); + if (sd->proto->cmd[req.cmd]) { + return sd->proto->cmd[req.cmd](sd, req); } switch (req.cmd) { @@ -1632,12 +1627,12 @@ static sd_rsp_type_t sd_app_command(SDState *sd, SDRequest req) { sd->last_cmd_name = sd_acmd_name(req.cmd); - trace_sdcard_app_command(sd_proto(sd)->name, sd->last_cmd_name, + trace_sdcard_app_command(sd->proto->name, sd->last_cmd_name, req.cmd, req.arg, sd_state_name(sd->state)); sd->card_status |= APP_CMD; - if (sd_proto(sd)->acmd[req.cmd]) { - return sd_proto(sd)->acmd[req.cmd](sd, req); + if (sd->proto->acmd[req.cmd]) { + return sd->proto->acmd[req.cmd](sd, req); } switch (req.cmd) { @@ -1928,7 +1923,7 @@ void sd_write_byte(SDState *sd, uint8_t value) if (sd->card_status & (ADDRESS_ERROR | WP_VIOLATION)) return; - trace_sdcard_write_data(sd_proto(sd)->name, + trace_sdcard_write_data(sd->proto->name, sd->last_cmd_name, sd->current_cmd, sd->data_offset, value); switch (sd->current_cmd) { @@ -2083,7 +2078,7 @@ uint8_t sd_read_byte(SDState *sd) io_len = (sd->ocr & (1 << 30)) ? 512 : sd->blk_len; - trace_sdcard_read_data(sd_proto(sd)->name, + trace_sdcard_read_data(sd->proto->name, sd->last_cmd_name, sd->current_cmd, sd->data_offset, io_len); switch (sd->current_cmd) { @@ -2227,7 +2222,9 @@ static const SDProto sd_proto_sd = { static void sd_instance_init(Object *obj) { SDState *sd = SD_CARD(obj); + SDCardClass *sc = SD_CARD_GET_CLASS(sd); + sd->proto = sc->proto; sd->last_cmd_name = "UNSET"; sd->enable = true; sd->ocr_power_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, sd_ocr_powerup, sd); -- 2.41.0

From: Philippe Mathieu-Daudé <f4bug@amsat.org> Extract sd_blk_len() helper, use definitions instead of magic values. Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> --- hw/sd/sd.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/hw/sd/sd.c b/hw/sd/sd.c index aaa50ab2c5..5997e13107 100644 --- a/hw/sd/sd.c +++ b/hw/sd/sd.c @@ -603,6 +603,14 @@ static void sd_response_r7_make(SDState *sd, uint8_t *response) stl_be_p(response, sd->vhs); } +static uint32_t sd_blk_len(SDState *sd) +{ + if (FIELD_EX32(sd->ocr, OCR, CARD_CAPACITY)) { + return 1 << HWBLOCK_SHIFT; + } + return sd->blk_len; +} + static uint64_t sd_req_get_address(SDState *sd, SDRequest req) { uint64_t addr; @@ -2076,7 +2084,7 @@ uint8_t sd_read_byte(SDState *sd) if (sd->card_status & (ADDRESS_ERROR | WP_VIOLATION)) return 0x00; - io_len = (sd->ocr & (1 << 30)) ? 512 : sd->blk_len; + io_len = sd_blk_len(sd); trace_sdcard_read_data(sd->proto->name, sd->last_cmd_name, -- 2.41.0

Disable tests using 0x4567 hardcoded RCA otherwise when using random RCA we get: ERROR:../../tests/qtest/npcm7xx_sdhci-test.c:69:write_sdread: assertion failed: (ret == len) not ok /arm/npcm7xx_sdhci/read_sd - ERROR:../../tests/qtest/npcm7xx_sdhci-test.c:69:write_sdread: assertion failed: (ret == len) Bail out! See https://lore.kernel.org/qemu-devel/37f83be9-deb5-42a1-b704-14984351d803@lina... Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> --- Cc: Hao Wu <wuhaotsh@google.com> Cc: Shengtan Mao <stmao@google.com> Cc: Tyrone Ting <kfting@nuvoton.com> --- tests/qtest/npcm7xx_sdhci-test.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/tests/qtest/npcm7xx_sdhci-test.c b/tests/qtest/npcm7xx_sdhci-test.c index 5d68540e52..6a42b142ad 100644 --- a/tests/qtest/npcm7xx_sdhci-test.c +++ b/tests/qtest/npcm7xx_sdhci-test.c @@ -44,6 +44,7 @@ static QTestState *setup_sd_card(void) sdhci_cmd_regs(qts, NPCM7XX_MMC_BA, 0, 0, 0x41200000, 0, (41 << 8)); sdhci_cmd_regs(qts, NPCM7XX_MMC_BA, 0, 0, 0, 0, SDHC_ALL_SEND_CID); sdhci_cmd_regs(qts, NPCM7XX_MMC_BA, 0, 0, 0, 0, SDHC_SEND_RELATIVE_ADDR); + g_test_skip("hardcoded 0x4567 card address"); sdhci_cmd_regs(qts, NPCM7XX_MMC_BA, 0, 0, 0x45670000, 0, SDHC_SELECT_DESELECT_CARD); @@ -76,6 +77,9 @@ static void test_read_sd(void) { QTestState *qts = setup_sd_card(); + g_test_skip("hardcoded 0x4567 card address used in setup_sd_card()"); + return; + write_sdread(qts, "hello world"); write_sdread(qts, "goodbye"); @@ -108,6 +112,9 @@ static void test_write_sd(void) { QTestState *qts = setup_sd_card(); + g_test_skip("hardcoded 0x4567 card address used in setup_sd_card()"); + return; + sdwrite_read(qts, "hello world"); sdwrite_read(qts, "goodbye"); -- 2.41.0

On 27/06/2024 18.22, Philippe Mathieu-Daudé wrote:
Disable tests using 0x4567 hardcoded RCA otherwise when using random RCA we get:
ERROR:../../tests/qtest/npcm7xx_sdhci-test.c:69:write_sdread: assertion failed: (ret == len) not ok /arm/npcm7xx_sdhci/read_sd - ERROR:../../tests/qtest/npcm7xx_sdhci-test.c:69:write_sdread: assertion failed: (ret == len) Bail out!
See https://lore.kernel.org/qemu-devel/37f83be9-deb5-42a1-b704-14984351d803@lina...
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> --- Cc: Hao Wu <wuhaotsh@google.com> Cc: Shengtan Mao <stmao@google.com> Cc: Tyrone Ting <kfting@nuvoton.com> --- tests/qtest/npcm7xx_sdhci-test.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/tests/qtest/npcm7xx_sdhci-test.c b/tests/qtest/npcm7xx_sdhci-test.c index 5d68540e52..6a42b142ad 100644 --- a/tests/qtest/npcm7xx_sdhci-test.c +++ b/tests/qtest/npcm7xx_sdhci-test.c @@ -44,6 +44,7 @@ static QTestState *setup_sd_card(void) sdhci_cmd_regs(qts, NPCM7XX_MMC_BA, 0, 0, 0x41200000, 0, (41 << 8)); sdhci_cmd_regs(qts, NPCM7XX_MMC_BA, 0, 0, 0, 0, SDHC_ALL_SEND_CID); sdhci_cmd_regs(qts, NPCM7XX_MMC_BA, 0, 0, 0, 0, SDHC_SEND_RELATIVE_ADDR); + g_test_skip("hardcoded 0x4567 card address");
This g_test_skip here does not make too much sense (since you're doing it in the caller site, too) ... could you please replace it with a proper comment why this code needs to be reworked? Thanks! Thomas
sdhci_cmd_regs(qts, NPCM7XX_MMC_BA, 0, 0, 0x45670000, 0, SDHC_SELECT_DESELECT_CARD);
@@ -76,6 +77,9 @@ static void test_read_sd(void) { QTestState *qts = setup_sd_card();
+ g_test_skip("hardcoded 0x4567 card address used in setup_sd_card()"); + return; + write_sdread(qts, "hello world"); write_sdread(qts, "goodbye");
@@ -108,6 +112,9 @@ static void test_write_sd(void) { QTestState *qts = setup_sd_card();
+ g_test_skip("hardcoded 0x4567 card address used in setup_sd_card()"); + return; + sdwrite_read(qts, "hello world"); sdwrite_read(qts, "goodbye");

On 27/6/24 18:47, Thomas Huth wrote:
On 27/06/2024 18.22, Philippe Mathieu-Daudé wrote:
Disable tests using 0x4567 hardcoded RCA otherwise when using random RCA we get:
ERROR:../../tests/qtest/npcm7xx_sdhci-test.c:69:write_sdread: assertion failed: (ret == len) not ok /arm/npcm7xx_sdhci/read_sd - ERROR:../../tests/qtest/npcm7xx_sdhci-test.c:69:write_sdread: assertion failed: (ret == len) Bail out!
See https://lore.kernel.org/qemu-devel/37f83be9-deb5-42a1-b704-14984351d803@lina...
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> --- Cc: Hao Wu <wuhaotsh@google.com> Cc: Shengtan Mao <stmao@google.com> Cc: Tyrone Ting <kfting@nuvoton.com> --- tests/qtest/npcm7xx_sdhci-test.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/tests/qtest/npcm7xx_sdhci-test.c b/tests/qtest/npcm7xx_sdhci-test.c index 5d68540e52..6a42b142ad 100644 --- a/tests/qtest/npcm7xx_sdhci-test.c +++ b/tests/qtest/npcm7xx_sdhci-test.c @@ -44,6 +44,7 @@ static QTestState *setup_sd_card(void) sdhci_cmd_regs(qts, NPCM7XX_MMC_BA, 0, 0, 0x41200000, 0, (41 << 8)); sdhci_cmd_regs(qts, NPCM7XX_MMC_BA, 0, 0, 0, 0, SDHC_ALL_SEND_CID); sdhci_cmd_regs(qts, NPCM7XX_MMC_BA, 0, 0, 0, 0, SDHC_SEND_RELATIVE_ADDR); + g_test_skip("hardcoded 0x4567 card address");
This g_test_skip here does not make too much sense (since you're doing it in the caller site, too) ... could you please replace it with a proper comment why this code needs to be reworked? Thanks!
Sorry I forgot to tag "NOTFORMERGE". Ideally I'd wait Google maintainers to fix the test so we don't need this patch. If they don't I'll update as you suggested. (An alternative I haven't investigated is to run the test using the -seed argument to force deterministic mode). Thanks!
Thomas
sdhci_cmd_regs(qts, NPCM7XX_MMC_BA, 0, 0, 0x45670000, 0, SDHC_SELECT_DESELECT_CARD); @@ -76,6 +77,9 @@ static void test_read_sd(void) { QTestState *qts = setup_sd_card(); + g_test_skip("hardcoded 0x4567 card address used in setup_sd_card()"); + return;

Rather than using the obscure 0x4567 magic value, use a real random one. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Cédric Le Goater <clg@redhat.com> --- hw/sd/sd.c | 11 ++++++++--- hw/sd/trace-events | 1 + 2 files changed, 9 insertions(+), 3 deletions(-) diff --git a/hw/sd/sd.c b/hw/sd/sd.c index 5997e13107..d85b2906f4 100644 --- a/hw/sd/sd.c +++ b/hw/sd/sd.c @@ -46,6 +46,7 @@ #include "qemu/error-report.h" #include "qemu/timer.h" #include "qemu/log.h" +#include "qemu/guest-random.h" #include "qemu/module.h" #include "sdmmc-internal.h" #include "trace.h" @@ -488,9 +489,10 @@ static void sd_set_csd(SDState *sd, uint64_t size) /* Relative Card Address register */ -static void sd_set_rca(SDState *sd) +static void sd_set_rca(SDState *sd, uint16_t value) { - sd->rca += 0x4567; + trace_sdcard_set_rca(value); + sd->rca = value; } static uint16_t sd_req_get_rca(SDState *s, SDRequest req) @@ -1113,11 +1115,14 @@ static sd_rsp_type_t sd_cmd_ALL_SEND_CID(SDState *sd, SDRequest req) /* CMD3 */ static sd_rsp_type_t sd_cmd_SEND_RELATIVE_ADDR(SDState *sd, SDRequest req) { + uint16_t random_rca; + switch (sd->state) { case sd_identification_state: case sd_standby_state: sd->state = sd_standby_state; - sd_set_rca(sd); + qemu_guest_getrandom_nofail(&random_rca, sizeof(random_rca)); + sd_set_rca(sd, random_rca); return sd_r6; default: diff --git a/hw/sd/trace-events b/hw/sd/trace-events index 43eaeba149..6a51b0e906 100644 --- a/hw/sd/trace-events +++ b/hw/sd/trace-events @@ -43,6 +43,7 @@ sdcard_response(const char *rspdesc, int rsplen) "%s (sz:%d)" sdcard_powerup(void) "" sdcard_inquiry_cmd41(void) "" sdcard_reset(void) "" +sdcard_set_rca(uint16_t value) "new RCA: 0x%04x" sdcard_set_blocklen(uint16_t length) "block len 0x%03x" sdcard_set_block_count(uint32_t cnt) "block cnt 0x%"PRIx32 sdcard_inserted(bool readonly) "read_only: %u" -- 2.41.0

From: Cédric Le Goater <clg@kaod.org> Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> --- hw/sd/sdmmc-internal.h | 97 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 97 insertions(+) diff --git a/hw/sd/sdmmc-internal.h b/hw/sd/sdmmc-internal.h index d8bf17d204..306ffa7f53 100644 --- a/hw/sd/sdmmc-internal.h +++ b/hw/sd/sdmmc-internal.h @@ -11,6 +11,103 @@ #ifndef SDMMC_INTERNAL_H #define SDMMC_INTERNAL_H +/* + * EXT_CSD fields + */ + +#define EXT_CSD_CMDQ_MODE_EN 15 /* R/W */ +#define EXT_CSD_FLUSH_CACHE 32 /* W */ +#define EXT_CSD_CACHE_CTRL 33 /* R/W */ +#define EXT_CSD_POWER_OFF_NOTIFICATION 34 /* R/W */ +#define EXT_CSD_PACKED_FAILURE_INDEX 35 /* RO */ +#define EXT_CSD_PACKED_CMD_STATUS 36 /* RO */ +#define EXT_CSD_EXP_EVENTS_STATUS 54 /* RO, 2 bytes */ +#define EXT_CSD_EXP_EVENTS_CTRL 56 /* R/W, 2 bytes */ +#define EXT_CSD_DATA_SECTOR_SIZE 61 /* R */ +#define EXT_CSD_GP_SIZE_MULT 143 /* R/W */ +#define EXT_CSD_PARTITION_SETTING_COMPLETED 155 /* R/W */ +#define EXT_CSD_PARTITION_ATTRIBUTE 156 /* R/W */ +#define EXT_CSD_PARTITION_SUPPORT 160 /* RO */ +#define EXT_CSD_HPI_MGMT 161 /* R/W */ +#define EXT_CSD_RST_N_FUNCTION 162 /* R/W */ +#define EXT_CSD_BKOPS_EN 163 /* R/W */ +#define EXT_CSD_BKOPS_START 164 /* W */ +#define EXT_CSD_SANITIZE_START 165 /* W */ +#define EXT_CSD_WR_REL_PARAM 166 /* RO */ +#define EXT_CSD_RPMB_MULT 168 /* RO */ +#define EXT_CSD_FW_CONFIG 169 /* R/W */ +#define EXT_CSD_BOOT_WP 173 /* R/W */ +#define EXT_CSD_ERASE_GROUP_DEF 175 /* R/W */ +#define EXT_CSD_PART_CONFIG 179 /* R/W */ +#define EXT_CSD_ERASED_MEM_CONT 181 /* RO */ +#define EXT_CSD_BUS_WIDTH 183 /* R/W */ +#define EXT_CSD_STROBE_SUPPORT 184 /* RO */ +#define EXT_CSD_HS_TIMING 185 /* R/W */ +#define EXT_CSD_POWER_CLASS 187 /* R/W */ +#define EXT_CSD_REV 192 /* RO */ +#define EXT_CSD_STRUCTURE 194 /* RO */ +#define EXT_CSD_CARD_TYPE 196 /* RO */ +#define EXT_CSD_DRIVER_STRENGTH 197 /* RO */ +#define EXT_CSD_OUT_OF_INTERRUPT_TIME 198 /* RO */ +#define EXT_CSD_PART_SWITCH_TIME 199 /* RO */ +#define EXT_CSD_PWR_CL_52_195 200 /* RO */ +#define EXT_CSD_PWR_CL_26_195 201 /* RO */ +#define EXT_CSD_PWR_CL_52_360 202 /* RO */ +#define EXT_CSD_PWR_CL_26_360 203 /* RO */ +#define EXT_CSD_SEC_CNT 212 /* RO, 4 bytes */ +#define EXT_CSD_S_A_TIMEOUT 217 /* RO */ +#define EXT_CSD_S_C_VCCQ 219 /* RO */ +#define EXT_CSD_S_C_VCC 220 /* RO */ +#define EXT_CSD_REL_WR_SEC_C 222 /* RO */ +#define EXT_CSD_HC_WP_GRP_SIZE 221 /* RO */ +#define EXT_CSD_ERASE_TIMEOUT_MULT 223 /* RO */ +#define EXT_CSD_HC_ERASE_GRP_SIZE 224 /* RO */ +#define EXT_CSD_ACC_SIZE 225 /* RO */ +#define EXT_CSD_BOOT_MULT 226 /* RO */ +#define EXT_CSD_BOOT_INFO 228 /* RO */ +#define EXT_CSD_SEC_TRIM_MULT 229 /* RO */ +#define EXT_CSD_SEC_ERASE_MULT 230 /* RO */ +#define EXT_CSD_SEC_FEATURE_SUPPORT 231 /* RO */ +#define EXT_CSD_TRIM_MULT 232 /* RO */ +#define EXT_CSD_PWR_CL_200_195 236 /* RO */ +#define EXT_CSD_PWR_CL_200_360 237 /* RO */ +#define EXT_CSD_PWR_CL_DDR_52_195 238 /* RO */ +#define EXT_CSD_PWR_CL_DDR_52_360 239 /* RO */ +#define EXT_CSD_BKOPS_STATUS 246 /* RO */ +#define EXT_CSD_POWER_OFF_LONG_TIME 247 /* RO */ +#define EXT_CSD_GENERIC_CMD6_TIME 248 /* RO */ +#define EXT_CSD_CACHE_SIZE 249 /* RO, 4 bytes */ +#define EXT_CSD_PWR_CL_DDR_200_360 253 /* RO */ +#define EXT_CSD_FIRMWARE_VERSION 254 /* RO, 8 bytes */ +#define EXT_CSD_PRE_EOL_INFO 267 /* RO */ +#define EXT_CSD_DEVICE_LIFE_TIME_EST_TYP_A 268 /* RO */ +#define EXT_CSD_DEVICE_LIFE_TIME_EST_TYP_B 269 /* RO */ +#define EXT_CSD_CMDQ_DEPTH 307 /* RO */ +#define EXT_CSD_CMDQ_SUPPORT 308 /* RO */ +#define EXT_CSD_SUPPORTED_MODE 493 /* RO */ +#define EXT_CSD_TAG_UNIT_SIZE 498 /* RO */ +#define EXT_CSD_DATA_TAG_SUPPORT 499 /* RO */ +#define EXT_CSD_MAX_PACKED_WRITES 500 /* RO */ +#define EXT_CSD_MAX_PACKED_READS 501 /* RO */ +#define EXT_CSD_BKOPS_SUPPORT 502 /* RO */ +#define EXT_CSD_HPI_FEATURES 503 /* RO */ +#define EXT_CSD_S_CMD_SET 504 /* RO */ + +/* + * EXT_CSD field definitions + */ + +#define EXT_CSD_WR_REL_PARAM_EN (1 << 2) +#define EXT_CSD_WR_REL_PARAM_EN_RPMB_REL_WR (1 << 4) + +#define EXT_CSD_PART_CONFIG_ACC_MASK (0x7) +#define EXT_CSD_PART_CONFIG_ACC_DEFAULT (0x0) +#define EXT_CSD_PART_CONFIG_ACC_BOOT0 (0x1) + +#define EXT_CSD_PART_CONFIG_EN_MASK (0x7 << 3) +#define EXT_CSD_PART_CONFIG_EN_BOOT0 (0x1 << 3) +#define EXT_CSD_PART_CONFIG_EN_USER (0x7 << 3) + #define SDMMC_CMD_MAX 64 /** -- 2.41.0
participants (4)
-
Fabiano Rosas
-
Peter Xu
-
Philippe Mathieu-Daudé
-
Thomas Huth