[Question] How to use Vc to handle partial loads and stores from Vc::Vector<T> types like float_v

I am dealing with a codebase which copies data from a `float_v` type object to an `float[]` array. The code checks if there is sufficient size, say for 4 for `__m128`, to cast it to `float_v` or use gather with mask for partial load.
In that code I see the implementation as such, in a basic sample form:
```cpp
const std::size_t data_size{7} ;
float data[ data_size ] = {0.0f, 1.0f, 2.0f, 3.0f, 4.0f, 5.0f, 6.0f} ;
std::size_t index{5} ;
const float_v tmp_data =  float_v::Zero();;
const std::size_t float_vLen = float_v::Size;
if( (index+float_vLen) < data_size) {
    reinterpret_cast<float_v&>(data[index]) = tmp_data;
} else {
    const uint_v indices(uint_v::IndexesFromZero());
    (reinterpret_cast<float_v&>(data[index])).gather(reinterpret_cast<const float*>(&tmp_data), indices, simd_cast<float_m>(indices<(data_size - index)));
}
```
## Problem

Is the gather implementation in the `else` condition safe? I have seen mentions that casting from `float*` to `__m128*` is okay but the other way around is not safe. See for reference [this stackoverflow answer](https://stackoverflow.com/a/52117639/17843293) and [this stack overflow post](https://stackoverflow.com/questions/71364764/gcc-avx-m256i-cast-to-int-array-leads-to-wrong-values)
I feel like I should `scatter` or `store` to a `float[4]` array and then pass its address instead. Or is the current implementation safe?

Another question, which would be really helpful if you could answer:
I am not sure about the cast to reference type `float_v&` in the above implementation. It doesn't "feel" right. I can think of four different ways to load:
1.
```cpp
alignas(16) float float_arr[4] = {0, 1, 2, 3} ;
float_v load_simd.load(float_arr) ; 
```
2.
```cpp
alignas(16) float float_arr[4] = {0, 1, 2, 3} ;
float_v gather_simd.gather(float_arr, uint_v::IndexesFromZero()) ; 
```
3.
```cpp
alignas(16) float float_arr[4] = {0, 1, 2, 3} ;
float_v* cast_ptr_simd = reinterpret_cast<float_v*>(float_arr) ; 
```
4.
```cpp
alignas(16) float float_arr[4] = {0, 1, 2, 3} ;
float_v& cast_ref_simd = reinterpret_cast<float_v&>(float_arr) ; 
```

Which would be the "best way" or good practice from your expert perspective?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Question] How to use Vc to handle partial loads and stores from Vc::Vector<T> types like float_v #362

Problem

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Question] How to use Vc to handle partial loads and stores from Vc::Vector<T> types like float_v #362

Description

Problem

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions