Generate separate stores for partially swizzled memory stores

Full vector and fully specified vector swizzle stores are not affected by this change, only partial swizzles ie swizzles with fewer components than the vector being stored to.

Previously the vector being stored to loaded and any components not specified in the swizzle were used to create a full store to the vector.

While this change generates more SPIR-V instructions, it is necessary for correctness.

Fixes #2518.
This commit is contained in:
Jeremy Hayes 2021-07-16 15:07:16 -06:00
parent 9158061398
commit 6d5b40f051
50 changed files with 31343 additions and 26594 deletions

View file

@ -1,13 +1,13 @@
spv.320.meshShaderUserDefined.mesh
// Module Version 10000
// Generated by (magic number): 8000a
// Id's are bound by 140
// Id's are bound by 143
Capability MeshShadingNV
Extension "SPV_NV_mesh_shader"
1: ExtInstImport "GLSL.std.450"
MemoryModel Logical GLSL450
EntryPoint MeshNV 4 "main" 12 19 37 103
EntryPoint MeshNV 4 "main" 12 19 37 106
ExecutionMode 4 LocalSize 32 1 1
ExecutionMode 4 OutputVertices 81
ExecutionMode 4 OutputPrimitivesNV 32
@ -27,11 +27,11 @@ spv.320.meshShaderUserDefined.mesh
MemberName 33(myblock) 4 "m"
MemberName 33(myblock) 5 "mArr"
Name 37 "blk"
Name 99 "myblock2"
MemberName 99(myblock2) 0 "f"
MemberName 99(myblock2) 1 "pos"
MemberName 99(myblock2) 2 "m"
Name 103 "blk2"
Name 102 "myblock2"
MemberName 102(myblock2) 0 "f"
MemberName 102(myblock2) 1 "pos"
MemberName 102(myblock2) 2 "m"
Name 106 "blk2"
Decorate 12(gl_LocalInvocationID) BuiltIn LocalInvocationId
Decorate 19(gl_WorkGroupID) BuiltIn WorkgroupId
MemberDecorate 33(myblock) 0 PerPrimitiveNV
@ -42,9 +42,9 @@ spv.320.meshShaderUserDefined.mesh
MemberDecorate 33(myblock) 5 PerPrimitiveNV
Decorate 33(myblock) Block
Decorate 37(blk) Location 0
Decorate 99(myblock2) Block
Decorate 103(blk2) Location 20
Decorate 139 BuiltIn WorkgroupSize
Decorate 102(myblock2) Block
Decorate 106(blk2) Location 20
Decorate 142 BuiltIn WorkgroupSize
2: TypeVoid
3: TypeFunction 2
6: TypeInt 32 1
@ -82,31 +82,31 @@ spv.320.meshShaderUserDefined.mesh
57: 26(fvec3) ConstantComposite 54 55 56
58: TypePointer Output 26(fvec3)
64: 6(int) Constant 3
69: TypePointer Output 27(fvec4)
74: 6(int) Constant 4
76: 23(float) Constant 1098907648
77: 27(fvec4) ConstantComposite 56 54 55 76
82: 6(int) Constant 5
85: 9(int) Constant 3
88: 9(int) Constant 1
93: 23(float) Constant 1099431936
94: 23(float) Constant 1099956224
95: 23(float) Constant 1100480512
96: 26(fvec3) ConstantComposite 93 94 95
98: 9(int) Constant 264
99(myblock2): TypeStruct 23(float) 27(fvec4) 29
100: 9(int) Constant 81
101: TypeArray 99(myblock2) 100
102: TypePointer Output 101
103(blk2): 102(ptr) Variable Output
109: 23(float) Constant 1101004800
113: 23(float) Constant 1101529088
114: 23(float) Constant 1102053376
115: 23(float) Constant 1102577664
116: 23(float) Constant 1103101952
117: 27(fvec4) ConstantComposite 113 114 115 116
129: 23(float) Constant 1105723392
139: 10(ivec3) ConstantComposite 34 88 88
69: 9(int) Constant 1
74: 9(int) Constant 3
78: 6(int) Constant 4
80: 23(float) Constant 1098907648
81: 27(fvec4) ConstantComposite 56 54 55 80
82: TypePointer Output 27(fvec4)
87: 6(int) Constant 5
96: 23(float) Constant 1099431936
97: 23(float) Constant 1099956224
98: 23(float) Constant 1100480512
99: 26(fvec3) ConstantComposite 96 97 98
101: 9(int) Constant 264
102(myblock2): TypeStruct 23(float) 27(fvec4) 29
103: 9(int) Constant 81
104: TypeArray 102(myblock2) 103
105: TypePointer Output 104
106(blk2): 105(ptr) Variable Output
112: 23(float) Constant 1101004800
116: 23(float) Constant 1101529088
117: 23(float) Constant 1102053376
118: 23(float) Constant 1102577664
119: 23(float) Constant 1103101952
120: 27(fvec4) ConstantComposite 116 117 118 119
132: 23(float) Constant 1105723392
142: 10(ivec3) ConstantComposite 34 69 69
4(main): 2 Function None 3
5: Label
8(iid): 7(ptr) Variable Function
@ -142,64 +142,69 @@ spv.320.meshShaderUserDefined.mesh
66: 6(int) SDiv 65 52
67: 58(ptr) AccessChain 37(blk) 66 52
68: 26(fvec3) Load 67
70: 69(ptr) AccessChain 37(blk) 63 64 44
71: 27(fvec4) Load 70
72: 27(fvec4) VectorShuffle 71 68 0 4 5 6
Store 70 72
73: 6(int) Load 8(iid)
75: 6(int) SDiv 73 74
78: 69(ptr) AccessChain 37(blk) 75 74 52
79: 27(fvec4) Load 78
80: 27(fvec4) VectorShuffle 79 77 7 6 5 4
Store 78 80
81: 6(int) Load 8(iid)
83: 6(int) Load 8(iid)
84: 6(int) SDiv 83 74
86: 41(ptr) AccessChain 37(blk) 84 74 52 85
87: 23(float) Load 86
89: 41(ptr) AccessChain 37(blk) 81 82 39 44 88
Store 89 87
90: 6(int) Load 8(iid)
91: 6(int) IMul 90 74
92: 6(int) Load 18(gid)
97: 58(ptr) AccessChain 37(blk) 91 82 44 92
Store 97 96
MemoryBarrier 88 98
ControlBarrier 31 31 98
104: 6(int) Load 8(iid)
105: 6(int) Load 8(iid)
106: 6(int) ISub 105 44
107: 41(ptr) AccessChain 103(blk2) 106 39
108: 23(float) Load 107
110: 23(float) FAdd 108 109
111: 41(ptr) AccessChain 103(blk2) 104 39
Store 111 110
112: 6(int) Load 8(iid)
118: 69(ptr) AccessChain 103(blk2) 112 44
Store 118 117
119: 6(int) Load 8(iid)
120: 6(int) IAdd 119 44
121: 6(int) Load 18(gid)
70: 41(ptr) AccessChain 37(blk) 63 64 44 69
71: 23(float) CompositeExtract 68 0
Store 70 71
72: 41(ptr) AccessChain 37(blk) 63 64 44 31
73: 23(float) CompositeExtract 68 1
Store 72 73
75: 41(ptr) AccessChain 37(blk) 63 64 44 74
76: 23(float) CompositeExtract 68 2
Store 75 76
77: 6(int) Load 8(iid)
79: 6(int) SDiv 77 78
83: 82(ptr) AccessChain 37(blk) 79 78 52
84: 27(fvec4) Load 83
85: 27(fvec4) VectorShuffle 84 81 7 6 5 4
Store 83 85
86: 6(int) Load 8(iid)
88: 6(int) Load 8(iid)
89: 6(int) SDiv 88 78
90: 41(ptr) AccessChain 37(blk) 89 78 52 74
91: 23(float) Load 90
92: 41(ptr) AccessChain 37(blk) 86 87 39 44 69
Store 92 91
93: 6(int) Load 8(iid)
94: 6(int) IMul 93 78
95: 6(int) Load 18(gid)
100: 58(ptr) AccessChain 37(blk) 94 87 44 95
Store 100 99
MemoryBarrier 69 101
ControlBarrier 31 31 101
107: 6(int) Load 8(iid)
108: 6(int) Load 8(iid)
109: 6(int) ISub 108 44
110: 41(ptr) AccessChain 106(blk2) 109 39
111: 23(float) Load 110
113: 23(float) FAdd 111 112
114: 41(ptr) AccessChain 106(blk2) 107 39
Store 114 113
115: 6(int) Load 8(iid)
121: 82(ptr) AccessChain 106(blk2) 115 44
Store 121 120
122: 6(int) Load 8(iid)
123: 69(ptr) AccessChain 103(blk2) 122 44
124: 27(fvec4) Load 123
125: 69(ptr) AccessChain 103(blk2) 120 52 121
Store 125 124
126: 6(int) Load 8(iid)
127: 6(int) IAdd 126 44
128: 6(int) Load 18(gid)
130: 41(ptr) AccessChain 103(blk2) 127 52 128 31
Store 130 129
131: 6(int) Load 8(iid)
132: 6(int) IAdd 131 52
133: 6(int) Load 8(iid)
134: 6(int) IAdd 133 44
135: 6(int) Load 18(gid)
136: 69(ptr) AccessChain 103(blk2) 134 52 135
137: 27(fvec4) Load 136
138: 69(ptr) AccessChain 103(blk2) 132 52 64
Store 138 137
MemoryBarrier 88 98
ControlBarrier 31 31 98
123: 6(int) IAdd 122 44
124: 6(int) Load 18(gid)
125: 6(int) Load 8(iid)
126: 82(ptr) AccessChain 106(blk2) 125 44
127: 27(fvec4) Load 126
128: 82(ptr) AccessChain 106(blk2) 123 52 124
Store 128 127
129: 6(int) Load 8(iid)
130: 6(int) IAdd 129 44
131: 6(int) Load 18(gid)
133: 41(ptr) AccessChain 106(blk2) 130 52 131 31
Store 133 132
134: 6(int) Load 8(iid)
135: 6(int) IAdd 134 52
136: 6(int) Load 8(iid)
137: 6(int) IAdd 136 44
138: 6(int) Load 18(gid)
139: 82(ptr) AccessChain 106(blk2) 137 52 138
140: 27(fvec4) Load 139
141: 82(ptr) AccessChain 106(blk2) 135 52 64
Store 141 140
MemoryBarrier 69 101
ControlBarrier 31 31 101
Return
FunctionEnd