I think slices and interfaces in Go are both 2×usize width, in which case you just need double-width CAS to make them safe from data races - which most modern architectures support.
Slices are base pointer, length, and capacity, so three words. But there's a garbage collector, so you can install a thunk if the slice is updated in a way that torn reads cause problems. It makes reads slightly slower, of course.
Two-word loads could be used for interface values (assuming that alignment is increased), except if support older x86-64 is needed. There are implementations that require using CMPXCHG16B, which is a really slow way to load two words. Both Intel and AMD updated their ISA manuals that using VMOVDQA etc. is fine (once the CPU supports AVX and the memory is cacheable).