Omgboom@lemmy.zip to Lemmy Shitpost@lemmy.world · 8 days agoShould have seen it cominglemmy.zipimagemessage-square74fedilinkarrow-up11.24Karrow-down128
arrow-up11.22Karrow-down1imageShould have seen it cominglemmy.zipOmgboom@lemmy.zip to Lemmy Shitpost@lemmy.world · 8 days agomessage-square74fedilink
minus-squaretheunknownmuncher@lemmy.worldlinkfedilinkarrow-up55·8 days agohttps://arxiv.org/abs/2405.20304 they invented their own reinforcement learning framework called Group Relative Policy Optimization
minus-squareSanctus@lemmy.worldlinkfedilinkEnglisharrow-up6·8 days agoYeah the original comment in this chain more describes US Telcos and shit, not this particular instance.
https://arxiv.org/abs/2405.20304 they invented their own reinforcement learning framework called Group Relative Policy Optimization
Yeah the original comment in this chain more describes US Telcos and shit, not this particular instance.