N
Hacker Next
new
past
show
ask
show
jobs
submit
login
▲
Batched reward model inference and Best-of-N sampling
(
raw.sh
)
34 points by
rawsh
37 days ago
|
0 comments
add comment
Loading comments...