• brucethemoose@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      2 days ago

      It’s honestly not that big a deal, as it’s not like knowing anything about how it was trained (beyond the config) would help you modify it. It’s still highly modifiable. It’s not like anyone can afford to replicate it.

      It would be nice to publish the hyperparameters for research purposes, but… shrug.

      I think a subset of the exact training data/hyperparameters would help with quantization-aware-training, maybe, but that’s all I got.