• The added line is THIS COLOR.
  • The deleted line is THIS COLOR.
- https://arxiv.org/abs/2402.17764
--  https://arxiv.org/pdf/2402.17764.pdf

- https://github.com/Beomi/BitNet-Transformers/
- https://github.com/frodo821/BitNet-Transformers
>
https://twitter.com/BoufrawFrodo2/status/1763435835935047789
オリジナルのBitNetを1.58bの論文に従って3値にするように修正しました

- Microsoftが1.58ビットの大規模言語モデルをリリース、行列計算を足し算にできて計算コスト激減へ
--  https://gigazine.net/news/20240229-microsoft-1bit-llm/

- 1ビットLLMの衝撃! 70Bで8.9倍高速 全ての推論を加算のみで!GPU不要になる可能性も
--  https://topics.smt.docomo.ne.jp/article/wirelesswire/business/wirelesswire-20240286094?

- https://twitter.com/andrew_n_carr/status/1770487200234213758?s=12
>
1.58 bit code is out (in an appendix)
https://github.com/microsoft/unilm/blob/master/bitnet/The-Era-of-1-bit-LLMs__Training_Tips_Code_FAQ.pdf

https://pbs.twimg.com/media/GJIIs1ZbcAA7hzM?format=jpg&name=small#.jpg


- https://twitter.com/teortaxesTex/status/1773861506674741570
>
It seems that results of that Microsoft paper about ternary LLMs can be replicated after all – for 3B@100B at least.
https://huggingface.co/1bitLLM/bitnet_b1_58-3B

https://pbs.twimg.com/media/GJ4E0F4XgAAcKal?format=jpg&name=small#.jpg



Reload   Diff   Front page List of pages Search Recent changes Backup Referer   Help   RSS of recent changes