Prepacking: A Simple Method for Fast Prefilling and Increased Throughput in Large Language Models (bibtex)

by Siyan Zhao, Daniel Israel, Guy Van den Broeck and Aditya Grover
Reference:
Siyan Zhao, Daniel Israel, Guy Van den Broeck and Aditya Grover. Prepacking: A Simple Method for Fast Prefilling and Increased Throughput in Large Language Models, In Arxiv, 2024.
Bibtex Entry:
@inproceedings{ZhaoArxiv24,
  author    = {Zhao, Siyan and Israel, Daniel and Van den Broeck, Guy and Grover, Aditya},
  title={Prepacking: A Simple Method for Fast Prefilling and Increased Throughput in Large Language Models}, 
  booktitle = {Arxiv},
  url       = "https://arxiv.org/pdf/2404.09529.pdf",
  eprint={2404.09529},
  archivePrefix={arXiv},
  primaryClass={cs.LG}
  month     = apr,
  year      = {2024},
  keywords  = {techreport}
}
PDF Preview:
(PDF preview not available, download PDF instead)
Powered by bibtexbrowser