Prepacking: A Simple Method for Fast Prefilling and Increased Throughput in Large Language Models (bibtex)
by Siyan Zhao, Daniel Israel, Guy Van den Broeck and Aditya Grover
View — Paper PDF
Reference:
Siyan Zhao, Daniel Israel, Guy Van den Broeck and Aditya Grover. Prepacking: A Simple Method for Fast Prefilling and Increased Throughput in Large Language Models, In Arxiv, 2024.
Bibtex Entry:
@inproceedings{ZhaoArxiv24, author = {Zhao, Siyan and Israel, Daniel and Van den Broeck, Guy and Grover, Aditya}, title={Prepacking: A Simple Method for Fast Prefilling and Increased Throughput in Large Language Models}, booktitle = {Arxiv}, url = "https://arxiv.org/pdf/2404.09529.pdf", eprint={2404.09529}, archivePrefix={arXiv}, primaryClass={cs.LG}, month = apr, year = {2024}, keywords = {techreport} }
PDF Preview:
Powered by bibtexbrowser